Skip to content

Conversation

@petrelharp
Copy link
Collaborator

No description provided.

@bhaller
Copy link
Contributor

bhaller commented Aug 18, 2018

Use the PedigreeID() method to access the pedigree id, rather than accessing pedigree_id_ directly; that's a private ivar. Your build didn't fail because you're building SLiMgui, which has special access to ivars all over the place to make life simpler. If you build the SLiM target instead, your build will fail when you use private ivars like this. Nice that Travis-CI automatically checks pull requests, I didn't know it would do that! :->

@bhaller
Copy link
Contributor

bhaller commented Aug 18, 2018

I'm not sure why the reset to -1 for the individual column would be unneeded now. Simplify may not throw away individual table entries any more if the corresponding nodes are gone; but the same is not true for SLiM, right? There can be nodes in the tree sequence for which (a) SLiM has forgotten the individual because it is dead and gone, and (b) it was not a "remembered individual", so there's no saved individual table entry for it, right? So won't those nodes still need to be given -1?

Re: the "which I think is OK but should be noted in the documentation" comment, I don't see that that is an issue at all; like, what would it even mean for it to be otherwise? Remembering an individual is like setting a flag, right? Once the flag is set, setting it again makes no difference; flags can't be "double-set". So in my opinion you can just delete that "NOTE:" comment entirely, I think it's obvious.

No other comments at the moment; I haven't reviewed it in fine detail yet, but it looks good to me. Thanks!

// TODO: CHECK THIS IS UNNEEDED?!?
// reset the individual column to -1 (which is MSP_NULL_INDIVIDUAL)
memset(p_tables->nodes->individual, 0xff, p_tables->nodes->num_rows * sizeof(individual_id_t));
// memset(p_tables->nodes->individual, 0xff, p_tables->nodes->num_rows * sizeof(individual_id_t));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why the reset to -1 for the individual column would be unneeded now. Simplify may not throw away individual table entries any more if the corresponding nodes are gone; but the same is not true for SLiM, right? There can be nodes in the tree sequence for which (a) SLiM has forgotten the individual because it is dead and gone, and (b) it was not a "remembered individual", so there's no saved individual table entry for it, right? So won't those nodes still need to be given -1?

Ah, this is important. We don't want to recompute tables.nodes.individualeach time around. But, correct me if I am wrong, but I think this is OK. Simplify does throw away individuals with no remaining nodes; but these individual's nodes are guaranteed to be around still. The only individuals kept in tables.individuals are Remembered ones -- others get added to the copy of the tables that get output, but that procedure doesn't touch the underlying tables that get carried forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect I'm missing the point somehow. But p_tables is the output table, and contains all of the nodes in the current tree sequence. The job of this method is to set up individual-related information in p_tables, not just for the remembered and extant individuals, but also for all of the nodes in p_tables. Some of the nodes in p_tables will correspond to individuals that were not remembered and are no longer extant, and so the p_tables->nodes->individual entries for those nodes need to have -1 set for their individual. So the strategy up until now, as I understand it, has been to set p_tables->nodes->individual to all -1, and then to fill in the right values for those nodes that still correspond to an individual in the table. The only way that setting to -1 would be unnecessary would be if every entry in p_tables->nodes corresponded to an entry in the individuals table anyway, in which case every -1 value would end up being overwritten in the loop below. But that is not the case here, as I understand it. No? Am I being dense? Maybe we need to talk about this in person.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the nodes in p_tables will correspond to individuals that were not remembered and are no longer extant, and so the p_tables->nodes->individual entries for those nodes need to have -1 set for their individual.

Ah, I see. If an individual was not remembered, then it will not appear in the individual table. Even if it was alive at the time of a previous output, it never would have been added to tables, only the temporary-tables-for-output.

@petrelharp
Copy link
Collaborator Author

Ah, thanks. I'm moving discussion to the code, so we can keep the different thread separate.


// add this individual to the individual table if not present already
// NOTE: this means if an individual is remembered more than once, subsequent times have no effect
// ... which I think is OK but should be noted in the documentation
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see that that is an issue at all; like, what would it even mean for it to be otherwise? Remembering an individual is like setting a flag, right? Once the flag is set, setting it again makes no difference; flags can't be "double-set". So in my opinion you can just delete that "NOTE:" comment entirely, I think it's obvious.

Recall that we store mutable things like age and location.

And, I think I disagree with my former self -- a more consistent behavior would be to store the state at the last time an individual was Remembered. For one thing, it's wierd for subsequent Remembers to have no effect; for another, if an individual is both Remembered and alive, then what gets output is the alive version, which is necessary to allow recovering from the saved state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I misunderstood what you were talking about here. So there is, we agree, no such thing as being "double-remembered" as such; the point is just that each time an individual is remembered its information in the table should be updated to the current value (or not). I agree with your current self: the information ought to be updated, yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some time you should show me how you manage to associate discussion threads with particular places in the code of the pull request, that's nice. :->

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In "files changed", hovering over a line of code makes a blue plus mark pop up; if so, when you click you can make a comment there.

@petrelharp
Copy link
Collaborator Author

Hm, there is one way that the current scheme will not allow complete recovery of saved state: if an individual is both Remembered and alive, but had a different age when Remembered, then currently we have no way of passing on what their age was when Remembered. I don't see how to do this at all reasonably, though; maybe not a big deal?

@bhaller
Copy link
Contributor

bhaller commented Aug 18, 2018

Hmm. In some ways the natural way to preserve that information would be to write the remembered/extant individual to the individuals table twice, once in the remembered section with the remembered info, once in the extant section with the extant info; but of course that does not mesh well with the rest of the design, and is not worth pursuing. I agree that it is not a big deal. The user has no way, in SLiM, of accessing information about remembered state; all that is accessible in SLiM is current state, so that is all that needs to be saved/restored as far as SLiM is concerned. It is possible that someone would want to use rememberIndividuals() to specifically remember individual state at a particular moment in time, with the intention of using that information in analysis in Python, and that getting later information instead would be a problem for them. But I think it's reasonable for us to just say, well, that's not what rememberIndividuals() means. If they want that, they can (a) write a .trees file out at the point in time they're interested in, or (b) write out the information they want in an auxiliary file, at the appropriate point in time, rather than relying on the .trees file for it. So there are solutions for that situation.

@bhaller
Copy link
Contributor

bhaller commented Aug 18, 2018

And how would I have gotten my comment above to be nested under your comment, as a reply? You did that above, somehow, but I see no "reply" option attached to your comment, so I had to start a new unlinked comment.

@petrelharp
Copy link
Collaborator Author

There's been a fair number of upstream changes to tskit, and I've had to merge those in, too. But, it seems to be working, superficially.

}

// our tables copy needs to have a population table to be able to sort it
WritePopulationTable(&tables);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change: tskit now checks that each node.population refers to a valid entry in the population table.


void SLiMSim::AddIndividualToTable(Individual *p_individual, table_collection_t *p_tables, bool p_addToRemembered)
{
// Maybe this should take a list of individuals instead of a single one?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had to do the same thing in two places, so I factored this out.

Maybe it should take a vector of individuals? I wasn't sure how to get that in one place, though.


// Add in the mutation.parent information
ret = table_collection_build_indexes(&output_tables, 0);
if (ret < 0) handle_error("table_collection_build_indexes", ret);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also a new requirement; previously, tskit wasn't being careful enough about when its indexes were invalidated.

// This overwrites whatever might be previously in the individual table
individual_table_clear(p_tables->individuals);
// First prepend the existing remembered individuals
individual_table_copy(tables.individuals, p_tables->individuals);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, but I'm being dense. This function is called here, on output_tables, which has just been copied from tables. So these two lines (individual_table_clear and individual_table_copy) are unnecessary. What we want to do is add some more lines to the end of the existing individual table.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this method should be called AddCurrentGenerationToIndividuals instead of WriteIndividuals.

individual_table_clear(p_tables->individuals);
// First prepend the existing remembered individuals
individual_table_copy(tables.individuals, p_tables->individuals);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is called here, on output_tables, which has just been copied from tables. So these two lines (individual_table_clear and individual_table_copy) were unnecessary. What we want to do is add some more lines to the end of the existing individual table.

Maybe this makes more clear that we don't need to erase the existing nodes.individual column?

}

void SLiMSim::WriteIndividualTable(table_collection_t *p_tables)
void SLiMSim::AddCurrentGenerationToIndividuals(table_collection_t *p_tables)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now reflecting what this actually does

@petrelharp
Copy link
Collaborator Author

And how would I have gotten my comment above to be nested under your comment, as a reply? You did that above, somehow, but I see no "reply" option attached to your comment, so I had to start a new unlinked comment.

In the "discussion" section there's just one thread; that's why starting threads in the code is useful.

@petrelharp petrelharp force-pushed the remember_individuals branch from 63e3f4d to 68c71d1 Compare August 22, 2018 18:01
@petrelharp
Copy link
Collaborator Author

This is good to go, I think. I've tidied up the commits somewhat.

@petrelharp petrelharp force-pushed the remember_individuals branch from 68c71d1 to e8b503f Compare August 22, 2018 23:35
@bhaller bhaller merged commit 5ee6eb8 into MesserLab:master Aug 22, 2018
@bhaller bhaller deleted the remember_individuals branch August 22, 2018 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants