- Notifications
You must be signed in to change notification settings - Fork 36
Description
I think I discovered a bug in SLiM's processing of recombination rates.
I'm trying to simulate the impact of purifying selection against non-synonymous deleterious variants on linked intergenic/intronic neutral sites. I've generated a recombination map based on real genome annotations, specifying recombination rates between adjacent exons and/or neutral sites.
I noticed that as the number of recombination regions gets larger (a little over 1 million individual recombination regions), SLiM crashes with a segmentation fault.
By investigating this issue closer, I found that this is the offending line:
https://github.com/MesserLab/SLiM/blob/master/core/chromosome.cpp#L129
SLiM creates an array of doubles B to store recombination rates on the stack, which is limited in size. This can cause a stack overflow if the number of recombination rates gets too large, leading to a segmentation fault.
Whether this happens or not depends on the stack limit. For example, the stack limit on my system is ~8 Mb. Since the sizeof(double) is 8 bytes, 8Mb is a bit over 1 million doubles, which leads to a stack overflow when more than 1 million recombination regions are specified by the user (my case exactly).
It was very easy to fix the stack overflow problem by using a std::vector (which allocates memory on the heap internally) instead of a plain C array. You can see the changes I made to fix this issue in this commit: https://github.com/bodkan/SLiM/commit/7073e7ffc37de882b3b6f2a5f854d63399056df3
I've uploaded a testing recombination map file and a simple SLiM configuration script generated from that file, which should be enough to reproduce the bug.
Let me know if this makes any sense!