Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting time boundaries into years #56

Open
rancilhac opened this issue Sep 21, 2023 · 3 comments
Open

Converting time boundaries into years #56

rancilhac opened this issue Sep 21, 2023 · 3 comments

Comments

@rancilhac
Copy link

Hello,

I am using MSMC2 to infer split times between several populations, and when it comes to converting the time boundaries output by MSMC2 into years I see that there are two formulas available:

As far as I understand the difference is that with the second formula the time of a given segment is placed at its middle while with the first one it is placed at the left boundary. However, both result in fairly different estimates in my case and I'm wondering whether I should prefer one or the other.

Thanks in advance,
Loïs

@stschiff
Copy link
Owner

Hmm, I'm not sure I understand the contexts of the two quotes above. I can't remember what's exactly in the book, but I guess the way to plot curves and estimate split times goes like this:

  1. Determine the cross-coalescence rate as a continuous but step-wise function, as done by the tool combineCrossCoal.py from the MSMC-tools repo.
  2. Look at this curve and check when it crosses the CCR=0.5 threshold.

That's it, right?

So basically, you should never convert a time-segment to a single time, but consider a single time-segment as an interval, which has a start and end point, and ultimately the coalescence rate functions as continuous, but step-wise functions through time.

@rancilhac
Copy link
Author

Hi @stschiff,
Sorry, my message was unclear and things were not very clear in my head either. Indeed the time segments are intervals, but I'm confused about how to convert the segments' boundaries in the output of combineCrossCoal.py into years. I think the first formula, time_boundary/mu*gen makes sense and correspond to what you describe. However in the script associated with the book chapter (https://github.com/StatisticalPopulationGenomics/MSMCandMSMC2/blob/master/plot_msmc.py) the middle of the time segments are used to calculate the time at which the CCR crosses the 0.5 threshold, and I don't understand why. Sorry if I'm missing something obvious!

@stschiff
Copy link
Owner

Those are minor differences. Just look at the scripts and judge whether the calculation makes sense to you. It's just linear interpolations. Not sure what to advice here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants