Friday, January 21, 2022

Deconstructing Jeanson's Study on Y-Chromosome Recent Common Ancestor

 Dr. Nathaniel Jeanson has released a study on Y-chromosome mutation rates and its implication for the age of Most Recent Common Ancestor (MRCA) that has caused quite a stir. This is because the results of his study point to all or almost all living males descending from a single male ancestor (or at least a small group of closely related males) as recently as 4,500 years ago. 

Not coincidentally, this roughly matches the date of the flood of Noah if one applies the reasoning of Bishop Ussher to the dates of the Masoretic text of Genesis. Ussher's reasoning is missing something, and there is a case to be made for the Septuagint numbers matching the original autographs, but even if you double, triple, or multiply this number times ten, the implications of what Jeanson is claiming are still astounding. His findings fly in the face of a lot of other scientific research. That in itself is a red flag, though not necessarily disqualifying because much advancement in science has come from showing that what the previous group of scientists believed was either mistaken or incomplete in an important aspect. 

Nevertheless, Jeanson has this one wrong, and I am going to explain why. Real science should not just dismiss contrary findings, but also be able to explain why they are mistaken. Sadly, I haven't found any unbelieving science aficionados either willing or able to do this. Thus it falls to me, a creationist (albeit one who thinks creationism and evolution have a complicated relationship) to explain to unbelieving skeptics and believers alike as to where Jeanson went wrong. He is most wrong, but that doesn't mean I think the current mainstream date is correct either. The latest research shows that conventional wisdom also has the Y-haplogroup MRCA date wrong. Taking this latest information into account I explain here why I think it is very possible that the 'real' Y-haplogroup MRCA date is only between 50-70 thousand years ago. 

To the unbelievers, if you read through this and decide that my science and reasoning are sound, well I tell you that my theology is way better. It is more worth listening to than what I say about science. And I ask you to consider it with an open mind and dare to sincerely ask God to show you Himself in it. 

To the believers I say, do not be dismayed if the earth and humanity are older than you thought. Nor even if you are mistaken concerning what the flood was about. As Christians, our hope isn't in never being mistaken, but that Christ has atoned for all of our sins, and loves us through our mistakes. Believe me, the text can still be true, and is true and is more wonderful and points more to the truth of Christ than you (or I ten years ago) could have even imagined. The church has been misreading it, but that doesn't make it false. It makes what they have been saying about it false. In former times, these errors did not matter, but now in order to witness the truth of God's Word we must cast aside what we thought it said before and be like the more noble Bereans who study carefully. When we do, we will find that the narrative of the text is Christ-centric and that just seeing it that way resolves all of these supposed conflicts with science. 

Now before we start I should point out the nature of MRCA data. It is theoretical, it may not even be describing an event that really happened. For example, if humanity started as a population instead of as a single couple, as most skeptics and I (though I still believe in a literal historical Adam) think, then the "MRCA" number may be "how far in the past something that never happened occurred". You can't say humanity is over 200,000 years old just because the MRCA date for Y-haplogroups or anything else is that far back. The date would only apply if humanity sprang from a single couple and did not start as a population. If they started as a population with some diversity then the MRCA date is going to be much older than the actual date of our species. I don't want to go into the weeds here, so go to what I write about that here for more details if you care to. 

Even in this study, were it correct, can't tell us that all male humans spring from a single common ancestor only 4,500 years ago. It could be that there were a thousand fathers at that time and they all had the same Y-haplogroup with little genetic variation on this chromosome. Or it could be that half of them had other Y-haplogroups but they failed to leave any male descendants in living populations. We know for example that Y-haplogroup diversity declined precipitously from 9,000 to 5,000 years ago before making a rebound (which I fold into my theology here). The recent decline in Y-haplogroup diversity is real, but that doesn't mean that humanity was ever reduced to a single male. 

That said, I am now going to explain where Jeanson went wrong. I need not delve into the technical jargon and minute details of his methodology to do so. His mistake is so fundamental I don't have to do much of that. I say "mistake" but I think there is more than one in his paper, but these would be smaller errors that he used to massage the data more closely to his desired outcome once he got things in the same ballpark. I am going to describe how he got them in the same ballpark when they should not be. 

He starts off with a long discussion of the low-coverage scans of Y-haplogroup data which defined previous such studies. He points out that these miss a lot of variations/mutations. He is right about that and also right about the way time-depth has been calculated. We take the known mutation rate from the short term, such as father to son, and then compare it to the total differences we have found among the various types of Y-haplogroups. If we find 10,000 more mutations amongst all haplogroups as we find between father and son (on average), then we conclude that there were 10,000 generations needed to produce the mutations that we see. If we assume a reasonable generation time of 20 years, then the MRCA was 200,000 years ago. That is the basics. The fine-tuning gets more complicated of course, for example factoring in back-mutations which can mask the total number of mutations that have occurred, but we need not go into those details. 

Notice that when you do the procedure I described above, finding more mutations between father and son (compared to total human mutations), the shorter the time to the supposed MRCA. Conversely, the more total mutations there are in all humans (compared to the number of mutations in father-son) then the further back in time it would be to this MRCA. So the trick, if one is attempting trickery and I am not saying for certain that Jeanson is, would be to measure father-son mutations in a way which would find as many of them as possible, but measure overall human Y-haplogroup diversity in a way which would leave many of the mutations and variation unfound. For a recent date, you want to find lots of mutations in your father-son searches but less in your all-of-humanity searches. Jeanson has found a way to do this discretely.

You can see this when you consider the effects of how well your data collection method picks up these mutations. If you have a low-coverage scan, it may only pick up a fraction of the mutations/variation that are there. Let's say one out of four for the sake of discussion. How can you then get a good number? Through the miracle of averaging. So long as the same collection method is used for determining both father-son rate and total amount of diversity in humanity, the numbers derived should still be sound. Yes, you missed 75% of the mutations in humanity (not in number of Y-haplogroups but all the differences in them) but you also missed 75% of the variation between father-son. Because of that, your time to MRCA should be just as sound as if you found all of the variation in both sets you are comparing. 

Remember I started by saying Jeanson spent time discussing low vs. high coverage scans? His father-son mutation rates are derived from high-coverage data. He uses that to develop what he calls "tips" which are in effect tiny branch lengths. He then compares that to branch lengths of all Y-haplogroups derived from other studies BUT, those studies used low coverage scanning of the genome. So his data for the father son rate was obtained via a method that maximized the number of mutations found, and he in effect compared that to total branch lengths for humanity obtained via a method that undercounted these mutations. 

I taught middle school science, almost all in public schools, for thirteen years. I taught them that for the results of an experiment to be valid you have to collect the data you are comparing under the same conditions. You can't find out who is the best free-throw shooter by having one person go outside on a windy day and shoot with a lumpy ball in dress shoes at a court by the highway with a bent rim and no net while the other person uses their favorite ball in a quiet gym. 

To stick with our example, if high-coverage data found four times the mutations of low-coverage data then the father-son data might find 10 mutations instead of 2.5. The total number of mutations for mankind was derived from low-coverage methods that perhaps found 25,000 mutations when high-coverage methods would have found 100,000. The real numbers would be different but the proportions are what is important here. 

So using high-coverage data on both ends might give us:

100,000 total mutations / 10 father-son mutations   = 10,000 generations. If generation time is 20 years that is a MRCA date of 200,000 years ago.

Or using the low-coverage data on both ends might give us:

25,000 total mutations / 2.5 father-son mutations  = 10,000 generations. If generation time is 20 years that is a MRCA date of 200,000 years ago. 

But what Jeasnson did, once you follow the shells, was use the higher rate for the father-son side and the lower rate for the all of humanity side.....

25,000 total mutations / 10 father-son mutations  = 2,500 generations. If generation time is 20 years then that is a MRCA date of 50,000 years ago. 

So that takes care of 145,500 of the 195,500 years Jeanson needs to lose. Now he just needs to whittle off another 41,500 years and he has a MRCA date consistent with his (almost certainly incorrect) date of the flood of Noah. Bear in mind if the high-coverage scans pick up even more mutations relative to the low-coverage scans this reduces the dates down even more. 

An example of how he whittled the dates down further in an invalid way? I think he did this in part by assuming a generation time of not twenty years, but fifty years (see his supp. table nine). That is, he figured the average age of the father at the birth of the average son was fifty. At first you might think this would make his problem worse because it would be multiplying by longer generations, but this is where his "filtering" came in. What I think he did here was say "we are looking at this data with ten father son mutations when the average age of the father is twenty, but evidence indicates fathers at fifty have a lot more mutations so we need to adjust the father-son mutation rate up for that". It looks like from supp, figure nine that he adjusted the mutations up four times or even five times. This is masked because he used generation times of 15 years and fifty years. 

In other words one that is unreasonably young for average age of father and one that is unreasonably old. If you consider a more reasonable age of twenty for age of father, his use of age fifty is jacking the generation time up 2.5 times (50 vs. 20) but his mutation rate goes up four times or even five times.  Let's say it is only four and do the math again.

25,000 total mutations / 40 father - son mutations = 625 generations x 50 years per generation = 31,500 years. 

He's managed to lose another big chunk of time. Again, these are just example numbers except for those from his "supplement nine". To find his exact numbers and use them would be very tedious, but in principle what he is doing is going to drive the numbers way down. Once we see this, the details of his actual calculations are not important. 

The above doesn't even consider something I mentioned earlier- back-mutations. A back-mutation is when a gene mutates, and then mutates again to its previous state. If you take a snap-shot at the end, you might think no mutations have occurred when in fact two have. If humanity is as young as he thinks, this would not be a huge factor because the number of back-mutations would not be huge, but it would not be zero either. If mankind has a long history stretching tens of thousands, or even hundreds of thousands, of years in the past then it is a bigger factor and one which would raise the number you use for "total number of mutations in humanity". In other words, this would push the MRCA date back in time. Jeanson doesn't account for it. 

But let's just focus on the different scan-coverage rates. It makes a huge difference that gets rid of almost all the years Jeanson needs to lose to support his conclusion. From there he could make much smaller mistakes, or to look at it in a more sinister light, find smaller ways to massage the data, to whittle off the remaining years he needs to lose. I have some idea about other places he did the smaller data massages, but there is no need to explore them. The methodology is already wrecked. 

I conclude that this paper is fundamentally flawed. But I also conclude that Jesus Christ is God and that the scriptures are, when properly understood, true. YEC are wrong about early Genesis, but they are not wrong about Jesus, and they are not wrong about the scriptures being sacred. 

For more information, please consider my book below. Thank you. 


Get the Book



2 comments:

  1. i didn't understand any of that, but thank you ...it's enough that I think you did understand it ...and as you said, Jesus is really where we should draw our strength and not the age old arguments.

    Thanks again,Mark.

    ReplyDelete

Note: Only a member of this blog may post a comment.