Saturday, 13 March 2021

The I-Y33765 clade-specific SNP mutation rate and age estimations

Estimating the time to most recent common ancestor (TMRCA) of two men by using SNPs found in the non-recombining regions of the Y-chromosome is helpful in better understanding their supposed relationship.  

To attempt this estimation it is necessary to know the number of reliable SNPs that have formed in each of their Y-chromosomes in the period by which they are separated.  If their Y-chromosomes show that they share a SNP such as I-Y33765, and we want to estimate the time to their most recent shared male ancestor who also carried the I-Y33765 mutation we start by counting the number of reliable mutations that have formed on their Y-chromosomes since (or downstream from) I-Y33765.  

From the chart below, and using the example of Jacobsson IN70815, we count the observed number of high quality SNPs (those shown in red, indicating they are positioned within the CombBED region of the Y-chromosome as defined by Adamov et al., (2015)) that are listed on the branches of the chart that link Jacobsson to I-Y33765.  In this way the pink annotations link six such SNPs that are of a suitable quality to be used in our calculation.  

Next, because each Big Y-700 test will analyse or provide "coverage" of a slightly different length or proportion of the 857 segments that make up the complete CombBED region, we apply a correction to adjust the number of observed SNPs to what they would have been if all CombBED segments (CombBED "coverage" length used by YFull is 8,467,165bp) had been fully analysed.  The formula for this SNP correction is:

 (Observed number of SNPs / Coverage in Big Y-700 test) * Coverage length of CombBED)

In the case of Jacobsson IN70815, his Big Y-700 test gave "coverage" of 8,223,356bp.  So, entering his detail into the SNP correction formula we get:

 (6 / 8,223,356) * 8,467,165 = 6.18 SNPs

Hence, we have 6.18 mutations that have occurred on Jacobsson's ancestral line since the formation of I-Y33765.  As explained in an earlier post, a 19 generation Swedish pedigree suggests that within the I-Y33765 clade a new mutation occurs every 109.5y.  Consequently we can estimate Jacobsson's TMRCA for the Y33765 mutation by multiplying the corrected number of SNPs by the clade-specific mutation rate and then adding an allowance for Jacobsson's age (YFull add 60y) as follows:

(6.18 * 109.5) + 60y = 737 ybp  

 
 Click on chart to enlarge image
 
By applying this method to all six NGS results that we have at present within the I-Y33765 clade we can produce estimates of TMRCA for the five phylogenetically significant SNPs within our clade (Table 1).
 

Table 1: Single nucleotide polymorphism (SNP) age estimates made using the mutation rate
 observed in the Nils Swennsson pedigree

 References

Adamov, D., Guryanov, V., Karzhavin, S., Tagankin, V., Urasin, V. (2015) Defining a new rate constant for Y-chromosome SNPs based on full sequencing data. Russian Journal of Genetic Genealogy 1:3–36

Warlords, foederati, princes or pirates: Exploring some characteristics of the men involved in the star cluster expansion downstream of I-Y4252

There would seem to be something remarkable about the man who was the founder of the I-Y4252 haplogroup.  We can see this clearly from the e...