The Proceedings of the Eighth International Conference on Creationism (2018)
clade resolution (Handt et al. 1998). Y chromosome typing began with more limited sampling of short tandem repeats (STRs) (Purps et al. 2014) andAlu insertion events (Romualdi et al. 2002). Several decades ago, geneticists moved into analyses on the level of single nucleotide variations (SNVs). It was not until recently that nearly complete whole-chromosome SNV data became available for the Y chromosome. Now that nearly full-length sequences are abundant and readily accessible (Smith 2015), haplogroup identification is no longer limited to just a limited set of specific alleles but can employ all variation data found within representative members of all known clades. This is powerful information that can help answer crucial questions regarding human origins. Using several differentmethods, researchers can create phylogenetic trees that reflect the genetic history of any given set of related people living today. The tree-building algorithms are forced to use approximations when comparing sequence data, and thus the nodes and interior branches do not necessarily reflect real individuals that lived in the past. However, as we will demonstrate, in the case of the human Y and mitochondrial gene trees, each branch point on each tree reflects a historical individual that passed one or more de novo mutations to a child. This means that any branch arises at a specific time, in a specific individual, and that event provides an informative reference point that enables the study of both the group founder and his or her descendants. After combining related sequences into natural haplogroups, it is possible to reconstruct the ancestral sequence of each group. This can be done with a high degree of confidence. Ancestral sequence reconstruction dates back as far as the pioneering work of Pauling and Zuckerkandl (1963) who introduced the term ‘paleogenetics’ in the early 1960s. This field has a strong mathematical basis that has continuously advanced over the decades since work was begun. Early parsimony methods like those of Jermann et al. (1995) were largely eclipsed by maximum likelihood methods like those of Pupko et al. (2000), which were followed Bayesian methods like those of Huelsenbeck and Bollback (2001). Historical sequence reconstructions have many complexities and are subject to multiple confounding factors, such as the presence of incomplete lineage sorting, genomic rearrangements, gene duplication and deletion, varying mutation rates over time, gene conversion, and differing rates of specific mutations. Worse, phylogenetic reconstructions will always yield a tree, even for unrelated organisms (i.e., different created kinds). Furthermore, the assumption that an accurate ‘molecular clock’ exists can also affect the final shape of the tree. Yet the presence of an accurate molecular clock is a highly-debatable subject (Wood 2012, 2013; Tomkins and Bergman 2015; Jeanson 2016). Despite the controversy, the molecular clock hypothesis has a profound effect on how phylogenetic trees are constructed. For example, the most dissimilar sequences are usually labeled as the oldest, and are generally shown as outlying branches, ignoring the possibility that they might be the same age as the others, but having more mutations. We understand these complexities, but for special chromosomes such as chrY and chrM (i.e., for non-recombining DNA elements with uniparental inheritance), the reconstruction of the ancestral sequence can be relatively simple. There is often no need to identify regions of synteny among diverse lineages, for example, and the alignment is often trivial. This gives us the unprecedented opportunity to examine, in parallel, the histories of both chromosomes. This has allowed us to shed new light on the genetics of both our primary patriarchal ancestor and our primary matriarchal ancestor. Methods The latest Y chromosome, mitochondrial (see Diroma et al. 2014), and chromosome 22 sequence data were obtained from the 1000 Genomes Project page (accessed 17 Apr 2015). High-coverage, high-quality, long-read Y chromosome data for 25 of the 1000 Genomes individuals was obtained from Complete Genomics (ftp://ftp2.completegenomics.com/Multigenome_summaries/ Complete_Public_Genomes_69genomes_VQHIGH_testvariants. tsv, accessed 3 Feb 2015). High-coverage Y chromosome data for 176 additional individuals from a diverse worldwide sampling was obtained in the Simons Genome Diversity Project (Malik et al. 2016). We constructed a full distance matrix for the 1000 Genomes Y and mitochondrial sequence data and then created naive neighbor-joining trees using MEGA, version 7 (Tamura et al. 2013)(Figs 1–3). Since our methodology requires multiple sequences within each groupunderconsideration,twoYchromosomesequences(HG03742 and HG02040, from haplogroups K2a1* and F*, respectively) were dropped from the analysis. The International Society for Genetic Genealogy (ISOGG) has curated a detailed table of Y chromosome variants (isogg.org/tree/ISOGG_YDNATreeTrunk. html, accessed 8 Feb 2016). We consulted this to double check the 1000 Genomes haplotype assignments and were surprised that two of the “A1b” sequences were strongly associated with variants that define haplogroup A0. Since the generally accepted phylogenetic root falls between these two clades, we split them into groups A0 and A1, following Karmin et al. (2015). Y chromosome haplogroup A0 and mitochondrial haplogroup L0 were used as outgroups. We filtered out any location where more than half of the readings were missing data or where missing data created a complex situation where the called ancestral allele was incongruent to the main phylogeny. We reconstructed the ancestral sequence for each major haplogroup using a simple decision tree similar to that of Pauling and Zuckerkandl (1963). In order to assign ancestral alleles, the state of that allele within a group is compared to its state outside the group. There are four possible results: A. No within-group variability and all other groups fixed for the alternate allele. The changemust have happenedwithin the ancestral stem of the group. It is unreasonable to think that multiple parallel mutations happened in all groups but the one under consideration, so this can be discounted. In these cases, the ancestral allele is set to the “Out” value. A special case arises when considering the outgroup (either included by design or by default as the deepest- branching group on an unrooted tree). If the outgroup is different from all others, it is impossible to directly identify the ancestral state, for the mutation could have happened on either side of the main stem. That is, along the branch that leads to the outgroup or Carter et al. ◀ Y Chromosome Noah and mitochondrial Eve ▶ 2018 ICC 134
Made with FlippingBook
RkJQdWJsaXNoZXIy MTM4ODY=