Learn all about Y-DNA and its role in ancestry
What is Y-DNA?
Y-DNA is DNA that is found in the Y-Chromosome. Humans have 23 pairs of chromosomes. 22 pairs are autosomal chromosomes and the 23rd pair are the sex chromosomes which determine a person’s gender. Males have one Y-Chromosome and one X-Chromosome (XY), while females have two X-Chromosomes (XX) and no Y-Chromosome.
Paternal inheritance pattern of Y-DNA
Your Y-DNA is passed down to you directly from your paternal ancestors (father to son inheritance pattern). All males who have descended from the same paternal lineage will have exactly the same or very similar Y-DNA profiles.
If two males have very different Y-DNA profiles, it will conclusively confirm that they are not related along their direct paternal line.
Only males can take the Y-DNA Test
Since only males have Y-DNA, only males can take the Y-DNA test to trace their paternal ancestry. Females wishing to trace their paternal ancestry must test the Y-DNA of a male family member such as a brother, father, uncle, or male cousin along their direct paternal lineage.
Why does Y-DNA hold ancestral information?
All males who descended from a common forefather will have the same Y-Chromosome.
The paternal inheritance pattern of the Y-Chromosome is identical to the manner in which the surname is passed down in many cultures (i.e. from father to son along the male lineage). As a result, Y-DNA testing will allow any two males to determine whether they descended from the same original family line.
Y-DNA testing helps to solve questions about an individual’s paternal ancestry and can be used to discover and re-unite paternal family lines.
Here is a summary of the unique characteristics of Y-DNA which make it ideal for paternal ancestry analysis:
1. It is found only in males and inherited strictly in a single line of descent from father to son
2. It has a low recombination rate, so it remains stable and unmixed
3. It contains fast mutating STR markers which can be used to compare with other males to see if they descended from the same paternal lineage
4. It contains slow mutating SNP markers which can be used to determine an individual’s ancient ancestry (haplogroups and subclades)
Y-DNA has two types of markers which provide ancestral information
The Y-DNA has two very different types of markers which are useful for tracing paternal ancestry: STR markers and SNP markers.
1. Y-DNA STR Markers
STR stands for “Short Tandem Repeat”. STR markers have a very fast mutation rate (approximately one mutation every 20 generations). The rapid mutation rate of Y-DNA STR markers makes this marker type useful for examining recent ancestry – ancestral events within the past few hundred years.
2. Y-DNA SNP Markers
SNP stands for “Single Nucleotide Polymorphism”. Unlike the rapid mutation rate of STR markers, “SNP” markers have an extremely slow mutation rate (approximately one mutation every few thousand years). SNP markers are used for investigating deep ancestry (ancient ancestry from tens of thousands of years ago) and do not carry any information pertaining to recent ancestry. Y-DNA SNP marker testing will allow you to determine which Y-DNA haplogroup you belong to. Haplogroups are ancient family groups which can be traced back to ancient origins in Africa over 100,000 years ago.
What are Y-DNA STR markers?
The Y-DNA contains many STR (“Short Tandem Repeat”) markers. STR markers are also known as microsatellites and are short sequences of DNA (2-13 base pairs) which are repeated over and over again. The number of repeated sequences for each STR marker varies between individuals and typically ranges from 5-50 repeats.
During a Y-DNA STR marker test, the number of repeats are determined for each STR marker of a large panel. The panel size (number of analyzed markers) varies between tests, but generally at least 20 different markers are analyzed and some tests can examine over 100 different Y-DNA STR markers. The resulting set of repeat numbers is called the individual’s Y-DNA profile or “haplotype”. Please note that “haplotype” is not the same as “haplogroup” (we will be discussing “haplogroups” later in the tutorial).
The number of repeats for each Y-DNA STR marker does not contain ancestral information by itself. However, it becomes useful when it is used to compare against other individuals or against specific population groups. All males who have descended from the same paternal lineage have the same or very similar Y-DNA STR marker profiles. Comparing your Y-DNA STR marker profile against another male in question or against a database of indigenous populations allows you to search for relatives along your paternal line and also allows you to find out which indigenous groups are the closest match to your paternal lineage.
The next four sections discuss the most common types of Y-DNA STR markers that are used for paternal ancestry testing.
Single-Copy Y-DNA STR Markers
Single-copy Y-DNA STR markers are STR markers that occur only once in the human genome. That means that when this marker type is tested to investigate the number of repeats, only one repeat number or “allele value” will be obtained for each marker in your report.
Multi-Copy Y-DNA STR Markers
Multi-copy Y-DNA STR markers are markers that occur more than once (i.e. more than one copy) in the human genome. That means that when this marker type is tested, more than one allele value can be obtained for each of these markers in your report.
For example, the Y-DNA STR markers DYS385, DYS459 and YCAII are typically present at two different locations on the Y-chromosome; therefore, they are also termed “duplicated markers”.
The allele values for each copy are not reported in any specific order, as the exact order of copies cannot be determined, but typically, the smaller allele value is reported first, followed by the larger allele value.
Special Multi-Copy Marker DYS389
DYS389 is a special marker. Unlike other multi-copy Y-DNA STR markers, only one location of DYS389 is amplified. The forward primer for DYS389 binds at a specific location on the Y-chromosome, whereas the reverse primer binds at two different locations. Such amplification yields two PCR products: the shorter DYS389I fragment and the longer DYS389II fragment, so two allele values are always reported for this marker.
Special rules for reporting Y-DNA STR Marker DYS464
DYS464 is a special Y-DNA STR marker which is known to have 4 to 7 alleles (a to d for 4 or a to g for 7). Previously, the “genotype” is reported. When the genotype is reported, identical repeats are reported multiple times if that same repeat is present more than once. For example, someone may be reported as having the following markers for DYS464:
DYS464a = 17
DYS464c = 18
DYS464d = 19
In the example above, the value “18″ is reported twice. This type of reporting is known as “genotype” reporting. However, recent policies implemented by the American Association of Blood Banks mandates that all Accredited DNA testing laboratories must report the “phenotype” instead of “genotype” for multi-copy markers, especially if the marker has more than 2 alleles. Though the requirements of AABB refers to apparent homozygotes detected on autosomal DNA, the interpretation concerns are similar for multicopy alleles on the Y-Chromosome because peak height ratio detection of STR fragment analysis results is not a validated method for copy number detection in the absence of real time PCR and sequencing.
If the phenotype is reported for the above individual, then the same results will be shown as follows:
DYS464a = 17
DYS464b = 18
DYS464c = 19
In the example above, duplicate repeat values for DYS464 are only reported once when reporting phenotypes, so when reporting by phenotype, “18″ is only reported once instead of twice.
Phenotype reporting is especially important in cases with more than 2 alleles such as DYS464 which is known to have up to 7 alleles. Currently, only the phenotype will be reported for Y-DNA STR marker DYS464.
Y-DNA STR Marker Testing
Y-DNA STR testing involves analyzing a panel of STR markers on the Y-Chromosome (usually at least 20 different markers). The combined result of all the markers tested determines your Y-DNA haplotype (or Y-DNA STR Profile) and this haplotype represents the unique genetic code for your paternal ancestral line. All males who have descended from the same paternal lineage as you will have exactly the same or a very similar Y-DNA haplotype as you.
Your results report for your Y-DNA STR marker test lists each marker that was tested and the number next to each marker name is your unique allele value (number of repeats) for that marker.
Using your Y-DNA haplotype to search for or verify family linkages
Your Y-DNA haplotype is the same or very similar to that of all males who have descended from the same forefather as yourself. This means that your father, grandfather and great-grandfather along your paternal lineage all carry the same Y-DNA haplotype as you.
Once you have tested your Y-DNA STR markers, you can use your haplotype to search the DNA Reunion database for people who are linked to you on your paternal line. You can also use it to verify whether any other male individual descended from the same paternal line as you. Groups of males with the same surname often consider using this test type to see if they have descended from the same male lineage.
What is the Atlantic Modal Haplotype?
Some Y-DNA haploytpes (profiles) occur more frequently in certain parts of the world. For example, people whose ancestors are from the western coast of Europe often share in common a small group of Y-Chromosome STR markers, which is called the Atlantic Modal Haploytpe. The Atlantic Modal Haploytpe is tied to the R1b haplogroup and is characterized by the following Y-DNA STR markers:
DYS19 = 14
DYS388 = 12
DYS390 = 24
DYS391 = 11
DYS392 = 13
DYS393 = 13
More information about the Atlantic Modal Haplotype can be found in Wilson et al. (2001) Genetic Evidence for Different Male and Female Roles During Cultural Transition in the British Isles. Click here to download and print a copy of the original article.
What is the difference between “Y-DNA haplotype” and “Y-DNA haplogroup”?
Y-DNA haplotypes should not be confused with Y-DNA haplogroups. A Y-DNA haplotype is an individual’s Y-DNA STR profile and includes the number of repeats at specific STR markers located on the Y-chromosome. Y-DNA haplotypes are useful for tracing recent paternal lineages and connections.
An individual’s Y-DNA haplogroup represents his “deep ancestry” or ancient family group. Y-DNA studies have shown that all males living today are descendants of a single root paternal ancestor who lived in Africa approximately 150,000 years ago. Over time, our ancestors migrated out of Africa in waves and populated the world. All males can be traced to one of less than two dozen main Y-DNA haplogroups. Y-DNA haplogroups are designated by letters, such as “Y-DNA Haplogroup J”. Haplogroups are useful for scientists who are studying human migration patterns and have archaeological value.
Y-DNA haplogroups are determined by testing a type of marker in the Y-DNA known as “SNP” (single nucleotide polymorphism) markers. SNP markers are not the same as STR markers. STR marker testing determines your haplotype and pertains to recent ancestry and will not tell you your haplogroup. Although your haplotype can be used to predict your haplogroup, confirmation of Y-DNA haplogroups must be made through Y-DNA SNP testing. We will be discussing Y-DNA haplogroups in more detail later in this tutorial.
What is Genetic Distance?
After you test your Y-DNA STR markers, you can compare your markers against the markers of any other male individual to see whether you share a recent common male ancestor. If a match is found, you can determine the TMRCA (time to most recent common ancestor) – a measurement of approximately how long ago you and the matching individual shared the common ancestor.
A key measurement value when comparing the Y-DNA STR markers between two different individuals is Genetic Distance.
Genetic distance is a measurement of the total difference in allele values of different genetic markers between two individuals. The smaller the value of the genetic distance for a given set of Y-DNA STR markers, the closer two individuals are related, and the more recently they shared a common ancestor. The method used to determine genetic distance for four different Y-DNA STR marker types is explained on the next pages.
The DNA Reunion database will automatically calculate the Genetic Distance and TMRCA values between you and your matches. You can also perform your own matches by plugging in the marker values at www.dnacalculator.org.
Calculation of genetic distance for single-copy and most multi-copy Y-DNA STR markers
For single-copy STR markers, the calculation is straightforward. The genetic distance for each single copy marker between two individuals is the absolute value of the difference between the value of the markers.
For most multi-copy markers, genetic distance can be calculated by adding the differences in allele values for each of the two copies.
The total genetic distance between two individuals is the sum of the genetic distances of all markers compared.
Calculation of genetic distance for multi-copy marker DYS464 – using Infinite allele model
Assuming mutations at the same marker took place in a single generation, the “Infinite allele” method of determining genetic distance counts the total difference between all copies of the same marker as 1 genetic distance, despite the fact that more than one mutation exists.
The genetic distance for DYS464 is calculated using this method. Regardless of the number of mismatches between two individual at Y-DNA STR marker DYS464, the genetic distance is always reported as a maximum of 1. Please remember that “genetic distance” is not the same as “mismatching markers”.
Calculation of genetic distance for DYS389i/ii
DYS389i is embedded in DYS389ii; therefore, the DYS389i values are included in DYS389ii values. Genetic distance can be determined by adding up two differences: differences in DYS389i values and differences in the second part of DYS389ii values, which are obtained by subtracting the DYS389i values from the DYS389ii values.
What is TMRCA?
TMRCA stands for “Time to Most Recent Common Ancestor”. It’s a measure of how long ago any two male individuals likely shared a common patrilineal ancestor.
Determining TMRCA through DNA testing:
The TMRCA for any two male individuals can be determined by testing and comparing their Y-DNA STR marker profiles. The more Y-DNA STR markers that are tested and compared, the higher the accuracy of the TMRCA prediction.
Here are some examples of how testing more Y-DNA STR markers can increase the precision of the TMRCA value:
12 Y-DNA STR marker test: If you and someone else test 12 STR markers, and matched each other perfectly at all 12 markers, your TMRCA is approximately 14.4. This means that you and the other individual likely shared a common ancestor between 0 to 14.4 generations ago. Now that’s a very broad time frame and does not provide solid evidence that two individuals are from the same line.
20 Y-DNA STR marker test: If two males match perfectly at 20 STR markers, the TMRCA is narrowed down to 8.3. This means that these two individuals likely descended from the same line and shared a common ancestor anytime between 0 to 8.3 generations ago.
44 Y-DNA STR marker test: If two males match perfectly at 44 STR markers, the TMRCA becomes only 3.8. This means that these two individuals shared a common ancestor between 0 and 3.8 generations ago.
As you can see, the more STR markers that are compared, the more precisely you can narrow down the time frame that you and another person shared a common paternal ancestor.
Why test more Y-DNA STR markers?
A number of different STR markers can be tested in the Y-DNA. The more Y-DNA STR markers that are tested, the more discriminating the matches will be when comparing against other individuals.
For example, comparison of 12 Y-DNA STR markers alone is generally not powerful enough to distinguish family lines and can give inconclusive results. The more markers that are available for comparison, the more discriminating the comparison becomes.
There are two major advantages for comparing more markers:
1. To prevent false positives
2. To obtain conclusive results
Y-DNA STR testing scenario
Mr. Jones has been studying his family’s ancestry for several years and has started a “Jones” family study based in Arizona. He is interested in confirming that his family line is linked to a “Jones” line in New York. Although there are rumours that the two lines are related, Mr. Jones does not have the paperwork to prove this link. Mr. Jones is also interested in finding out whether his line is linked to any other Jones lines worldwide.
Mr. Jones had previously chosen to test just 12 Y-DNA STR markers. After testing, he uses the 12 markers to search the DNA database and finds out that he is a perfect match to the Jones line in New York. However, he also finds that he has a perfect match to over 200 individuals in the DNA Reunion database, and over half of them do not even share his surname. How is this possible? Does it mean that he is related to everyone who matches him at the 12 markers? No, this simply means that data from only 12 markers is not powerful enough to distinguish Mr. Jones from other family lines.
To clarify this, Mr. Jones decides to increase his number of Y-DNA STR markers to 20. He searches the database using 20 Y-DNA STR markers and this time narrows down the number of matches. In fact, now, only 18 people match him perfectly at his 20 markers, including the Jones line in New York. Surprisingly, many of the individuals who matched perfectly at 12 markers, only match at 14 or less out of the 20 markers tested. This confirms that there is no familial link with most of the 200 individuals identified in the 12 marker test, as more than 3 mismatches indicates that two family lines are not closely related.
To further clarify the findings, Mr. Jones decides to upgrade to a 44 marker test. This time, he finds out that he is a perfect match at all 44 markers to only two lines – a Jones line in England, and a Jones line in the United States. After contacting the two lines and comparing paperwork and stories, Mr. Jones was able to confirm that his line was indeed definitely linked to both lines and he is now able to add both new lines to his family tree.
Surprisingly, Mr. Jones was also able to find out that only 43 out of the 44 markers matched with the Jones line in New York. This suggests that although the Jones line in New York is related to his line, they are likely more distantly related.
Mr. Jones also discovered that he had a close match to 4 other Jones lines (43 out of the 44 matched) and he is now pursuing the possibility that the 4 other lines are also distantly related to him. TMRCA analysis dictates that 1 mutation occurs every 500 generations, and thus we would detect a mutation every 12 generations with the 44 marker test.
Mr. Jones is now trying to recruit more Jones males from throughout Europe to try to reconstruct and relink his family line.
A comparison using only 12 Y-DNA STR markers was not discriminating enough for Mr. Jones to pinpoint his family lines. After increasing to 20 Y-DNA STR markers, Mr. Jones was able to obtain more useful information and was able to eliminate false matches initially generated when only 12 markers were compared. However, after increasing to 44 markers, Mr. Jones was able to pinpoint the people that he was looking for and was furthermore able to accurately answer his questions about his relationship to the Jones line in New York. Mr. Jones can continue to carry on his research, and as more and more people are tested and added to the DNA Reunion database, Mr. Jones will be able to reconstruct his family line in greater detail and re-unite with Jones worldwide who are descendants of his family line.
What are Y-DNA Haplogroups?
DNA studies have shown that all people living today can trace their ancestry back to common roots in Africa approximately 150,000 years ago. Over time, man eventually journeyed out of Africa, and in many waves of migrations which spanned tens of thousands of years, eventually populated the rest of the world. During these ancient journeys, small mutations called “SNPs” occurred randomly in their Y-DNA. Each SNP acts as a “time-and-date stamp” which allows us to understand the approximate time and location in the journey our ancestors were when the SNP first occurred. Once a SNP occurs, it is passed down to all future generations and serves as a marker which allows us to approximate where our ancestors were at specific time points every few thousand years along the ancient migration out of Africa.
Today, our Y-DNA and mtDNA contain a rich collection of SNP markers, passed down to us from our ancient ancestors over thousands of years. Using SNPs found in our Y-DNA, all people living today can be plotted onto a paternal tree of mankind called the “Y-DNA Phylogenetic Tree”. The main branches of the tree are called “Y-DNA Haplogroups”. The finer sub-branches of the tree are called “Y-DNA Subclades”.
By testing the STR markers in your Y-DNA, you can predict which Y-DNA haplogroup you most likely descended from. By testing SNPs in your Y-DNA, you will be able to conclusively confirm your Y-DNA haplogroup.
Y-DNA Haplogroups are associated with different regions of the world
Once you find out which Y-DNA haplogroup you belong to, you can find out which general region of the world your paternal ancestors came from. Please note that haplogroups are NOT country specific. Y-DNA haplogroups are associated with specific migration paths leading to specific regions of the world, so once you know which Y-DNA haplogroup you belong to, you will know the general geographic location of the world your paternal ancestors came from, i.e. Asia, Europe, Americas (Native American), Africa, Middle East, Indigenous Australian etc.
Haplogroups are NOT country specific because there are no Y-DNA haplogroups which are exclusively found in only one country and not a neighboring country. Y-DNA haplogroups can be further classified into finer sub-branches called “Subclades”. Knowing your subclade may provide further geographical localization of your ancestry if published research on the geographical distribution of the subclade is available.
The following chart shows the Y-DNA haplogroups found in each region.
|Region/Population||Major Y-DNA haplogroups|
|Native Americans||C, Q|
|Oceanic and Aboriginal Australians||C, K, M, N, S|
|East Asian||C, D, N, O, Q|
|South Asian (i.e. India)||C, H, L|
|Europe and Middle East||I, J, R, T|
|Diverse||F, G, P|
|African||A, B, E|
What are Y-DNA Subclades?
While Y-DNA haplogroups represent the main branches of the Y-DNA phylogenetic tree, Y-DNA subclades represent the finer sub-branches of each haplogroup. Once you find out which Y-DNA haplogroup you belong to, you can take a Y-DNA subclade test for your particular haplogroup to further narrow down your position in the Y-DNA phylogenetic tree. Please note that not all Y-DNA haplogroups have a subclade test available. At this time, subclade testing is only available for Y-DNA haplogroups E, G, I, J, L, O, Q, and R. Please also note that new Y-DNA subclades are discovered at a rapid rate. The subclade panels examine the most commonly known subclades of each Y-DNA Haplogroup, but there may be additional rare or undiscovered Y-DNA subclades which are not included in the existing Y-DNA subclade test panels.
Y-DNA haplogroups have an extremely broad distribution pattern and can only provide a very general idea of the region of the world that the haplogroup is found, such as Asia, Africa, Europe etc. Y-DNA subclades represent the finer sub-branches of Y-DNA haplogroups and theoretically will have a more localized distribution pattern than haplogroups. However, please remember that even if you find out which Y-DNA subclade you belong to, it is not guaranteed that population distribution information for your subclade is known yet in the scientific community.
A race against time
The scientific community is racing against time to test the DNA of indigenous populations from around the world in order to gain a better understanding of the distribution pattern of Y-DNA haplogroups and subclades found in different parts of the world. This type of study is ongoing and not all Y-DNA subclades currently have distribution data or some may only have limited data.
Since the population distribution studies are ongoing in the scientific community and new distribution data for haplogroups and subclades is published at a rapid rate, the database is updated routinely to reflect the latest known data. Even if very limited population distribution information is currently published for your subclade at the time that you take the subclade test, your results will be constantly updated online as new information pours in, so you will always see the latest interpretation and distribution information for your subclade when you login to your account.
The Y-DNA test involves testing a panel of STR and SNP markers in the Y-Chromosome.
*Important Reminder: Only males can take the Y-DNA Tests
The Y-DNA test can only be performed on males, since females do not carry the Y-Chromosome (the Y-DNA is passed down from a father to all of his sons). Females who wish to trace their paternal lineage must test a male relative and use their markers (e.g. brother, father, male cousin on paternal line, nephew on paternal line etc.).
The available Y-DNA test types are listed in the table below:
|Y-DNA Test Type||Prerequisite||Purpose||Description|
|Y-DNA STR Tests||none||For use in searches and comparisons with other individuals||This test is the starting point for paternal ancestry research. The options for STR testing include 20, 44, 67, 91 and 101 markers. The 20 marker test is sufficient to use most of the search and analysis features. We recommend initially testing 20 or 44 markers, and then adding more Y-DNA STR markers if you wish to narrow down the matches and TMRCA value.|
|Y-DNA Haplogroup Determination SNP Test||Y-DNA STR Test, minimum 20 Y-DNA STR markers||To confirm your Y-DNA haplogroup||The Y-DNA Haplogroup Determination SNP test analyzes a selection of SNP markers to conclusively confirm which Y-DNA haplogroup you belong to. This test also allows you to search for matches to indigenous populations. If you have a “strong” prediction for your Y-DNA haplogroup based on STR testing, your Y-DNA haplogroup may be determined based on STR markers alone.|
|Y-DNA Subclade Determination SNP Test (optional, only available for selected Haplogroups)||Y-DNA Haplogroup Determination SNP Test||To confirm your Y-DNA subclade||The Y-DNA Subclade Determination SNP test analyzes a selection of SNP markers within the Y-Chromosome in order to confirm your Y-DNA subclade. Subclade testing is currently only available for the following Y-DNA Haplogroups: E, G, I, J, L, O, Q, R.|
|Y-DNA Stand Alone SNP Test (optional, only available for selected subclades)||Y-DNA Subclade Determination SNP Test||To further refine your Y-DNA subclade||Over time, new SNPs are discovered and the Y-DNA phylogenetic tree will continue to expand and become more detailed as new discoveries are made. If a new SNP is discovered in your branch of the Y-DNA tree, the new SNP will automatically be offered to you for purchase as a “Stand Alone SNP” test. This allows you to continue to refine your results.|