When a result comes back from 23andMe, MyHeritage, or AncestryDNA announcing that you are “34% Scandinavian” or “12% Anatolian,” it is easy to read that number as a precise biological fact. It is not. It is a statistical estimate, probabilistic, reference-dependent, and shaped by algorithmic choices that differ from one company to the next. Understanding what these tests actually measure, where they are reliable, and where they must be treated with caution is essential to getting genuine value from your results. This article walks through the science behind commercial DNA testing: the admixture algorithm that produces ethnicity percentages, the reference panels that define what “Irish” or “Anatolian” means computationally, and the segment-based features, 23andMe’s chromosome painting, MyHeritage’s Genetic Groups, AncestryDNA’s Communities, that add a genealogical layer on top of pure ancestry estimation.
1. The Engine Under the Hood: Admixture Estimation
Every ethnicity estimate from every major company rests on the same conceptual foundation: a statistical model that assumes your genome is a mixture of K ancestral populations and tries to find the proportions that best explain your observed genotypes. The most widely used implementation is the ADMIXTURE algorithm (Alexander, Novembre & Lange, 2009), though most companies use proprietary variants. The core logic is identical: given a panel of reference individuals whose ancestry is known and assumed to be “pure,” fit each test individual as a combination of those reference clusters.
This sounds straightforward, but the devil is in three sources of ambiguity that make all results estimates rather than measurements.
| Source of Uncertainty | What It Means in Practice | Effect on Your Results |
|---|---|---|
| The reference panel | The algorithm can only “see” populations represented in its reference. Unrepresented ancestries are absorbed into the nearest proxy. | A Moroccan Berber may appear partly “Southern European” if no Berber-specific reference cluster exists. |
| The choice of K | How many ancestral clusters the model assumes. Too few: real diversity is collapsed. Too many: noise appears as signal. | Running the same data at K=8 vs K=18 produces meaningfully different percentages. |
| Modern vs. ancient reference | All major companies use modern reference populations. But the people alive today in Ireland or Turkey are not identical to those who lived there 500, 1000, or 2000 years ago. | Deep ancestry (>1000 years) is poorly captured; results reflect recent population structure, not Iron Age or Roman-era ancestry. |
| Probabilistic assignment | Percentages are posterior probability distributions compressed into a single number. Small percentages especially carry wide confidence intervals. | A result of “4% Scandinavian” is often indistinguishable from zero given the model’s uncertainty. |
Most big-box tests use a supervised variant of ADMIXTURE: the reference populations are fixed in advance (e.g., “Irish,” “Nigerian Yoruba,” “Han Chinese”), and the algorithm finds the best fit for your genome against those fixed anchors. Unsupervised ADMIXTURE, used in academic research, discovers clusters from scratch without preassigned labels. Supervised methods are more interpretable but entirely dependent on which reference populations were chosen and how well they represent global diversity.
2. Reference Panels: The Foundation of Every Result
The single most important factor determining how useful and accurate your ethnicity estimate is, and why the same genome produces different results at different companies, is the reference panel: the database of individuals whose ancestry has been verified and who serve as the algorithm’s anchors for each ethnic or regional cluster. A larger, more geographically diverse reference panel generally produces finer-grained and more reliable results, but size alone is not sufficient: sampling strategy, verification quality, and the choice of which populations to include as distinct clusters all matter.
| Company | Reference Panel Size (approx.) | Number of Regions | Algorithm Type | Key Strength | Key Limitation |
|---|---|---|---|---|---|
| 23andMe | ~14,000 reference individuals | ~45 populations (grouped into ~2,000 sub-regions) | Supervised admixture + chromosome painting | Transparent methodology; chromosome-level resolution; strong in Europe and East Asia | Smaller raw panel; weaker in Central Asia, Oceania, indigenous Americas |
| AncestryDNA | >60,000 reference samples (as of current panel) | 1,400+ geographic regions | Supervised admixture + tree-informed network (Communities) | Largest panel; powerful recent-genealogy Communities; 1,400 region labels | Some regional labels are overconfident; results updated frequently (can change dramatically) |
| MyHeritage | ~5,000+ reference individuals | 42 ethnicities | Supervised admixture + DNA match clustering (Genetic Groups) | Strong Ashkenazi Jewish, Middle Eastern, and some European distinctions; Genetic Groups genealogically informative | Smallest panel; fewest ethnicity categories; less granular European breakdown |
If you test at all three companies, you will receive three different sets of percentages from the same DNA. This is expected and does not mean any company is “wrong.” It reflects differences in reference panel composition, regional category definitions, K values, and algorithm implementation. The broad continental proportions should agree reasonably well; subcontinental distinctions (e.g., “Northwestern European” vs. “British & Irish” vs. “French & German”) will often diverge because these populations are genetically very similar and category boundaries are partly arbitrary.
3. Three Companies, Three Approaches
3.1 23andMe: Chromosome Painting and Ancestry Composition
23andMe’s flagship ancestry feature, Ancestry Composition, is technically among the most sophisticated of the major consumer tests because it operates at the chromosome segment level rather than treating the whole genome as a single unit. Their algorithm (described in published methodology documents) performs local ancestry inference: it paints each segment of each chromosome a colour corresponding to one of their reference populations. The aggregate of these painted segments produces your overall percentages, but you can also inspect individual chromosomes to see where each ancestry appears.
This matters for one practical reason: phased chromosome painting is more informative than simple genome-wide admixture. If you carry a 10 cM segment on chromosome 7 that paints clearly as East Asian, that is a real signal from a real ancestor, not an admixture artefact. The chromosome view lets you distinguish genuine segments from background noise, though segments shorter than ~7 cM should still be treated with scepticism due to phase-switching errors and reference panel misassignment.
23andMe also offers a “Genetic Affinity” feature that compares your genome to specific present-day populations within their database, producing a more granular geographic estimate than the standard 45-category breakdown. Their confidence levels (“Speculative,” “Standard,” “Conservative”) reflect posterior probability thresholds and should be treated accordingly: Conservative mode suppresses weak or uncertain calls; Speculative mode includes assignments below the standard confidence threshold.
Use Standard confidence as your baseline, Speculative mode inflates small percentages and produces noisy sub-regional labels. The chromosome painting view is the most informative feature: long painted segments (>10 cM) in unexpected ancestries are worth investigating genealogically. Short segments (<5 cM), especially at Speculative confidence, are often misassigned noise. If a ancestry category is “Northwestern European” or similarly broad, that reflects genuine algorithmic ambiguity between similar populations, not imprecision in your DNA.
3.2 MyHeritage: Ethnicity Estimates and Genetic Groups
MyHeritage combines a standard admixture-based Ethnicity Estimate with a distinctive layer called Genetic Groups. The ethnicity estimate operates on the same supervised-admixture principle as the other companies, with 42 reference populations, a more limited palette than 23andMe or AncestryDNA, which means European sub-regions are sometimes collapsed into broader categories.
The more genealogically useful feature is Genetic Groups. Rather than estimating ancestry from the algorithm alone, MyHeritage identifies clusters of DNA match users who share significant amounts of DNA with each other, implying recent shared ancestry, and assigns them to geographic groups based on where members of those clusters have documented family origins. If you are assigned to, say, the “Ashkenazi Jewish, Poland” genetic group, this means you share significant DNA with a cluster of people who have documented roots in Polish Jewish communities, not merely that your admixture percentages suggest Ashkenazi ancestry.
This segment-sharing approach makes Genetic Groups more reliable for recent genealogical ancestry (last 5, 8 generations) than for deep ancestry, and particularly powerful for endogamous communities (Ashkenazi Jewish, Sephardic Jewish, Romani, and certain island populations) where DNA match patterns are very distinctive.
3.3 AncestryDNA: Ethnicity Regions and Communities
AncestryDNA’s reference panel is the largest of the three, and their result interface distinguishes between two layers: Ethnicity Regions (admixture-derived, comparable to 23andMe’s Ancestry Composition) and Communities, which represent a qualitatively different type of inference.
Communities are assigned not purely from your DNA, but from a combination of your DNA match network and the family trees attached to your matches. If thousands of your DNA matches have documented family trees placing their ancestors in, say, the Azores or County Cork or colonial Virginia, and you share enough DNA with them, AncestryDNA assigns you to that Community. This is not admixture estimation, it is genealogical network analysis informed by documented family history. As a result, Communities are often the most historically specific ancestry signal you can obtain from a consumer test, and they are frequently more informative about where your ancestors actually lived in the last 300, 400 years than any percentage estimate.
The limitation is the same as its strength: Communities depend on the composition of the AncestryDNA user base and the quality of family trees attached to matches. Populations underrepresented in AncestryDNA’s customer base, many African, Asian, and indigenous populations, have fewer or no Communities. Matching-based inference also propagates errors in family trees.
AncestryDNA labels up to 1,400 geographic sub-regions. Some of these are genuine signals supported by both admixture patterns and match network data; others are algorithmically inferred labels with wide confidence intervals. A label of “Kildare and Wicklow” rather than simply “Ireland” sounds impressively specific, but should be treated with caution: county-level distinctions within Ireland are near the edge of what can be resolved from shared DNA alone. The broad regional assignments (“Ireland & Scotland,” “Germanic Europe,” “Nigeria”) are substantially more reliable than the sub-regional labels.
4. What Each Feature Actually Measures
The diagram below places the main features of each company along two axes: the time depth they reliably reach (from recent genealogical ancestry to deep pre-historic population structure) and their geographic resolution (from continental to county-level). Understanding this map helps you know which tool to reach for depending on your question.
5. How to Interpret Your Results Without Being Misled
The following principles apply regardless of which company you tested with. They reflect the genuine capabilities and limitations of admixture estimation as a technology, not any flaw specific to one company’s implementation.
| Principle | What to Do | What Not to Do |
|---|---|---|
| Trust continental proportions; be sceptical of sub-regional labels | Your proportion of European, African, East Asian, or Indigenous American ancestry is reliably estimated by all three companies. | Do not treat labels like “Scottish vs. Irish” or “Northern vs. Southern Italian” as precise measurements. These populations are genetically nearly indistinguishable. |
| Treat small percentages (<5%) as uncertain | Note them but do not build genealogical hypotheses around them until corroborated by DNA matches or documents. | Do not claim Native American, Jewish, or Romani ancestry based solely on a 2, 3% estimate. These can be reference panel artefacts or genuine signal, matching data is needed to distinguish them. |
| Use DNA matches, not just percentages | AncestryDNA Communities and MyHeritage Genetic Groups add genealogical context. Cross-reference estimated matches to find shared ancestors. | Do not rely on the ethnicity estimate alone for genealogical research. The match network is far more informative for the last 6, 8 generations. |
| Understand that results can change | When a company updates its reference panel, your percentages may shift. This is normal and reflects improved data, not a different genome. | Do not treat a previous estimate as a fixed truth. A result of “18% Scandinavian” in 2018 may become “6% Scandinavian” in 2024 after panel improvements, both are estimates, neither is wrong. |
| Remember the time depth limitation | These tests are best at ancestry from the last ~1,000, 1,500 years. For pre-medieval population structure, academic tools (G25, qpADM) are required. | Do not interpret “Anatolian” ancestry as meaning your ancestors came directly from Anatolia in recorded history. Most European Anatolian ancestry is Neolithic, ~7,000 years ago, and is present in virtually all Europeans at varying levels. |
| Compare across companies for context | If all three companies agree on a component, it is more likely real. If only one reports it, treat it sceptically. | Do not average the three estimates arithmetically, they use different reference populations and the categories are not comparable. |
Admixture estimates are most reliable for individuals from endogamous communities, populations that have historically married within the group, such as Ashkenazi Jewish, Sephardic Jewish, Finnish, Sardinian, or certain island populations. This is because these communities have distinctive genomic signatures that are easy to identify even with moderate reference panels. Ironically, DNA matching is more complex for these populations: because everyone shares large amounts of DNA through many channels, distinguishing close genealogical relatives from population-level sharing becomes difficult. MyHeritage and AncestryDNA both have specific tools to handle Ashkenazi matching, for example.
6. Common Misconceptions Addressed
“23andMe is the most accurate; its results are more reliable than other companies.”
23andMe has one of the more transparent methodologies and strong chromosome-level resolution, but “most accurate” depends on what you are measuring. For recent genealogical ancestry in North America, AncestryDNA’s Communities often outperform any pure admixture estimate. For endogamous communities, MyHeritage is competitive. No company is uniformly superior across all ancestries and questions.
“If I have 2% Native American ancestry, I must have a Native American ancestor.”
A result of 2, 3% for any ancestry, especially from under-represented reference populations, may reflect a real ancestor or may be a reference panel artefact. The threshold matters: 8, 10% or more, especially confirmed by chromosome painting showing a long segment, is substantially more reliable. Small claims require corroboration from DNA matches who share the same segment, or from documented genealogy.
“My ethnicity estimate tells me exactly where my ancestors lived.”
Ethnicity estimates describe your DNA’s statistical similarity to modern reference populations, not to historical populations. “Germanic Europe” means your DNA resembles modern German or Austrian reference samples. It does not mean your ancestors lived within today’s borders of Germany. Population distributions have shifted dramatically over the last 1,000 years through migration, war, and border changes.
“My result changed between updates; the company must have made an error.”
Reference panel updates are improvements, not corrections of errors. When AncestryDNA or 23andMe adds thousands of new reference individuals or refines their regional categories, the algorithm’s resolution changes and results shift. A shift from “19% Irish” to “11% Irish, 8% Scottish” does not mean your ancestry changed, it means the company can now distinguish two previously collapsed categories.
“These tests tell me about my deep ancestry, Roman, Viking, or Celtic roots.”
Commercial tests use modern reference populations and are calibrated for recent ancestry. They cannot reliably detect Roman, Viking, or Celtic contributions specifically, because these historical populations were not genetically uniform and their descendants have blended into modern European populations that all companies use as reference anchors. Academic tools (G25, qpADM) using ancient DNA are required for pre-medieval ancestry questions.
“AncestryDNA’s 1,400 regions are all equally reliable.”
The 1,400+ regions vary enormously in reliability. Broad regional categories (Ireland, West Africa, Japan) are robustly estimated. County-level distinctions within a single country (e.g., specific counties within Ireland or specific provinces within Spain) are at the edge or beyond the edge of what genetic data can reliably resolve. These fine-grained labels are partly inferred from match network geography rather than pure genetic differentiation.