For fifteen years, population genetics has rested on a methodological triptych: ADMIXTURE for decomposing genomes into K ancestral components, G25 and its PCA ecosystem for visual modelling, qpADM for formally testing admixture hypotheses. These tools rewrote Eurasian migration history and gave the genetic genealogy community a shared vocabulary. They are also reaching their structural limits. A new generation of algorithms, built on identity by descent (IBD), ancestral recombination graphs (ARG), and deep learning, is emerging from labs at Oxford, Stanford, the Broad Institute, the Wellcome Sanger Institute, and from the R&D pipelines of Ancestry and 23andMe. Ancestry's SideView technology, shipped in 2022, is already the first consumer-scale application of the new paradigm: an IBD-based phasing and ancestry inference engine that bypasses ADMIXTURE entirely. ARG-Needle (2023), Threads (2024), SINGER (2025), Orchestra (2025), and ARGformer (2026) are the academic siblings of the same revolution. The five-year horizon could look as different from the G25 era as the G25 era looked from the pre-ancient-DNA period of 2010.
Key Points
- ADMIXTURE, G25 and qpADM all operate on allele frequencies and discard most of the haplotype information encoded in the lengths of shared genomic segments. Their assumption of K discrete ancestral populations is a useful fiction whose limits are now visible.
- The single most consequential failure mode of every frequency-based method is the midpoint effect: when a real ancestral component is missing from the reference panel, its signal is silently redistributed onto whichever references sit geometrically between the target and the missing component. The output looks clean, the residual looks small, but the historical labels can be entirely wrong. IBD and ARG-based methods escape this trap because they operate on physical segments and genealogical paths, not on geometric projections.
- IBD (identity by descent) captures a complementary and orthogonal dimension: segments physically inherited from a recent common ancestor, with lengths that directly date the admixture event through an exponential distribution governed by recombination rate.
- The ancestral recombination graph (ARG) is the theoretical limit of population genetic inference. It is a single graph-like structure that encodes coalescence, IBD, recombination, mutation, and from which every classical summary statistic can be derived.
- ARG inference now scales to the UK Biobank's nearly 500,000 genomes (Threads, 2024). A milestone that was technically impossible in 2020 has been crossed in four years, and the trend is accelerating.
- Ancestry shipped SideView in April 2022, an IBD-based phasing technology that achieves 95 percent accuracy for 90 percent of customers, without requiring either parent to be tested. The consumer industry is already past the inflection point.
- Deep learning methods (Gnomix, Orchestra, SIA, ARGformer) replace parametric models with learned representations and improve quarterly, often by margins that classical methods can no longer match. Orchestra (Nature Communications 2025) achieves above 95 percent precision and recall on 35 worldwide populations.
- The most likely near-term replacement for G25 is not the full ARG paradigm but a hybrid generation that combines IBD-derived haplotype information with PCA-style dimensionality reduction. ChromoPainter, FineSTRUCTURE and GLOBETROTTER have existed since 2012, and the next generation of community calculators will almost certainly be built on painted haplotype copying matrices rather than on raw SNP-based PCA.
- The convergence is not a five-year vision. The components already exist in 2026 in separate codebases. Integration is a software engineering problem, not a scientific one.
- ADMIXTURE, G25, and qpADM will not disappear immediately. Their displacement will be gradual, and the ancient DNA reference panel that anchors them will remain absolutely central in the new paradigm.
1. Where we stand: the current triptych
To grasp the shift that is unfolding, you need to understand what ADMIXTURE, G25 and qpADM actually do, and what they share. All three operate on matrices of allele frequencies (or linear transformations of them) and all three assume, implicitly or explicitly, that ancestral populations can be treated as discrete entities contributing independently to a test individual's genome.
ADMIXTURE (Alexander, Novembre, Lange 2009) is a maximum likelihood model that decomposes each genome into K ancestral components. For roughly ten years it was the de facto standard, and it remains widely used. Its mathematical core is solid, but its base assumption is increasingly hard to defend: that K populations exist as real entities and can be characterised by a vector of allele frequencies. Haplotype structure, linkage disequilibrium, the temporal sequence of admixture events, all of this is invisible to the model. ADMIXTURE returns a static cross-section of ancestry, not a history of it.
G25 (Davidski, from 2019 onward) projects genomes onto 25 principal component axes built from a curated panel of modern and ancient samples. It has been extraordinarily productive for the community, but it is conceptually a compression of information: Euclidean distances in this 25-dimensional space approximate population distances, and Non-Negative Least Squares (NNLS) modelling lets you express a target as a non-negative linear combination of references. The results are remarkably coherent for most reasonable models, which is why the tool has become the lingua franca of amateur paleogenetics. But G25 is not an inference framework. It is a geometric heuristic. It does not quantify uncertainty, does not date admixture, and remains highly sensitive to the composition of the reference panel.
qpADM (Haak, Patterson et al. 2015) is the most rigorous of the three. It formally tests admixture models using f-statistics on carefully chosen outgroups. It returns a p-value, confidence intervals on mixture proportions, and the ability to reject candidate models. But it does not scale beyond a few dozen source populations, requires careful selection of right and left populations, and remains fundamentally based on second and third moments of allele frequencies. All phase information, all segment length information, all haplotype structure is still discarded.
The common thread: all three tools look at allele frequencies, not at segments of DNA. But the genome is not a soup of independent alleles. It is a mosaic of inherited segments, each with its own genealogical history. Discarding that mosaic structure is like trying to reconstruct a family tree by counting how often each surname appears in the phone book, instead of reading actual genealogies.
2. The Achilles heel of allele frequencies
Consider two European individuals, one Finnish, one Portuguese. ADMIXTURE and G25 will primarily see a combination of three components (Steppe, Early European Farmer, Western Hunter-Gatherer) in different proportions. That is correct, but it is a very coarse statistical summary. What is physically happening in their genomes is much richer.
Each chromosome is a mosaic of segments inherited from different ancestors, themselves living at different epochs. A 5 cM segment shared between two modern Sicilians might trace to a common ancestor 30 generations ago. A 12 cM segment between the same two individuals might trace to an ancestor only 12 generations ago. The segment lengths follow a known distribution (exponential, with rate proportional to the time to the most recent common ancestor and the local recombination rate), which means that segment lengths are a clock. Allele frequencies discard this clock entirely.
The same logic applies at population scale. Two populations that share 8 percent of their ancestry from a single admixture event 30 generations ago have a very different IBD-segment-length distribution than two populations that share 8 percent of their ancestry from a continuous trickle of migration over the last 200 generations. ADMIXTURE will report the same 8 percent in both cases. The IBD profile will look completely different.
The popular assumption
ADMIXTURE percentages tell you the demographic composition of your ancestors. A 30 percent steppe result means roughly 30 percent of your ancestors were steppe pastoralists.
The technical reality
ADMIXTURE percentages tell you which combination of K modelled source populations best matches your allele frequencies, treating each SNP as independent. They cannot distinguish 30 percent steppe ancestry inherited from a single recent admixture event from the same proportion arriving in three pulses across 3,000 years, and they collapse all temporal layering into a single point estimate without uncertainty bounds.
3. The midpoint trap: the structural weakness of frequency-based methods
If there is one specific failure mode that summarises the limits of ADMIXTURE, G25 and qpADM together, it is the midpoint effect. It is the single most important methodological pitfall in the current paradigm, and it is precisely the failure mode that the new generation of methods (IBD, ARG, deep learning) is designed to avoid.
The midpoint effect is simple to state. When a method operating on allele frequencies (or on a PCA projection of them) lacks a proper reference for a particular ancestral population, it does not refuse to provide an answer. It returns the closest linear combination of available references, which is geometrically the midpoint (or weighted average) of the surrounding reference populations. The model has produced an answer, the NNLS fit looks clean, the residual is small, and the user reads off a percentage breakdown that looks like real ancestry but is actually a geometric artifact of the reference panel's gaps.
The clearest case is the one documented in this site's analysis of the Migration Era Calculator. The Karl Hogstrom panel (Calculator 186 on ExploreYourDNA) returns a small but persistent "North Africa 150 CE" component for many western and central French regional populations. The historical fit is implausible: there is no documented Roman-era North African demographic input into Brittany, Normandy or Picardy at meaningful levels. What is actually happening is the midpoint effect in operation. The calculator lacks a clean Sub-Saharan African reference, and the small Sub-Saharan signal in a French sample, which is real and reflects continuous post-medieval contact, gets projected onto the "North Africa 150 CE" reference because that reference sits on the PCA path between the European cluster and the Sub-Saharan cluster. The user reads "North Africa", but the underlying signal is a midpoint between Europe and Sub-Saharan Africa, and the historical interpretation collapses.
The same effect appears with Mozabites used as a Berber reference. As Fregel et al. 2018 demonstrated, modern Mozabites carry roughly 50 percent European Neolithic ancestry (inherited from Cardial farmers who crossed the Strait of Gibraltar around 5500 BCE) on top of their endemic Iberomaurusian Maghrebi substrate. A modern Spaniard modelled against Mozabite as a "North African" reference will systematically overestimate North African ancestry, because half of the Mozabite signal in the fit is the European Neolithic in Mozabite catching the European Neolithic in the Spaniard. The NNLS sees a clean fit, the user reads "10 percent Mozabite", but the historically meaningful Andalusi proportion is closer to 5 percent. The remaining 5 percent is a midpoint artifact created by a contaminated reference.
The popular assumption
If the NNLS model returns 5 percent of population X, then your ancestor had 5 percent ancestry from population X, or from something genetically close to it. The fit is good, the residual is small, the answer is reliable.
The structural reality
NNLS finds the best linear combination in PCA space, not the historically true source. If a real ancestral component is missing from the reference panel, its signal is silently redistributed onto whichever references sit geometrically between the target and the missing component. The fit is clean, the residual is small, and the historical labels can be entirely wrong. The user has no way of knowing from the output alone.
This is not a bug that better software can fix. It is a structural property of any method that:
- Projects genomic data into a low-dimensional space (PCA, or the K-component simplex of ADMIXTURE)
- Models ancestry as a non-negative linear combination of available references
- Optimises a single least-squares or likelihood objective with no awareness of which references are historically plausible or which components are missing from the panel
ADMIXTURE, G25 plus NNLS, and even qpADM (in its handling of unmodelled gene flow) all suffer from this. qpADM is the most resistant of the three, because its outgroup-based f-statistics framework can detect that a model is misspecified, but only if the user has the methodological awareness to test for it and the right outgroups available. The amateur user running NNLS in Vahaduo has no such guardrail at all.
Deep learning methods sit in an intermediate position. Orchestra and Gnomix, trained on labelled haplotypes from well-defined populations, also largely avoid the midpoint trap when the training data covers the relevant ancestries. When the training data has gaps, deep networks can still produce midpoint-like artifacts (assigning unfamiliar haplotypes to whichever training class is closest in feature space). The decisive advantage is that the calibration step (Gnomix uses isotonic regression to map predicted probabilities to actual accuracy) flags low-confidence predictions explicitly, where classical NNLS returns no uncertainty at all.
The midpoint effect is, in this sense, the bridge between the old and new paradigms. The new methods are not unconditionally better than the old ones at every task. But they are specifically better at the task that broke G25 the most: distinguishing real ancestry from geometric artifact. Every claim of "small but consistent" admixture from a frequency-based calculator should now be cross-checked against an IBD-based or ARG-based query before being accepted as historically meaningful. The era when a clean NNLS fit was sufficient evidence is ending.
4. Five phases of methodological history
The history of population genetic methods can be divided into five phases that overlap rather than succeed each other cleanly. The fifth phase is being born under our eyes.
ADMIXTURE, STRUCTURE, fastSTRUCTURE. K discrete ancestral components modelled by maximum likelihood. Slow modelling, no haplotype information, no dates, no uncertainty quantification. The 1000 Genomes Project provides the reference panel.
G25, qpADM, qpGraph. Massive ancient DNA reference panel from the Reich Lab, Allentoft and Lazaridis. PCA becomes the common language of the community. NNLS becomes the dominant amateur modelling heuristic. The Yamnaya, EEF, WHG framework is codified.
Ancestry Genetic Communities, SideView, 23andMe matching network, MyHeritage Genetic Groups. IBD-based phasing on millions of testers. Network structure becomes the substrate of inference. Ancestry's 2017 paper documents post-colonial North American population structure from 770,000 genomes.
ChromoPainter, FineSTRUCTURE (Lawson 2012), GLOBETROTTER (Hellenthal 2014), SOURCEFIND (2018), HapNe (2023), FLARE (2023). Painted haplotype copying matrices replace SNP-based PCA as the input to clustering and NNLS modelling. The most likely near-term replacement for the G25 amateur workflow.
Relate (Speidel 2019), tsinfer (Kelleher 2019), ARG-Needle (Zhang 2023), Threads (Gunnarsson 2024), SINGER (Deng 2025). Reconstruction of full genealogical histories. UK Biobank scale of nearly 500,000 genomes reached in 2024.
Gnomix (Hilmarsson 2021), Orchestra (Comajoan Cara 2025), SIA (Hejase 2022), ARGformer (2026). Transformers and recurrent neural networks on genomes. Inference becomes representational learning. Fusion of IBD, allele frequencies, and dates into a single framework.
An ARG inferred at biobank scale, with IBD read off the threading step, local ancestry assigned by a transformer, admixture dated from segment length distributions, and continuity with the ancient DNA reference panel. Components exist today; integration is a software engineering problem.
5. IBD, the genealogical memory of the genome
IBD (identity by descent) refers to genomic segments shared by two individuals because they inherited them from a recent common ancestor. The keyword is recent. An IBD segment is a physical, contiguous stretch of chromosome that has not been broken by recombination since the most recent common ancestor of the two individuals carrying it. Its length, in centimorgans, follows an exponential distribution with mean inversely proportional to the time to that ancestor.
This single fact changes everything. With one 8 cM segment, you can say with high confidence that two people share a common ancestor within roughly the last 8 to 15 generations. With many segments accumulated across the genome and across pairs of individuals, you can build up an entire genealogical structure: which populations share recent ancestors, how many generations ago, and with what intensity.
Allele frequencies see the time-averaged smear of all common ancestors going back to the population's effective coalescence time (often tens of thousands of years). IBD sees the specific, datable, geographically localisable common ancestors of the last few millennia. The two views are not redundant. They are complementary, and a complete inference framework should use both.
The methods to detect IBD at scale (hap-IBD, iLASH, IBDseq, FastIBD, hapne) emerged in parallel with the consumer DNA testing industry. By 2018 it was technically possible to detect every IBD segment longer than 5 cM in databases of millions of individuals, with high specificity. The consumer companies, with their massive proprietary databases, were the first to benefit. Academic IBD work, by contrast, was for a long time bottlenecked by access to large genomic datasets, and only the UK Biobank changed that picture.
6. Ancestry's industrial bet on IBD
Ancestry shipped SideView in April 2022. The technical achievement is impressive: using only the IBD shared between a test individual and the rest of Ancestry's database (around 20 million testers at the time), the algorithm assigns each genomic segment to one of two parental chromosomes, without requiring either parent to be tested. The phasing accuracy reaches 95 percent for 90 percent of customers, a figure that would have been unthinkable in 2015.
The underlying white paper, "Accurate Genome-Wide Phasing from IBD Data" (Wang et al. 2022, bioRxiv), describes how this works. The algorithm leverages the fact that long IBD segments must lie on a single parental chromosome (because they were inherited as an unbroken block from one parent). By identifying which segments are shared with which clusters of distant relatives, the algorithm can resolve the parental phase of the entire genome.
The implication is profound. SideView did not invent a new ancestry component or refine the ADMIXTURE framework. It bypassed the framework entirely. By phasing the genome through IBD networks, it produces parent-of-origin ancestry estimates that no allele-frequency method can match in accuracy or interpretability. And critically, the underlying technology is general: once you have a phased genome and a dense IBD network, you can ask questions ADMIXTURE was never designed to answer.
Ancestry's Genetic Communities feature, which clusters customers into hundreds of fine-scale geographic communities, runs on the same substrate. The clustering is not based on allele frequencies. It is based on IBD network structure. Each community is a connected subgraph of the IBD network, and the geographic location is read off the cluster, not inferred from PCA distances. The 2017 Nature Communications paper by Han et al. ("Clustering of 770,000 genomes reveals post-colonial population structure of North America") is the foundational publication. By 2026, the technology has been deployed continuously for nearly a decade.
7. The ARG, theoretical unification
If IBD is one half of the story, the ancestral recombination graph (ARG) is the other half and the more theoretically ambitious one. An ARG is a complete graph-like data structure that encodes the genealogical history of a sample of genomes along every position of the genome. At each position, the ARG specifies the marginal tree relating the sampled chromosomes back to their most recent common ancestor. As you move along the chromosome, the tree changes at each historical recombination point.
The key insight: if you know the ARG, you know everything. IBD segments are simply paths through the ARG. Allele frequencies are summaries of mutation events placed on ARG branches. f-statistics, qpADM constraints, PCA axes, can all be computed from the ARG as derived quantities. The ARG is the maximal sufficient statistic of population genetic data.
For two decades, inferring ARGs from real data was computationally intractable for anything beyond a handful of genomes. ARGweaver (Rasmussen et al. 2014) was the first method to sample reasonably accurate ARGs but scaled only to dozens of genomes. That changed with a sequence of methodological breakthroughs:
Speidel et al. (Nature Genetics 2019) introduced a heuristic Bayesian framework that scaled to thousands of genomes for the first time. Applied to the SGDP and 1000 Genomes data, Relate recovered known demographic events with high fidelity and opened the door to ARG-based inference at human population scale.
Kelleher et al. (Nature Genetics 2019) introduced the tree-sequence representation: a compact data structure that stores an inferred ARG. Memory and compute requirements drop by orders of magnitude. The 2022 Wohns et al. paper (Science) used tsinfer to build a unified genealogy combining 3,601 modern and 8 ancient genomes spanning 215,000 years of human history.
Zhang et al. (Nature Genetics 2023) introduced threading-based inference using genotype hashing. The method built genome-wide genealogies for 337,464 UK Biobank individuals from genotype arrays. Genealogy-based association testing detected more rare and ultra-rare signals than imputation from 65,000 sequenced haplotypes.
Gunnarsson et al. (bioRxiv 2024) inferred the ARG of 487,409 UK Biobank genomes using 10 million imputed variants, with substantially reduced compute compared to ARG-Needle. ARG-based imputation strategies improved ultra-rare variant accuracy by 5 to 10 percent. The bottleneck of biobank-scale ARG inference is effectively gone.
Deng et al. (Nature Genetics 2025) introduced sampling from the full ARG posterior at two orders of magnitude higher speed than prior methods. For the first time, hundreds of whole-genome sequences can be analysed with proper Bayesian uncertainty quantification rather than single-point ARG estimates.
Lewanski et al. (bioRxiv 2026) trained an encoder-only transformer on simulated and inferred ARGs from coalescent simulations and from ancient and present-day Homo sapiens genomes. Using only the learned embeddings, ARGformer recovers global population structure and supports unsupervised ancestry inference, without ever accessing the genotype matrix directly.
The Wohns et al. 2022 paper in Science, "A unified genealogy of modern and ancient genomes," is perhaps the most striking demonstration. The authors inferred a single ARG combining 3,601 modern genomes and 8 high-coverage ancient genomes, covering 215,000 years of human history in a single graph. The resulting tree sequence is small enough to fit on a laptop, fast enough to query interactively, and rich enough to recover essentially every major migration event documented by independent paleogenetic studies. This is what an integrated paleogenetic future looks like.
8. Deep learning enters the field
Until roughly 2020, population genetics inference was overwhelmingly model-based. You wrote down a generative model (a coalescent process, a Wright-Fisher model, an admixture graph) and you fitted its parameters. The hard part was always the model specification, not the data. The community accumulated extensive expertise in choosing the right outgroups, the right reference populations, the right hypothesis space.
Deep learning changes the trade-off. Instead of specifying a model, you specify an architecture and let the network learn its own representations from millions of simulated examples or from massive empirical datasets. The community resisted this shift longer than other fields, partly because population genetics has a strong tradition of mechanistic modelling, and partly because the data was for a long time too sparse for representation learning to work well. Both barriers have now fallen. The approach has produced concrete wins in several distinct subproblems.
Local ancestry inference (LAI): Gnomix (Hilmarsson et al. 2021) combines a set of base classifiers operating on genomic windows with a smoother (a small neural network that refines the window-level estimates). The result outperforms the previous gold standard (RFMix) on whole genome data at orders of magnitude faster runtime. Orchestra (Comajoan Cara et al. 2025, Nature Communications) trains on over 10,000 single-origin individuals from 35 worldwide populations and achieves precision and recall above 95 percent on the test set. It was used to resolve the long-debated origins of Ashkenazi Jews (confirming the South European heritage demonstrated by the Norwich and Erfurt ancient DNA work) and to identify trace ancestries in Latin Americans matching specific historical records.
Selective sweep detection: SIA (Hejase et al. 2022, MBE) uses a recurrent neural network operating on ARG features (rather than raw genotype matrices) to detect and quantify positive selection. Applied to the 1000 Genomes Northern European data, SIA detected novel signals at the MC1R and ABCC11 loci and recapitulated the classic LCT lactase sweep at higher resolution than allele-frequency-based methods. Critically, SIA infers the full allele frequency trajectory of a beneficial variant, not just a sweep score.
Variant dating: tsdate (2025) provides a scalable variational algorithm for dating mutations placed on inferred ARGs. Applied to 47,535 genomes from the Genomics England 100,000 Genomes Project, it inferred ages for 23.2 million variants, including 11.8 million singletons. The ages remain accurate under extreme sampling imbalance and reveal signatures of purifying selection at clinically relevant loci.
Representation learning on ARGs: ARGformer (2026) is the most striking development. A transformer trained on genealogies from coalescent simulations and from ancient and present-day human genomes, ARGformer recovers global population structure and supports unsupervised ancestry inference, using only its learned embeddings, without ever accessing genotype matrices directly. This is a fundamental shift: ancestry inference becomes a downstream task of a self-supervised foundation model, exactly the architecture that revolutionised natural language processing and computer vision in the late 2010s.
The pattern across these methods is consistent. Where classical methods plateaued at certain accuracy levels because of structural assumptions in their generative models, deep learning methods continue to improve as they are trained on more data, with larger networks, and with better simulated training distributions. The improvement curves do not yet show signs of saturation.
9. The hybrid generation: combining IBD and PCA
The opposition between "frequency-based methods" and "IBD-based methods" is convenient for exposition but slightly artificial. In practice, the most successful tools of the last decade have been hybrid methods that combine haplotype and IBD information with PCA-style dimensionality reduction. These methods are not yet the full ARG paradigm, but they are the bridge, and they are what most of the consumer industry and most working academic labs actually use right now. For the genetic genealogy community, the hybrid generation is almost certainly what will replace G25 first, before the deeper transition to ARG-based inference completes.
The foundational tool of this class is ChromoPainter and FineSTRUCTURE (Lawson, Hellenthal, Myers, Falush 2012). Instead of operating on the raw SNP matrix, ChromoPainter paints each test chromosome as a mosaic of donor haplotypes drawn from a reference panel. The output is a "copying matrix" where each row is a test individual and each column is a reference donor population, with cells giving the proportion of the test chromosome painted by haplotypes from each donor. This matrix captures haplotype and IBD-style information much more richly than a frequency-based PCA. FineSTRUCTURE then runs hierarchical clustering on this matrix to identify fine-scale population structure that no allele-frequency method can resolve.
The famous GLOBETROTTER method (Hellenthal et al. 2014, Science) builds on ChromoPainter to detect admixture events and date them through the decay of linkage disequilibrium in painted segments. The "Genetic Atlas of Human Admixture History" produced by that paper remains one of the most influential pieces of population genetics work of the 2010s. It identified specific admixture pulses (the Mongol expansion, the Arab slave trade, post-Roman migrations) with dates that match historical records to within decades, using a methodology that is fundamentally a hybrid of haplotype information and PCA-style linear modelling.
Several subsequent methods follow the same logic:
Uses ChromoPainter copying-matrix outputs to identify the most plausible historical source populations for an admixture event. Applied notably to the colonial admixture history of Latin America, where it disentangled the European, African, and indigenous source contributions far more precisely than ADMIXTURE.
Infers effective population sizes and admixture timing from IBD segment length distributions, combining the IBD information with parametric coalescent models. Where ADMIXTURE gives a static cross-section, HapNe gives a demographic trajectory through time.
Performs local ancestry inference using a haplotype copying model, essentially a modernised HAPMIX-style approach with sharply improved scalability. Compatible with reference panels of tens of thousands of haplotypes and well-suited to consumer-scale pipelines.
23andMe's consumer pipeline already combines a frequency-based ancestry composition layer (PCA-style continental signal) with an IBD-based Recent Ancestor Locations and Genetic Communities layer (IBD network clustering). The hybrid generation is already deployed at consumer scale, even if customers see it as two separate reports.
The most direct PCA-IBD hybrid is what could be called IBD-PCA: run PCA not on the genotype matrix, but on the matrix of pairwise IBD sharing across a database. This produces a low-dimensional embedding where Euclidean distance encodes recent genealogical relationships rather than deep allele-frequency similarity. The result is a PCA-like visualisation that captures the last few thousand years of demography, exactly the time scale that matters most for genetic genealogy and where allele-frequency PCA is least informative. Ancestry, MyHeritage and 23andMe have all explored variants of this approach internally, and academic implementations now exist for biobank-scale IBD networks.
In the unified future, the ARG itself provides a natural generalisation of the hybrid approach. The eGRM (expected genetic relatedness matrix) computed from an ARG combines both deep and recent ancestry information in a single matrix, and you can run PCA on the eGRM to recover a unified embedding that contains both the G25-style continental signal and the SideView-style fine-scale community signal as different scales of the same representation. The Wohns et al. 2022 unified genealogy paper effectively demonstrated this at limited scale, and ARGformer (2026) generalises it through transformer-learned embeddings.
This hybrid generation matters for one specific reason that the genetic genealogy community will feel directly. ARG inference at biobank scale requires GPU clusters, substantial engineering, and careful methodological choices that most amateur tools cannot replicate. Hybrid IBD+PCA methods, by contrast, can run on a laptop with a reasonable reference panel. The next generation of community calculators (the post-G25 era) will almost certainly be built around painted haplotype copying matrices, NNLS modelling on those matrices instead of on PCA coordinates, and IBD-derived dating layered on top. The transition will be gradual, and G25 will probably remain useful as a coarse visualisation layer for years, but the substantive inference will move to the hybrid layer first, and only later to full ARG-based methods.
10. The convergence
The five phases described in section 2 are not stages in a sequence where each replaces the previous. They are converging. The future state of population genetics inference, on a horizon of perhaps three to seven years, will look something like the following.
A user uploads a phased whole genome (or a phased imputed array). The pipeline first inserts that genome into a global ARG inferred from a reference panel of millions of samples, using Threads-style threading parallelised on cloud GPUs. The IBD segments are read off the ARG as a side effect of the threading step, without needing a separate IBD-detection pass. The local ancestry of every segment is assigned by a deep learning module (Orchestra-style) that uses ARG context rather than raw genotype windows. The global ancestry estimate is the integral of the local estimates, with explicit uncertainty quantification. Admixture events are dated from segment length distributions natively, without a separate qpADM run. Comparison with ancient DNA samples placed on the same ARG gives both broad continental signals and fine-scale community membership in a single coherent representation.
This is not a five-year vision. The components already exist today, in 2026, in separate codebases. The integration is a software engineering problem, not a scientific one.
What will be displaced
ADMIXTURE: replaced by deep learning local ancestry inference with explicit ARG context. The K ancestral components become learned latent representations, not fixed assumed populations.
G25 plus NNLS: replaced by direct queries on the ARG (which historical samples does this segment trace back to, and when did the divergence happen?). The 25-dimensional PCA compression becomes unnecessary when the full genealogical structure is available.
qpADM: replaced by ARG-based hypothesis testing. The admixture proportions, their dates, and their uncertainty all fall out of the same posterior in a single inference step.
What will be preserved
The ancient DNA reference panel. aDNA is what anchors the ARG to specific historical times and places. The Reich Lab, Allentoft, Lazaridis, Olalde, Rodriguez-Varela datasets remain absolutely central. Without them, an ARG is just a tree with arbitrary time calibration.
The interpretive expertise of the community. Reading an Iberian Cardial-to-Visigoth-to-Andalusi admixture history off an ARG still requires knowing what those labels mean and how to design the right query.
Calculator-style workflows for specific historical questions. Specialised reference panels (Migration Era, Modern World Regions, World Ancient by Joshua) will remain useful for targeted questions.
11. What this means for genetic genealogy
For the average ExploreYourDNA reader, the practical consequences will appear progressively over the next three to seven years. The customer-facing dashboards will change more slowly than the underlying methodology, but each of the following capabilities is already technically achievable in 2026.
Better parental phasing without parents tested. SideView is only the first generation. The next will deliver near-perfect phasing for any tester whose database has enough relatives, and the threshold of "enough relatives" will fall as databases grow. Three- and four-grandparent phasing (separating each of four grandparental components) is the obvious next step, and it has already been demonstrated in academic prototypes.
Better admixture dating. Estimating that your North African ancestor lived 6 to 8 generations ago, plus or minus one generation, will become standard, not exotic. Today this requires careful manual work in DATES, GLOBETROTTER or HAPNE. Soon it will be a default report element. The Ibiza al-Andalus study (Rodriguez-Varela et al. 2026) is an early example: it dated the main admixture event in medieval Islamic Ibiza to 869 CE plus or minus 6 years, from segment length distributions alone.
Finer-grained communities. Where Ancestry today identifies a few hundred fine-scale communities, ARG-based community detection could identify thousands, with confidence intervals on each assignment. The Genetic Communities feature of 2017 will look in retrospect like a first prototype.
Honest uncertainty. The "South Italian: 23 percent (CI 18 to 28)" notation is coming. The era of single-point estimates with no error bars, which characterised the 2015 to 2022 industry standard, is ending. Customers will probably resist this at first (the simplicity of a single number is appealing) but the methodological pressure for proper uncertainty quantification is strong.
Continuity with ancient DNA. The same ARG that places you in the modern Iberian cluster will also place you, with quantified uncertainty, on the timeline of Bronze Age Iberia, Roman Hispania Baetica, and medieval al-Andalus. The relationship between modern ancestry estimates and ancient samples will be continuous and interpretable, not requiring separate manual modelling in Vahaduo for each historical question.
Faster integration of new ancient samples. Every new ancient genome published in Nature or Science currently requires the community to rebuild PCA models, run new qpADM analyses, retrain calculator panels. In an ARG-based world, a new ancient sample is simply inserted into the existing graph, and downstream queries update automatically. The latency between a Reich Lab publication and its absorption into community tools could drop from months to days.
12. Comparison: what each generation of tools captures
| Method generation | Information used | Captures | Midpoint risk | Scale |
|---|---|---|---|---|
| ADMIXTURE / STRUCTURE | Allele frequencies, SNP-by-SNP | Coarse continental ancestry as K components | High, redistributes missing components silently across K | Thousands of genomes |
| G25 / PCA + NNLS | Top 25 principal components of allele frequencies | Visual placement in shared PCA space, heuristic admixture modelling | Very high, geometric midpoint is the literal failure mode of NNLS | Tens of thousands of references |
| qpADM / qpGraph | f-statistics, allele frequency moments | Formal hypothesis tests on admixture, confidence intervals on proportions | Medium, detectable with the right outgroups but requires expertise | Tens to hundreds of source populations |
| Hybrid IBD + PCA (ChromoPainter, FineSTRUCTURE, GLOBETROTTER, SOURCEFIND, FLARE, HapNe) | Painted haplotype copying matrix, IBD segment lengths, plus dimensionality reduction | Fine-scale structure, admixture dating, source identification, all in a compatible NNLS workflow | Low to medium, painted haplotypes from different sources are not interchangeable | Thousands to tens of thousands |
| IBD-based (hap-IBD, SideView) | Shared IBD segments and their lengths | Recent admixture, parental phasing, community structure, fine-scale geography | Low, segments are physical, not geometric | Millions of consumer testers |
| Local ancestry inference (RFMix, Gnomix, Orchestra) | Haplotype context across windows, deep learning representations | Per-segment ancestry assignment, admixture chronology, fine-scale resolution | Low to medium, depends on training panel coverage | Tens of thousands per training run |
| ARG inference (Relate, tsinfer, ARG-Needle, Threads, SINGER) | Inferred genealogical histories along the genome | Coalescence times, IBD, recombination, mutation rates, all classical statistics | Very low, missing references produce explicit uncertainty, not artifacts | Up to 500,000 genomes (Threads, 2024) |
| ARG + deep learning (ARGformer, SIA, integrated future) | Learned embeddings on inferred ARGs | Unified ancestry, selection, demography, and admixture dating in a single framework | Very low, with explicit calibration of confidence | Biobank scale and beyond |
13. Caveats and the rate of change
First, the reference panel is still the bottleneck. The best ARG is only as good as the genomes feeding it. For Eurasian populations, this is no longer a serious constraint. For under-sampled regions (Sub-Saharan Africa, indigenous Americas, Oceania, large parts of South and Southeast Asia), the reference imbalance is severe and will not be fixed by methodological advances alone. It requires sustained sampling effort, which is a political and financial problem, not a technical one.
Second, the consumer industry has commercial incentives to oversimplify. The temptation to compress an ARG-derived ancestry estimate back into a percentage pie chart, because that is what customers expect, is going to remain strong. The methodological revolution may not be visible in customer-facing dashboards for years, even when it is fully deployed in the back-end pipeline.
Third, deep learning brings reproducibility challenges. A coalescent model has interpretable parameters with clear biological meaning. A 200-million-parameter transformer does not. The community will have to develop new norms for what counts as a validated finding when the underlying inference is partly a black box. The peer review system is not yet set up for this.
Fourth, ADMIXTURE, G25, and qpADM will not disappear immediately. They have specific niches (small custom reference panels, specific historical scenarios, hypothesis testing on aDNA) where they will remain the best tools for several more years. Their displacement will be gradual, not abrupt, and the transition period will see hybrid workflows that combine old and new approaches.
Fifth, AI does not eliminate the need for domain expertise. Every paleogenetics paper for the last decade has shown that the same dataset can support multiple plausible historical narratives depending on which references are chosen and which questions are asked. That epistemic situation does not change when the underlying inference engine improves. If anything, more powerful tools make domain expertise more important, not less, because they amplify whatever choices the user makes upstream.
14. Three transversal lessons
The methodological transition described in this article illustrates three general principles of population genetics inference that go beyond any specific tool.
First lesson: information richer than allele frequencies has always been there, waiting to be used. Haplotype structure, IBD segment lengths, the full genealogical history of the genome: none of this was hidden by nature. It was hidden by computational limits and by methodological inertia. The community used allele frequencies for fifteen years not because they were the right summary statistic, but because they were the most tractable one given the compute and the algorithms available. Now that the constraints have eased, the right summary statistic (the ARG, or its IBD-network equivalent) is becoming accessible. This pattern will repeat. The current methods, however sophisticated they look, are still summary statistics of an even richer underlying biology that future methods will exploit.
Second lesson: industry leads, academia formalises. Ancestry, 23andMe and MyHeritage operate at scales that academic labs cannot match. The IBD revolution started in their back-end pipelines, for engineering reasons (relative matching, customer phasing), and was only later formalised in academic publications. ARG-Needle, Threads and SINGER are, in part, the open-science instantiation of what the consumer DNA industry had been doing internally for five years. This pattern is now general in computational biology: the largest datasets are private, the most powerful methods are built on top of them, and academic methodology often follows.
Third lesson: the more powerful the tool, the more important the question. ADMIXTURE produces a single answer to a single fixed question. ARG-based inference can answer any question the user can frame. This is liberating but also dangerous. The choice of question, the choice of references, the choice of historical framing, all become more important, not less, when the inference engine becomes more capable. The next decade of population genetics will be a decade of better tools applied to better-framed questions, but also a decade in which methodological sophistication can no longer hide the importance of historical interpretation. The tools will improve faster than they did between 2010 and 2025. Whether the interpretations keep up depends on how seriously the community takes the qualitative, historical, archaeological context that surrounds every quantitative claim.
15. References
- Alexander, D. H., Novembre, J., Lange, K. (2009). Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 19(9), 1655-1664. DOI: 10.1101/gr.094052.109 ADMIXTURE Foundational
- Haak, W., Lazaridis, I., Patterson, N., et al. (2015). Massive migration from the steppe was a source for Indo-European languages in Europe. Nature, 522(7555), 207-211. DOI: 10.1038/nature14317 qpADM Steppe
- Speidel, L., Forest, M., Shi, S., Myers, S. R. (2019). A method for genome-wide genealogy estimation for thousands of samples. Nature Genetics, 51, 1321-1329. DOI: 10.1038/s41588-019-0484-x Relate ARG
- Kelleher, J., Wong, Y., Wohns, A. W., Fadil, C., Albers, P. K., McVean, G. (2019). Inferring whole-genome histories in large population datasets. Nature Genetics, 51, 1330-1338. DOI: 10.1038/s41588-019-0483-y tsinfer Tree sequence
- Wohns, A. W., Wong, Y., Jeffery, B., et al. (2022). A unified genealogy of modern and ancient genomes. Science, 375(6583), eabi8264. DOI: 10.1126/science.abi8264 Unified genealogy aDNA
- Zhang, B. C., Biddanda, A., Gunnarsson, A. F., et al. (2023). Biobank-scale inference of ancestral recombination graphs enables genealogical analysis of complex traits. Nature Genetics, 55, 768-776. DOI: 10.1038/s41588-023-01379-x ARG-Needle UK Biobank
- Gunnarsson, A. F., Zhang, B. C., Biddanda, A., et al. (2024). A scalable approach for genome-wide inference of ancestral recombination graphs. bioRxiv preprint. DOI: 10.1101/2024.08.31.610248 Threads 487k genomes
- Deng, Y., Nielsen, R., Song, Y. S. (2025). Robust and accurate Bayesian inference of genome-wide genealogies for hundreds of genomes. Nature Genetics, 57, 2124-2135. DOI: 10.1038/s41588-025-02317-9 SINGER Bayesian ARG
- Hilmarsson, H., Kumar, A. S., Rastogi, R., Bustamante, C. D., Montserrat, D. M., Ioannidis, A. G. (2021). High Resolution Ancestry Deconvolution for Next Generation Genomic Data. bioRxiv preprint. DOI: 10.1101/2021.09.19.460980 Gnomix Local ancestry
- Comajoan Cara, M., Mas Montserrat, D., Bustamante, C. D., Ioannidis, A. G. (2025). Tracing human genetic histories and natural selection with precise local ancestry inference. Nature Communications, 16, 4576. DOI: 10.1038/s41467-025-59936-3 Orchestra Deep learning
- Hejase, H. A., Mo, Z., Campagna, L., Siepel, A. (2022). A Deep-Learning Approach for Inference of Selective Sweeps from the Ancestral Recombination Graph. Molecular Biology and Evolution, 39(1), msab332. DOI: 10.1093/molbev/msab332 SIA Selection
- Wang, J., Robinson, M. R., Skotte, L., et al. (2022). Accurate Genome-Wide Phasing from IBD Data. bioRxiv preprint. Underlying technology of Ancestry SideView. SideView Industry
- Han, E., Carbonetto, P., Curtis, R. E., et al. (2017). Clustering of 770,000 genomes reveals post-colonial population structure of North America. Nature Communications, 8, 14238. DOI: 10.1038/ncomms14238 Genetic Communities Ancestry
- Browning, S. R., Browning, B. L. (2023). Biobank-scale inference of identity-by-descent and applications. Annual Review of Genomics and Human Genetics. IBD Methods
- Pearson, A., Durbin, R. (2023). Local ancestry inference for complex population histories. bioRxiv preprint. DOI: 10.1101/2023.03.06.529121 AncestralPaths Neural network
- Lewanski, A. L., Grundler, M. C., Bradburd, G. S. (2026). ARGformer: learning on ancestral recombination graphs with transformers. bioRxiv preprint. DOI: 10.64898/2026.02.11.705405 ARGformer Transformer
- Wong, Y., Ignatieva, A., Koskela, J., Gorjanc, G., Wohns, A. W., Kelleher, J. (2025). Tracing the evolutionary histories of ultra-rare variants using variational dating of large ancestral recombination graphs. Genome Biology. tsdate Variant dating
- Rasmussen, M. D., Hubisz, M. J., Gronau, I., Siepel, A. (2014). Genome-wide inference of ancestral recombination graphs. PLoS Genetics, 10(5), e1004342. DOI: 10.1371/journal.pgen.1004342 ARGweaver Foundational
- Fregel, R., Mendez, F. L., Bokbot, Y., et al. (2018). Ancient genomes from North Africa evidence prehistoric migrations to the Maghreb from both the Levant and Europe. PNAS, 115(26), 6774-6779. DOI: 10.1073/pnas.1800851115 Cross-Strait Mozabite midpoint
- Patterson, N., Moorjani, P., Luo, Y., et al. (2012). Ancient admixture in human history. Genetics, 192(3), 1065-1093. DOI: 10.1534/genetics.112.145037 f-statistics qpADM foundations
- Lawson, D. J., Hellenthal, G., Myers, S., Falush, D. (2012). Inference of population structure using dense haplotype data. PLoS Genetics, 8(1), e1002453. DOI: 10.1371/journal.pgen.1002453 ChromoPainter FineSTRUCTURE
- Hellenthal, G., Busby, G. B. J., Band, G., et al. (2014). A genetic atlas of human admixture history. Science, 343(6172), 747-751. DOI: 10.1126/science.1243518 GLOBETROTTER Atlas
- Chacon-Duque, J. C., Adhikari, K., Fuentes-Guajardo, M., et al. (2018). Latin Americans show wide-spread Converso ancestry and imprint of local Native ancestry on physical appearance. Nature Communications, 9, 5388. DOI: 10.1038/s41467-018-07748-z SOURCEFIND Latin America
- Browning, S. R., Waples, R. K., Browning, B. L. (2023). Fast, accurate local ancestry inference with FLARE. American Journal of Human Genetics, 110(2), 326-335. DOI: 10.1016/j.ajhg.2022.12.010 FLARE Haplotype copying
- Fournier, R., Tsangalidou, Z., Reich, D., Palamara, P. F. (2023). Haplotype-based inference of recent effective population size in modern and ancient DNA samples. Nature Communications, 14, 7945. DOI: 10.1038/s41467-023-43522-6 HapNe Effective size