Around four thousand years ago, on the dry grassland east of the Ural mountains, a people learned to yoke horses to a light two-wheeled chariot. They buried their dead with those chariots, with horse sacrifices and fire rituals that read like pages torn from the much later Vedic hymns, and almost every man among them carried the same Y-chromosome: R1a-Z93. Within a few centuries their descendants and their language had crossed Central Asia, climbed the passes into the Iranian plateau, and pushed down through the Swat valley into the Indian subcontinent. Today that one paternal lineage is carried by hundreds of millions of men from the Volga to the Bay of Bengal, reaching thirty to seventy percent of the men in many North Indian groups. R1a-Z93 is the clearest single thread tying the steppe to South Asia, and it tells a story the rest of the genome only half tells, because the lineage travelled much further than the ancestry that carried it.
- R1a-Z93 is the Asian branch of the R1a Y-chromosome. It split from its European sister branch, R1a-Z282, on the steppe roughly 5,000 to 5,500 years ago. Z282 stayed and became the dominant male line of the Slavs and Balts; Z93 went east and south and became the male line of the Indo-Iranians.
- The lineage is anchored in ancient DNA. Every male of the Sintashta culture (about 2100 to 1800 BCE) analysed to date carried R1a-Z93, as did the related Andronovo and Srubnaya peoples. These are the chariot-building, horse-sacrificing steppe pastoralists most archaeologists identify with the early Indo-Iranians.
- Z93 was not in India before the Bronze Age. Ancient DNA from the Indus Valley Civilisation, including the Rakhigarhi individual, carries Iranian-farmer and indigenous ancestry but no R1a and no steppe ancestry at all. The lineage appears in South Asian remains only after about 2000 to 1500 BCE, exactly when the steppe ancestry arrives.
- The Swat valley records the moment of contact. Iron Age skeletons from northern Pakistan show steppe ancestry climbing over time, from low levels around 1200 BCE toward the modern pattern, the genetic fingerprint of the Indo-Aryan blend forming on the ground.
- Genome-wide, the steppe contribution to South Asians is a minority. A clean three-source Global25 model rebuilds modern South Asians as a mixture of steppe (Sintashta-like), Iranian-farmer-related ancestry, and the deep indigenous Ancient Ancestral South Indian (AASI) substrate. Steppe ancestry runs from roughly forty percent in the far northwest down to under twenty percent in the south.
- But the Y-chromosome outran the genome. In group after group the share of men carrying R1a-Z93 is far higher than the genome-wide steppe fraction, often roughly double it. The Indo-Aryan expansion into South Asia was strongly male-biased: steppe men, steppe language, but mostly indigenous mothers, and a paternal lineage that then expanded explosively.
- The lineage has internal geography. The Central Asian and Iranic carriers (Pashtuns, Kyrgyz, Scythians, Sarmatians) sit mostly on the Z2123 and Z2125 sub-branches; the great South Asian clade is L657, about 4,200 years old, the single largest R1a lineage in India.
- Language and a Y-line travelled together here, but ancestry did not follow in the same proportion. A North Indian Brahmin can be majority indigenous in the genome and still carry, on the father's father's line, an unbroken thread back to a chariot driver on the Bronze Age steppe.
1. One lineage, two roads
The R1a story begins long before the chariots, with a Y-chromosome haplogroup, R1a-M417, that diversified somewhere on or near the Pontic-Caspian steppe a little over five thousand years ago. From its node R1a-Z645 the lineage forked into two great branches that would never meet again. One, R1a-Z282, stayed in the west and rose to dominate Eastern Europe, where it is carried today by around half of all Russian, Polish and Ukrainian men. The other, R1a-Z93, turned east and south, and became the signature paternal lineage of the Indo-Iranian world. The two branches are close cousins, sharing an ancestor only five thousand-odd years back, which is why a Punjabi farmer and a Polish farmer can both be R1a and yet sit on opposite ends of a lineage that split at the very dawn of the Bronze Age.
The diagram below places Z93 on the tree and notes the ancient DNA that pins it down. The crucial facts are simple. Z93 is the eastern branch; it dominates the Sintashta and Andronovo steppe cultures; it is absent from pre-Bronze-Age India; and within it the Central Asian carriers and the South Asian carriers sit on different sub-branches, a fossil record of how the lineage fanned out.
Schematic of the R1a tree. R1a-M417 splits into the European Z282 and the Asian Z93 around 5,000 to 5,500 years ago. Z93 is the lineage of the Sintashta, Andronovo and Srubnaya steppe peoples, and of the modern Indo-Iranians; its South Asian carriers cluster on the L657 sub-branch, its Central Asian and Iranic carriers on Z2123 and Z2125. Dates follow the published Y-chromosome literature.
2. Born on the chariot steppe
The homeland of Z93 is not a guess. It is written into the graves of the Sintashta culture, a cluster of fortified settlements that appeared on the southern Ural steppe around 2100 BCE. Sintashta is famous for two things: it produced the earliest known spoke-wheeled chariots, and it buried its dead with horses, weapons and fire rituals whose parallels in the later Vedic and Avestan texts are close enough that most scholars connect these people to the early Indo-Iranians. When their genomes were read, the male line was startlingly uniform. Every Sintashta man tested carried R1a-Z93, and the same was true of the sprawling Andronovo horizon that grew out of Sintashta and of the Srubnaya culture to the west. A whole world of Bronze Age herders, spread across thousands of kilometres of grassland, shared one paternal lineage.
Autosomally these steppe pastoralists were themselves a recent mixture. A Global25 model rebuilds Sintashta as roughly half Yamnaya steppe, a little under a third European Neolithic farmer, and a fifth Eastern hunter-gatherer, which is to say they were close kin of the Corded Ware people of Europe, carrying the same blend of Yamnaya and farmer ancestry. They were not an ancient, isolated stock; they were the cutting edge of the steppe Bronze Age, mobile, mounted, and about to move.
3. The road south
When the Andronovo world expanded, it pushed in several directions, and the diagram below traces the ones that matter for Z93. South into Central Asia, the steppe peoples met the settled oasis civilisation of the Bactria-Margiana complex, the BMAC, and pressed on. One stream carried the Iranic branch onto the Iranian plateau, where Z93 sub-lineages still mark Pashtuns, Persians and the descendants of the Scythians and Sarmatians. Another stream crossed the mountains into the Swat valley of northern Pakistan and on into the Indian subcontinent, carrying the Indo-Aryan languages and the L657 lineage that would become the most common R1a clade in India.
Schematic of the Sintashta homeland and its outflows. The steppe peoples expanded west as Srubnaya, south through the Andronovo horizon into Central Asia and the BMAC, then onward to the Iranian plateau (the Iranic branch) and over the passes into South Asia (the Indo-Aryan branch). Proportions and routes are schematic; the Y-lineage and the Indo-Iranian languages travelled along these roads together.
Crucially, this happened after the Indus Valley Civilisation. The mature Harappan world had Iranian-farmer-related and indigenous ancestry but, on the evidence so far, no steppe ancestry and no R1a. The steppe blend, and Z93 with it, arrives in the subcontinent only in the second millennium BCE, layered on top of a population that was already there. The Swat valley skeletons capture this directly: steppe ancestry that is low in the early Iron Age and climbs over the following centuries, the slow mathematics of newcomers marrying into a much larger resident population.
4. The autosomal picture: a steppe minority
If you set the Y-chromosome aside and read the whole genome, the steppe shrinks. Modern South Asians are best modelled as a three-way mixture: a steppe component (Sintashta-like), an Iranian-farmer-related component that is the single largest ingredient in most groups, and the deep indigenous AASI substrate, the ancestry of the people who reached India tens of thousands of years before any farmer or herder. The bars below show that mixture across a north-to-south transect, from the ancient Swat valley through the northwest and down to the Tamil south.
Non-negative least squares models on three deep sources: Sintashta (steppe), Iran Neolithic (Ganj Dareh), and Onge as a proxy for the deep AASI substrate. Steppe ancestry falls along a clear geographic gradient, highest in the northwest, lowest in the south; the Iranian-farmer share is the largest ingredient in most northern groups; the indigenous AASI share rises steadily toward the south and east. The split between the steppe and Iranian-farmer slices is the soft part of the model (see section 7), and the southern fits are looser because Onge is an imperfect AASI proxy. Read the figures as directions, not exact percentages.
One pattern in those bars is worth a second look. The North Indian Brahmin here is not the most steppe-shifted group; it sits below the Punjabi Jatt and the Pashtun, and carries more of the indigenous AASI substrate than either. And yet, as the next section shows, North Indian Brahmins carry some of the highest R1a-Z93 frequencies anywhere. The genome and the Y-chromosome are telling two different stories, and the gap between them is the whole point of this lineage.
A distance chart says the same thing another way. The chart below measures how far a typical North Indian group, the Punjabi Jatt, sits from each ancestral pole. The closest is not the steppe at all but the Iranian-farmer base; the steppe sources come next, a real but secondary pull; and the group sits far from both the Anatolian farmers of the west and the deep AASI pole. A North Indian is, in the bulk of the genome, closer to an Iranian farmer than to a Sintashta charioteer, even while carrying that charioteer's Y-chromosome.
Scaled Global25 distance from the Punjabi Jatt average to each ancestral pole, multiplied by 1000. The nearest pole is the Iranian-farmer base; the Yamnaya and Sintashta steppe sources are the next nearest, a genuine but secondary signal; the Anatolian farmer and the AASI proxy are remote. The steppe is present, but it is not the core of the genome.
5. The lineage that outran the genome
Here is the heart of the matter. A whole genome is inherited from thousands of ancestors at once, and it averages them out. A Y-chromosome is inherited from one unbroken line of fathers, and it can be amplified or erased by who had sons. In South Asia those two measures diverge sharply, and the divergence is the signature of how the Indo-Aryan expansion actually worked.
The chart below sets the two side by side. For each group it shows the genome-wide steppe ancestry, the modest minority from the bars above, against the share of men who carry R1a-Z93. Among the Pashtuns of the northwest, where some groups such as the Yusufzai reach around eighty percent R1a, the paternal lineage is roughly twice the autosomal steppe fraction. Among North Indian Brahmins, R1a-Z93 climbs toward seventy percent on a genome that is barely a third steppe. The lineage outruns the ancestry everywhere in the north, and only in the deep south do the two fall back toward each other, at low levels.
Genome-wide steppe ancestry (from the Global25 models above) against R1a-Z93 frequency (from the published Y-chromosome literature) for representative groups. In the northwest and among upper castes the Z93 paternal frequency runs far above the autosomal steppe share, the male-biased signature; in the south both fall to low levels. Frequencies are approximate and vary between studies and sub-groups. Since essentially all South Asian R1a is Z93, R1a frequency and Z93 frequency are read here as the same thing.
This is the same kind of asymmetry that marks colonial populations such as Brazil, where European fathers and indigenous and African mothers produced a genome-wide blend with a wildly skewed paternal line. The South Asian case is older and gentler in its details, but the genetics rhyme: a relatively small number of incoming men, marrying into a large resident population, can leave a modest mark on the autosomes and a towering one on the Y-chromosome, especially when their descendants go on to have more sons than average. The Z93 lineage did exactly that. It is not large because the steppe migration was large; it is large because, once it arrived, it expanded, generation after generation, along the male line.
6. The Iranic side of the family
South Asia takes the spotlight because the lineage is so common there, but Z93 is the Indo-Iranian lineage, and the Iranian half of that pair carries it too. The same Sintashta-Andronovo source that fed the Indo-Aryans fed the Iranic peoples, and Z93 sub-lineages run through Pashtuns, Persians, Tajiks, and the Iron Age Scythians and Sarmatians who ranged across the whole steppe from the Altai to the Danube. The Central Asian carriers sit largely on the Z2123 and Z2125 sub-branches, distinct from the South Asian L657, so the lineage's internal tree still preserves the split between the two arms of the Indo-Iranian world. When a Pashtun and a Bengali Brahmin both test R1a-Z93, they are reading the same Bronze Age chapter from two different pages.
7. When the proxy is soft
The usual caution applies, and here it is specific. The steppe source, Sintashta, is itself part Caucasus-hunter-gatherer through its Yamnaya ancestry, and the Iranian-farmer source carries closely related ancestry. The two sit near each other in the genetic space, so a model will shuffle ancestry between them, and Global25 least squares in particular tends to assign a generous slice to the steppe. The genome-wide steppe figures in this article are therefore best read as an upper-leaning estimate; the rigorous published studies, using formal statistical tests, generally land a little lower, with steppe ancestry in the mid-teens to high twenties for most groups rather than the high thirties. The headline frequencies for the Y-chromosome, by contrast, come straight from the Y-chromosome literature and are not affected by this softness.
What survives every caution is the shape of the thing, and it is not subtle. There is a real steppe ancestry cline that runs high in the northwest and fades to the south. There is a real, deep, dominant indigenous and Iranian-farmer substrate underneath it. And there is a Y-chromosome, R1a-Z93, that is far more common than the steppe ancestry that brought it, distributed exactly as a male-biased Bronze Age expansion would predict. Those three facts are confirmed by formal published genomics; the Global25 models here are a clear, reproducible picture of their shape, not a precise accounting to the last point.
The story in steps
Claim and reality
R1a-Z93 is native to India and spread outward from there.
Z93 is absent from pre-2000 BCE South Asian remains, including the Indus Valley. It is found in every analysed Sintashta and Andronovo steppe male, and its diversity is older in Central Asia. The lineage entered the subcontinent from the steppe in the Bronze Age.
A high R1a-Z93 frequency means a population is mostly steppe in ancestry.
No. North Indian Brahmins can be majority indigenous and Iranian-farmer-related in the genome while reaching seventy percent Z93. A Y-chromosome is one ancestral line; it can be common without the genome-wide ancestry being large.
The steppe migration into India was a balanced movement of whole families.
It was strongly male-biased. The Z93 paternal line runs far above the genome-wide steppe ancestry, while steppe maternal lineages are rare. The pattern is incoming men marrying into a large resident population.
R1a in India and R1a in Europe are basically unrelated.
They are close cousins. The Asian Z93 and the European Z282 split from a single ancestor, R1a-Z645, only about 5,000 to 5,500 years ago. A Punjabi and a Pole on the R1a tree share a Bronze Age grandfather.
Carrying Z93 tells you what you look like or which culture you belong to.
It tells you one thing only: your direct paternal line traces to the Bronze Age steppe. Appearance, the rest of the genome, and culture are all independent of it. The lineage is a thread of descent, nothing more.
Reproduce it yourself
Paste these scaled Global25 coordinates into Vahaduo to reproduce the distances and models above. The modern South Asian groups and the ancient Swat valley sample are the targets; Sintashta, Andronovo and Yamnaya are the steppe sources, Iran Neolithic the Iranian-farmer source, Onge the AASI proxy, with Anatolian Neolithic, EHG and Yamnaya available for characterising the steppe itself. All coordinates are on the same Global25 scale.
Kalash,0.083883,0.024991,-0.084032,0.066594,-0.071679,0.040306,0.003116,0.002017,-0.030928,-0.025157,-0.005592,-0.000495,-0.002392,-0.010902,0.016611,0.008762,-0.013821,0.002104,0.000672,-0.012832,-0.003922,-0.005392,0.002862,-0.003426,0.003353
Pashtun_Yusufzai,0.080078,0.019624,-0.089322,0.059204,-0.065189,0.037347,0.003539,0.002694,-0.013968,-0.012397,-0.005020,0.000101,-0.000988,-0.009043,0.010642,0.013883,-0.000732,0.000529,0.001287,-0.008692,-0.003134,-0.005961,0.002508,-0.003438,0.003596
Punjabi_Jatt_Sikh,0.076519,0.006093,-0.097686,0.081063,-0.069065,0.048842,0.002274,0.003878,0.001227,-0.007848,-0.005285,-0.000672,-0.000849,-0.007987,0.012797,0.009080,-0.004690,0.000834,0.002733,-0.009202,-0.001550,-0.005505,0.001344,-0.001664,0.002464
Punjabi_Khatri,0.073925,0.005773,-0.104859,0.071213,-0.072515,0.045679,0.003339,0.004336,-0.008568,-0.009064,-0.005863,0.000868,0.000196,-0.008330,0.010193,0.011221,-0.002491,0.000673,0.000059,-0.012282,-0.001734,-0.007361,0.001952,-0.001953,0.001153
UP_Brahmin,0.065733,-0.058901,-0.108516,0.085393,-0.061319,0.051351,0.001792,0.009346,0.013831,0.004237,-0.015650,-0.000880,-0.000743,-0.002993,0.003614,-0.001143,-0.002673,-0.000364,-0.002765,-0.004487,0.000016,0.000371,0.002373,0.001973,-0.000464
Gujarati,0.055773,-0.050795,-0.151536,0.107511,-0.079972,0.062951,-0.000859,0.011672,0.031218,0.016974,-0.006462,0.000712,-0.001946,-0.000104,0.001233,0.000291,-0.001161,0.000097,0.000053,-0.005594,0.002318,-0.003944,0.001945,0.001035,-0.003704
Tamil_Brahmin_Iyer,0.053070,-0.045868,-0.143416,0.100547,-0.072385,0.057521,0.000940,0.010230,0.029426,0.016295,-0.003011,-0.001049,-0.000434,-0.001055,0.001691,0.002812,0.002255,-0.000465,0.000330,-0.002918,0.001061,-0.002107,0.000118,0.000838,-0.001881
Tamil_Nadu_Dalit,0.025917,-0.113739,-0.189402,0.128555,-0.076085,0.065647,-0.001645,0.015319,0.067603,0.046652,-0.004996,0.002017,-0.002413,0.011984,-0.012079,-0.009822,0.008615,0.000244,-0.001421,0.006474,0.005270,0.002378,-0.001356,0.007527,-0.002828
Pakistan_IA_Swat_Loebanr,0.067643,-0.000987,-0.111563,0.066520,-0.082512,0.043977,0.003243,0.001154,-0.018355,-0.014641,-0.005804,0.001302,-0.003993,-0.006681,0.012180,0.011573,-0.002712,0.000832,0.000927,-0.010212,-0.001993,-0.006342,0.000595,-0.006269,0.003873
Sintashta_MLBA,0.125277,0.115992,0.057570,0.078509,0.011656,0.028604,0.005493,0.004002,-0.017595,-0.027552,-0.002512,0.001531,-0.003424,-0.021762,0.022313,0.013408,-0.004074,0.000360,-0.000546,-0.000574,-0.006594,0.002206,0.002754,0.007245,-0.004274
Andronovo_MLBA,0.123742,0.119397,0.052258,0.074013,0.011694,0.029841,0.007419,0.004352,-0.013995,-0.025539,-0.002900,0.000471,-0.004651,-0.020958,0.024158,0.009546,-0.006184,-0.000815,-0.001167,0.001197,-0.005062,0.001660,-0.002923,0.003408,-0.004738
Yamnaya_Samara,0.122654,0.088981,0.044110,0.114466,-0.027262,0.045546,0.004027,-0.002379,-0.054728,-0.074654,0.000974,-0.000548,-0.000974,-0.021706,0.036808,0.012134,-0.006488,-0.001747,-0.002514,0.010928,-0.003808,0.001262,0.009800,0.019886,-0.004480
Iran_N_GanjDareh,0.044066,0.066155,-0.156344,0.006875,-0.124067,0.022630,0.015679,-0.000395,-0.082102,-0.054827,-0.001021,-0.001713,0.004948,-0.008120,0.033833,0.055801,-0.006482,0.009284,0.009679,-0.035035,0.007540,-0.029800,-0.011973,-0.037320,0.022188
Onge_AASI_proxy,-0.022525,-0.244529,-0.132429,0.095965,0.029933,-0.004756,-0.007644,0.007579,0.054823,0.024439,0.023495,0.003218,-0.004061,0.008475,-0.012693,-0.011145,0.010918,-0.001620,-0.005981,0.028829,-0.003711,0.009690,-0.012824,-0.001123,0.004343
Anatolia_Neolithic,0.118842,0.181041,0.003548,-0.100835,0.051744,-0.046397,-0.005191,-0.007321,0.036861,0.080971,0.009190,0.012064,-0.023279,0.000982,-0.041820,-0.009106,0.021590,0.000697,0.011753,-0.009470,-0.013102,0.006750,-0.004667,-0.003451,-0.005465
EHG_Samara,0.118945,0.044175,0.123507,0.201069,-0.003077,0.054245,-0.016216,-0.021229,-0.008795,-0.081186,0.012179,-0.015511,0.028320,-0.039911,0.023818,0.017502,-0.004302,-0.002217,-0.007919,0.014257,-0.007861,0.018363,0.007641,-0.024462,-0.005449
References and sources
- 1 Narasimhan, V. M., Patterson, N., Moorjani, P., et al. The formation of human populations in South and Central Asia. Science 365, eaat7487 (2019). The landmark ancient-DNA study of the region: 523 ancient individuals, the three-way (AASI, Iranian-farmer, Steppe_MLBA) model of South Asians, the Swat valley steppe gradient, the absence of steppe ancestry in the Indus Valley, and the steppe Y-chromosome signal. link
- 2 Underhill, P. A., Poznik, G. D., Rootsi, S., et al. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. European Journal of Human Genetics 23, 124-131 (2015). The reference phylogeny of R1a: the Z282 / Z93 split, the Z2123, Z2125 and M780 sub-branches, and the frequency map, including the very high R1a frequencies among Pashtun groups such as the Yusufzai. link
- 3 Allentoft, M. E., Sikora, M., Sjogren, K.-G., et al. Population genomics of Bronze Age Eurasia. Nature 522, 167-172 (2015). Genome-wide data from Sintashta, Andronovo and Srubnaya, showing their Corded-Ware-like autosomal profile and their R1a steppe Y-lineages, and linking them to the later spread into Asia. link
- 4 Silva, M., Oliveira, M., Vieira, D., et al. A genetic chronology for the Indian Subcontinent points to heavily sex-biased dispersals. BMC Evolutionary Biology 17, 88 (2017). Dates the R1a-Z93 and L657 expansions and argues that the Bronze Age dispersals into South Asia were strongly male-biased, with the paternal R1a signal far exceeding the maternal one. link
- 5 Haak, W., Lazaridis, I., Patterson, N., et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207-211 (2015). Establishes the steppe (Yamnaya) source of the R1a and R1b expansions and the Late Neolithic / Bronze Age rise of these lineages, the western half of the same process. link
- 6 Global25 coordinates: Davidski (Eurogenes), with modern and ancient population averages from the public Global25 datasheets and the Moriopoulos 2025 collection. Global25 spreadsheet tooling: Vahaduo. G25
Modern and ancient Global25 coordinates: Davidski (Global25), with population averages from the public Global25 datasheets and the Moriopoulos 2025 collection. The South Asian points are named modern population averages; the ancient Swat point is the Loebanr Iron Age average; the steppe, farmer, hunter-gatherer and AASI points are named published averages, with Onge used as a deep proxy for the AASI substrate. Global25 spreadsheet tooling: Vahaduo. Analysis: scaled Global25 Euclidean distances and non-negative least squares modelling in Python. Ancestry fractions are proxy-dependent and best read as directions rather than exact percentages; the steppe-versus-Iranian-farmer split is the soft part of the modelling, because the two sources share Caucasus-hunter-gatherer ancestry, and Global25 tends to lean the steppe share high relative to the formal published figures. The R1a-Z93 frequencies are taken from the published Y-chromosome literature cited above. Y-chromosome lineages are single inherited lines and carry no information about appearance, character or worth.