The Roma, known as Gitanos in Spain, Tziganes in France, Zigeuner in Germany, are among the most genetically distinct populations in Europe. Despite a thousand years of European presence, their autosomal DNA continues to carry an unmistakable signature from the Indian subcontinent. This article uses G25 coordinates from the Moriopoulos 2025 modern collection to model the ancestry of nine Roma diaspora populations across the Balkans, Central Europe, and the Iberian Peninsula, and to quantify both their shared South Asian heritage and the variable degree of local European admixture accumulated along their long journey west.

? Key Findings

  • All Roma populations retain between 18% and 42% South Asian ancestry, detectable even in populations that have lived in Iberia for over 600 years.
  • Balkan Roma show the highest South Asian component (~39, 42%), consistent with their position closest to the ancestral entry point into Europe.
  • Iberian Roma (Gitanos) show the strongest local admixture, with up to 49% Iberian ancestry, yet retain a clearly non-European signal absent in any other Spanish population.
  • A Middle Eastern / Iranian “passage layer” is visible in most populations, reflecting the historical route through Persia and the Caucasus before entering Byzantine territory.
  • The South Asian proxy most consistently selected by the NNLS model is Gujarati-like, likely reflecting populations from the northwest of the Indian subcontinent with higher Iranian-related ancestry relative to more eastern South Asian groups.
  • Y-DNA haplogroup H-M82 (H1a1a) and mtDNA haplogroup M5a1, both exclusively South Asian, are the most common lineages among Roma men and women respectively.

I. Historical Migration Route

The Roma are not a monolithic group but a cluster of related populations who share a common origin in northwestern India, most likely in the region spanning today's Punjab, Rajasthan and Sindh. The earliest genetic divergence from South Asian sister populations is estimated between the 5th and 11th centuries CE, with most genomic studies converging on a departure from the subcontinent around the 9th, 10th century CE.

From the Indian subcontinent, ancestral Roma populations moved into Persia and the Caucasus, a passage clearly reflected in the autosomal signal. Byzantine Greek sources mention a group called the Atsingani in 9th, 11th century Anatolia, widely considered the first historical reference to Roma in the Mediterranean world. By the 13th century they were documented in the Balkans, and by the 15th century they had spread across most of Western Europe, reaching Spain (1425 CE), England (~1500 CE) and Scandinavia.

? Ancestral Migration Route

? NW India (Punjab / Rajasthan)
9th, 10th c. CE
? Persia / Afghanistan
~10th c. CE
? Armenia / Caucasus
~10th, 11th c. CE
? Byzantine Anatolia
~11th c. CE
? Balkans
12th, 14th c. CE
? Western Europe
14th, 15th c. CE

This route is not merely a historical narrative: it leaves a layered genetic trace that remains partially visible in modern Roma populations and that the G25 model below attempts to decompose systematically.

~9, 10c
Indian Departure
Small founding group leaves NW India; major bottleneck explaining the unusual genetic homogeneity of modern Roma across continents.
~10, 12c
Persian Passage
Transit through Khorasan, Iran and the Caucasus; partial admixture with Iranian and Armenian populations leaves a detectable autosomal layer.
~12, 14c
Byzantine Settlement
Extended presence in the Byzantine Empire (Anatolia, Greece, Balkans) produces the strong Greek-Balkan admixture signal dominant in Balkan Roma today.
14, 16c
European Diaspora
Dispersal across Western Europe; each local group acquires a variable proportion of the host population's ancestry while retaining the South Asian core.

II. Haplogroup Signatures of South Asian Origin

Before examining the autosomal G25 results, it is worth noting the uniparental haplogroup evidence, which provides the clearest possible signal of South Asian ancestry and is entirely independent of admixture modeling.

Y-DNA
H-M82
Found in ~40, 60% of Roma men across all diasporas. Haplogroup H1a1a is exclusive to South Asia and is the single most diagnostic genetic marker of Roma identity. It is virtually absent in non-Roma European populations.
mtDNA
M5a1
The most frequent Roma mitochondrial haplogroup. Like H-M82, M5a1 is a South Asian-specific lineage absent in West Eurasia and confirms a female founding population from the Indian subcontinent.
Founder Event
~300, 1000
Estimated effective size of the Roma founding population based on runs-of-homozygosity and haplotype sharing (Morar et al. 2004; Mendizabal et al. 2012). This extreme bottleneck explains the remarkable genetic cohesion of Roma worldwide.

III. G25 Admixture Results by Diaspora

The G25 coordinates below were extracted from the Moriopoulos 2025 Modern Population Collection (no-simulations averages). Admixture modeling was performed using NNLS (non-negative least squares) in the G25 scaled space, with three macro-level source categories:

  • South Asian (proxy: Gujarati, Jat_Haryana, Punjabi_Hindu_India, Sindhi), representing the ancestral Indian source population.
  • Balkan / SE European (Greek Peloponnese, Greek Macedonia, Bulgarian, Romanian, Macedonian), reflecting Byzantine and post-Byzantine host admixture.
  • Iberian (Spanish Andalucia, Extremadura, Cataluña), relevant for the Gitano populations of Spain and Portugal.
  • Iranian / MENA (Iranian Persian, Iranian Zoroastrian, Armenian), the Caucasus and Persian passage layer.
  • Turkish / Ottoman (Turkish Rumeli), Ottoman Balkan admixture visible in some groups.

You can run your own G25 coordinates against these Roma population averages or model Roma populations on Calculator #186, The Migration Era Calculator (Age of the Huns, Scaled) available on this site.

Roma, Balkans (Bosnia-Herzegovina) n=3 • Distance: 0.00289
Balkan / SE European (Greek-Peloponnese, Greek-Macedonia) 58.9%
 
South Asian (Gujarati proxy) 39.0%
 
Iranian / MENA (Armenian passage) 2.1%
 
Balkan / SE European South Asian Iranian / MENA Iberian Turkish / Ottoman
Roma, Czechia n=4 • Distance: 0.00134
South Asian (Gujarati proxy) 37.5%
 
Balkan / SE European (Greek-Peloponnese, Bulgarian) 41.6%
 
Iranian / MENA (Armenian) 8.3%
 
Turkish / Ottoman 7.9%
 
Iberian 4.8%
 
Roma, Serbia n=1 • Distance: 0.00411
Balkan / SE European (Bulgarian, Greek) 55.2%
 
Turkish / Ottoman 23.1%
 
South Asian 18.1%
 
Iberian 3.7%
 
Roma, Turkey / Balkans n=1 • Distance: 0.00528
Balkan / SE European (Greek-Peloponnese) 45.8%
 
South Asian (Gujarati proxy) 41.8%
 
Iranian / MENA (Armenian passage) 12.4%
 

Note on the Serbian and Turkey/Balkans results: Both populations are based on a single individual (n=1), meaning individual-level variance may distort macro-component proportions. They should be interpreted with greater caution than the multi-individual averages.

Iberian Roma, The Gitanos

The Iberian Roma, collectively known as Gitanos, represent one of the most studied Roma communities in Europe. Their entry into Spain is historically documented to 1425 CE, when King Alfonso V of Aragon granted a safe-conduct to a group of Roma pilgrims. Over the following centuries they spread throughout the Iberian Peninsula. The genetic results below show that six centuries of Iberian residence have significantly increased their local admixture, yet their South Asian ancestry remains clearly detectable and sharply distinguishes them from all non-Roma Spanish populations.

Roma, Barcelona (Cataluña) n=6 • Distance: 0.00199
Iberian (Andalucia + Cataluña) 48.7%
 
Balkan / SE European (Greek) 27.1%
 
South Asian (Gujarati proxy) 20.1%
 
Turkish / Ottoman 4.1%
 
Roma, Bilbao (Basque Country) n=8 • Distance: 0.00346
Balkan / SE European 48.5%
 
South Asian 34.5%
 
Iberian (Andalucia) 17.0%
 
Roma, Granada (Andalucia) n=7 • Distance: 0.00341
South Asian 35.8%
 
Balkan / SE European 32.5%
 
Iberian (Andalucia) 21.6%
 
Iranian / MENA (Armenian) 10.1%
 
Roma, Madrid n=4 • Distance: 0.00320
Balkan / SE European 35.8%
 
South Asian 33.6%
 
Iberian (Andalucia) 21.7%
 
Iranian / MENA (Armenian) 8.9%
 
Roma, Porto (Portugal) n=4 • Distance: 0.00365
Balkan / SE European 45.7%
 
South Asian 35.5%
 
Iberian (Andalucia + Extremadura) 16.3%
 
Iranian / MENA (Armenian) 2.5%
 

IV. Comparative Table, All Roma Populations

Population n South Asian Balkan / SE Eur. Iberian Iranian / MENA Turkish Distance
Roma, Turkey / Balkans 1 41.8% 45.8% , 12.4% , 0.00528
Roma, Bosnia-Herzegovina 3 39.0% 58.9% , 2.1% , 0.00289
Roma, Czechia 4 37.5% 41.6% 4.8% 8.3% 7.9% 0.00134
Roma, Granada (Andalucia) 7 35.8% 32.5% 21.6% 10.1% , 0.00341
Roma, Porto (Portugal) 4 35.5% 45.7% 16.3% 2.5% , 0.00365
Roma, Bilbao (Basque Country) 8 34.5% 48.5% 17.0% , , 0.00346
Roma, Madrid 4 33.6% 35.8% 21.7% 8.9% , 0.00320
Roma, Barcelona 6 20.1% 27.1% 48.7% , 4.1% 0.00199
Roma, Serbia 1 18.1% 55.2% 3.7% , 23.1% 0.00411

? A Note on Single-Sample Populations

Roma_Serbia and Roma_Turkey_Balkans are each based on a single sequenced individual (n=1). Individual Roma genomes vary considerably due to the combined effects of the historical bottleneck and variable admixture with local host populations. These two entries should be treated as indicative of the range of variation rather than as stable population averages. The populations with n ≥ 4 (Czechia, Barcelona, Granada, Madrid, Porto, and the Bosnia-Herzegovina group) provide more statistically robust estimates.

V. Interpretation: What the Results Tell Us

The South Asian Persistence

The most striking result across all nine Roma populations is the consistent presence of a South Asian genetic component ranging from approximately 18% to 42%. This is remarkable for a population that has been resident in Europe for up to a millennium. For comparison, no other non-South-Asian European population, including Sephardic Jewish communities whose presence in Iberia pre-dates the Roma, shows any detectable South Asian autosomal signal.

The NNLS model consistently selects a Gujarati-like proxy to represent the South Asian component rather than more easterly or more purely South Asian populations (Bengali, Dravidian). This is consistent with what population genetics research suggests about the ancestral Roma homeland: a Northwest Indian population with relatively high Iranian-related ancestry, as found today among communities from Punjab, Rajasthan, and Gujarat, particularly among upper-caste and Rajput-related groups. Some researchers have specifically proposed Rajasthan as the most likely departure point.

The Balkan Layer, Byzantine Admixture

Across all Roma populations, including those now living in Iberia or Central Europe, a strong Balkan / Greek component forms the bulk of the non-South-Asian ancestry. This is not the result of recent mixing with Balkan populations, but rather reflects the extended period, roughly 200 to 400 years, that ancestral Roma spent within the Byzantine Empire before beginning their westward dispersal. The Greek Peloponnese and Greek Macedonia proxies being selected consistently is highly coherent with this history.

The implication is significant: most of what is “European” in Roma ancestry is Byzantine-Greek-Balkan, not the ancestry of the western European host populations they later settled among. A Roma individual in Seville carries far more Greek Peloponnese ancestry than Andalusian ancestry in the purely genetic sense.

The Iberian Gradient

Among Iberian Roma populations, a clear geographic gradient emerges. The Roma of Barcelona show the highest Iberian absorption (~49%), while those of Granada (~22%), Porto (~16%), and Bilbao (~17%) show lower but still significant Iberian components. This variation may reflect differences in the degree of endogamy practiced by different local Gitano communities, as well as possible differences in when or how they settled into particular regions.

Interestingly, despite being based in Andalucia, the region of Spain most associated historically with Gitano culture, Roma from Granada do not show the highest Iberian proportion. The Granada sample retains a notably large Iranian/MENA component (~10%), possibly reflecting greater preservation of the older pre-Iberian ancestry in this specific community.

The Iranian / Armenian Passage Layer

A Middle Eastern / Caucasian layer ranging from approximately 2% to 12% is detectable in most Roma populations. This “passage layer” is consistently captured by Armenian or Iranian Zoroastrian proxies, both of which genetically approximate the populations ancestral Roma would have encountered during their transit through the Caucasus and Persia. This component is most visible in Turkish Balkans (12.4%), Czechia (8.3%), Granada (10.1%), and Madrid (8.9%).

The survival of this signal after so many generations of European residence suggests either that it was large enough in the founding Roma population to remain detectable, or that some ongoing admixture with MENA-related populations (including Ottoman Turks and Middle Eastern traders) reinforced it during the Balkan centuries.

? Common Misconception

“Roma have lived in Europe so long that they are genetically European, their South Asian ancestry is negligible.”

? Genetic Reality

Even after 600 years in Iberia, Gitanos retain 20, 36% South Asian ancestry, more South Asian ancestry than any non-Roma population anywhere in Western Europe. Their Indian heritage is not “negligible” in any genomic sense.

? Common Misconception

“Gitanos must be closely related to Egyptian or North African populations, hence the old name ‘Gypsies’.”

? Genetic Reality

The term “Gypsy” derives from a medieval European misidentification of Roma as migrants from “Little Egypt.” Genetically, Roma have no special affinity with North African or Egyptian populations. Their ancestry traces to India via the Caucasus and the Byzantine world.

VI. G25 Coordinates, Roma Diaspora Populations

The following G25 coordinates are drawn from the Moriopoulos 2025 Modern Population Collection. Paste them as target populations in Vahaduo or use them as reference points when running your own analysis on Calculator #186.

Roma_Balkans (Bosnia-Herzegovina, n=3)
Roma_Balkans_Bosnia,0.091059,0.063640,-0.058831,0.024979,-0.018772,0.013108,0.000392,0.002539,0.016703,0.014882,0.000379,-0.000350,0.000942,0.008762,-0.010541,-0.005215,0.008866,0.003125,0.001508,0.001001,-0.002080,0.000124,-0.000657,-0.004016,-0.001277
Roma_Czechia (n=4)
Roma_Czechia,0.087419,0.059210,-0.056244,0.024590,-0.017033,0.013238,0.001951,0.004357,0.011212,0.012636,-0.000824,0.000102,-0.000492,0.003162,-0.006689,-0.001236,0.002694,0.001042,0.001030,-0.002616,-0.002766,0.000621,0.001211,0.002081,-0.000701
Roma_Serbia (n=1)
Roma_Serbia,0.101303,0.088351,-0.014331,0.017765,0.007386,0.006693,0.001175,0.004154,0.003068,0.012210,-0.003735,0.002248,-0.001041,0.005780,-0.012758,0.009016,0.018515,0.004181,0.005908,-0.002001,0.004367,-0.000371,0.004560,-0.000482,-0.005748
Roma_Turkey_Balkans (n=1)
Roma_Turkey_Balkans,0.091058,0.057885,-0.078818,0.019380,-0.022773,0.015897,-0.003290,0.003923,0.020248,0.020410,-0.000487,-0.001349,0.000892,0.012937,-0.013843,-0.001856,0.011735,0.001774,0.003142,0.007379,-0.002121,-0.003462,-0.001972,-0.003374,-0.000838
Roma_Spain_Barcelona (n=6)
Roma_Spain_Barcelona,0.101113,0.099691,-0.007417,0.014158,0.011336,0.009296,0.000235,0.003115,0.018476,0.026121,-0.000704,0.002897,-0.008399,-0.002454,-0.002217,-0.004044,0.003847,-0.002154,-0.001027,0.000021,-0.000416,-0.001443,-0.003410,-0.006005,-0.000040
Roma_Spain_Bilbao (n=8)
Roma_Spain_Bilbao,0.090490,0.068929,-0.045773,0.020268,-0.003000,0.017187,-0.000118,0.002308,0.019277,0.021595,-0.001401,-0.000019,0.000167,0.006967,-0.008500,-0.009331,-0.000896,0.000649,-0.000189,0.000516,-0.002090,0.001144,0.001017,-0.002094,0.000569
Roma_Spain_Granada (n=7)
Roma_Spain_Granada,0.082278,0.068040,-0.053551,0.018088,-0.010771,0.014343,0.001007,0.006066,0.021504,0.019421,-0.000394,0.000792,-0.005394,0.004561,-0.008647,-0.005398,0.002049,-0.003837,-0.002011,0.000143,-0.001604,-0.001696,-0.000933,-0.002995,0.004858
Roma_Spain_Madrid (n=4)
Roma_Spain_Madrid,0.089067,0.072610,-0.049780,0.020026,-0.006771,0.010389,0.005699,0.002077,0.014317,0.021778,0.003613,-0.004159,-0.004125,0.002306,-0.002171,-0.002983,0.003455,0.000570,0.000691,0.000532,0.000905,0.003926,0.004437,-0.001085,0.003113
Roma_Portugal_Porto (n=4)
Roma_Portugal_Porto,0.086506,0.071595,-0.050534,0.021883,-0.007463,0.018058,0.002585,0.003750,0.016311,0.021732,-0.000893,0.002323,-0.005426,0.005092,-0.010451,-0.008585,0.000163,-0.002376,0.000315,0.006628,-0.002122,-0.003586,0.001941,0.002591,-0.003114

VII. Key Takeaways

The genetic data paint a consistent and striking picture of Roma ancestry. Across all nine populations examined, spanning the Balkans, Central Europe, and the Iberian Peninsula, a set of converging signals emerges that is entirely coherent with the historical narrative of migration from South Asia through the Middle East into Byzantine Europe and then westward.

The South Asian component is the most diagnostic: present in every Roma population, ranging from ~18% in a single Serbian individual to over 40% in the least-admixed Balkan groups, and pointing consistently to a Northwest Indian source population with affinities to modern Gujarati, Punjabi, and Sindhi communities. This is entirely in line with what haplogroup studies have shown for decades: Y-DNA H-M82 and mtDNA M5a1 leave no room for ambiguity about the Indian origin of the Roma people.

The Balkan / Byzantine layer, consistently the largest non-South-Asian component in almost every population, tells the story of a long pause in the Byzantine world before the westward dispersal. Roma arrived in Western Europe already carrying the genetic imprint of Greece and the Balkans, not as pristine descendants of the Indian subcontinent but as a population that had already absorbed significant Balkan admixture over several centuries.

Finally, the Iberian variation, from ~17% in Porto to ~49% in Barcelona, documents the ongoing, variable process of local admixture that continues to differentiate Roma communities even within a single country. Yet even at the high end of this range, the South Asian signal never disappears: Barcelona Roma are still ~20% South Asian, a proportion that exceeds what any admixture calculator would ever return for any non-Roma Western European population.

References

  1. Morar B. et al. (2004). Mutation History of the Roma/Gypsies. The American Journal of Human Genetics, 75(4), 596, 609. DOI:10.1086/424759
  2. Mendizabal I. et al. (2012). Reconstructing the Population History of European Romani from Genome-wide Data. Current Biology, 22(24), 2342, 2349. DOI:10.1016/j.cub.2012.10.039
  3. Rai N. et al. (2012). The Phylogeography of Y-Chromosome Haplogroup H1a1a-M82 Reveals the Likely Indian Origin of the European Romani Populations. PLOS ONE, 7(11), e48477. DOI:10.1371/journal.pone.0048477
  4. Moorjani P. et al. (2013). Genetic Evidence for Recent Population Mixture in India. The American Journal of Human Genetics, 93(3), 422, 438. DOI:10.1016/j.ajhg.2013.07.006
  5. Olivieri A. et al. (2021). Mitogenome Diversity of Present-Day Romani People Points to a Single Founder Population from the Northwestern Indian Subcontinent. European Journal of Human Genetics, 29, 1835, 1843. DOI:10.1038/s41431-021-00924-y
  6. Haak W. et al. (2015). Massive Migration from the Steppe was a Source for Indo-European Languages in Europe. Nature, 522, 207, 211. DOI:10.1038/nature14317
  7. Moriopoulos Modern Population Collection 2025 (no simulations, averages). Available at: moriopoulos.com
  8. Davidski. Global25 PCA Modern Population Averages (Scaled). Eurogenes Blog. Available via: Vahaduo G25 Download