When a modern Croatian or Serbian individual runs their genome through an ancestry calculator, they encounter a result that seems paradoxical at first glance: a strong Slavic signal, yet resting on a genetic background that ties them unmistakably to the pre-Slavic Balkan world. This is not a contradiction. It is the direct genomic reflection of a documented historical process: the Slavic migrations swept through the Balkans between the 6th and 7th centuries CE, blending , to very different degrees depending on the region, with the Illyrian, Thracian, and Greek populations who preceded them. Crucially, the pre-Slavic Balkan world was not genetically uniform: the western Balkans (Albania, Dalmatia, Adriatic coast) were dominated by Illyrian populations, while the eastern and central Balkans (Bulgaria, Romania, North Macedonia) preserved a distinct Thracian heritage. Understanding this east, west substrate difference is essential to interpreting modern Balkan genetic diversity correctly.
1. Two Pre-Slavic Worlds: Illyrian and Thracian
Before the Slavic migrations, the Balkan peninsula harboured at least two genetically distinguishable Iron Age populations, both descended from Neolithic Anatolian farmers (EEF) and Bronze Age Yamnaya steppe herders, but in different proportions and with different regional histories. Recognising this distinction is critical: using a single “proto-Balkan” source for all modern populations systematically misrepresents Bulgarian and Romanian ancestry.
| Pre-Slavic Component | Period | Core Area | G25 Proxy | Key Genetic Character |
|---|---|---|---|---|
| Illyrian / West Balkan | 1200 BCE, 600 CE | Albania, Dalmatia, Adriatic coast, Epirus | Albania_Medieval; IllyrianCluster (qpADM) | Moderate EEF, moderate steppe; Adriatic-shifted on PCA (PC4 ~ −0.035) |
| Thracian / East Balkan | 1200 BCE, 600 CE | Bulgaria (Thrace, Moesia), Romania, N. Macedonia (east) | Bulgaria_IA; Bulgaria_LIA (G25) | Higher EEF, more Aegean-pulled; PC4 ~ −0.053, clearly distinct from Illyrian |
| Aegean / Mycenaean | 3000 BCE, 500 CE | Greece, Aegean islands, Crete, Anatolian coasts | Greece_BA_Mycenaean; Greece_Roman | EEF-dominant, low steppe, strong Anatolian affinity |
| Early Slavic | 550, 900 CE | Migration wave from Pontic steppe; all of the Balkans | Czech_Medieval; Ukraine_Medieval.SG (G25); Russia_Sunghir_Medieval.SG (qpADM) | High steppe/EHG, low EEF; sharp PC3 shift relative to all Balkan pre-Slavic sources |
On the Global25 PCA, Bulgaria_IA (Thracian Iron Age) and Albania_Medieval (Illyrian) separate clearly on PC4: Thracians sit at approximately −0.053, Illyrians at −0.035. Thracians pull more strongly toward the Aegean/Anatolian pole, reflecting their geographic position between the Greek world and the Pontic steppe. Using an Illyrian cluster as the sole pre-Slavic source for Bulgarians or Romanians produces a structurally incorrect model. Both ancient populations are Iron Age Balkan, but they are not interchangeable.
The ancestry proportions in this article are drawn from qpADM analyses using actual ancient genomes (Lazaridis et al. 2022 Southern Arc; published community qpADM runs), cross-referenced with the academic literature. Simple G25 NNLS modelling, while useful for visualising population relationships, systematically overestimates Slavic ancestry in this context because it cannot distinguish Bronze Age Yamnaya steppe ancestry (already present in the Balkans since ~2500 BCE) from the genuinely medieval Slavic expansion signal. For Albanians, for example, raw NNLS produces ~21, 30% Slavic while qpADM converges on ~15, 17%. All population-level percentages below reflect qpADM-calibrated estimates.
2. G25 Coordinates, Modern Balkan Populations
The coordinates below are drawn from the Moriopoulos 2025 collection (scaled population averages). Paste this block directly into Vahaduo “Single” mode to explore population relationships and reproduce PCA plots.
3. The Slavic Gradient: qpADM-Calibrated Estimates
The table below presents Slavic ancestry proportions calibrated against published qpADM analyses and the academic literature. For each population, the dominant pre-Slavic substrate is identified, either Illyrian (western Balkans) or Thracian (eastern/central Balkans), alongside the secondary Aegean component where relevant. Slavic proportions represent the genuine post-550 CE migration signal, distinct from Bronze Age steppe ancestry already present in the Balkans millennia earlier.
| Population | Primary Pre-Slavic Substrate | Pre-Slavic Est. Range | Early Slavic Est. Range | Key Sources |
|---|---|---|---|---|
| Slovenian | Illyrian / Pannonian | 60, 70% | 30, 40% | Lazaridis 2022; Mathieson 2015 |
| Croatian | Illyrian (Dalmatian) | 65, 75% | 25, 35% | Lazaridis 2022; Olalde 2019 |
| Bosnian | Illyrian | 65, 75% | 25, 35% | Lazaridis 2022 |
| Serbian | Illyrian (Moesian) | 70, 80% | 20, 30% | Lazaridis 2022 |
| Montenegrin | Illyrian | 72, 80% | 20, 28% | Lazaridis 2022 |
| Albanian | Illyrian (~58%) + EastMed/Byzantine (~26%) | ~83, 85% | 15, 17% | qpADM direct (IllyrianCluster + Russia_Sunghir_Medieval + Turkey_Byzantine) |
| Bulgarian | Thracian (Bulgaria_IA / Bulgaria_LIA) | 80, 85% | 15, 20% | Lazaridis 2022; community qpADM |
| Macedonian | Thracian + Illyrian mixed | 78, 82% | 18, 25% | Lazaridis 2022 |
| Romanian | Thracian / Dacian | 80, 85% | 15, 20% | Lazaridis 2022; Mathieson 2015 |
| Greek (E. Macedonia & Thrace) | Thracian + Aegean | 85, 92% | 8, 15% | Lazaridis 2022 |
| Greek (Central Macedonia) | Aegean + Thracian | 88, 94% | 6, 12% | Lazaridis 2022 |
| Greek (Thessaly) | Aegean dominant | 92, 97% | 3, 8% | Lazaridis 2022 |
| Greek (Peloponnese) | Aegean dominant | 93, 98% | 2, 7% | Lazaridis 2022 |
| Greek (Deep Mani) | Aegean (very high) | 97, 100% | 0, 3% | Lazaridis 2022 |
| Greek (Crete) | Aegean / Minoan | ~100% | 0, 2% | Lazaridis 2022 |
The Albanian result shown here is directly supported by published qpADM modelling (see Figure 2): IllyrianCluster 58.3% ± 7.2%, Russia_Sunghir_Medieval.SG 15.3% ± 6.1% (Slavic proxy), Turkey_Southeast_Byzantine.DG 26.5% ± 6.6% (EastMed/Byzantine). The model fits with chisq = 6.16 (p = 0.629), confirming statistical adequacy. An independent Iron Age breakdown gives near-identical results: 58.8% Iron Age Illyrian, 17% Balto-Slavic, 24.2% Iron Age Anatolia. These converging results firmly place Albanian Slavic ancestry at ~15, 17%, the lowest of any Slavic-influenced mainland Balkan population.
4. Population Case Studies
4.1 Albanians: The Best-Preserved Illyrian Descendant Population
The qpADM results shown in Figure 2 settle a long-contested question with unusual precision. Albanians carry approximately 58% Illyrian Iron Age ancestry, ~26% EastMed/Byzantine ancestry (Turkey_Southeast_Byzantine.DG), and only ~15, 17% Early Slavic. This makes them the mainland Balkan population with the lowest Slavic admixture and the strongest documented continuity from a pre-Slavic Iron Age Balkan population. The Slavic layer is real and statistically well-supported (z = 2.52), but it is a minority component overlaid on a predominantly Illyrian genetic foundation.
The ~26% Byzantine/EastMed component reflects contacts with the Byzantine Greek world across the 1st millennium CE, trade, population movement, and assimilation within the Byzantine Empire, rather than a recent Greek admixture event. It is best interpreted as accumulated Aegean-shifted ancestry absorbed across centuries of Byzantine rule over the Adriatic hinterland.
Sources: IllyrianCluster (1200, 500 BCE) = 58.3% ± 7.2% • Russia_Sunghir_Medieval.SG (500, 1500 CE) = 15.3% ± 6.1% • Turkey_Southeast_Byzantine.DG (64 BCE, 330 CE) = 26.5% ± 6.6%. Model fit: chisq = 6.16, p = 0.629, rank = 2 of 8 possible. An independent decomposition using Iron Age sources alone gives 58.8% Iron Age Illyrian, 17% Balto-Slavic, 24.2% Iron Age Anatolia, closely replicating the three-source result.
4.2 Croats and Serbs: Slavic Majority on an Illyrian Foundation
Croats and Serbs carry approximately 25, 35% Early Slavic ancestry, a majority in the context of post-600 CE migrations but leaving a substantial Illyrian substrate (~65, 75%) that firmly roots them in the pre-Slavic Balkan world. The geographic asymmetry between them is real: Croatia and Slovenia sat at the front line of the 6th, 7th century Slavic colonisation wave, and their higher Slavic proportion reflects that exposure. Central Serbia and Montenegro, more mountainous and less accessible, retained a somewhat larger pre-Slavic component.
The pre-Slavic substrate in both populations is predominantly Illyrian (Dalmatian, Pannonian, Moesian), consistent with the historical distribution of Illyrian-speaking peoples across the western Balkans before Romanisation. This Illyrian genetic layer is shared, in different proportions, with Albanians, providing a genomic link between these two neighbouring populations despite their very different languages and identities.
Genetically, Serbs and Croats are closer to each other than either is to their Romanian, Bulgarian, or Greek neighbours. Their shared Illyrian substrate and comparable (if not identical) Slavic proportions make them genomically the most similar pair in the Balkans. National identity does not read off genetic percentages: shared ancestry is not shared culture.
4.3 Bulgarians and Romanians: Thracian Heirs, Not Illyrian
This is where the most significant correction from earlier versions of this article applies. Bulgaria covers the historical territories of Thrace and Moesia, two Iron Age population zones genetically distinct from the Illyrian west. Bulgaria_IA and Bulgaria_LIA ancient samples sit clearly apart from Albania_Medieval on the G25 PCA (PC4: −0.053 vs −0.035), pulled more strongly toward the Aegean/Anatolian EEF pole. Modelling Bulgarians with an Illyrian source, as is sometimes done in simplified analyses, produces a structurally wrong decomposition.
When the appropriate Thracian ancient DNA source is used, the pre-Slavic component of modern Bulgarians rises to ~80, 85%, with Early Slavic ancestry estimated at approximately 15, 20%. This is comparable to Romanian and significantly lower than the Slavic proportions in Croats or Bosnians. Similarly, Romania preserves a strong Dacian (northern Thracian) substrate, making it genetically closer to Bulgarians than to Croats or Serbs despite being north of the Danube.
North Macedonia occupies a transitional position, with a mixed Thracian-Illyrian substrate reflecting its geographic position at the interface of the two Iron Age population zones.
The traditional scholarly attribution of a Turkic (Oghur branch) language to the Proto-Bulgarians rests on a small number of fragmentary inscriptions in Greek script, comparative Turkic linguistic analysis, and references in Byzantine and Arab chronicles. However, no substantial text in the Bulgar language itself has been recovered, and a minority of scholars have argued for an Iranian (Sarmatian/Alanic) affiliation. The point stands regardless of language: modern Bulgarians show no detectable genetic contribution from Turkic or Inner Asian steppe populations in their population averages. The Proto-Bulgarian ruling elite provided a state structure, a dynastic tradition, and a name; they were genetically absorbed into the Thracian-Slavic majority within a few generations.
5. The Greek Exception: Continuity and Its Limits
Greece presents the most internally diverse pattern in the entire Balkan dataset. The Early Slavic proportion ranges from approximately 10, 15% in Eastern Macedonia and Thrace, the region bordering Slavic settlement zones most directly, down to effectively zero in Crete and the Aegean islands. This north-to-south gradient is direct genomic confirmation of the Byzantine historical record: Slavic groups colonised the Balkan interior as far as Thessaly and reached the Peloponnese sporadically during the 7th, 9th centuries, but Byzantine naval control kept the islands and peninsulas inviolate.
Importantly, northern Greek populations (Macedonia, Thrace) show a significant Thracian ancestry component in their pre-Slavic substrate, alongside the dominant Aegean/Mycenaean signal. This reflects the long historical coexistence of Greek, Thracian, and later Byzantine populations in the northern Aegean zone, which was also more exposed to Slavic contact.
| Greek Region | Pre-Slavic Substrate | Est. Early Slavic | Historical Context |
|---|---|---|---|
| E. Macedonia & Thrace | Thracian + Aegean | 8, 15% | Direct border with Slavic settlement zones; contacts 7th, 11th c. |
| Central Macedonia (Thessaloniki area) | Aegean + Thracian | 6, 12% | Historical Sclaveni presence; major Byzantine city provided resistance |
| W. Macedonia | Aegean + Illyrian traces | 6, 10% | Mountainous; prolonged contact with western Balkan Slavs |
| Thessaly | Aegean dominant | 3, 8% | Real Slavic penetration; rapid Byzantine re-assimilation 10th, 11th c. |
| Peloponnese | Aegean dominant | 2, 7% | Slavic settlement 8th, 9th c.; forced re-Hellenisation under Nikephoros I |
| Deep Mani | Aegean (very high) | 0, 3% | Never subjugated by Slavs per Byzantine sources; isolated peninsula |
| Crete | Aegean / Minoan | 0, 2% | Island unreachable by overland Slavic migrations; Minoan-Mycenaean substrate preserved |
6. Common Misconceptions Addressed
“Albanians are the pure, direct descendants of the Illyrians, with no Slavic influence.”
Albanians carry ~15, 17% Early Slavic ancestry (qpADM, directly measured). Their Illyrian foundation is strong and majority (~58%), but a Slavic layer absorbed between the 7th and 9th centuries is clearly visible and statistically well-supported.
“Bulgarians descend primarily from Illyrians, the same pre-Slavic stock as Albanians and Croats.”
Bulgarians descend primarily from Thracians, not Illyrians. Bulgaria_IA and Bulgaria_LIA ancient samples are genetically distinct from the Illyrian cluster. Thracians (Thrace, Moesia, Odrysian Kingdom) had a different ancestral profile, more Aegean/EEF-shifted, than their western Illyrian neighbours.
“Greeks have no Slavic ancestry: the Slavs never reached Greece.”
Most continental Greek regions carry 5, 15% Early Slavic ancestry. The gradient from northern Macedonia (~10, 15%) to Deep Mani (~1%) directly confirms Byzantine chronicles describing Slavic incursions across the Greek mainland. Only the islands and deep peninsulas escaped essentially untouched.
“Modern Bulgarians are significantly descended from Turkic Proto-Bulgarians.”
No detectable Inner Asian or Turkic steppe genetic component appears in modern Bulgarian population averages. The Proto-Bulgarians, whose language affiliation is itself debated (traditionally Oghur Turkic, based on fragmentary evidence), were genetically absorbed within a few generations. Modern Bulgarians are ~80, 85% Thracian pre-Slavic and ~15, 20% Early Slavic.
“Croats and Serbs are primarily Southern European, genetically close to Italians or Greeks.”
Croats and Serbs cluster closer to West Slavic populations (Poles, Czechs) than to Italians or Greeks on PCA. Their ~25, 35% Slavic ancestry is a real and dominant signal. Their Illyrian substrate does provide a southern European affinity that distinguishes them from northern Slavs, but the Slavic vector is not cosmetic.