Biomolecular insights into North African-related ancestry, mobility and diet in eleventh-century Al-Andalus

Uniparental genetic background of the Segorbe Giant

We confirmed that the individual was genetically male (RY > 0.077; Supplementary Fig. S3), and both his uniparental markers point towards North African origins (Supplementary Table S2). He belongs to mtDNA haplogroup U6a1a1a (nomenclature according to Hernández et al.28). Although U6 in general, and U6a in particular, is present in higher frequencies in North and West Africa29,30, the complete mitochondrial genome dataset currently available is heavily biased towards Europe, and U6a1a1a, which dates to 3.5 thousand years ago (ka) (maximum-likelihood node estimation based on modern variation), appears to have a more southern European distribution (Fig. 1a; Supplementary Fig. S4). However, in our Iberian mitogenome dataset, U6a1a1a occurs only at 0.3%, whereas the HVS-I (hypervariable segment I) subclade U6a1a1, defined by a transition variant at position 16239, which nests U6a1a1a, is found at ~ 14% in Algerian Mozabite Berbers31.

Figure 1

UE2298/MS060 maternal lineage. (a) Phylogenetic tree of mtDNA lineage U6a1a1. ρ and maximum-likelihood (ML) node age estimates (in ka) shown on the branches (in italics and in bold, respectively); sequences are coloured according to geography, with ancient sequences circled in red (position of UE2298/MS060 sequence is indicated by the star); underlined samples are newly reported; mutations relative to rCRS are indicated on the branches. The complete and more detailed tree for haplogroup U6 is shown in Supplementary Fig. S4. Details of the sequences used to build the tree are in Supplementary Table S4. (b) Timeline showing occurrence of haplogroup U6 in the archaeological record of North Africa and Iberia through time2,6,13,14,15,32,33,34,35, and a map of the frequency distribution of U6a in present-day Iberia, with a point indicating the location of Segorbe city. Density maps of additional mtDNA lineages are shown in Supplementary Fig. S5.

Haplogroup U6a1 has been found in Moroccan Iberomaurusian remains dating to 14–15 ka32, as well as in Early Neolithic Morocco (i.e. the pre-agricultural Holocene)2 (Fig. 1b). Although U6 lineages have been retrieved from sixteenth century CE Islamic burials in Granada (Andalusia)6, to our knowledge, UE2298/MS060 (dating to the eleventh century CE) is the earliest documented finding of a U6 lineage in Iberia. Based on the results of our newly generated Iberian mitochondrial dataset (n = 1104: 1008 sequences from mainland Spain and the Balearic Islands, plus 96 from mainland Portugal), U6a can be found at a frequency of 1.6% in modern mainland Iberian populations, with a peak of 3.6% in the south of Spain (Fig. 1b). This pattern contrasts with most mitochondrial lineages today in Iberia, although a peak of frequency in the south of the Peninsula is also observed for typically sub-Saharan African L lineages (but not for the predominantly northeast African haplogroup M136) (Supplementary Fig. S5; Supplementary Table S5). UE2298/MS060 falls outside the modern geographic distribution of U6 lineages in Spain, suggesting that the present distribution might not reflect the medieval distribution of this haplogroup. A detailed phylogeographic analysis of U6 can be found in Supplementary Note 1.

We assigned UE2298/MS060 to the Y-chromosome haplogroup E1b1b1b1 (E–M310) (Supplementary Table S2), dating to ~ 13.9 [12.1–15.7] ka (Yfull, v.6.06.15) and immediately basal to the clade nesting E–M81 (E1b1b1b1a) (Fig. 2; Supplementary Figs. S6 and S7). E1b1b is very frequent in contemporary North Africa and has been found in North African and Levantine remains2,32,33,37 (Supplementary Fig. S8). E–M81 (E1b1b1b1a), dating to ~ 2.8 ka (YFull, v.6.06.15), has been retrieved from early Islamic remains (seventh–eighth century CE) in southern France38, whereas the more derived E1b1b1b1a1 has been found in two individuals from an Islamic necropolis in the city of Valencia, dating to twelfth–thirteenth century CE6. E–M81 is today predominantly found in the Maghreb (where its average frequency is > 40%) and peaks in modern Berber populations, with frequencies reaching > 80%39,40,41, being almost fixed in some groups, such as the southern Moroccan Tachlhit-speakers42 and the Chenini–Douiret and Jradou from Tunisia40. In Europe, it is found mostly in Iberia and Sicily at frequencies < 5%43.

Figure 2

PathPhynder tree showing the position of UE2298/MS060 paternal lineage. Neighbour-joining phylogenetic tree estimated with 256 Y-chromosome sequences from worldwide populations45,46. Coloured circles indicate the number of derived (green) or ancestral (red) branch defining markers identified in the ancient individual. The branches coloured in green indicate the path with greatest support for the inclusion of UE2298/MS060 within a clade containing present-day Spanish, Near Eastern and North African individuals belonging to the E–M310 (E1b1b1b1) Y-chromosome lineage (indicated by the star). Label for haplogroups (A, B and E) provided on the right-hand side of the figure. The complete Y-chromosome tree is shown in Supplementary Fig. S7.

Given that there are no reads covering any of its diagnostic positions, we cannot exclude the possibility that UE2298/MS060 could belong to the E–M81 lineage (Supplementary Fig. S6). Using pathPhynder44 to investigate his Y-chromosomal affinity with present-day populations, UE2298/MS060 was positioned in a branch that harbours Iberian and North African E–M310-derived lineages, but with no support for membership to a more downstream lineage within this clade (Fig. 2; Supplementary Fig. S7).

Genome-wide ancestry of the Segorbe Giant

We investigated the autosomal ancestry of our ancient individual by calling ~ 74,200 autosomal SNPs (~ 72,300 when using a different approach to deal with post-mortem damage (Supplementary Table S2)). The PCA (Fig. 3a; Supplementary Fig. S9) shows that UE2298/MS060 occupies an intermediate position between present-day and ancient North African and Iberian populations in PC1, close to other Iberian Islamic individuals. Some differentiation between the Islamic individuals from Valencia and those from Andalusia is visible in the PCA, with the Andalusians mostly falling closer to North Africans and UE2298/MS060 falling outside both the Valencian and Andalusian clusters (Fig. 3b). However, this difference between UE2298/MS060 and the other Islamic individuals is not detected with ADMIXTURE in supervised mode (K = 3), using Iberia_IA, Levant_BA and Morocco_LN/Guanches as reference populations (following the findings in Olalde et al.6) (Fig. 3c; Supplementary Fig. S10).

Figure 3

Overview of UE2298/MS060 autosomal ancestry. (a) PCA projecting 336 ancient samples on 702 modern individuals from North African, European, Near Eastern and Caucasian populations. (b) Zoom-in of PCA shown in (a) focussing on individuals from the Islamic period; individuals from Valencia and Andalusia (excluding two outliers that plot together with ancient North African individuals in (a)) within green and grey shapes, respectively. (c) Ternary plot showing supervised ADMIXTURE proportions (K = 3), using Iberia_IA, Morocco_LN and Levant_BA as reference populations. Abbreviations as follows: E/CHG, Eastern/Caucasus Hunter-Gatherers; Meso, Mesolithic; (E/M/L) N, (Early/Middle/Late) Neolithic; Chl, Chalcolithic; BA, Bronze Age; IA, Iron Age; c., centuries.

Outgroup-f3 runs using different outgroups (Mbuti, Ju_hoan_North and Ust_Ishim) consistently show a higher proportion of shared drift with Middle/Late Neolithic, Chalcolithic and Bronze Age Iberian populations, and with the Anatolian Neolithic (Supplementary Table S6), than with North African populations (although the proximity of North African groups, particularly Late Neolithic Morocco and the Guanches, to UE2298/MS060 changes when using Ust’-Ishim, a non-sub-Saharan African outgroup, suggesting that his genome may have some African-related ancestry). D-statistics consistently show UE2298/MS060 to be significantly closer to Iberian populations than to Iberomaurusians, Early Neolithic Morocco or the Guanches (Fig. 4; Supplementary Table S7). However, tests using Late Neolithic Morocco, in the form D(outgroup, UE2298/MS060; Morocco_LN, Iberian population), consistently generated results close to zero and non-significant (|Z|-score < 3), which might be an indicator that a population genetically close to Morocco_LN contributed to the ancestry of UE2298/MS060 in similar proportions to an Iberian source. We note that we did not observe any major differences in the patterns observed for outgroup-f3 and D-statistics using different approaches to minimise the effects of post-mortem damage (“mapDamage —rescale” and “soft-clipping”) (Supplementary Tables S6 and S7), but additional qpAdm models are accepted using “mapDamage –rescale” (Supplementary Tables S8 and S9).

Figure 4

Detection of North African- and European-related ancestries in the genome of UE2298/MS060. D(Chimp, UE2298/MS060; Iberian population, North African population). A significant negative D-value indicates that UE2298/MS060 shares more genetic drift with the Iberian population; a significant positive D indicates more shared drift with the North African population. Non-significant D indicates that UE2298/MS060 is symmetrically close to both populations tested (shown in yellow, with labels in bold). Error bars correspond to 2 standard errors. Detailed output can be found in Supplementary Table S7. Abbreviations as follows: (E/M/L)N, (Early/Middle/Late) Neolithic; Chl, Chalcolithic; BA, Bronze Age; IA, Iron Age; c., centuries.

We tested different qpAdm 1-way scenarios using different proximal Iberian sources as left populations. Models using populations from Andalusia (Iberia_c.5-8CE and Iberia_c.3-4CE, which already displayed North African-related ancestry6) are accepted (p-values: 0.092 and 0.343, respectively), whereas models using populations from Catalonia, in the northeast of the Peninsula, are rejected (p-value < 0.05) (Supplementary Table S8). However, considering the genetic heterogeneity in different regions of Iberia through time, and given the complex history of population interactions in Iberia during the first millennium CE16,18, it is unlikely that UE2298/MS060 descends directly from Andalusian Visigothic populations and therefore we also explored 2-way admixture scenarios. Notably, 1-way qpAdm analysis was consistent with UE2298/MS060 descending from Islamic_Andalusia (p-value = 0.327) but not from Islamic_Valencia (p-value = 0.0005), in line with the position of UE2298/MS060 in the PCA (Fig. 3b) and highlighting regional genetic differences during this period.

Alternatively, UE2298/MS060 could be modelled using 2-way combinations of distal and proximal Iberian populations (showing varied proportions of North-African related ancestry6) and either the Guanches or Morocco_LN (Table 1; Supplementary Table S9). D-statistics comparing these two North African populations indicate that UE2298/MS060 is closer to Morocco_LN (|Z|> 3) (Supplementary Table S7) than to the Guanches.

Table 1 Accepted 2-way qpAdm admixture models with standard errors (SE) and p-values. Models accepted using both datasets (“mapDamage –rescale” and “soft-clipping”) are shown in italics.

Mobility in Islamic Segorbe

In order to assess whether or not UE2298/MS060 was likely to have spent their childhood in the local region, we performed stable oxygen analysis on eight individuals from Plaza del Almudín. Tooth enamel carbonate data is presented in Supplementary Table S10 and plotted in Fig. 5a. The δ18OVSMOW values for the Segorbe population (excluding outlier MS075) range from 26.2 to 27.6‰ (range = 1.4‰, n = 7), with a mean of 26.8 ± 0.5‰ (1σ). The converted δ18Odw values (mean -6.0‰, excluding MS075) fit with the meteoric water values for eastern Iberian coast. The δ18OVSMOW values from both teeth sampled from UE2298/MS060 are consistent with the rest of the population and the small difference in values between the different molars (M1/M2 and M3) provide no indication of movement between early childhood and adolescence. Overall, there is no evidence that UE2298/MS060 was an immigrant in East Spain, on the basis of his oxygen values.

Figure 5

Mobility and diet in Islamic Segorbe. (a) Mobility isotopes (oxygen and carbon) for UE2298/MS060 and other individuals from Plaza del Almudín. (b) Dietary isotopes (carbon and nitrogen) from Plaza del Almudín compared to other medieval Islamic and Christian sites from Gandía and Valencia51,52. (c) FRUITS model for UE2298/MS060; models for other individuals can be found in Supplementary Fig. S11.

By contrast, one other individual reported here (MS075) seems to be an outlier (δ18OVSMOW = 30.6; > 1.5 times the interquartile range above quartile 3)47, and possibly a migrant from a warmer climate, with a δ18Odw value similar to Africa or the Near East48. Detailed results and discussion of oxygen analysis can be found in Supplementary Note 2.

Diet patterns in Islamic Segorbe

The values for δ15N and δ13C dietary isotopes in the Islamic necropolis of Plaza del Almudín range between 10.7 to 13.2‰ and from –17.8 to –11‰, respectively, for the 13 individuals studied (Fig. 5b; Supplementary Table S11). UE2298/MS060 has a δ15N value of 11.3‰ and a δ13C value of –17.4‰, showing lower δ15N and a more negative δ13C than the majority of the humans sampled from this assemblage. Application of a Bayesian mixing model (BMM), FRUITS (Food Reconstruction Using Isotopic Transferred Signals)49, supports the observation that C4 plants likely played a substantial part in the diet of some individuals and that marine fish consumption was variable (Supplementary Fig. S11). UE2298/MS060 (Fig. 5c) seems to have consumed limited amounts of C4-plants (mean: 11.4 ± 6.5% or 4.8–17.9% of the diet) and marine protein (mean: 2.4 ± 2.4% or 0–4.8% of the diet) compared to the rest of the population analysed. On the other hand, he seems to have the highest levels of mammal and C3-plant consumption amongst the analysed individuals (Supplementary Fig. S11).

Individual MS075, identified as a possible migrant due to their oxygen value, displays the lowest probability (close to zero) of marine fish consumption amongst the individuals studied here (Supplementary Fig. S11), and shows signals of a mixed C3/C4 diet, which is also a possibility for Africa50. Detailed results and discussion of diet patterns inferred from individuals from the site of Plaza del Almudín can be found in Supplementary Note 2.


We analysed individual UE2298/MS060 excavated from the Islamic necropolis of Plaza del Almudín, in Segorbe, dating to the eleventh century CE. The archaeologists responsible for the excavation in 1999 considered this individual unusual due to his considerable height compared with other individuals found at the same site (despite periods of disease and/or malnutrition in childhood)27, and dubbed him the “Segorbe Giant”. The subsequent anthropological analysis suggested some African morphological features and a link was postulated to the Berber-speaking populations that settled in the region in medieval times26,27.

Analysis of the uniparental markers from UE2298/MS060 fits well with this assumption, pointing to an origin in the Maghreb, most likely from a Berber group. MtDNA lineage U6a is not only connected to modern Amazigh populations30, but has also been found in Moroccan remains associated with Iberomaurusian culture, and in the Moroccan Early Neolithic site of Ifri n’Amr or Moussa2,32 (Fig. 1b). He also carries the Y-chromosome E1b1b1b1 (E–M310) lineage. E1b1b is extremely common amongst extant North Africans and has been found in ancient North African and Levantine remains2,32,33,37 (Supplementary Fig. S7). Due to low coverage, we could only assign him to a basal position within E1b1b1b1, but it is possible that he may belong to a more derived subclade. One possibility would be E1b1b1b1a (E–M81), which is the most common haplogroup amongst modern Berber males today42,53, and has been linked to Islamic remains in southern France38. Another would be its descendant E1b1b1b1a1-M183 lineage, identified in three Guanche males, in two Islamic individuals from Granada, and in an earlier sixth century CE male from the Visigoth phase of Pla de l’Horta, in Catalonia6,33.

Although he carries both uniparental markers of North African origin, autosomal evidence paints a more complex picture. The individual is positioned in the PCA mid-way between modern/ancient Iberian populations, and Late Neolithic Moroccan, Guanches and modern North African individuals (Fig. 3a), and formal tests of admixture point to high proportions of Iberian-like ancestry (Fig. 4; Supplementary Table S7).

Considering the archaeological and historical records for this period in the region of Valencia, we envisage three possible scenarios to explain the observed ancestry in UE2298/MS060. One would be to assume that this individual is a direct migrant from North Africa (whose unique genetic composition has not yet been examined using aDNA), or derives from a population that moved into Iberia but retained its genetic identity. A second scenario is that he descends from pre-Islamic Iberian genetic diversity. Finally, the third scenario is that he is the result of admixture between Iberian and North African sources.

The first scenario would imply that pre-Islamic populations in North Africa would be genetically similar to UE2298/MS060 (or possibly to other contemporary individuals found in Spain6). The nearest temporal proxy available are the Guanches (from the seventh–eleventh centuries CE), who originated in the Maghreb but have been isolated in the Canary Islands since at least the early Iron Age. D-statistics, however, suggest that UE2298/MS060 is genetically closer to Morocco_LN than to the Guanches (Supplementary Table S7). In any case, qpAdm rejects the hypothesis that UE2298/MS060 directly descends from a population resembling either the Guanches or Morocco_LN (Supplementary Table S8). Additionally, the oxygen data for UE2298/MS060 (Supplementary Note 2) is consistent with someone who grew up in the region, and points towards low mobility between early childhood and adolescence. (In contrast, another individual from the same necropolis (MS075) does look non-local (Supplementary Note 2), possibly a migrant from a warmer climate outside the Mediterranean, with oxygen values similar to those of Africa or the Near East48). Nevertheless, one should note that aDNA sampling in North Africa is sparse and limited to a few individuals from very specific sites and periods, and we cannot rule out that a population with a similar genetic composition to that of UE2298/MS060 existed in the region around this period.

Although North African-related ancestry in present-day Spain is present at low values (typically ~ 3–8%), with a slight southwest-to-northeast decline19,20, increased African-related ancestry has been present in south Spain since the third century CE6. This North African influence is captured in our qpAdm analysis, with 1-way models using pre-Islamic Andalusian populations being accepted (Supplementary Table S8). However, it is unlikely that UE2298/MS060 descends directly from Andalusian Visigothic populations and ultimately these models, despite being statistically plausible, do not fully explain the ancestry of our individual. We note that there are no data available from or around the region of Valencia between the end of the Iron Age and the Islamic period, and post-Iron-Age genetic variation in Spain was most likely very heterogeneous across locations and centuries6. This heterogeneity is confirmed by our results showing that UE2298/MS060 forms a clade with Islamic_Andalusia, but not with Islamic_Valencia (Supplementary Table S8).

The third scenario would be that the genetic variation seen in UE2298/MS060 was a result of admixture between Amazigh people who migrated from North Africa to Iberia, and the local population inhabiting the Peninsula, at some point during either the Islamic conquest, the Caliphate period, or the Berber empires. This would explain UE2298/MS060’s intermediate position in the PCA and ternary plot (supervised ADMIXTURE) (Fig. 3). D-statistics support this scenario, with tests comparing Morocco Late Neolithic and Iberian populations from different periods not showing him to be significantly closer to one or the other (Fig. 4; Supplementary Table S7). We show that UE2298/MS060 can be modelled as admixture between Iberian and North African sources (either the Guanches from the Canary Islands or Late Neolithic Moroccans) (Table 1). The fact that he still carried both uniparental markers of North African origin suggests that the admixture may have happened only a few generations before his time, coinciding with the zenith of Berber power, rather than earlier during the conquest, in agreement with admixture dates inferred from modern Iberian genomes from Aragon and Catalonia20. However, we cannot rule out assortative mating, allowing these uniparental markers to be retained for longer, or the possibility that these lineages were common in some Iberian populations before the Islamic period. The date of the burial (eleventh century CE)27 fits the historical narrative of Berber settlement in the region of Sharq al-Andalus18. Considering the genetic evidence, together with the stable isotope results and the historical accounts of intermarriage between local individuals and the North African newcomers, and in agreement with recent aDNA evidence from Iberia6, this third scenario seems the most plausible to explain the ancestry patterns seen in his genome.

Nevertheless, the original source populations are difficult to pinpoint. Due to lack of sampling in North Africa for this specific period and preceding centuries, the nearest proxies available for the North African source are the Guanches33 and the Late Neolithic Moroccan population from Kelif el Boroud site2. There is high differentiation between present-day North African populations and ancient North African individuals available to date (seen in PC3; Supplementary Fig. S9), which indicates that important population dynamics occurring after the Late Neolithic and/or Iron Age shaped extant genetic structure in the region. Modern North African populations show a signal of increased Levantine-related ancestry around the seventh century CE, as a result of movements from the Near East during the Islamic expansion into North Africa17; the impact of these movements was also seen in the Levant, as shown by the study of seventh–eighth century Islamic individuals in Syria54. Therefore, the North-African source of UE2298/MS060 might have already displayed this increased Near Eastern-related ancestry. Similarly, the population of Valencia in the immediately preceding centuries has yet to be studied.

A study in modern South Americans detected North African ancestry introduced at the early stages of European colonization55. The presence of individuals in medieval Spain with a genetic background similar to that of UE2298/MS060 would explain the source of this ancestry in America, suggesting that admixture with North Africans had a wider impact on medieval Spanish genetic variation, before virtually disappearing in the following centuries.

We found no U6 in our present-day whole-mtDNA dataset from the region of Valencia (n = 54), or in a larger previously published HVS-I database (n = 123)56. This absence might be an echo of the brutality of the decree of expulsion of Moriscos (Muslims forcibly converted to Christianity), which may have effectively erased the population carrying North African-related ancestry that lived in the region in the preceding centuries. They were replaced by settlers from regions further north with little North African-related ancestry20. This is in sharp contrast with regions of the Crown of Castilla, where historical sources claim there was better integration of the Morisco identity into the general population, and where no mass deportations were recorded: the frequency of U6, M1 and L lineages are higher in these regions today (present-day central and south Spain) (Fig. 1b; Supplementary Fig. S5). This pattern is also visible at the genome-wide level20.

This study emphasises the importance of immigration during the Islamic period. In contrast to Andalusia, the region of Valencia is not geographically close to the Maghreb, and was under Islamic rule for a shorter time, but nonetheless developed strong links with the Arab–Berber world during the Islamic period57. A contemporary individual, MS075, is evidence of continued movement during Berber rule (Supplementary Note 2).

UE2298/MS060 is a single, low-coverage sample and although the results cannot be extrapolated to the population as a whole, recently published results6 show a similar trend of admixture in Islamic Spain. The heterogeneity of genomic patterns that is now being uncovered by aDNA studies emphasises the need for much more detailed, high-resolution fine-scale studies. More individuals and a wider diversity of sites across the Peninsula should be studied to explore the population dynamics during the Islamic period in more detail and assess potential fine differences between geographical regions and periods, and between urban and rural societies.

