Clusters Galore analysis of West Eurasians

It's been a while since the last Clusters Galore analysis, so I've decided to use my recently assembled dataset and run such an analysis over the individuals who belonged to the Six main West Eurasian components.

Hence, at the beginning, I identified 945 individuals in my set who had more than 95% combined admixture proportions in the Six. Subsequently, I ran MDS on this set, keeping 50 dimensions.

One of the open issues in Clusters Galore analysis is how to choose how many MDS dimensions to retain. So far, I've applied a heuristic by choosing the number of MDS dimensions that maximizes the number of inferred clusters by MCLUST. However, when I actually inspect the MDS plots, it often turns out that meaningful information seems present at even higher number of MDS dimensions. As a result, I've decided to pick the number of dimensions in the following manner.

The main idea is that data points in uninformative MDS dimensions will appear as largely Gaussian noise. So, we can use a test of normality (I've chosen the Shapiro-Wilk test) to detect dimensions that appear not to be noise. Below is the p-value of this test for different MDS dimensions:

Up to 22 dimensions, there is a strong non-Gaussian signal (all p-values less than 0.001). Hence, I would use the first 22 dimensions in MCLUST analysis. With these dimensions, the number of inferred clusters was estimated as 35. So, this is something like a 6-fold increase in resolution over the Six components inferred by ADMIXTURE.

The cluster totals for the different populations can be seen in the spreadsheet.

Important Caveat: Some populations (e.g., Finnish_D, or Turkish_D) have a great number of individuals who do not meet the "95% in the Six" inclusion threshold. Hence, results are not representative for them, and simply indicate the cluster assignment of their subsets that do meet the threshold. You can check whether individuals have been removed from the original dataset by comparing sample sizes in the Clusters Galore spreadsheet with the K12a one.

Here are some observations on the 35 cluster. I will mention the modal population (or region) for each one:

Ashkenazi
Scandinavian
French
British Isles
Armenian
S Italian/Sicilian
Kurd
Greek
Cypriot
Balto-Slavic
Hungarian
Balkan
Sephardic
Spanish
Iberian
North Italian/Tuscan
Morocco Jews (main)
Saudis
Georgian/Abkhazian
Basque
Bedouin
Druze #1
Druze #2
Druze (main)
Mozabite (main)
Mozabite #1
Orkney
Sardinian
Azerbaijan Jews
Iran/Iraq Jews
Lezgins
Morocco Jews #1
Samaritan
Yemen Jews
Abkhazian

Clusters Galore analysis of West Eurasians

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112