Demystifying Oracles: What is and how Gedmatch’s Oracles works

WGedmatch’s Oracles are mathematical approaches that look for the N populations that make up the% of a person’s results.

These approaches are not based on any logical rule or that take into account origins,…. Simply explained, what it does is compare the difference between your% and that of X sample populations. The population with the lowest result is the one that theoretically most resembles yours.

Note: for didactic reasons the oracle calculation process is simplified, leaving the essence of the process.

Imagine a very simple case of Oracle 1 population

Oracle 1 population

Your DNA equals a value of 10

The samples we have are:

Spanish: 9
Frances: 8
Portuguese: 8.5
German: 6
Chinese: 1
If we subtract your DNA from each of these samples, the result looks like this:

Spain: 10 – 9 => 1
France: 10 – 8 => 2
Portuguese: 10 – 8.5 => 1.5
German: 10 – 6 => 4
Chinese: 10 – 1 => 9

The result is what in Gedmatch they call Distance. As you will know the closer to 0 the better.

We order the results from least to greatest and we have:

Spain @ 1
Portuguese @ 1.5
France @ 2
German @ 4
Chinese @ 9
And with this you already have the Oracle 1 population.

Oracle 2 populations

The Oracle 2 populations look for populations that reduced by 50% each have less distance to yours:

If your DNA is 10 and the 50% samples look like this:

Spanish 9 – 50% = 4.5
French 8 – 50% = 4
Portuguese 8.5 – 50% = 4.25
German 6 – 50% = 3
Chinese 1 – 50% = 0.5

If we look for all the differences between your DNA and the 50% combinations added:

10 – (50% Spanish (4.5) + 50% Spanish (4.5)) => 10 – 9 => 1
10 – (50% Spanish (4.5) + 50% French (4)) => 10 – 8.5 => 1.5
10 – (50% Spanish (4.5) + 50% Portuguese (4.25)) => 10 – 8.75 => 1.25
10 – (50% Spanish (4.5) + 50% German (3)) => 10 – 7.5 => 2.5
10 – (50% Spanish (4.5) + 50% Chinese (0.5)) => 10 – 5 => 5
……… WE MAKE ALL THE COMBINATIONS
10 – (50% French (4) + 50% French (4)) => 10 – 8 => 2
10 – (50% French (4) + 50% Spanish (4.5)) => 10 – 8.5 => 1.5
10 – (50% French (4) + 50% Portuguese (4.25)) => 10 – 8.25 => 1.25
10 – (50% French (4) + 50% German (3)) => 10 – 7 => 2.5
10 – (50% French (4) + 50% Chinese (0.5)) => 10 – 4.5 => 5.5

We repeat the process with all the combinations and then order, resulting:

50% Spanish + 50% Spanish @ 1
50% Spanish + 50% Portuguese @ 1.25
50% Spanish + 50% French @ 1.5
50% Spanish + 50% German @ 2.5
50% Spanish + 50% Chinese @ 5

Oracle 4 Populations

The Oracle 4 populations are looking for populations that reduced by 25% each, have a smaller distance to yours:

If your DNA is 10 and the 25% samples look like this:

Spanish 9 – 25% = 2.25
French 8 – 25% = 2
Portuguese 8.5 – 25% = 2,125
German 6 – 25% = 1.5
Chinese 1 – 25% = 0.25

Note for reasons of example, it only shows the first two combinations, but in reality all existing combinations are made in the Oracle population set:

10 – (25% Spanish (2.25) + 25% Spanish (2.25) + 25% Spanish (2.25) + 25% Spanish (2.25)) => 10 – 9 => 1
10 – (25% Spanish (2.25) + 25% Spanish (2.25) + 25% Spanish (2.25) + 25% French (2)) => 10 – 8.75 => 1.25
10 – (25% Spanish (2.25) + 25% Spanish (2.25) + 25% Spanish (2.25) + 25% Portuguese (2,125)) => 10 – 8,875 => 1,125
10 – (25% Spanish (2.25) + 25% Spanish (2.25) + 25% Spanish (2.25) + 25% German (1.5)) => 10 – 8.25 => 1.75
10 – (25% Spanish (2.25) + 25% Spanish (2.25) + 25% Spanish (2.25) + 25% Chinese (0.25)) => 10 – 7 => 3

We repeat the process with all the combinations (they are dozens of combinations …) and then we order, being result:

25% Spanish + 25% Spanish + 25% Spanish + 25% Spanish @ 1
25% Spanish + 25% Spanish + 25% Spanish + 25% Portuguese @ 1,125
25% Spanish + 25% Spanish + 25% Spanish + 25% French @ 1.25
25% Spanish + 25% Spanish + 25% Spanish + 25% German @ 1.75
25% Spanish + 25% Spanish + 25% Spanish + 25% Chinese @ 1.75

As you can see, there is no magic, in the end it is pure mathematics that will give results as consistent as we want to interpret them.

There are several important factors to note:

It definitely influences the amount of samples that Oracle has. If only we had the sample of a Chinese, naturally, it is the only combination that will give us;).

The oracle are able to find continental differences, so if someone has a Chinese grandfather and the rest Spanish, this combination will come out in Oracle.

Master populations are based on real people, who claim to have 4 grandparents from the same area. Given this, the samples may be incorrect, mixed, … That is why we should not take it as an absolute truth.

Proxy Populations

A population is composed of several common origins, for example, the Ashkenazi are a mixture of Mediterranean and Levantine. Let’s say that if this population did not exist, it would be 50% Italian + 50% Lebanese, for example, as two populations that have that genetic similarity.

If we take into account this detail, the concept of Proxy Population is defined as a population that is composed of a common origin to yours and hides this background.

For example, if someone has an origin from an area of ​​the Peninsula that has some Italian influence, it is very possible that the combinations for his proportions will be result in a 25% ashkenazi, which as you can see is simply because an ashkenazi is 50% “Italian”, so it is an ashkenazi reduced its composition by 25%, it is very possible that the result fits.

If we put an extreme example, if a son of a Chinese and a German is introduced into the Oracle as a sample of German (by mistake naturally), when a Chinese performs the Oracle, it will result in a 50% “German”. Naturally, the sample is incorrect and induces these errors.

Many populations are very similar, so the% of their cake will have minimal differences and in the oracles they will alternate with each other. Therefore, the difference between two Valencians from differentiated areas is very subtle, so it is very unlikely that an Oracle will ever highlight these differences. Yes, you can make the difference between a Basque and a Valencian, because they do have a noticeable difference.