10.19 Region-specific \(f_4\) ratio
Now we can re-calculate the \(f_4\)-ratio only within promoter regions.
# f4-ratio with only promoters
f4_filtered <- f4ratio(data = new_snps_keep,
X = pops, A = "Altai", B = "Vindija", C = "Yoruba", O = "Chimp") %>%
# convert z score to pvalue
mutate(p = 2 * pnorm(-abs(Zscore)))
f4_filtered
## A B X C O alpha stderr Zscore p
## 1 Altai Vindija French Yoruba Chimp -0.005541 0.028515 -0.194 0.84617588
## 2 Altai Vindija Sardinian Yoruba Chimp 0.002263 0.031027 0.073 0.94180612
## 3 Altai Vindija Han Yoruba Chimp 0.066668 0.029767 2.240 0.02509092
## 4 Altai Vindija Papuan Yoruba Chimp 0.010940 0.030057 0.364 0.71585801
## 5 Altai Vindija Khomani_San Yoruba Chimp 0.026367 0.031975 0.825 0.40937159
## 6 Altai Vindija Mbuti Yoruba Chimp 0.005176 0.030194 0.171 0.86422377
## 7 Altai Vindija Dinka Yoruba Chimp 0.008542 0.026768 0.319 0.74972651
Plot the region-excluded \(f_4\)-ratios
ggplot(f4_filtered,
aes(x = X, y = alpha, color = p < 0.05)) +
geom_point() +
geom_errorbar(aes(ymin = alpha - 2 * stderr, ymax = alpha + 2 * stderr), width = 0.5) +
geom_hline(yintercept = 0, linetype = 2) +
labs(y = "Neanderthal ancestry proportion", x = "Present-day individual")
Except for the Han population, we see almost no Neanderthal ancestry when we calculate the \(f_4\)-ratio within promoters – supporting the idea that functionally important genomic regions are depleted for Neanderthal introgression.