10.14 \(f_{4}\) statistic
The \(\mathbf{f_{4}}\) statistic – not to be confused with the \(\mathrm{F_{ST}}\) from the previous week – is very similar to the D statistic. Its main advantage is that it is proportional to the branch length separating two pairs of populations.
Compute the \(f_{4}\) statistic for all populations using the code below:
f4_result <- f4(data = snps,
W = pops, X = "Yoruba", Y = "Vindija", Z = "Chimp") %>%
# convert z score into pvalue
mutate(p = 2 * pnorm(-abs(Zscore)))
f4_result
## W X Y Z f4 stderr Zscore BABA ABBA nsnps
## 1 French Yoruba Vindija Chimp 0.001965 0.000437 4.501 15802 14844 487753
## 2 Sardinian Yoruba Vindija Chimp 0.001798 0.000427 4.209 15729 14852 487646
## 3 Han Yoruba Vindija Chimp 0.001746 0.000418 4.178 15780 14928 487925
## 4 Papuan Yoruba Vindija Chimp 0.002890 0.000417 6.924 16131 14721 487694
## 5 Khomani_San Yoruba Vindija Chimp 0.000436 0.000415 1.051 16168 15955 487564
## 6 Mbuti Yoruba Vindija Chimp -0.000030 0.000410 -0.074 15751 15766 487642
## 7 Dinka Yoruba Vindija Chimp -0.000057 0.000380 -0.151 15131 15159 487667
## p
## 1 6.763451e-06
## 2 2.565034e-05
## 3 2.940837e-05
## 4 4.390659e-12
## 5 2.932586e-01
## 6 9.410104e-01
## 7 8.799757e-01
Note that the p-values are the same as when we calculated the \(D\) statistic, but the actual \(f_4\) values are different.