4.7 Measuring LD with \(D\)
If SNP1 and SNP2 are in linkage equilibrium, the probability of seeing an A
C
haplotype should be equal to the product of the allele frequencies of A
and C
. This is simply the probablity of observing two events together if the events are independent.
Otherwise, for SNPs that are not independent of each other, we should see A
C
either more or less often than expected from the allele frequencies.
This intuition is summarized in \(\mathbf{D}\), a population genetics statistic for measuring LD between two SNPs.
\[ D = h_{12} - p_1*p_2 \]
- \(\mathbf{h_{12}}\) is the frequency of our haplotype of interest (
A
C
). - \(\mathbf{p_1*p_2}\) is the product of the frequencies of the two alleles on this haplotype (
A
at SNP1 andC
at SNP2)
How do we interpret \(D\)?
If two SNPs are in linkage equilibrium, \(h_{12}\) and \(p_1*p_2\) should be the same, and we should get \(D = 0\).
If two SNPs are in linkage disequilibrium, \(p_1*p_2\) should be different from \(h_{12}\), so that \(D \neq 0\).