4.7 Measuring LD with \(D\)

If SNP1 and SNP2 are in linkage equilibrium, the probability of seeing an A C haplotype should be equal to the product of the allele frequencies of A and C. This is simply the probablity of observing two events together if the events are independent.

Otherwise, for SNPs that are not independent of each other, we should see A C either more or less often than expected from the allele frequencies.

This intuition is summarized in \(\mathbf{D}\), a population genetics statistic for measuring LD between two SNPs.

\[ D = h_{12} - p_1*p_2 \]

  • \(\mathbf{h_{12}}\) is the frequency of our haplotype of interest (A C).
  • \(\mathbf{p_1*p_2}\) is the product of the frequencies of the two alleles on this haplotype (A at SNP1 and C at SNP2)

How do we interpret \(D\)?

If two SNPs are in linkage equilibrium, \(h_{12}\) and \(p_1*p_2\) should be the same, and we should get \(D = 0\).

If two SNPs are in linkage disequilibrium, \(p_1*p_2\) should be different from \(h_{12}\), so that \(D \neq 0\).