4.9 Measuring LD with \(D'\)

Aside from being nonzero, what does the value of \(D\) mean? This is surprisingly hard to interpret because the minimum and maximum value of \(D\) is different for every pair of SNPs.

Why does the range of \(D\) change?

The possible values of \(D\) depend on the frequencies of the alleles at each SNP. For example:

If \(p_1 = 0.5\) and \(p_2 = 0.5\), then \(D\) is between \([-0.25, 0.25]\)
If \(p_1 = 0.1\) and \(p_2 = 0.7\), then \(D\) is between \([-0.07, 0.03]\)

The \(\mathbf{D'}\) statistic fixes this issue by dividing \(D\) by its theoretical maximum. \(D'\) is constrained between \([-1, 1]\), where more extreme values denote stronger LD.

\[ D' = \frac{D}{\mathrm{max}(-p_1 p_2, -(1-p_1)(1-p_2))}, \mathrm{\:for\:} D < 0 \\ D' = \frac{D}{\mathrm{min}(p_1 (1-p_2), p_2(1-p_1) )}, \mathrm{\:for\:} D > 0 \]

\(p_1\) and \(p_2\) are the frequencies of the alleles at SNP1 and SNP2.

Use this formula to calculate \(D'\) for our two SNPs of interest.

Because \(D\) is positive, we use the second formula for \(D'\). First, we need to find the denominator, which is the minimum of \(p_1 (1-p_2)\) and \(p_2 (1-p_1)\).

p1 * (1-p2)

## [1] 0.3008145

p2 * (1-p1)

## [1] 0.1748161

p2 * (1-p1) is smaller, so we plug that into our \(D'\) formula:

Dprime <- D / (p2 * (1-p1))
Dprime

## [1] 0.8058206

This tells us that LD between these two SNPs is 80.6% of its theoretical maximum.