4.2 Why do we care about LD?

As a result of linkage disequilibrium, knowledge of a genotype at one site in the genome can provide information about the genotype at another site, even if the second site was not actually genotyped. Using prior knowledge of LD to “fill in” missing genotype information is a process called imputation.

Linkage disequilibrium also means that correlation between a genotype at a particular site and phenotype (e.g., disease outcome) does not imply causation. Even ignoring other possible confounders, any variant on the same haplotype could be driving the association.

Beyond mutation and recombination, other evolutionary forces such as gene flow, genetic drift, and natural selection can also influence patterns of LD observed in population genetic data. Measuring linkage disequilibrium is therefore important for both medical and evolutionary studies.

Fig. 2. LD can be used to impute missing genotypes, but also complicates genetic association studies (such as finding variants that cause disease). Non-causal variants in LD will perfectly co-occur with the causal variant, making it difficult to determine which one is truly causal.