6.18 Conclusion

In this lab, we used genotype data from the 1000 Genomes Project to calculate two measures of population structure.

  • We explored the Geography of Genetic Variants browser, a useful resource for visualizing allele frequency differences between human populations.

  • Using genotype data from the 1000 Genomes Project, we plotted the allele frequency spectrum of variants in human populations.
    • We saw that humans carry an excess of rare variation due to recent population expansion.

  • Finally, we used principal component analysis to cluster individuals in our dataset by their genotype information. Plotting individuals in PCA space allowed us to distinguish the five superpopulations of 1000 Genomes.