8.19 Conclusion
We used genotype and simulated phenotype data from the 1000 Genomes Project to perform a genome-wide association study for variants associated with drug \(\mathrm{IC_{50}}\).
- Using linear regression, we first did GWAS “by hand” on just one variant in the VCF. We fit a linear model to ask whether there’s a significant relationship between genotype and phenotype.
- We then used PLINK to perform this test on every SNP in the genome.
- We followed up on the top SNP from our GWAS by plotting boxplots of phenotype stratified by genotype.