9.6 Data (for FST)
We’ll calculate \(\mathrm{F_{ST}}\) using genotype data from the 1000 Genomes Project. Read in the VCF using thevcfR
package:
## Scanning file to determine attributes.
## File attributes:
## meta lines: 19
## header_line: 20
## variant count: 9748
## column count: 2513
## Meta line 19 read in.
## All meta lines processed.
## gt matrix initialized.
## Character matrix gt created.
## Character matrix gt rows: 9748
## Character matrix gt cols: 2513
## skip: 0
## nrows: 9748
## row_num: 0
## Processed variant 1000Processed variant 2000Processed variant 3000Processed variant 4000Processed variant 5000Processed variant 6000Processed variant 7000Processed variant 8000Processed variant 9000Processed variant: 9748
## All variants processed
We’ll also read in a metadata table with information on which populations each sample is from.
## sample pop superpop sex
## 1 HG00096 GBR EUR male
## 2 HG00097 GBR EUR female
## 3 HG00099 GBR EUR female
## 4 HG00100 GBR EUR female
## 5 HG00101 GBR EUR male
## 6 HG00102 GBR EUR female