8.8 Counting allele dosage
We’re often interested in encoding genotypes as a 0, 1, or 2, which you can think of as the dosage of the minor allele. This is an additive model, and assumes that the phenotype of the heterozygote is intermediate between those of the two homozygotes.
We can use the table
function on the gt_GT_alleles
column to quickly check how many individuals have each genotype.
##
## A/A A/G
## 66 20
Now we’ll use the mutate
function to create a new column of the dataframe that counts the dosage of the minor allele (i.e., how many G’s each person has at that SNP):
# convert genotypes to counts (i.e., dosage) of minor allele
test_snp_gt <- test_snp_gt %>%
# count number of Gs
mutate(dosage = str_count(gt_GT_alleles, "G")) %>%
drop_na()
head(test_snp_gt)
## # A tibble: 6 × 6
## ChromKey POS Indiv gt_GT gt_GT_alleles dosage
## <int> <int> <chr> <chr> <chr> <int>
## 1 1 558185 1001 0/0 A/A 0
## 2 1 558185 1002 0/0 A/A 0
## 3 1 558185 1003 0/0 A/A 0
## 4 1 558185 1004 0/1 A/G 1
## 5 1 558185 1005 0/0 A/A 0
## 6 1 558185 1006 0/1 A/G 1
Checking our work with table
If we run table
on the dosage
column, we should get the same breakdown of genotypes as we got from the gt_GT_alleles
columns.
##
## 0 1
## 66 20