8.5 VCF data

The data section of a VCF describes genetic variants.

The first 9 columns give information about the variant itself – its position, the reference/alternative alleles, etc. The rest of the columns are sample-specific, and contain the individual’s genotype at that variant.

1   558185  rs9699599   A   G   .   .   PR  GT  0/0 0/0 0/0 0/1 0/0 0/1 ./. 0/0 0/0 0/0 0/1 0/0 0/0 0/0 0/0 0/1 0/0 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 ./. 0/0 0/0 0/0 0/0 0/1 0/0 0/1 0/1 0/0 0/1 ./. 0/1 0/1 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/0 0/1 0/0 0/1 0/0 0/0 0/0 0/1 0/0 0/1 0/0 0/0 0/0 ./. 0/0 0/0 0/0 0/1 0/0 0/1 0/0 0/1 0/0 0/1 0/0 0/0 0/1 0/0 ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./. ./.

How do you interpret VCF genotypes?
  • 0/0: homozygous reference (does not carry the variant)
  • 0/1 or 1|0: heterozygous
  • 1/1: homozygous alternate (both chromosomes have the variant)
  • ./.: Missing genotype (could not be confidently called)

The sample-specific columns often include additional genotype information, like the number of sequencing reads from the individual that support the reference vs. alternative alleles. The included fields are specified column 9 (FORMAT) (which in this case just reads GT, for “genotype”).