3.3 Setup
In this module, we’ll use sequencing data from families to look at the relationship between DNMs, crossovers, and parental age.
3.3.1 R packages
We’re using R’s tidyverse
library to analyze our data. You can load this R package by running:
3.3.2 Data
Our data comes from the supplementary tables of this paper by Halldorsson et al., which performed whole-genome sequencing on “trios” (two parents and one child) in Iceland. We’ve pre-processed the data to make it easier to work with.
Load the pre-processed data by running the code chunk below.
# read data
dnm_by_age <- read.table("dnm_by_age_tidy_Halldorsson.tsv",
sep = "\t", header = TRUE)
# preview data
head(dnm_by_age)
## Proband_id n_paternal_dnm n_maternal_dnm n_na_dnm Father_age Mother_age
## 1 675 51 19 0 31 36
## 2 1097 26 12 1 19 19
## 3 1230 42 12 3 30 28
## 4 1481 53 14 1 32 20
## 5 1806 61 11 6 38 34
## 6 2280 63 9 3 38 20
The columns in this table are:
Proband_id
: ID of the child (i.e., “proband”)n_paternal_dnm
: Number of DNMs (carried by the child) that came from the fathern_maternal_dnm
: Number of DNMs that came from the mothern_na_dnm
: Number of DNMs whose parental origin can’t be determinedFather_age
: Father’s age at proband’s birthMother_age
: Mother’s age at proband’s birth