3.3 Setup

In this module, we’ll use sequencing data from families to look at the relationship between DNMs, crossovers, and parental age.

3.3.1 R packages

We’re using R’s tidyverse library to analyze our data. You can load this R package by running:

library(tidyverse)

3.3.2 Data

Our data comes from the supplementary tables of this paper by Halldorsson et al., which performed whole-genome sequencing on “trios” (two parents and one child) in Iceland. We’ve pre-processed the data to make it easier to work with.

Load the pre-processed data by running the code chunk below.

# read data
dnm_by_age <- read.table("dnm_by_age_tidy_Halldorsson.tsv",
                         sep = "\t", header = TRUE)
# preview data
head(dnm_by_age)
##   Proband_id n_paternal_dnm n_maternal_dnm n_na_dnm Father_age Mother_age
## 1        675             51             19        0         31         36
## 2       1097             26             12        1         19         19
## 3       1230             42             12        3         30         28
## 4       1481             53             14        1         32         20
## 5       1806             61             11        6         38         34
## 6       2280             63              9        3         38         20

The columns in this table are:

  1. Proband_id: ID of the child (i.e., “proband”)
  2. n_paternal_dnm: Number of DNMs (carried by the child) that came from the father
  3. n_maternal_dnm: Number of DNMs that came from the mother
  4. n_na_dnm: Number of DNMs whose parental origin can’t be determined
  5. Father_age: Father’s age at proband’s birth
  6. Mother_age: Mother’s age at proband’s birth