3.12 Optional homework

Assignment: Fit two linear models (one paternal, one maternal) to ask if there is an association between the number of parental crossovers and parental age. If there is an association, how is the number of crossovers predicted to change with every year of maternal/paternal age?


Solution
# fit the model with paternal age
fit_pat <- lm(data = crossovers,
              formula = n_pat_xover ~ Father_age)
summary(fit_pat)
## 
## Call:
## lm(formula = n_pat_xover ~ Father_age, data = crossovers)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.2173  -3.1880  -0.1997   2.8061  24.7652 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 26.369432   0.102736  256.67   <2e-16 ***
## Father_age  -0.005852   0.003462   -1.69    0.091 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.388 on 41090 degrees of freedom
## Multiple R-squared:  6.953e-05,  Adjusted R-squared:  4.519e-05 
## F-statistic: 2.857 on 1 and 41090 DF,  p-value: 0.09098

There isn’t a significant association between paternal age and the number of paternal crossovers (p = 0.091).

# fit the model with maternal age
fit_mat <- lm(data = crossovers,
              formula = n_mat_xover ~ Mother_age)
summary(fit_mat)
## 
## Call:
## lm(formula = n_mat_xover ~ Mother_age, data = crossovers)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -27.161  -6.095  -0.425   5.641  45.905 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 41.709271   0.206238  202.24   <2e-16 ***
## Mother_age   0.065989   0.007576    8.71   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.685 on 41090 degrees of freedom
## Multiple R-squared:  0.001843,   Adjusted R-squared:  0.001819 
## F-statistic: 75.87 on 1 and 41090 DF,  p-value: < 2.2e-16

Surprisingly, there is a significant association between maternal age and the number of maternal crossovers (p < 2e-16). For every year of maternal age, we expect the child to carry 0.07 additional maternal origin crossovers.

Although the maternal crossovers plot doesn’t look that impressive, our estimated slope is 0.07, which is probably too small to distinguish visually.