Problem Set 3.
A. This problem is from the book Generalized Linear Models
by McCulloch and Nelder.
The data in eyes.dat on the
website were collected by Sir Francis Galton in an
investigation of eye color inheritance in human populations. Each row of the
table corresponds to a different family (w/ at least six children!). Associated
with each family is a combination of parents’ eyes (the P colums) and grandparents’ eyes (the G columns). The
last two columns count the total number children and total number of light eyed
children in each family.
- Construct
a graph to summarize this experiment.
- Set up
a six level factor, P, one level corresponding to each discernable
combination of paternal eye color combinations.
- Set up
a factor, G, corresponding to each discernable combinations of grandparent
eye color combinations. How many levels does G have? What assumption does
this factor and the one in 2 above make?
- Fit a
model that estimates the probability of obtaining light eyed children as a
function of P. Construct a plot to summarize the fit of the model. Discard
points that clearly do not fit the model and refit.
- Construct
a table that has one parent’s eye color on one axis, the other on the
second axis, and fill in the associated probabilities of having a light
eyed child.
(How would you add confidence intervals to the probabilities in this
table? Do so for one entry.)
- Add G
to the model and refit.
- Construct
a plot to summarize the fit of this model. Discard points that clearly do
not fit the model and refit.
- Does G
improve the fit? Test this. If it does, then summarize the experiment with
a new table.
- In
question 3, you had several choices for the type of model to fit. What are
a few of those choices. In terms of estimating
probabilities, does it matter which one you chose? Why or why not?
B. Suppose the families in the eyes data are representative
of a certain population.
- Build
a model to assess whether parents with certain eye color combinations are more
likely to have more children than others. (Hint, the response is a count.)
- Assess
the fit of the model graphically and summarize the conclusions
graphically.
- Perform
a test to see if parent’s eye color combination is a helpful covariate.