Problem Set 3.

 

A. This problem is from the book Generalized Linear Models by McCulloch and Nelder.

 

The data in eyes.dat on the website were collected by Sir Francis Galton in an investigation of eye color inheritance in human populations. Each row of the table corresponds to a different family (w/ at least six children!). Associated with each family is a combination of parents’ eyes (the P colums) and grandparents’ eyes (the G columns). The last two columns count the total number children and total number of light eyed children in each family.

 

  1. Construct a graph to summarize this experiment.
  2. Set up a six level factor, P, one level corresponding to each discernable combination of paternal eye color combinations.
  3. Set up a factor, G, corresponding to each discernable combinations of grandparent eye color combinations. How many levels does G have? What assumption does this factor and the one in 2 above make?
  4. Fit a model that estimates the probability of obtaining light eyed children as a function of P. Construct a plot to summarize the fit of the model. Discard points that clearly do not fit the model and refit.
  5. Construct a table that has one parent’s eye color on one axis, the other on the second axis, and fill in the associated probabilities of having a light eyed child.
    (How would you add confidence intervals to the probabilities in this table? Do so for one entry.)
  6. Add G to the model and refit.
  7. Construct a plot to summarize the fit of this model. Discard points that clearly do not fit the model and refit.
  8. Does G improve the fit? Test this. If it does, then summarize the experiment with a new table.
  9. In question 3, you had several choices for the type of model to fit. What are a few of those choices. In terms of estimating probabilities, does it matter which one you chose? Why or why not?

 

B. Suppose the families in the eyes data are representative of a certain population.

 

  1. Build a model to assess whether parents with certain eye color combinations are more likely to have more children than others. (Hint, the response is a count.)
  2. Assess the fit of the model graphically and summarize the conclusions graphically.
  3. Perform a test to see if parent’s eye color combination is a helpful covariate.