ST505/ST697R, Regression Analysis and Regression Modelling. Fall 2012
NEWEST STUFF/REMINDERS
DEC. 7 HOMEWORK 9 SOLUTION POSTED (PROBLEMS 1 AND 3 COVER;
RELEVANT MATERIAL FOR EVERYONE)
FINAL EXAM: Monday, Dec. 10 1:30 pm in LGRT 101.
Information on "final" exam.
SUNDAY, DEC. 9. 4 p.m. Open office hours
- not a prepared review. I'll answer questions
as long as people have them.
Monday, Dec. 10. Office hours 11-12.
Dec. 7: Added a couple of pages of closing notes.
Course
Description, policies and tentative Syllabus.
LISTING OF POTENTIAL IE PROJECTS (THIS INCLUDES INFO ON THE
SMSA DATA THAT 69R STUDENTS ARE USING IN HW 7
IE project description (for ST505 students).
IE project Step 2.
Updated: Nov. 9. IE project update (for ST505 students).
IE group assignments (ST505 students). Updated Oct. 19
IE Presentation days
Link to Chapters 1 and 2 of the book and student solution
manual
OFFICE HOURS
Week of Dec. 3
Mon: 1:30-3:30
Tue: 3 - 4
Thur: 2-3:30
And by appointment. Contact me if
have questions and can't make office hours.
TA OFFICE HOURS
TA: Viet Nguyen, office: LGRT 1335L
Viet will handle computational questions
Other questions can be directed to me in class
or in office hours
Tuesday: 2:00-3:00
Wednesday 4:00-5:00
Thursday 2:00- 3:00
LECTURE NOTES/EXAMPLES
Lecture notes: Pages 1-142
Dec. 2 : Pages 136-142 added. Intro to nonparmetric reg and serial correlation.
Plots for yield example.
Reading
Detailed itemization of upcoming reading/coverage.
READING
Nov. 29: Reading on additional diagnostics: Section 10.2, 10.3 and
10.4 (but just Cook's distance in section 10.4)
Nov. 13: Reading on model building (which we'll start in on Friday, Nov. 16).
Chapter 9. Skip Press and data splitting.
Previous reading.
Schedule
Monday. Dec. 3
- ST505 student working on IE projects
- Lecture for ST697R: Some additional diagnostics;
nonparametric regression (mainly as a diagnostic tool with one
variables). Section 3.10
Wed. Dec. 5
Autocorelation in time series. The basic
problem and one correction technque.
Parts of Sections 12.1-12.4 (up to page 494)
See notes for coverage.
FRIDAY, DEC. 7
- Comments on homeworks 8 and 9
- Closing remarks
- evaluations
HOMEWORK/ASSIGNMENTS
Homework 1: Due Friday, Sept. 14
Homework 1 solution
Homework 2: Due Wednesday, Sept. 26
Homework 2 solution
Homework 2, SAS code and output for problem 1
Homework 2, R code and output for problem 1
Homework 3: Due Friday, Oct. 5 (start of class)
Cow carcass and crime data are below.
Homework 3 solution.
Homework 4: Due Friday, Oct. 12
Kishi data below.
Homework 4 solution.
Homework 5: Due Monday, Oct. 22, start of class (no exceptions)
Note in problem 2 it should be testing beta0=0 and beta1 = 1.
An earlier version of this was linking to an unedited version.
Homework 5 solution.
Homework 6: Due Wed. Nov. 7 (by 4 p.m. sharp!)
Homework 6 solution.
Homework 7. Due: Wed. Nov 21
Homework 7 solution.
R code and output for weighted analysis of house data: hw 7
Homework 8. One problem is due Fri. Nov. 30 (start of class)
Homework 8 solution.
Homework 9. ST697R students only.
Homework 9 solution.
R code and output associated with homework 8
Exam and grading
Midterm Solution
(Note: a 25 got left off right at end of second version of answer to 2b in open book).
Grading information: summary of midterm scores and overall grading.
Final Solution
COMPUTING
General comments on computing
An introduction to both SAS and R (from 2011 ST597 notes)
This has A LOT more than what you need. See sections
1 - 3 (intro to SAS) and Sections 15.1 - 15.4 (intro to R).
We will demonstrate the key features in class.
An introduction to R in 3 parts. The first part is the
same as Chapter 15 in the link above.
SAS links
R links
R programs
Reading, listing and plotting data (with header names in the data file)
and fitting a simple linear regression model.
Reading data (with no header names in the data file)
concise summary of how to run R programs above and save things (also sent in email).
More analysis of the pH example (confidence intervals for coefficients,
residuals, etc).
( Updated Tues. Sept 18 to include confidence
intervals on E(Y) and prediction intervals)
Class handout of pH example output from R.
Simulating estimated coefficients and MSE for simple linear
Note: You don't need to understand the programming in the simulation
program; just how to run it (will be demoed in class Wed.)
Analyzing the kishi data with illustration of how to use
R as a calculator to get simultaneous intervals.
Class handout of Kishi example using R.
Illustrating computing and plotting confidence bands using Kishi data.
Illustrating correlation analysis using Brain data.
Puffin diagnostics. With hist corrected and dropping PIs and CIs
Esterase assay example; testing for constant variance.
Also shows you how to route multiple plots to a file.
Testing lack of fit using cholesteroal data
Comparing means and assessing variance with grouped data (kishi example)
R code for Ramus example, page 65 of notes and GPA example(page 66).
R code for weighted least squares of ester example; parts of pages 71-75 of notes.
R code for house price example in notes
R code for house price example in notes. Modified to handle
data with no missing and do overall F test in different way.
Descriptive statistics in R
Analyzing qualitative variables; Brain data
Variable selection: surgery example
Additional diagnostics (cook's distance, etc.) Housse example
SAS programs
Reading, listing and plotting data
and fitting a simple linear regression model.
Regression analysis of pH data. confidence intervals, etc.
Note that the last part shows how you could customize graphs.
For hwk. can just use plots out of prog reg.
Simulating estimated coefficients and MSE for simple linear
regression
( Modified, Wed. Sept 19 to remove shrinking of graph size).
Note: You don't need to understand the programming in the simulation
program; just how to run it (will be demoed in class Wed.)
SAS files for Kishi example. Pages 30 and following in notes
SAS file to do inverse prediction
SAS files to do regulation
SAS file for Brain size example on page 42-44 of notes.
SAS file for Puffin example diagnostics, page 49 of notes.
Esterase assay example with residual analysis. Page 56 of notes
SAS file for lack of fit in Chol. example, page 62 of notes.
Kishi SAS file for assessing constant variance with
grouped data. Page 63 of notes
SAS file for Ramus example, page 65 of notes..
SAS file for GPA example, page 66 of notes..
SAS file for weighted least squares in Ester example;
parts of pages 71-75 of notes.
SAS file for house price example in notes.
SAS file for yield, moisture, temp example; page 97 and folllowing in notes
Analyzing qualitative variables with SAS; Brain data
Variable selection: surgery example
Additional diagnostics (cook's distance, etc.) Housse example
DATA
pH data file.
Breakage data from Ex. 1.21 (no names).
Breakage data from Ex. 1.21 (with names in first row).
Hardness data from Ex 1.22 (for homework 2. Note that Y is
first then X) ).
Cow carcasses and pH
Cow carcasses and pH data (txt file with names in first row). HW 3.
Crime data for homework 3
Kishi nitrogen intake-balance data (no names).
Puffin data (no names)
Esterase assay data (no names)
Virus-survival data for homework 5
Insurance data: IE project
Sex discrimination data: IE project
1985 CPS wage data: IE project
SENIC data: Infections in hospital IE project
House Price data. Example in notes
House Price data. NO missing for price, sqft or tax. comma separated.
Patient data for homework 6.
Modified MCAS data
Reading MCAS data
SMSA data for s697R. MODIFIED SAT DEC 1. DROP FT. WORTH ADD NAMES
Yield data
Brain data, no names.
Copier data for homework 8
Combines problem 8.15 and 1.20. Data in order: service, number, type.
MISCELLANEOUS
Who is Francis Galton and what does he have to do with regression?
U.S. weather extremes
Dallas marathon times plot. Analysis is part of homework 2
Cell phones and driving: What is the regression here doing?
Brain and Spinal Cord Interaction: Regression example.
So what's a puffin look like?
Florida 2000 election example