As always, feel free to leave anonymous feedback here: https://goo.gl/forms/fKjLeKItix2Djg5l2

Lecture review, notation/terminology clarification

Relevant reading: Parts of sections 11.3 and 11.4

For #3 on the homework, you may use lm() to compute familiar quantities (residuals, fitted values), but please use them to manually compute the new quantities (std. residuals, pred. residuals, std. pred. residuals, Cook’s distance). You may use R functions to check that your computations are correct.

Examples in simple regression

x <- c((-4):4, 20)
n <- length(x)
p <- 1
y <- x + 1 + rnorm(n, sd=0.5)
y[n] <- 0
MakePlots(x, y) # custom function that I wrote

y[n] <- x[n] + 1 + rnorm(1, sd=0.5)
MakePlots(x, y)

x <- c((-4):4)
n <- length(x)
p <- 1
y <- x + 1 + rnorm(n, sd=0.5)
y[n/2+1] <- 10
MakePlots(x, y)

Testing for outliers

Read 11.3.1 in Fox.

XKCD: Significant