--- title: "fruit flies" author: "Chris Parrish" date: "January 19, 2016" output: pdf_document --- fruit flies reference: - Cannon, et al., Stat2, chapter 05, examples 5.1-5.6, 5.10-5.11 - Cannon, R Manual, chapter 5 Import the data. {r} data <- read.csv("FruitFlies.csv", header=TRUE) head(data, 4) dim(data)  Scatterplot matrix. {r} pairs(~ Longevity + Partners + Type + Thorax + Sleep + Treatment, data=data, col="darkred")  Variable $Treatment$ is a Factor. {r} str(data)  Use Lattice graphics to view the data. {r} library(lattice) xyplot(Longevity ~ Treatment, data=data)  Group statistics. {r} n <- with(data, tapply(Longevity, Treatment, length)) mean <- with(data, round(tapply(Longevity, Treatment, mean), 3)) sd <- with(data, round(tapply(Longevity, Treatment, sd), 3)) idx <- c(5, 1, 3, 2, 4) # idx orders the rows in the table fruitfly.statistics <- cbind(n, mean, sd)[idx, ] fruitfly.statistics  ANOVA with aov. {r} fruitfly.aov <- aov(Longevity ~ Treatment, data=data) fruitfly.aov options(show.signif.stars=FALSE) summary(fruitfly.aov)  SS and degrees of freedom for ANOVA. $Observed$ - $Grand\ Mean$ = $Group\ Effect$ + $Residual$ $df_{Total} = n - 1 = df_{Groups} + df_{Error}= (k - 1) + (n - k)$ Residuals. {r} qqnorm(resid(fruitfly.aov), col="cadetblue") qqline(resid(fruitfly.aov), col="orange") plot(predict(fruitfly.aov), resid(fruitfly.aov), pch=20, col="darkred") std.dev <- fruitfly.statistics[ , 3] std.dev ratio <- max(std.dev) / min(std.dev) ratio  Fisher's Least Significant Difference (= LSD). $$LSD = t^* \sqrt{MSE\Big(\frac{1}{n_i} + \frac{1}{n_j})}$$ If a difference of means is larger than the $LSD$ of those means then the associated CI does not contain 0, so the means are significantly different. Test for the difference of two means. $$CI: \bar{y}_{8v} - \bar{y}_{none} = \pm t^* \sqrt{MSE\Big(\frac{1}{n_{8v}} + \frac{1}{n_{none}})}$$ {r} y.bar.8v <- fruitfly.statistics[5, 2] y.bar.none <- fruitfly.statistics[1, 2] point.estimate <- y.bar.8v - y.bar.none alpha <- 0.05 df <- 120 t.star <- qt(c(alpha/2, 1 - alpha/2), df=df) mse <- 219.3 n.8v <- n.none <- 25 se <- sqrt(mse * (1 / n.8v + 1 / n.none)) ci <- point.estimate + t.star * se ci  Note that the right endpoint of the CI reported in the text (p.254) is not correct. Test all possible pairs of means. Each sample size is 25, so compare the absolute value of each difference of means with Fisher's LSD. If the difference of means is larger than Fisher's LSD, then the means are significantly different (because the associated CI will not contain 0). Which pairs of means are significantly different? {r} LSD <- t.star[2] * se LSD diffs <- outer(mean, t(mean), "-") diffs  Effect plot: without interaction. {r message=FALSE} fruitflies.lm1 <- lm(Longevity ~ Treatment, data=data) library(alr4) plot(allEffects(fruitflies.lm1))