## Statistics 375## Mathematical Statistics |

Laura Chihara and Tim Hesterberg make available on their web site a comprehensive collection of files illustrating the use of the R programming language in support of their course. The following scripts used those files as their starting points, and added a few pirouettes and swirls that I managed to dream up while working through the material. The alert reader could decipher which are the contributions of the book's authors and which are mine by a simple line-by-line textual analysis of the original files which are available online and the modified ones which are here, but the bottom line is that virtually all of the leadership was provided by the authors, and the following scripts are products of an appreciative student working at absorbing this material. These notes do not contain the solutions to any of the exercises in their text.

- 1. Data and Case Studies (R script)
- 2. Exploratory Data Analysis (R script)
- 3. Hypothesis Testing (R script)
- 4. Sampling Distributions (R script)
- 5. The Bootstrap (R script)
- 6. Estimation (R script)
- 7. Classical Inference : Confidence Intervals (R script)
- 8. Classical Inference : Hypothesis Testing (R script)
- 9. Regression (R script)
- 10. Bayesian Methods (R script)
- 11. Additional Topics

Bradley Efron and Robert Tibshirani's * An Introduction to the Bootstrap* is a marvelous introduction to wide swaths of modern statistics. Fresh and engaging, the examples and discussions create energy and excitement in the exploring student. It is a fine bridge to the survey of much of contemporary statistics presented in Carl Morris and Robert Tibshirani's *The Science of Bradley Efron: Selected Papers*.

- 1. Introduction (R script, aspirin.r)
- 2. The Accuracy of a Sample Mean (R script, mouse.r)
- 3. Random Samples and Probabilities (R script)
- 4. The Empirical Distribution Function and the Plug-In Principle
- 5. Standard Errors and Estimated Standard Errors
- 6. The Bootstrap Estimate of Standard Error (R script)
- 7. Bootstrap Standard Errors: Some Examples (R script)
- 8. More Complicated Data Structures (R script)
- 9. Regression Models (R script)
- 10. Estimates of Bias (R script)
- 11. The Jackknife (R script)
- 12. Confidence Intervals Based on Bootstrap "Tables" (R script)
- 13. Confidence Intervals Based on Bootstrap Percentiles
- 14. Better Bootstrap Confidence Intervals
- 15. Permutation Tests
- 16. Hypothesis Testing with the Bootstrap
- 17. Cross-Validation and Other Estimates of Prediction Error
- 18. Adaptive Estimation and Calibration
- 19. Assessing the Error in Bootstrap Estimates
- 20. A Geometrical Representation for the Bootstrap and Jackknife
- 21. An Overview of Nonparametric and Parametric Inference
- 22. Further Topics in Bootstrap Confidence Intervals
- 23. Efficient Bootstrap Computations
- 24. Approximate Likelihoods
- 25. Bootstrap Bioequivalence
- 26. Discussion and Further Topics
- Appendix: Software for Bootstrap Computations
- References

The following notes take full-scale advantage of the chapter-by-chapter transcripts available on the author's web site.

- Introduction (R script)
- Estimation (R script)
- Inference (R script)
- Diagnostics
- Problems with the Predictors
- Problems with the Error
- Transformation
- Variable Selection
- Shrinkage Methods
- Statistical Strategy and Model Uncertainty
- Insurance Redlining - A Complete Example
- Missing Data
- Analysis of Covariance
- One way ANOVA
- Factorial Designs
- Block Designs

- 11. Regression (R script)
- 12. The Analysis of Variance (R script, smokers.csv)
- 13. Randomized Block Designs

- 11. Linear Statistical Models (R script)

Efron, Bradley, 2005, Bayesians, Frequentists, and Scientists, Journal of the American Statistical Association. March 1, 2005, 100(469): 1-5.

Efron, Bradley, 2010, Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction, Institute of Mathematical Statistics Monographs. Cambridge University Press.

Efron, Bradley, Winter 2009/2010, Stats 329, Large-Scale Simultaneous Inference, and accompanying data sets and programs. Read the Foreward for a concise introduction to this Stanford statistics course taught by Bradley Efron.

Benjamini and Hochberg, 1995, Controlling the false discovery rate: a practical and powerful approach to multiple testing

Efron and Tibshirani, 2002, Empirical Bayes methods and false discovery rates for microarrays

Efron, Bradley, 1979, Bootstrap Methods: Another Look at the Jackknife, The Annals of Statistics, Volume 7, Number 1 (1979), 1-26. Open access. Click on "PDF file."

Efron, Bradley, 1987, The Jackknife, the Bootstrap, and Other Resampling Plans. CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial Mathematics.

Efron, Bradley, and Robert J. Tibshirani, 1991, Statistical Data Analysis in the Computer Age, Science 12 July 1991: Vol. 253, no. 5018, pp. 390-395

Efron, Bradley, and Robert J. Tibshirani, 1994, An Introduction to the Bootstrap, Chapman & Hall/CRC.

- View the book on Google books.
- Data files, errata, and software supporting this text are available from the StatLib archive at Carnegie Mellon University. Log into statlib with the username "statlib" and search for the file "bootstrap.funs." Follow the enclosed directions to unpack the file with the shell command "sh bootstrap.funs"
- The R package "bootstrap" in support of this book is available on cran, but note that it recommends that new projects use the package "boot".
- Bradley Efron's home page.
- Robert Tibshirani's home page.
- Some related references.

Hastie, Trevor, Robert Tibshirani, and Jerome Friedman, 2009, The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Second Edition. Springer. A pdf of the book is available from this site, by arrangement with the publisher. Preface to first (2001) and second (2009) editions. Book review of the first edition.

Albert, Jim, 2009, Bayesian Computation with R (Use R!), 2nd ed., Springer. See the author's website.

Robert, Christian, and George Casella, 2009, Introducing Monte Carlo Methods with R (Use R!), Springer. Publisher's web page.

Hoff, Peter D., 2009, A First Course in Bayesian Statistical Methods, Springer Texts in Statistics. Author's web site and the book's web site.

Gelman, Andrew, John B. Carlin, Hal S. Stern, Donald Rubin, 2003, Bayesian Data Analysis, Second Edition, Chapman & Hall/CRC Texts in Statistical Science

Carlin, Bradley P., and Thomas A. Louis, 2008, Bayesian Methods for Data Analysis, Third Edition, Chapman & Hall/CRC Texts in Statistical Science.

Casella, George, 1985, An Introduction to Empirical Bayes Data Analysis, The American Statistician, Vol. 39, No. 2 (May, 1985), pp. 83-87

Faraway, Julian J., 2002, Practical Regression and Anova using R (available from CRAN)

Faraway, Julian J., 2005, Linear Models with R, Chapman & Hall/CRC. Author's web site. Publisher's web site

Faraway, Julian J., 2006, Extending the Linear Model with R. Generalized Linear, Mixed Effects and Nonparametric Regression Models, Chapman & Hall/CRC. Author's web site. Publisher's web site

Fox, John, and Sanford Weisberg, 2011, An R Companion to Applied Regression, 2nd Edition, Sage Publications, Inc.

Fisher, R. A., 1922, On the mathematical foundations of theoretical statistics, with commentary by Seymour Geisser, 1992, in Breakthroughs in Statistics: Foundations and basic theory, by Samuel Kotz and Norman Lloyd Johnson, Springer

R. A. Fisher -- A Guide to R. A. Fisher

Erich Lehmann -- In Memoriam; DeGroot, A conversation with Erich Lehmann (full text, open access; click on "PDF file"); Willem R van Zwet, Remembering Erich Lehmann

Wickham, Hadley, 2009, ggplot2. Elegant Graphics for Data Analysis, Springer. Author's web site and the ggplot2 web site

Sarkar, Deepayan, 2008, Lattice. Multivariate Data Visualization with R, Springer. Author's web site and the book's web site

Murrell, Paul, 2011, R Graphics, CRC Press, Taylor and Francis Group, Chapman & Hall

Venables, W. N., and B. D. Ripley, 2010, Modern Applied Statistics with S, Fourth Edition, Statistics and Computing, Springer

Dalgaard, Peter, 2008, Introductory Statistics with R, Second Edition, Statistics and Computing, Springer

Verzani, John, 2005, Using R for Introductory Statistics, Chapman and Hall/CRC

MIT OpenCourseWare 18.443 Statistics for Applications, F2003

MIT OpenCourseWare 18.443 Statistics for Applications, F2006

MIT OpenCourseWare 18.443 Statistics for Applications, S2009

Recommended software for biostatistics at Vanderbilt

Kahn Academy, Statistics course