Home page for EEB 581:
Advanced Topics in Biological Statistics

 You are visitor number   since 26 December 2005 

Lecture schedule --- R --- Info on students ---- Problem Sets --- a few statistics links (under construction) -- selected references (under construction)

Course information

This course is designed as a lecture course covering various topics in Statistical analysis (see below). I assume students have some modest background in statistics and we build on this by discussing a number of topics. The goal of this course is to provide students with a better feel for statistics and to be much less intimidated by methods of statistical analysis.

Course Objectives: We will introduce statistical distributions and computing the statistical power of various designs, matrix algebra useful for statistics and the general linear model, maximum likelihood estimation and testing, Bayesian Statistics, and various resampling and randomization methods. The focus is obtaining a general understanding of these statistical tools rather than which computer programs to use. Thus, the course will be somewhat more theoretical than applied, but the student will leave with a much broader understanding than a course concerned with running various statistical packages.

Math/Stats background required: Some knowledge of Calculus and a previous stats course (which introduced covariance, regression and ANOVA) is desirable.

Computer Programs: While the course focus is in basic statistical concepts, we will also introduce the R computing language. R: is one of the most powerful and flexible statistical programs, with a very large (and growing) library. Bad news: a little hard to get started on. good news: FREE!! (This is essentially S+, for those of you who have heard of this). More details are given below.

Class textbooks/reading There is no formal textbook for the class, although there will be extensive readings for most lectures (posted as pdf files below).

You also might wish to buy one or more of the following textbooks on using R

Meeting time and Place: Tuesday and Thursday, 9:30 a.m. -10:45 a.m. LSS 340

Instructor: Bruce Walsh:

The R Statistical Programming Language

The R Project for statistical Computing website

UA R users group website

US Mirror site for downloading R. Current versions for

An Introduction of R (Walsh notes)

  1. R as a basic statistical calculator for obtaining p values and plotting probability distributions (6 page pdf file).
  2. Power Calculations in R (4 page pdf file).
  3. Matrix Calculations in R (3 page pdf file).
  4. Bootstrap and jackknife in R (5 page pdf file).
  5. The Metropolis-Hastings Sampler in R (4 page pdf file).

pdf files of The official R Manuals

Lecture schedule

(VERY tentative, topics may be added/deleted per wishes of class)

DATE Day Lect. # Topic Handouts Problem Sets
12 Jan Thursday 1 Overview: Probabilities and Probability Distributions Univariate Distributions  
17 Jan Tuesday 2 Overview: Univariate distributions   PS 1
19 Jan Thursday 3 Overview: Bivariate distributions Bivariate Distributions  
24 Jan Tuesday 4 Normal, t, Chi-square distributions (1): Distributions of functions of normals,

(2): R as a basic statistical calculator

PS 1 Solutions
26 Jan Thursday 5 F distributions   PS 2
31 Jan Tuesday 6 Power of tests 1: Normals (1): Power,

(2): Simple power calculations in R

PS 2 Solutions
2 Feb Thursday 7 Power of tests 2: Fixed Effects ANOVAs    
7 Feb Tuesday 8 Power of tests 3: Random Effects ANOVAs   PS 3
9 Feb Thursday   No class, Walsh at NIH Central Limit theorem problem PS 4
14 Feb Tuesday 9 Matrix algebra 1: addition, multiplication (1): Intro to Matrix Algebra and linear models

(2): Matrix Calculations in R

PS 3 due
16 Feb Thursday 10 Matrix algebra 2: Inversion and the Multivariate Normal   PS 4 due
21 Feb Tuesday 11 Matrix algebra 3: The Multivariate Normal    
23 Feb Thursday 12 Matrix algebra 3: The Multivariate Normal   PS 5
28 Feb Tuesday 13 General linear model (GLM) 1: OLS General linear Model

Summary of GLM results

PS 5 due
2 March Thursday 14 GLM 2: Generalized inverses, systems of equations Generalized inverses PS 6
7 March Tuesday 15 GLM 3: Geometry of matrices, PC matrix Eigenstructure PS 7

PS 6 due

9 March Thursday 16     PS 7 due
14 March Thursday   Spring Break    
16 March Thursday   Spring Break    
21 March Tuesday 17      
23 March Thursday 18     PS 8 due
28 March Tuesday 19 Maximum Likelihood estimation, Likelihood ratio tests MLEs PS 9
30 March Thursday   No class, Walsh at UCSF    
4 April Tuesday 20 Generalized Linear models Generalized Linear models PS 9 due
6 April Thursday   No class, Walsh seminar at University of Florida    
11 April Tuesday 21 Resampling methods 1: Randomization and the Jackknife Resampling methods  
13 April Thursday 22 Resampling methods 2: The Bootstrap Bootstrap and Jackknife in R PS 10
18 April Tuesday 23 Multiple comparisons: 1: Sequential Bonferroni corrections and the False Discovery Rate Multiple comparisons  
20 April Thursday 24 Multiple comparisons: 2: the False Discovery Rate   PS 10 due
25 April Tuesday 25 Bayesian methods: Introduction Bayesian methods  
27 April Thursday 26 Bayesian methods: Advanced topics    
2 May Tuesday 27 MCMC methods MCMC and Gibbs Sampler

The Metropolis-Hastings Sampler in R

 

Problem Sets

Problem set Topic Due date Solutions
1 Regressions, covariances 24 Jan PS 1 Solutions
2 Confidence Intervals 31 Jan PS 2 Solutions
3 Power with z and t tests 14 Feb PS 3 Solutions
4 Power with F tests 16 Feb PS 4 Solutions
5 Basic Matrices, MVN 28 Feb PS 5 Solutions
6 Intro to GLM 7 March PS 6 Solutions
7 Generalized Inverses 9 March PS 7 Solutions
8 More GLM fun 23 March PS 8 Solutions
9 Matrix Eigenstructure 4 April PS 9 Solutions
10 Resampling Approaches 20 April PS 10 Solutions
10 MCMC   PS 11 Solutions
Data for PS 10!

data <- c(8.26, 6.33, 10.4, 5.27, 5.35, 5.61, 6.12, 6.19, 5.2, 7.01, 8.74, 7.78 , 7.02, 6, 6.5, 5.8, 5.12, 7.41, 6.52, 6.21, 12.28, 5.6, 5.38, 6.6, 8.74)

Selected Statistics References

  1. Randomization, Boostrap and Monte Carlo methods in biology (2nd ed). Bryan F. J. Manly (1997).

  2. Bayesian Hierarchical Modeling David Draper. You can download a postscript file of the draft version from Draper's website

  3. Generalized, Linear, and Mixed Models. Charles E. McCullock and Shayle R. Searle. (2001).

  4. Categorical Data Analysis, (2nd Ed.). Alan Agresti. (2002).

  5. Multivaraite Statistics: A Practical Approach. Berhard Flury and Hans Riedwyl. (1988)

  6. Applied Nonparametric Statistical methods. P. Sprent. (1989)

  7. Experiments: Planning, Analysis, and Parameter Design Optimization. C. F. Jeff Wu and Michael Hamada. (2000)

  8. Statistical Analysis with Missing Data. Roderick J. A. Little and Donald B. Rubin. (2002).

  9. Bayesian Statistics: An Introduction (2nd ed). Peter M. Lee (1997).

  10. Applying Generalized Linear Models. James K. Lindsey (1997).

  11. Tools for Statistical Inference: Methods for exploration of posterior distributions and likelihood functions (3rd ed). Martin Tanner (1996).

  12. Statistical Principles in Experimental Design (3rd ed). B. J. Winer, Donald R. Brown, and Kenneth M. Michels (1991).

  13. Intutive Biostatistics. Harvey Motulsky.

  14. Statistics as Principled Argument. Robert Abelson.

  15. Markov chain Monte Carlo: Stochastic simulation for Bayesian inference.Dani Gamerman (1997).

  16. The Ecological Detective.Ray Hilborn and Marc Mangel (1997).

  17. Mathematical and Statistical Methods for Genetic Analysis Keenth Lange (1997).

  18. Statistical Data Analysis. Glen Cowan (1998).

  19. Design and Analysis of Ecological Experiments. Samuel Scheiner and Jessice Gurevitch, Eds (1993).

  20. Regression Modeling Strategies. Frank E. Harrell, Jr. (2001).