Statistics 122/290: Bayesian and Modern Data Analysis
  Spring 2012

Course Home Page


Course Description

In this course, we focus on the principles of data analysis and computer-intensive, modern statistical modeling. Topics include Bayesian inference, prior and posterior distributions, regression modeling, hierarchical models, model checking and selection, missing data, and stochastic simulation by Markov Chain Monte Carlo including Gibbs sampling and Metropolis algorithms. We emphasize the use of high level statistical software to write computer programs for performing data analysis.

Course Objectives

Logistics

Prerequisites

Undergraduates are expected to have taken STA 104, STA 114, and either STA 121 (preferred) or ECON 139. Graduate students are expected to have taken STA 213 or a similar class in mathematical statistics. Students who do not know about sampling models, probability density functions, or likelihood functions should be enrolled in STA 114 or STA 213 rather than STA 122/290. All students are expected to have taken a course in multivariable calculus, such as MTH 102 or MTH 103. Familiarity with Bayesian methods is not assumed. Students who are not sure about their preparation should discuss it with the instructor.

Readings

The required text is:

Hoff, P. L.  (2009),  A First Course in Bayesian Statistical Methods,  Springer.  ISBN 978-0-387-92299-7.

This book is available for purchase at the book store. It also is available for free as an online book at the Duke Library website.

A recommended but not required text is:

Adler, J. (2009) R in a Nutshell, O'Reilly Media. ISBN: 9780596801700.

Many other resources are useful for learning about Bayesian methods and software packages used in statistical research; see the link to suggestions for other readings. These are not required texts.

Computing

We will use the statistical software package R for analyzing data.  It can be downloaded for free at http://www.r-project.org/.  Alternatively, R is available on the public computers on campus. See the STA 122/290 computing resources page for useful links and tips on R. The page also contains information on the word processor Latex and text editor emacs.

The lab sections will be used for additional computation in R, review programs used in class, and TA office hours related to computing assignments. Attendance is not required. Any lab is open to anyone.

Calculator

Students don't need a calculator for this course.

Schedule of Topics

We will cover the topics in the table below.  We may spend different amounts of time on each topic than shown, depending on the interests of the class participants.

Introduction to Bayesian inference Hoff, Chapter 1,2
2 lectures
Bayesian inference for one parameter models
Hoff Chapter 3, 4
3 lectures
Bayesian inference for normal model
Hoff Chapter 5
1 lecture
Gibbs sampling and MCMC convergence diagnostics
Hoff Chapter 6
3 lectures
Bayesian finite population inference
Notes
1 lecture
Multivariate normal distribution
Hoff Chapter 7
1 lectures
Hierarchical models
Hoff Chapter 8
3 lectures
Linear regression
Hoff Chapter 9
2 lectures
Metropolis-Hastings Algorithms
Hoff Chapter 10
2 lectures
Mixed effects models
Hoff Chapter 11
2 lectures
Missing data
Hoff Chapter 7 and Notes
3 lectures


Graded work

Graded work for the course will consist of two term exams, methods assignments, and a final project.  Students' final grades will be determined as follows:
 
Methods Assignments
40 %
Term Exams
40 %
Final Project
20 %

There are no make-ups for graded work except for medical or familial emergencies or for reasons approved by the instructor before the due date.  See the instructor in advance of relevant due dates to discuss possible alternatives. Students in STA 122 will be graded separately from those in STA 290.

Descriptions of graded work

Methods Assignments:

Methods assignments are posted on the Statistics 122/290 course web site on Sakai.  Students turn in these assignments at the beginning of class on the due date.  Students are permitted to work with others on the assignments, but each person must write up and turn in their own answers.  The methods assignments are designed to build students' knowledge of the computational and the mathematical aspects of Bayesian inference and data analysis. For assignments involving mathematical manipulations, students can write answers by hand provided penmanship is neat; illegible answers will be marked as incorrect. For assignments requiring graphical displays, students must include the graphs in a word processor, e.g., LaTex or Word, with typed explanations about the graphs; graphs without explanations will be marked as incorrect. For assignments requiring text responses, students are strongly encouraged, but not required, to use a word processor.

Term Exams:

There will be two term exams. One will cover mathematical and conceptual aspects of Bayesian inference; the other will cover distributional theory for Markov Chain Monte Carlo. Practice problems will be available later in the semester.

Final Project:

The final project will involve a Bayesian data analysis on a topic of your choosing. Further instructions, including due dates, can be found on this link .

Academic honesty

Students are expected to abide by Duke's Community Standard for all work for this course.  Violations of the Standard will result in a failing final grade for this course and will be reported to the Dean of Students for adjudication.  Ignorance of what constitutes academic dishonesty is not a justifiable excuse for violations.

For the exams, students are required to work alone.  For the methods assignments, students may work with others but each student must submit his or her own answers. For the project, students can choose to work alone or in pairs.