Statistics 218:  Fall 2007
Statistical Data Mining

course policies, office hours, and general information


Logistics

  • Lecture Times and Location:  Wednesdays and Fridays, 10:05 - 11:20 a.m., in 025 Old Chemistry Building.


 

  • Instructor: Professor David Banks,  210A Old Chemistry Building,  684-3743,  banks@stat.duke.edu


 

  • Instructor's Office Hours:   Wednesday, 3:00 - 5:00 p.m. (except when travelling).   Other times are available by appointment.


Readings

The primary text for the course is Hastie, Tibshirani, and Friedman’s The Elements of Statistical Learning.   We shall also read recent research papers.

Course lectures will be posted on the class website the day before the lecture.   You are encouraged to print these out and bring them to class---that way you can focus on the material rather than taking notes.   To save trees you can use some special print options. For PCs, use (1) Print - ePrint; (2) Properties (just to the right of the printer selector): a. "Print on both sides" b. "Pages per sheet" (Four pages per sheet works well for the lecture notes) c. OK; (3) OK/Print.   For Macs, use (1) File - Print; (2) Click on the box that says "Copies and Pages"; (3) Select "Layout"; (4) Select "Two-sided Printing" (long-edge is usually preferred); (5) Properties - "Pages per sheet" (four pages per sheet works well).

Graded Work

Graded work for the course will consist of presentations and exams:

 

Homework

25 %

Survey Article

25 %

Research Project:  Write-Up

25 %

Research Project:  Presentation

25 %

There will be several homework assignments over the course of the semester.  Some may require statistical computing.  Additionally, each student is expected to work with the instructor to identify an appropriate research topic and develop it.  That project will entail both a write-up of the work and a 30-minute oral presentation to the class at the end of the semester.  Finally, each student is expected to write a short survey piece (approximately 5 to 8 pages) on a data mining topic for submission to either Wikipedia or Statistical Surveys.

Each assignment will receive a letter grade. An 'A+' corresponds to a score of 12, an 'A' corresponds to 11, an 'A-' is a 10, a 'B+' is a 9, and so forth. The final grade in the course is determined by the weighted average (as per the table above) of these scores. Breakpoints for grades occur at the halfway points. For example, the lowest possible average that gives an 'A-' for the semester is 9.5.


Academic Honesty

You are expected to abide by Duke's Community Standard for all work for this course.  Students are allowed to discuss homework problem strategies with each other, but the solution write-ups must be their own work.