**Changes for 2020**: I'll post new notes in 2020 as this class will change. **There won't be a Spring 2020 class. This class will be offered in the Fall semester starting in 2020**. The new version will shorten the introductory part since our PhD students are now required to take a Biostatistics sequence during their first year (BIOS 6611, 6612). The second part of the class will now add methods for observational data: difference-in-difference, instrumental variables, regression discontinuity, and propensity scores. In addition, I'll cover two-part models and GLM models focusing on the analysis of cost data. ** **

Previous to 2020:

This class is the first of a two-class sequence on methods in health services research and policy evaluation. It emphasizes both statistical theory and its implementation. Topics are covered from different methological traditions: econometrics, statistics, and epidemiology. There is a lot of "translation" from one discipline to another. For example, causality is covered using the new causal inference literature but also the way economists have traditionally understood causality (i.e. zero conditional mean assumption, selection on observables). The linear regression model is covered in the traditional way in econometrics (ordinary least squares, the Gauss-Markov theorem) but also in the way the "general" linear model is presented in analysis of experiments (ANOVA, etc). Maximum likelihood estimation is covered early on as a general framework for model selection using likelihood ratio tests. Models are interpreted in several ways with emphasis on using analytical and numerical derivaties (that is, marginal effects--see Lecture 23). Simulations are used for every topic, including using the estimated model for a Bayesian-like hypothesis testing. Bayesian methods are also introduced. Other topics include exploratory data analysis, logit/probit, Poisson regression, GLM models, model selection, and bootstrapping. This class is also an introduction to Stata but we do compare Stata to R and SAS when relevant.

Email me if you want the Stata code for lectures. Lecture are written in LaTeX. Code. Problem sets, answer keys, and readings are available on Canvas for registered students.

Syllabus Spring 2019: [HSMP7607_spring2019_syllabus.pdf]

Lecture 1: Overview of regression analysis and class [week 1_intro.pdf]

Lecture 2: Introduction to Stata [week 1 stata.pdf]

Lectures 3 and 4: Review of probability and mathematical statistics [week 2 probability.pdf]

Lecture 5: Causal inference [week 3 causal.pdf]

Lecture 6: Simple linear regression [week 3 SLR.pdf]

Lecture 7: Simple linear regression (properties, testing) [week 4 SLR II.pdf]

Lecture 8: Simple linear regression (fit, confidence intervals, simulations) [week 4 SLR III.pdf]

Lecture 9: Mutiple linear regression [week 5 MLR.pdf]

Lecture 10: Multiple linear regression (interpretation) [week 5 MLR II.pdf]

Lecture 11: Maximum likelihood estimation (MLE) [week 6 MLE.pdf]

Lecture 12: Regression assumptions diagnostics I [week 7 diagnosis.pdf]

Lecture 13: Regression diagnostics II [week 7 diagnosis II.pdf]

Lecture 14: Qualitative predictors (ANOVA, effects coding, etc) [week 8 qualitative.pdf]

Lecture 15: Modeling I [week 9 modeling.pdf]

Lecture 16: Modeling II (variable transformations, etc) [week 9 modeling II.pdf]

Lecture 17: Heteroskedasticity I [week 10 heteroskedasticity.pdf]

Lecture 18: Heteroskedasticity II [week 10 heteroskedasticity II.pdf]

Lecture 19: Collinearity [week 11 collinearity.pdf]

Lecture 20: Bias-variance, adjusting [week 11 interpretation.pdf]

Lecture 21: Linear probability model, logistic, probit [week 12 lpn logit.pdf]

Lecture 22: Logistic regression [week 12 logistic.pdf]

Lecture 23: Margins and marginal effects [week 13 margins.pdf] (updated, 2019)

Lecture 24: Probit, variable selection (AIC, BIC) [week 14 selection.pdf]

Lecture 25: Bootstrap and methods II [week 14 boostrap.pdf]