A Guide to Using Multilevel Models for the Evaluation of Program Impacts
wp-01-33.pdf — PDF document, 515 kB (527709 bytes)
Author(s): Angeles G, Mroz T A
The purpose of this essay is to help researchers investigating the impacts of health, family planning, and nutrition programs understand the importance and relevance of using multilevel analysis in their empirical evaluations of the programs' impacts. The discussion first defines what it means to have a multilevel model, and it then turns to an examination of the statistical properties of estimators when one has a hierarchical structure. Throughout the essay we illustrate the basic points through the use of Monte Carlo experiments, where we simulate data and outcomes according to known and exact rules. After simulating data, we use a variety of estimation approaches to estimate the underlying relationships in the simulated data. Since we know the "true" way the "world" operates in these experimental settings, these Monte Carlo experiments allow us to evaluate how well particular statistical procedures can uncover the "true" form of the statistical relationship. Based on these Monte Carlo experiments and some direct comparisons of the statistical properties of the various estimators that we consider, we present a set of recommended approaches for using multilevel data to assess the overall effectiveness of programs. We focus our analysis on simple multilevel models where the effects of observed covariates are fixed and do not vary across units of the hierarchical structure. The residual term in a linear regression model possibly has a simple hierarchical structure. Our primary concern is how well various estimators measure the impacts of observed covariates on outcomes of interest. We focus on unbiasedness of the point estimators, precision of the estimators, and the ability of the point and standard error estimators to provide unbiased hypothesis tests. For our evaluations we focus on only simple linear regression models with continuous outcomes estimated by ordinary least squares (OLS) and on simple, two-level maximum likelihood estimation models. Given this scope, the essay reaches three main conclusions. First, if the data do have a multilevel error structure and one fails to account for this in the estimation of standard errors of estimates, one can dramatically overstate the significance of the estimated statistical relationships. In particular, a researcher who fails to use procedures that adjust estimated standard errors for the multilevel error structure would "uncover" statistically significant relationships when they do not exist. To obtain correct statistical inferences, one need not use complete multilevel modeling approaches. Instead, statistical procedures that ex post account for the clustering in the data when calculating standard errors will provide correct standard errors. Second, there typically is little efficiency loss in the estimation of the impact of a community-level variable on individual-level outcomes if one ignores the multilevel error structure and uses Ordinary Least Squares procedures to estimate the impacts of covariates on the outcome of interest. There can, however, be sizable increases in efficiency for estimators of the impacts of the individual-level variables, but these effects are typically of less interest in program evaluation studies. The third conclusion is more tentative than the first two. It deals with problems that one can encounter with multilevel models when one incorrectly assumes a simple linear relationship when the true relationship is nonlinear. In particular, if one imposes incorrectly a simple linear specification for the observed regressors when there really is a more complex function describing mean effects, then it is possible to incorrectly "uncover" a multilevel error structure when one does not exist. Taken as a whole these conclusions suggest that a fruitful estimation approach in practice would be to rely on simple estimation procedures like ordinary least squares, adjust the estimated standard errors to account for the possible multilevel error structures, and examine whether nonlinear relationships might better describe the data than simple linear effects. After a thorough examination of the empirical relationship with simple models and adjusted standard errors, one could then use more detailed multilevel models to obtain more precise estimators.
This document is not available in print from MEASURE Evaluation.