Thursday, June 26, 2014

Review questions RAND report’s attempt to evaluate the effectiveness of a principal preparation program using value-added measures of students’ test scores

An evaluation of the New Leaders principal preparation program concludes that the program has a slightly positive effect on student test scores, though only for certain grade levels, subject areas, and districts. But a review published today cautions that the evaluation, even with such tepid conclusions, overreaches.

Reviewer Edward J. Fuller of Penn State University notes that the evaluation, conducted by researchers at the RAND Corporation, found effect sizes that were quite small where they existed at all, and that “the study’s results are more mixed than its bottom-line conclusion would suggest.” More importantly, Fuller explains that the research concerning principal effectiveness expressly warns against using value-added estimates—of the sort used by RAND—to attempt to capture such effectiveness in a high-stakes context.

Professor Fuller reviewed Preparing Principals to Raise Student Achievement: Implementation and Effects of the New Leaders Program in Ten Districts for the Think Twice think tank review project. The review is published by the National Education Policy Center, housed at the University of Colorado Boulder School of Education. Fuller is associate professor and executive director of the Penn State Center for Evaluation and Education Policy Analysis whose research examines a variety of areas relating to educator preparation, quality, and career pathways, as well as school improvement, evaluation, and charter schools.

The report was written by a team of researchers for RAND led by Susan Gates. The evaluation was sponsored by the New Leaders principal preparation program, a non-profit organization founded in 2000 and based in New York City, which describes its mission as improving student outcomes by preparing effective leaders and improving working conditions of school principals.

The RAND evaluation attempts to determine the New Leaders program’s impact on student test scores at primary and secondary grades, although the data allow for only weak analyses at the upper grades.

At the lower grades, the effect sizes associated with principals prepared through the New Leaders program are less than two percentile points. Overall, Fuller adds, “the study’s results are more mixed than its bottom-line conclusion would suggest” – with most results finding no statistically significant impact of New Leaders principals on test scores, and nearly as many negative findings as positive ones.

Because of the general weaknesses of the value-added approach, and because of additional causal difficulties of attributing some defined aspect of students’ test score growth to their principals, Fuller points out that the evaluations’ findings have weak validity. This does not mean that New Leaders principals are poorly prepared; they might be extremely professional and effective. But the RAND evaluation is wrong to attempt to make causal claims.

Fuller observes that the study does provide “a rich description of a thoughtful approach” to evaluating the effectiveness of principals and of principal preparation programs. But, the reviewer warns, it is also potentially harmful, if it leads policymakers to conclude that the evaluation approaches employed offer a valid basis for high-stakes accountability systems for principals or preparation programs.

“Current research is very clear about this—the estimates presented in the study do not accurately capture principal effectiveness and should not be used to make high-stakes decisions about individuals or programs,” Fuller concludes.

“Thus, ultimately, this study misses a very important opportunity to discuss these issues and inform policymakers about the problems and prospects of using the strategies it employs.”

No comments: