Can short-term measures of teacher effects indicate long-term effectiveness? A recent, highly publicized report, The Long-Term Impacts of Teachers: Teacher Value-Added and Student Outcomes in Adulthood, concludes that elementary school teachers’ so-called value-added effects on their students show up years later in those students’ teenage pregnancy rates, college success and career earnings. A new review of that study, conducted for the National Education Policy Center (NEPC), found that while the study is impressive in many ways, more evidence is needed to prove a key part of its case.

The report is authored by economists from Harvard and Columbia University: Raj Chetty, John Friedman, and Jonah Rockoff.

Dale Ballou, an economist and professor of public policy and education at Vanderbilt University, reviewed The Long-Term Impacts of Teachers for NEPC’s Think Twice think tank review project.

The Chetty, Friedman and Rockoff study was covered extensively in major newspapers and was cited in President Obama’s State of the Union address. It has quickly become an important part of national discussions about how we should evaluate teachers. In particular, some policy makers believe that we should base decisions about teacher pay, retention and other considerations on the basis of how a teacher’s students perform on standardized tests, measured using “value-added” approaches that attempt to isolate the effects of individual teachers on student growth.

Past examinations of the value-added methodology have found that these analyses can vary widely based on even modest changes to the assumptions employed in the model or the assessment used. Teachers also have expressed concern that overreliance on test scores for teacher evaluations leads to a narrowing of the curriculum and teaching to the test.

The new study merges IRS data with archival test-score data from a large urban district, using value-added modeling to estimate the short-term test score growth of the students of teachers in reading/language arts and mathematics in grades 4 through 8. It then looked at certain longer-term outcomes for those students. The researchers concluded that teachers who proved to be effective at raising test scores in either of those subjects also had positive effects on other student outcomes, even years later.

According to Ballou, “The report claims that these positive outcomes are the result of having had higher value-added teachers, above and beyond any association that might arise for other reasons.”

The three economists support their conclusions regarding short-term effects with strong tests for systematic bias in the assignment of students to teachers. Those tests found that the value added was estimated without bias. Family background factors that predict high test scores were not distributed in such a way as to systematically favor teachers with high value-added estimates.

But Ballou explains that a similar set of tests is needed to establish that the estimated effect of high-value-added teachers on long-term outcomes is also free of bias. He writes:

What is needed now is to test whether family background factors that predict long-term success in employment, college attendance, etc., are distributed in such a way as to favor high-value-added teachers. … [T]he tests for bias need to be run again… The fact that similar tests have already validated the estimates of teacher value added does not imply that they are not needed to validate inferences about the impact of teachers on long-term outcomes. Indeed, it is probably even more important that these tests be conducted with respect to outcomes like earnings and the avoidance of teen pregnancy. Data available from tax returns are not likely to distinguish well between families that nurture the development of character and families that are much less successful in this task. Unobserved differences between families are likely to be very important. The way to test whether high-value-added teachers have been systematically assigned more students whose families are richer with respect to these unobservable factors is to conduct the quasi-experimental tests described above. Absent that, we will not know whether such differences were present and were the underlying reason for the observed association between teacher value-added and students’ long-term success.

“On this key point the report falls short,” Ballou writes. “While some of these tests have been reported, much more evidence could have been presented to support this claim. The same kind of tests conducted to establish that value added was estimated free of bias could have been applied to test the larger and more significant claim of this report: that high-value added teachers improve life outcomes many years after students have left their classrooms. In the absence of such evidence, it is premature to conclude that the report's central conclusions are correct.”

