by Diane Ravitch
In his post, “Getting Value-Added Right,” Robert raises excellent questions, and his restaurant metaphor is apt. The value-added growth model, as Dan Willingham notes in the comments section and his post on the Britannica Blog, is not ready for prime time. There are too many intervening variables to hold teachers solely accountable for the test-score growth of every student. Given high rates of mobility, there is a large fluctuation in the student population in schools. As Thomas J. Kane and Douglas O. Staiger point out in one of their papers, their inherent volatility make test scores a poor basis for an accountability system.
The imprecision of test score measures arises from two sources. The first is sampling variation, which is a particularly striking problem in elementary schools. With the average elementary school containing only sixty-eight students per grade level, the amount of variation stemming from the idiosyncrasies of the particular sample of students being tested is often large relative to the total amount of variation observed between schools. The second arises from one-time factors that are not sensitive to the size of the sample; for example, a dog barking in the playground on the day of the test, a severe flu season, a disruptive student in a class, or favorable chemistry between a group of students and their teacher. Both small samples and other one-time factors can add considerable volatility to test score measures.
There are many, many reasons why one-year changes in scores are not reliable. There are many reasons why it is hard to give credit or blame for students’ test score gains and losses from year to year. Until we have better tests and have ironed out many of the confounding variables, it is unfair to make credible inferences about teacher performance from test scores, let alone use such data to dispense rewards and punishments.
There is another reason to worry about value-added growth models that determine a teacher’s fate and compensation. If we turn teaching into an activity whose sole purpose is to produce gains on tests that we know are mainly low-level and dumbed-down, we will not make education better. We may succeed in destroying it altogether. We better find ways to emphasize the quality of curriculum (think Core Knowledge) and to de-emphasize the number of times that kids are asked to check off a box on standardized tests in the course of a month. Or our education system will be far worse than ever.
Diane blogs on education at Bridging Differences — ed.