UVA professor and Core Knowledge board member Dan Willingham, who routinely graces this blog with his observations, is now blogging over at Britannica Blog. His first post is up today, and it’s a barn burner: How NOT to Evaluate Teachers. Plans to evaluate teachers based on standardized test scores are “fatally flawed,” he writes.
Obviously, the measure cannot be based on a one-time test score, because a student’s achievement is a product of (at least) his home environment, neighborhood, and prior schooling. So you must try to assess how much the student learns over the course of the year. But these “value added” measures bring lots of thorny statistical problems. For example, suppose your plan is to administer a test in the Autumn and one in the Spring, and to compare them to see how much students have gained. Well, some Autumn test-takers will have moved by the Spring. Can’t you just ignore those scores? No, because low-income students are more likely to move than high-income students, and low-income students tend to score lower. So if you ignore missing data, you’re biasing the estimate.
Dan lists other problems that he says are old stuff to statisticians, and concludes ”there’s nothing wrong with using value-added measures in research, with all the caveats of the method understood, as one in an array of tools to address a research question. But using it as a measure of an individual teacher’s efficacy is foolish.”


Recent Comments