Tag Archive for 'teacher evaluation'

“An Unavoidable Element of Subjectivity”

Schools need much more than merit pay to recruit and retain good teachers, argues Kevin Carey at the Quick and the Ed.  “They need strong leadership, good facilities, safe working conditions, and the right kind of organizational culture,” he writes. “You can’t paper over the lack of those things by simply tacking on a salary bonus, even a big one, to the existing steps-and-lanes pay scale.”

Carey’s reasoned (and reasonable) take on merit pay feels like a welcome departure from the teacher-quality-and-test-scores über alles refrain more commonly sung by accountability hawks.  Especially in his recognition that “we need to build schools great people want to teach in, and that means fully recognizing their value in all ways, including pay.”

The great schools of the future will be professional meritocracies in a way today’s public schools are not, but not by adding test scores to the mechanistic logic of an industrial-age salary scale. Rather, they’ll spend a great deal of energy on getting the conditions and culture right, and then negotiate substantially higher and substantially more variable salaries with individual teachers. It will be an expensive, time-consuming, imperfect process with an unavoidable element of subjectivity. It will also be much, much better than what most schools use today.

Agreed.  I’d also wager there isn’t one teacher in a thousand who wouldn’t welcome merit pay in a school that spent “a great deal of energy on getting the conditions and culture right.” 

The phrase “unavoidable element of subjectivity” also strikes me as a recognition of the infinite complexity teachers face in working with our most disadvantaged students (any attempt to move past mindless “teachers fear accountability” sloganeering is a welcome development).  Guest-blogging over at Joanne Jacobs, the always insightful Diana Senechal captures the dilemma of nuance-averse accountability well.  “With dumbed-down tests, vapid literacy programs, an overwhelming focus on test prep at the exclusion of essential subjects, and unreliable rating systems, we end up taking a yardstick to a void–and declaring miracles whenever we please,” she wrote.  The flip side of that — the thing that teachers reasonably fear — is that it is too easy to declare failure whenver we please, and hold teachers solely responsible when they are too often reduced to foot soldiers with no control over what or even how they teach. 

This cannot be said often enough: teachers are not by nature accountability-averse.  They are, however, sensibly averse to having an extraordinarily difficult and complex task measured by crude and simplistic tools.

Update:  John Thompson, a vocal teacher advocate who also viewed Carey’s post favorably, takes up a similar theme at This Week in Education.  “I’ve never understood why ‘reformers,’ who are angered by the terrible results of policies set by principals and central offices, respond by attacking teachers who do not set those policies. But the answer, which the New Teacher Center makes clear, is not to attack principals but to use ‘contextual data’ to enhance teacher and principal quality and create a learning culture which attracts and retains educators.”

Diane Ravitch on Teacher Evaluation and Value-Added

In his post, “Getting Value-Added Right,” Robert raises excellent questions, and his restaurant metaphor is apt. The value-added growth model, as Dan Willingham notes in the comments section and his post on the Britannica Blog, is not ready for prime time. There are too many intervening variables to hold teachers solely accountable for the test-score growth of every student. Given high rates of mobility, there is a large fluctuation in the student population in schools. As Thomas J. Kane and Douglas O. Staiger point out in one of their papers, their inherent volatility make test scores a poor basis for an accountability system.

The imprecision of test score measures arises from two sources. The first is sampling variation, which is a particularly striking problem in elementary schools. With the average elementary school containing only sixty-eight students per grade level, the amount of variation stemming from the idiosyncrasies of the particular sample of students being tested is often large relative to the total amount of variation observed between schools. The second arises from one-time factors that are not sensitive to the size of the sample; for example, a dog barking in the playground on the day of the test, a severe flu season, a disruptive student in a class, or favorable chemistry between a group of students and their teacher. Both small samples and other one-time factors can add considerable volatility to test score measures.

There are many, many reasons why one-year changes in scores are not reliable. There are many reasons why it is hard to give credit or blame for students’ test score gains and losses from year to year. Until we have better tests and have ironed out many of the confounding variables, it is unfair to make credible inferences about teacher performance from test scores, let alone use such data to dispense rewards and punishments.

There is another reason to worry about value-added growth models that determine a teacher’s fate and compensation. If we turn teaching into an activity whose sole purpose is to produce gains on tests that we know are mainly low-level and dumbed-down, we will not make education better. We may succeed in destroying it altogether. We better find ways to emphasize the quality of curriculum (think Core Knowledge) and to de-emphasize the number of times that kids are asked to check off a box on standardized tests in the course of a month. Or our education system will be far worse than ever.

Diane blogs on education at Bridging Differences — ed.

How Not to Evaluate Teachers

UVA professor and Core Knowledge board member Dan Willingham, who routinely graces this blog with his observations, is now blogging over at Britannica Blog.  His first post is up today, and it’s a barn burner: How NOT to Evaluate Teachers.  Plans to evaluate teachers based on standardized test scores are “fatally flawed,” he writes.

Obviously, the measure cannot be based on a one-time test score, because a student’s achievement is a product of (at least) his home environment, neighborhood, and prior schooling. So you must try to assess how much the student learns over the course of the year. But these “value added” measures bring lots of thorny statistical problems. For example, suppose your plan is to administer a test in the Autumn and one in the Spring, and to compare them to see how much students have gained. Well, some Autumn test-takers will have moved by the Spring.  Can’t you just ignore those scores? No, because low-income students are more likely to move than high-income students, and low-income students tend to score lower. So if you ignore missing data, you’re biasing the estimate.

Dan lists other problems that he says are old stuff to statisticians, and concludes ”there’s nothing wrong with using value-added measures in research, with all the caveats of the method understood, as one in an array of tools to address a research question. But using it as a measure of an individual teacher’s efficacy is foolish.”

Hiring and Firing

Jay Mathews, the dean of education reporters, takes a strong stand on teacher retention, arguing that giving principals the unfettered power to hire and fire teachers is “crucial” to closing the achievement gap.

This is a difficult choice and a hard time for D.C. teachers. They are fine people who have chosen a tough profession and put their hearts into their work. Many fear being judged by principals who were not skillful teachers themselves and have little clue as to what helps kids learn and what doesn’t. But I don’t see any way the city’s children are going to get the instruction they deserve — the imaginative, fun-loving, firm teaching found at schools like KEY — unless principals are given the power to hire and fire teachers based on demonstrated skill and improved learning in class.

Mathews cites the example of the KIPP DC:KEY Academy, where principal Sarah Hayes dismissed two teachers who were not cutting it, despite efforts to improve.  “If KEY were a traditional school, Hayes’s only reasonable option would have been to mentor the teachers, note her dissatisfaction on their evaluations and recommend that they not be kept after a two-year probation,” he writes.  “That is the way it goes in most school systems. Staffing rules, tenure agreements and low expectations tend to favor weak teachers unless they do something awful.”