Researchers at Vanderbilt University failed to find any connection between merit pay and higher test scores after a three-year study of Nashville teachers.
Perhaps they would have found it had they been properly incentivized.
| M | T | W | T | F | S | S |
|---|---|---|---|---|---|---|
| « Apr | ||||||
| 1 | 2 | 3 | 4 | 5 | ||
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | 30 | 31 | ||
Researchers at Vanderbilt University failed to find any connection between merit pay and higher test scores after a three-year study of Nashville teachers.
Perhaps they would have found it had they been properly incentivized.
The 4th annual Education Next poll shows a sharp divide between teachers and the general public on merit pay, teacher tenure, Race to the Top, and a host of other hot-button education issues. The poll, which was conducted by researchers at Harvard shows
“most Americans support merit pay for teachers, while teachers oppose the policy by a large margin; there is strong opposition among the public to teacher tenure, while teachers favor it; and teachers are significantly more opposed to the federal RttT program than the broader public.”
No surprises here. Teacher tenure will never make sense to those who don’t enjoy that kind of job security. And merit pay will always have an intuitive appeal. Who can begrudge the standouts in any field deserve more.
Here’s a poll question I’d like to see asked:
In general, do you feel your child’s teachers spend too much time, too little time, or the right amount of time preparing students for standardized state tests?
Or this one:
Please indicate whether you strongly agree, somewhat agree, somewhat disagree or strongly agree with the following statement: my child’s school places too much emphasis on standardized tests.
Just a hunch, but I suspect a majority of Americans would express reservations about the amount of test prep their children endure–at least those with kids in the prime testing grades 3 through 8–and the degree to which testing dominates elementary education. If so, this might skewer Ed Next’s finding that “support for ‘basing a teacher’s salary, in part, on his or her students’ academic progress on state tests’ jumped five percentage points in one year, increasing from 44 percent in 2007 to 49 percent in 2010.
Another figure that jumped out at me: everyone “knows” that teachers are the weakest link in the chain and that attacking teacher unions is a political winner. Maybe not. More people believe teacher unions are “blocking school reform rather than helping it,” but the margin is slim, 33 to 28 percent. “But 39 percent take no position at all,” says Ed Next.
Other interesting data points in the Ed Next poll:
“When it comes to school choice, charters and learning on the Internet are ‘in,’ while vouchers are ‘out,’” notes Harvard’s Paul E. Peterson, the editor-in-chief of Education Next.
My humble request for my friends at Ed Next. How about a few questions next year on curriculum? It would be intriguing to learn what Americans think about the content of their children’s education and how they feel it compares to their own.
Over on Twitter, my friend Stephanie Germeraad, who is nearly as passionate about sports as she is about education, suggests education ought to steal a page from baseball when it comes to teacher seniority. Commenting on the decline of legendary closer Trevor Hoffman, she tweets a quote from Alan J. Borsuk: “Schools can learn from baseball. Brewers wouldn’t start Hoffman just because he’s been pitching longer.” The point is that seniority is no guarantee of quality. Fair enough. But here’s a sobering truth: We are far more capable of measuring the effectiveness of relief pitchers like Hoffman than classroom teachers.
If you’re a casual baseball fan, you might know a few ”facts” about the pitchers on your favorite team: their won-loss record, their ERA (the number of “earned runs” allowed per nine innings), or their WHIP (walks and hits per innings pitched). To an expert, such statistics scratch the surface at best, and may even be irrelevant. Wins are a function of a team’s offense, for example, as much as a pitcher’s effectiveness, while ERA and WHIP are strongly influenced by the defensive ability of the other eight men on the field. An outfielder with greater range for example, will record an out on a ball that a lesser defender lets fall for a hit. Same pitch, same swing, different outcome.
Among baseball geeks, you often hear discussions of fielding independent pitching, or ”FIP,” a measure of the things a pitcher is directly responsible for such a strikeouts, home runs and walks. FIP helps you understand how well a pitcher pitched, regardless of how well the team played behind him. Data even helps teams decide what kind of pitchers are best suited to their stadiums through analysis of “park effects.” A fly ball pitcher (yes, they keep track of fly balls, line drives and ground balls hit off every pitcher) might prosper in a big stadium like New York’s Citi Field, but allow lots of home runs in a bandbox like Philadelphia’s Citizens Bank Park. A pitcher who “pitches to contact” (i.e., doesn’t strike out a lot of hitters) is fine if your team’s defense is strong. If not, you might spend more to sign pitchers who are strikeout artists. Data even helps spot problems as they occur. Fans of the New York Mets are concerned that all-star pitcher Johan Santana’s fastball is topping out below 90 miles an hour of late, making his changeup, a slow-speed pitch, less likely to fool hitters expecting the fastball.
To a baseball fan statistics are a revelation. The granularity and specificity are illuminating. You can see, if you’re so inclined, a pitcher’s FIP, ERA, strikeouts, and his strikeout-to-walk ratio. The percentage of batted balls that were hit on the ground, in the air, or for line drives can speak volumes about a pitcher’s effectiveness. When a player’s agent goes to negotiate his contract, he can even discuss his “Wins Above Replacement” (WAR), a statistic that measures the total value of a player over a given season compared to an average replacement player.
If these kinds of numbers thrill you, adding depth and nuance to your love of baseball, thank Bill James. It is no overstatement to say that no one has had a greater impact on baseball in the last 25 years than James, who pioneered and named the field of sabermetrics, the use of detailed statistics to analyze baseball team and player performance. James has made a career of demonstrating the factors that lead to teams scoring runs and winning games, and how the efforts of individual players contribute to wins. Some of his insights have been legendary and have overthrown time-honored beliefs about the game–why RBIs matter less than on-base percentage, for example. Or why stolen base attempts tend to hurt a team’s offense. Before Bill James, baseball was all batting averages, bromides and intangibles–a century of baseball men who knew what they knew based on experience, instinct and rudimentary data.
We are in the test scores, bromides and intangibles era of measuring teacher quality. If you’re a prinicipal, wouldn’t you love to know the “school effects” of teacher performance when it came time to make hiring decisions? Would it change your perception of merit pay if there was a classroom equivalent of FIP–the factors directly under a teacher’s control? What if we could compensate teachers based on their replacement value compared to an average first year teacher?
“It’s far more than win/loss/ERA/WHIP” is the clubhouse mantra,” Stephanie tweeted, defending her assertion that education can profit from baseball’s example. ”Difference is, baseball doesn’t say they therefore can’t do it,” she wrote. Not quite right. In baseball there is data–lots of it–to measure effectiveness clearly and fairly. Difference is ”it’s far more than test scores” is not a mantra in ed reform.
Education awaits its Bill James.
“Who needs merit pay when you have 3000+ applicants for seven jobs?” asks Michael E. Lopez at Joanne Jacobs blog. A New York Times report notes teachers may be facing “the worst job market since the Great Depression.” Pelham Memorial High School in Westchester County, New York has received applications from candidates as far away as California for one of its seven advertised openings.
Over at The Answer Sheet, Patricia Duffey, an eighth grade ELA teacher from North Carolina suggests moving the idea of merit pay “from the teacher column into the parent column.” She floats the idea on the same day that we learn New York City is scuttling a plan aimed at doing precisely that.
“I truly believe that we need to provide more incentive for parents of low-performing students to follow-through in helping their child succeed,” Duffey writes.
“Teachers are giving, in many cases, more than 100 percent in the classroom, but there truly is no replacement for hard work on the part of all who are invested in education: student-parent-teacher-school administration. I cannot over-emphasize the effect that parental involvement has in student success, and how difficult it is for teachers to bridge that gap when it is missing.”
No argument from me about the importance of parental involvement. My classroom experience tells me that parental involvment and good results are not axiomatic, but it is the way to bet. So what about merit pay for parents? A few years ago New York City launched a privately funded program to pay low-income parents to do things like attend parent-teacher conferences or take kids to for routine medical and dental checkups. Mayor Bloomberg talked at the time about seeking government funding for the program if it proved successful.
Well, it hasn’t.
The program to “encourage good behavior and self-sufficiency has so far had only modest effects on their lives and economic situation,” the New York Times reports today. ”While payments to the families will end in August, researchers will continue to monitor them for three more years, to see if any behavior encouraged by the initial payments will continue. A final report will be issued in 2013.”
“You always hope that you’ve come across a magic silver bullet and you never do,” Bloomberg said. Truer words…
A suggestion by Claus Von Zastrow of Public School Insights that pundits like Jonathan Alter who write about education be subject to performance pay attracted the notice of Alter, who has been mixing it up with commenters to the post. It started when Von Zastrow took issue with Alter’s KIPP cheerleading and broad brush take on reform.
What do we make of Alter’s suggestion that only charter schools and merit pay are “real reform?” Well what about better staff development? Better curriculum? Stronger ties between schools and communities? Much, much better assessments? Are those phony reforms? All in all, Alter gets an unsatisfactory rating, so no performance bonus this year. In fact, his failure to improve since last summer puts him at risk of termination.
That was apparently too much for the Newsweek pundit, who showed up on the blog’s comments to defend himself and do a little advocacy work. ”With the president’s support, the pool of reformers is growing,” Alter wrote. “Come on in, guys. The water’s warm.”
Alter gets points for showing up and opening himself up for further abuse. The highlight of the thread so far: One anonymous wit who wickedly applies Alter’s take on merit pay to his own columns:
I’m glad you’ve accepted Claus’ merit pay proposal. The formula is clear. Since your job is to inform the public, we’re going to measure your readers’ knowledge. Then, a year from now, we’re going to measure it again. If they’re smarter, you’ll get a substantial bonus. If not, we’ll put you on a 90-day plan of review, support, and, if your readers don’t get smarter, we’ll have to regretfully let you go. Sorry, but it’s all about the readers, not the writers.
Tough crowd.
Football fans see it time and again: It’s 4th down and short yardage. An official standing 30 or 40 feet away from the play sees a running back hurl himself full throttle into a forest of 300-pound linemen and disappear beneath a collapsing pile of players, a football buried somewhere against his body. Chaos everywhere, yet the official, with unquestioned authority places the ball he lost sight of on the exact spot on the ground where forward momentum stopped and calls for the chains. Play stops and the fans grow quiet as a team of officials runs in from the sidelines and takes a precise-to-the-inch measurement of the ball’s location. If the any part of the ball is beyond the plane of the outstretched chain, a first down is awarded. The crowd goes wild.

Never mind that the linesman is merely estimating the ball’s position. Never mind that the ten-yard length of chain was placed based on an eyeball approximation of where the series of downs began three plays ago. Never mind that every play in the series of downs begins and ends with a best guess (the wide receiver was knocked out of bounds at about the 35-yard line) When it’s time to determine whether or not a first down is to be awarded, football is suddenly a game of inches.
Games, playoff hopes, bowl bids and careers turn on a guess–or a series of guesses. But no one seems to question it. Call for the chains! If you stop and think about it, this doesn’t make a lot of sense. The answer however is simple: Don’t think about it.
Here are a few more things not to think about:
We know this. We see it all around us, but like the football fan caught up in the arbitrary kabuki dance of the moving of the chains, we accept it, applaud it or moan about lousy spots, but the game goes on.
“There must be a better way,” Pat Summerall, an N.F.L. veteran and broadcaster said in a recent New York Times article. “Because games are decided, careers are decided, on those measurements.” He was talking about measuring for first downs. “There’s a certain amount of drama that is involved with the chains,” said New York Giants president, John Mara in the same article. “Yes, it is subject to human error, just like anything else is. But I think it’s one of the traditions that we have in the game, and I don’t think any of us have felt a real compelling need to make a change.”
“With national standards will come national standardized tests, so it’s an especially good time to rethink how these exams are scored, and by whom,” Dana Goldstein sensibly observes at The American Prospect’s Tapped blog. “Perhaps teachers and principals should be scoring tests, not $8 an hour part-timers. In that case it would be important, especially with the push for merit pay, to make sure teachers aren’t grading their own students’ tests, to decrease the temptation to engage in foul play.”
Like the theatrical measurement of a first down in football, we want to rely on precise measurements of an imprecise process to make high stakes decisions on everything from federal funding to merit pay to whether a teacher keeps his or her job at all. “I understand that tests are far from perfect and that it is unfair to reduce the complex, nuanced work of teaching to a simple multiple choice exam,” Education Secretary Arne Duncan recently observed.
Right. It’s way more complicated than that.
But it’s 4th down! Call for the chains! Take a measurement. How else are we going to know?
Schools need much more than merit pay to recruit and retain good teachers, argues Kevin Carey at the Quick and the Ed. “They need strong leadership, good facilities, safe working conditions, and the right kind of organizational culture,” he writes. “You can’t paper over the lack of those things by simply tacking on a salary bonus, even a big one, to the existing steps-and-lanes pay scale.”
Carey’s reasoned (and reasonable) take on merit pay feels like a welcome departure from the teacher-quality-and-test-scores über alles refrain more commonly sung by accountability hawks. Especially in his recognition that “we need to build schools great people want to teach in, and that means fully recognizing their value in all ways, including pay.”
The great schools of the future will be professional meritocracies in a way today’s public schools are not, but not by adding test scores to the mechanistic logic of an industrial-age salary scale. Rather, they’ll spend a great deal of energy on getting the conditions and culture right, and then negotiate substantially higher and substantially more variable salaries with individual teachers. It will be an expensive, time-consuming, imperfect process with an unavoidable element of subjectivity. It will also be much, much better than what most schools use today.
Agreed. I’d also wager there isn’t one teacher in a thousand who wouldn’t welcome merit pay in a school that spent “a great deal of energy on getting the conditions and culture right.”
The phrase “unavoidable element of subjectivity” also strikes me as a recognition of the infinite complexity teachers face in working with our most disadvantaged students (any attempt to move past mindless “teachers fear accountability” sloganeering is a welcome development). Guest-blogging over at Joanne Jacobs, the always insightful Diana Senechal captures the dilemma of nuance-averse accountability well. “With dumbed-down tests, vapid literacy programs, an overwhelming focus on test prep at the exclusion of essential subjects, and unreliable rating systems, we end up taking a yardstick to a void–and declaring miracles whenever we please,” she wrote. The flip side of that — the thing that teachers reasonably fear — is that it is too easy to declare failure whenver we please, and hold teachers solely responsible when they are too often reduced to foot soldiers with no control over what or even how they teach.
This cannot be said often enough: teachers are not by nature accountability-averse. They are, however, sensibly averse to having an extraordinarily difficult and complex task measured by crude and simplistic tools.
Update: John Thompson, a vocal teacher advocate who also viewed Carey’s post favorably, takes up a similar theme at This Week in Education. “I’ve never understood why ‘reformers,’ who are angered by the terrible results of policies set by principals and central offices, respond by attacking teachers who do not set those policies. But the answer, which the New Teacher Center makes clear, is not to attack principals but to use ‘contextual data’ to enhance teacher and principal quality and create a learning culture which attracts and retains educators.”
Near the end of World War I, President Woodrow Wilson sought to reassure Americans that what was known at the time as “The Great War” was a just cause. In a speech to Congress, he outlined America’s war aims in “Fourteen Points” that were as broad as insuring freedom of navigation on international waters and fair trade, and as specific as redrawing the borders of several European nations and restoring their pre-war populations. French Prime Minister Georges Clemenceau, in one of history’s finer bon mots, quipped, “Fourteen points? Why, God Almighty has only Ten!”
Secretary of Education Arne Duncan goes Wilson one better. Five, actually. He has Nineteen Points. God has fallen nine back, well off the pace.
According to detailed guidelines being released today in Washington, states that hope for a piece of the $4.35 billion Race to the Top Fund will have to abide by 19 detailed criteria on academic standards, data-tracking, teacher recruitment and retention, and turning around low-performing schools. “You can’t pick or choose here,” Duncan tells USA Today.
EdWeek’s Michele McNeil notes the guidelines “send a strong message that any state hoping to land a grant must allow student test scores to be used in decisions about teacher compensation and evaluation.” While opposition to that will be summarily dismissed as the product of accountability-averse teachers unions, Dan Willingham has cogently described why this particular reform is not ready for prime time. Still, states like New York and California, which currently forbid by law using test data to evalute teachers will not be eligible for Race to the Top funds, as McNeil points out:
Being able to link teacher and student data is “absolutely fundamental—it’s a building block,” U.S. Secretary of Education Arne Duncan said in an interview. “We believe great teachers matter tremendously. When you’re reluctant or scared to make that link, you do a grave disservice to the teaching profession and to our nation’s children.”
To be sure, there is much to like about this Ed Reform Early Christmas, and the sense of urgency is welcome and laudable. But let’s be clear, No Child Left Behind, however well-intentioned, did little to advance the idea that children benefit from a robust, well-rounded curriculum. It did much to advance the idea that children must be taught whatever might appear on a year-end test. If time was limited, anything that did not contribute to this near-term payoff was jettisoned. Thus, aggressive accountability measures actively worked against the patient, steady development of background knowledge that creates both well-educated children and, ultimately, higher test-scores. It beggars credulity to think that using data to hold individual teachers directly responsible for student gains will result in a sudden outbreak of big picture thinking in classrooms across the country.
The idea that reading comprehension is a function of background knowledge has not taken deep hold in America’s classrooms. And what teacher — especially the new, young and relatively inexperienced teachers who disproportionately fill struggling urban schools — will have the wherewithal to insist on the steady buildup of knowledge across the curriculum? Indeed, if we are to have 19 points, why not round up to 20 and insist that a Race to the Top cannot happen without attending to a well-rounded curriculum? Instead we are almost certain to have more — much more — of the deleterious effects of our data-driven, muscular accountability age: endless focus on reading strategies that have limited impact, mind-numbing test prep, and no attention to the essential long-range development of background knowledge that will make reading gains possible years down the road.
“Language comprehension is a slow-growing plant,” observes E.D. Hirsch. “Even with a coherent curriculum, the buildup of knowledge and vocabulary is a gradual, multiyear process that occurs at an almost imperceptible rate. The results show up later.”
This is clear, this is obvious, and this is certain. But there is simply no room for this kind of thinking in an accountability system that insists –for every good reason under the Sun–on results right now and encourages individual teachers to compete instead of cooperate.
Fast-forward. It is 2016. After a years of holding teachers accountable for short-term gains, and creating incentives that actively work against the buildup of knowledge, with disappointing results, we wake up and realize we are going about this the wrong way. A few look back and say we should have listened to our Cassandras. But other energetic, well-meaning reformers see it another way. Instead of realizing we have fatally neglected a robust curriculum, that we are reaping what we have sown, they will conclude that as a nation we simply have no good 8th grade reading teachers. Aggressive, immediate action is needed.
Because after all, the data doesn’t lie, does it?
President Obama loves merit pay. So does Arne Duncan. Editorial writers from coast to coast support the idea proposed by Gov. Arnold Schwarzenegger that “teacher employment be tied to performance, not to just showing up.” Dan Willingham wanders into the fray with his latest video, “Merit Pay, Teacher Pay and Value-Added Measures,” and offers six reasons why “value added measures sound fair, but they are not.”
The political winds certainly seem to be very much at the back of merit pay plans. Months or years hence, there may be a temptation to describe the “unintended consequences” of such plans. Call them unintended, but not unanticipated.