Value-Added: When Being Right Isn’t Enough

by Robert Pondiscio
August 17th, 2010

The Los Angeles teachers union may be correct to fight the publication of individual teachers’ value-added test scores.  But they’re on the wrong side of history, writes Dan Willingham, who has long made a compelling, fair and utterly dispassionate case (he has no skin in the fight) that using value-added measures to evaluate teacher quality is “not ready for prime time.”  He’s explained the problems in blog posts, and even in a YouTube video

Clearly, to no avail.  For those who have just arrived on the planet this morning, the Los Angeles Times over the weekend produced a blockbuster piece of reporting, based on years of test scores for 3rd through 5th grade teachers, enlisting a statistician to rate the effectiveness of individual teachers by name. The data, says the paper tells “which ones have the classroom magic that makes students learn and which ones annually let their students down.”  Blogging at the Washington Post’s Answer Sheet, Willingham responds:

“The writers of the Times article are either uninformed or disingenuous about the status of the value-added measures. They write ‘Though controversial among teachers and others, the method has been increasingly embraced by education leaders and policymakers across the country, including the Obama administration.’  The ‘others’ include most researchers looking into the matter.”

The L.A. teachers union is calling for a boycott of the paper.  Good luck with that, is Willingham response.  “When it comes to value-added measures, teachers and unions are right. The models aren’t reliable enough to evaluate individual teachers,” he observes. 

“But right now that doesn’t matter much. The mood today is that something has to be done about incompetent teachers. We’ve seen that mood in districts in New York City and Washington D.C. and now we’re seeing it in Los Angeles.  We’re also seeing it at the federal level. Education Secretary Arne Duncan said that the publishing of the individual teacher’s scores is just fine. The people who feel that something must be done are right. In most districts there is not a mechanism by which to ensure that incompetent teachers are not teaching.”

This is the time for the teacher’s unions to make teacher evaluation their top priority, Willingham concludes. “If they don’t, others will.”

Others have.

28 Comments »

  1. The saddest part of the LAT story is that it appears the LAUSD teacher union is on the RIGHT side of history on this matter:
    http://gd.is/LbVm

    It sounds like they are participating in a pretty thoughtful process that will take into account many relevant factors to develop a teacher evaluation system. Seems like the wrong city/union to pick on.

    Comment by Jason — August 17, 2010 @ 12:45 pm

  2. [...] This post was mentioned on Twitter by Emily Alpert, Robert Pondiscio. Robert Pondiscio said: Willingham: LA Times wrong on value-added. But if unions don't make evaluation a priority, others will. http://bit.ly/cBk5gX [...]

    Pingback by Tweets that mention Value-Added: When Being Right Isn’t Enough « The Core Knowledge Blog, The Core Knowledge Blog -- Topsy.com — August 17, 2010 @ 3:26 pm

  3. Robert, you write that Dan writes: “The ‘others’ include most researchers looking into the matter.”

    Is that true?

    It does seem like a bunch of economists do embrace value-add over the alternative (i.e., no concrete measurement, just supervisor observations; or simply no teacher evaluation at all).

    Kane, Staiger, Fryer, Angrist, Rockoff, Hanushek, et al.

    Comment by MG — August 17, 2010 @ 4:54 pm

  4. In Massachusetts, the results of the MCAS test gets returned to each school by the state with an item analysis of every question on the test.

    If too many kids in Ms Smith’s class get the wrong answer on question number 24 in the third grade math test (matching fractions to percentages), the teacher learns she needs to do a more thorough job teaching this concept for this year’s class (tests are given in the spring and the results don’t come back until the following autumn). If it happens again, the teacher needs to be mentored on the concept or have it reinforced how important each portion of the math curriculum is to the development of each student.

    Value-added or not, the contents of the test need to be taught/learned, period. Willie Sanders aside, this is not a sophisticated statistical algorithm we’re dealing with. It’s objective data that can easily be scrutinized for success or failure. The teacher has either taught it and most kids in the class have grasped the concept, or not. If they have – GOOD. If they haven’t, Houston, we have (a bit of) a problem.

    This is objective data on every teacher districts should have at their fingertips.

    Help these teachers if needed. After that, decisions have to be made. W

    Comment by Paul Hoss — August 17, 2010 @ 5:07 pm

  5. If we have a significant number of inadequate teachers, what does that say about the teachers colleges that educate and CERTIFY them? If we are going for accountability, let’s hold accountable the institutions that certified these people as teachers in the first place.

    Comment by Homeschooling Granny — August 17, 2010 @ 5:50 pm

  6. If ed reform ultimately is only about negotiating politically feasible reforms, student learning is unlikely to ever change.

    While it may be counter to the mood in DC, the more people advocate for effective reforms (Yearly Value-Added Assessment at the teacher level is not on that list) the better the possibility that the mood might shift.

    Homeschooling Granny, Now that would be an interesting idea. Rate the ed schools by the test scores of their graduating teachers. At least this plan would have enough data points to even have the possiblity of being statistically valid.

    Comment by Erin Johnson — August 17, 2010 @ 8:31 pm

  7. Again, it seems they ignore curriculum.

    In related news, the city has required that all hammers be replaced by zucchinis. Since no nails are being driven anymore, it is obvious that the cause must be incompetent workers.

    Comment by Obi-Wandreas — August 17, 2010 @ 8:53 pm

  8. MG – Willingham’s use of the word “most” is ambiguous, it’s true. And I’m not an expert in the fields of statistics and psychometrics. However, the largest organizational bodies that deal with these matters – APA, AERA, NCME, and the National Academies – all stand against the use of standardized tests for teacher evaluation purposes. I could counter your list with this one (borrowed from Gerald Bracey) representing the opposite view : Henry Braun, Howard Wainer, Dan McCaffrey, Dale Ballou, J. R. Lockwood, Haggai Kupermintz – and I would add Dan Willingham, Jesse Rothstein, Linda Darling-Hammond.

    Paul – I assume you’ve seen the tests you’re talking about. I also assume there are some decent ones out there. The ones that I’m most familiar with are the California STAR tests for reading/language arts, and they are rather inadequate for the needs you describe. First of all, many of the reading comprehension questions don’t require any real reading – they require locating information. They rely too often on background knowledge (a point Robert and friends make convincingly). They are not instructionally sensitive – it’s easy for many students to answer many questions regardless of what was taught. Many questions do not allow valid inferences because there are multiple ways to arrive at correct answers. Example: The preceding passage was a biography because it a). told someone’s life story, b). explained biology, c). featured dramatic moments. So, if my son can answer that question (an actual sample from the state), is it because he can read the passage or because he already knows what a biography is? Sure, we want students to be able to answer that question regardless, but the question does not allow any diagnosis of teaching or learning. Futhermore, are you bothered by the fact that one study (need some time to find it) found that if you switched the test, you inverted the results of which teachers are “good” or “bad”? What about the study that showed VAM-based evaluations of teachers were not steady – add another year of data, and some of these L.A. teachers will look better or worse, and it will be tempting – but impossible – to know why.

    Comment by David B. Cohen — August 17, 2010 @ 10:56 pm

  9. Massachusetts standards, coupled with our MCAS tests, their history, and the level of success they’ve exemplified, speak for themselves.

    Yes, of course I’m familiar with the MCAS tests. Every teacher in Massachusetts who teaches math and/or ELA is familiar with them. Again, we received an item analysis of each test each year showing where our students excelled and where they had problems.

    Don’t know enough about California’s test to comment on them. If you have problems with the California Star test(s), perhaps the state DOE has an opportunity for public input into the process. Sometimes this avenue can be productive, especially if a number of teachers join the chorus.

    Comment by Paul Hoss — August 18, 2010 @ 6:17 pm

  10. [...] Value-Added: When Being Right Isn’t Enough « The Core Knowledge Blog Filed under: education — coopmike48 @ 8:52 pm Value-Added: When Being Right Isn’t Enough « The Core Knowledge Blog. [...]

    Pingback by Value-Added: When Being Right Isn’t Enough « The Core Knowledge Blog « Parents 4 democratic Schools — August 18, 2010 @ 11:52 pm

  11. The six reasons in Dan’s You-tube explaining the “unfairness” of using value-added measures appear to make sense.

    Could these six excuses be diffused if schools had random placements of students and examined the results over time? I believe they could.

    In addition, is it really fair to give Mr. X the major discipline problems every year because he can purportedly “handle” them? I don’t think so. The same case can be made for any marginal cohort of students (LD, needy, low readers, etc.). After all, is Mr. X being paid more for this more challenging case load every year? He is not.

    Taxpayers, and particularly parents, have a right to know which teachers are and which teachers are not getting the job done. US public schools simply cannot continue to use subjective administrative evaluations. They have proven to be too unreliable and often of no value.

    Random student placements would be a relatively major paradigm shift. So? Our schools need one to at least be headed in the direction of becoming more reputable (NYC, Georgia, etc.). If jointly administered by the local union and district administrators, random placements could be a major step toward making value-added information meaningful. Again, a major change, but one sorely needed.

    Comment by Paul Hoss — August 19, 2010 @ 7:20 am

  12. @phoss As a teacher who every year received students (and once, someone else’s entire class) because he could purportedly handle them, I wonder. If Gabriel is terrorizing Miss A’s class, and Juan is forever disrupting Miss B, you want to say, “We can’t put them both in Mr. P’s room because it would mess up our random distribution. Seriously, Paul?

    Comment by Robert Pondiscio — August 19, 2010 @ 7:42 am

  13. Mr. Pondiscio,

    Why do you constantly defend the teachers? The teachers are the ones delivering the content and if they don’t do it correctly, they should have their records exposed and be disciplined or fired. You wrote over on Flypaper that Julian was wrong because posting of scores will “drive good teachers out of the profession”. If the teachers are so great, then why the worry over letting parents know what the teacher’s qualifications are?

    It seems to me that you care more about protecting bad teachers than about making sure that all kids are taught rigorous content, even if that means coming down on the teachers.

    Comment by AlexB — August 19, 2010 @ 4:07 pm

  14. @AlexB “Constantly defend teachers” is an overstatement. I am concerned, certainly, that there is more than a little “ready, fire, aim” in our rush to label teachers “good” or “bad” and the issues with using test scores alone to make those judgements is fraught with difficulties.

    And I’m in no way concerned with teachers “qualifications.” I’m “qualified” to teach, but I can safely very little that I learned in the process of earning my “qualification” was of use in the classroom. And that’s one of the biggest overlooked problems in the teacher quality debate. We confuse ends and means virtually as a matter of policy. To wit: With approximately three million teachers needed to staff U.S. schools, there are not now—nor will there ever be—enough “great teachers” to go around. By all means, let’s get rid of bad teachers. But it should surprise no one when another bad teacher takes his or her place. Measuring teacher performance does nothing whatsoever to address how we ended up with so many bad teachers in the first place. I’ve said this so often I can now repeat it verbatim: The typical teacher in a low-performing school was poorly trained, has no say over curriculum (and as often as not, no curriculum whatsoever), little leverage on disciplinary issues, and often has to prepare and deliver lessons in a manner explicitly prescribed by administrators. Professional development typically adds nothing of value, and meaningful feedback is rare to nonexistent.

    This is less a method of producing good teachers than a hazing ritual. I’m not suggesting we’re producing bad teachers on purpose. But if we were, it might look an awful lot like what we’re doing. I’m a little more interested in fixing that — including curriculum and content, naturally — that running around like the Queen of Hearts screaming “Off With Their Heads!”

    Comment by Robert Pondiscio — August 19, 2010 @ 4:38 pm

  15. One other point, Alex: I suspect reading scores in elementary school, given equali implementation, are at least as sensitive to curriculum quality as teacher quality. Perhaps we should fire bad curricula?

    Comment by Robert Pondiscio — August 19, 2010 @ 5:11 pm

  16. Robert,

    Not sure what you’re suggesting. What I’m saying is Gabriel and Juan should be randomly assigned to the three or four classes at that grade level and not automatically placed in Mr. P’s class because he can handle them. That’s sexual discrimination, isn’t it? Is Mr. P being additionally remunerated for having these kids in his class?

    Over several years, these kinds of placements should even themselves out so the distribution of at-risk students should be evenly distributed, again, over time.

    Comment by Paul Hoss — August 19, 2010 @ 8:53 pm

  17. @Paul Hoss I’m wondering, Paul, if you truly believe that we should exercise pure random distribution of students so that we can have better data on whose a more effective teacher. In other words, it sounds like you’re suggesting that getting the cleanest data on teacher effectiveness is a greater good that exercising professional judgement on what might work for particular children.

    Comment by Robert Pondiscio — August 20, 2010 @ 12:07 am

  18. Robert, If we were looking at what would be best for any particular child then schools would be embracing CK and other reforms that enabled children to learn well. So far, this has not been on the radar of most major ed reform advocates.

    Given that the current popular ed reform is value-added assessment of teaching then, yes, we should make that data as reliable and valid as possible. How to do so has yet to be elucidated; but perhaps we should all just blindly follow the LA Times lead and ignore all the technical problems with these analyses. Feeling like we help children is always more important that actually doing so.

    Comment by Erin Johnson — August 20, 2010 @ 1:56 am

  19. Robert,

    Your example above of Gabriel and Juan, for me is fuzzy. Are you suggesting Miss A and/or Miss B should not have to tolerate these two disruptive boys because they can’t handle them as well as Mr. P might? Not fair, not fair at all.

    If Miss A and/or B can’t handle nudgey or disruptive kids, I don’t believe they should be in the classroom. Same for Miss X who has trouble teaching kids with learning disabilities or Miss Y who is not a strong math instructor. If they cannot handle the whole enchilada, what are they doing in the classroom? Beyond that, why should these “deficiencies” exclude them from any form of at-risk students? NOT FAIR. Again, does Mr. P get paid more for his ability to handle problem students? Is there some covert salary schedule out there the public doesn’t know about?

    As well, for me, many of these “perceptions” of teachers are here-say and innuendo, especially from the viewpoint of parents. Too many parents request a teacher without any first hand knowledge of what goes on in that teacher’s classroom. They’ve never spent a minute in the teacher’s room, often times never even having met the teacher, but insisting that teacher would be best for their child. I can’t tell you how many parents insisted their child be in my room but they didn’t know me from Adam, probably couldn’t even pick me out of a police line-up. It really boggled my mind. And the other teachers at that grade level are somehow perceived as chopped liver? Wow, talk about a screwed up way to run an auto parts store.

    And AJ, that is one outstanding suggestion. I absolutely love it. Parents, because of political correctness, are NEVER called into question about the failure of their child(ren) in school, NEVER. But we as teachers know all too well, the job of a parent is the most important job on the planet. There are no prerequisites for these openings, and no license required. As I’ve stated before, it’s easier and less expensive to become a parent than it is to obtain a fishing license.

    Comment by Paul Hoss — August 20, 2010 @ 6:39 am

  20. Mr. Pondiscio,

    “But it should surprise no one when another bad teacher takes his or her place. Measuring teacher performance does nothing whatsoever to address how we ended up with so many bad teachers in the first place.”

    Are you saying that that its always the falt of having a questionable curriculum? Should we dumb down the curriculum to ensure that all the teachers are always successful? Does letting teachers know when they’ve proven not to have taught the content better than just letting the teacher continue to teach?

    “has no say over curriculum(and as often as not, no curriculum whatsoever)”

    Many teachers do in fact have control over the curriculum. That’s precisely why we’re having this issue in the first place. Low test scores are evidence that the teacher hasn’t taught the material. Sanctions against the teacher need to be applied in order to increase performance. You assume that teachers will feel too ashamed to want to improve. The assumption that it’s the result of systematic issues like curriculum and discipline presupposes that the teacher is always the unfortunate cild in a divorce case and that all the problems are due to the fighting parents when, in reality, the teacher is more like one of the kids fighting with the other in a tug of war and laughs when the other kid gets knocked to the ground.

    It seems that your teaching experience leads you to side with the teacher’s unions rather than the kids.

    Comment by AlexB — August 20, 2010 @ 12:39 pm

  21. Another thing: Are you saying that content should be “develpmentally appropriate?”

    Comment by AlexB — August 20, 2010 @ 12:42 pm

  22. @Alex The fastest way to get me to completely lose interest in a discussion about education is to start throwing around sophomoric formulations like “siding with the teacher’s unions rather than the kids.” Fair warning. I also don’t care much for the word “always” in such discussions, e.g. “are you saying that that its always the falt of having a questionable curriculum?”

    To be clear, what I’m saying is that I find it curious — irresponsible, frankly — that teacher quality debates tend to assume that all other things are not only equal, but functional: that we have in place good curriculum, training, professional development, ed schools, etc. And since all of these are in fine working order, when there’s a breakdown, it must be on the teacher. There are lots of bad teachers as you’d expect if you had three million people doing any task. The point that no one seems terribly interested in is how many of them fail despite doing exactly what they’ve been trained to do with the resources and materials they’ve been given.

    I also disagree with the idea that many teachers have control over their curriculum. Low test scores are NOT evidence that the teacher hasn’t taught the material. Take reading, for example. What “material” should be taught? The alphabet? Phonics? Fine if you’re teaching kindergarten, but what — what exact “material” do you suggest should a 4th grade teacher be held accountable for teaching? And how do you propose to assess it.

    I’m going to suggest that you take a look at an article E.D. Hirsch and I wrote recently for the American Prospect which looks at some of these issues.

    http://www.prospect.org/cs/articles?article=theres_no_such_thing_as_a_reading_test

    The problem is that reading comprehension is not “material” to be covered but rather a reflection of the lifetime sum of a student’s general knowledge. But we mistakenly conceive of it as a generic, all-purpose and transferable skill. I’m fond of saying “teaching content is teaching reading.” This is in no way reflected in the pedagogy or curriculum of most U.S. schools, and it is certainly not reflected in the way most schools of education train and support teachers. Most of all, it’s not even the way the vast majority of reading teacher were taught to teach reading.

    So how then, does it make sense to hold teachers exclusively responsible for the field’s general lack of understanding about how reading and reading assesment works?

    Comment by Robert Pondiscio — August 20, 2010 @ 1:26 pm

  23. Mr. Pondiscio,

    I know for a fact that state departments of education do in fact define content at each grade level. Take a look at the math content at each grade level in Kentucky- http://www.education.ky.gov/KDE/Instructional+Resources/
    Curriculum/+Documents+and+Resources/Teaching+Tools/
    Combined+Curriculum+Documents or the math standards for California- http://www.cde.ca.gov/be/st/ss/documents/mathstandard.pdf

    Your argument that it’s the fault of teacher training schools and state departments of education for not defining core content by grade level and that explains teacher ineffectiveness doesn’t impress me.

    “So how then, does it make sense to hold teachers exclusively responsible for the field’s general lack of understanding about how reading and reading assessment works?”

    If all teachers don’t know how reading and assessing it works, then they shouldn’t be teaching in the first place!
    The task is to get teachers who will follow the state standards, which, at this juncture, unfortunately doesn’t seem very likely.

    Comment by AlexB — August 20, 2010 @ 2:56 pm

  24. The actual address to get to Kentucky’s core content is-

    http://www.education.ky.gov/KDE/Instructional+Resources/Curriculum/+Documents+and+Resources/Teaching+Tools/Combined+Curriculum+Documents

    Teacher’s know what to teach. The problem is that they don’t want to be held accountable when they don’t.

    Comment by AlexB — August 20, 2010 @ 3:02 pm

  25. @AlexB. We’ve been talking — at least I have been talking (and so was the LA Times, chiefly) — about reading. Not Math or other areas of instruction.

    <<< Your argument that it’s the fault of teacher training schools and state departments of education for not defining core content by grade level and that explains teacher ineffectiveness doesn’t impress me.

    Happily, my goal is not to impress you.

    <<< If all teachers don’t know how reading and assessing it works, then they shouldn’t be teaching in the first place!

    By what agency, I wonder, would you expect teachers to gain this superior knowledge, when their ed schools, professional development, feedback and formal assessment does not account for it? This is precisely my point, which you consistently or willfully ignore: Our education system is constructed on incorrect and demonstrably false assumptions about the nature of reading (cf. E.D. Hirsch’s entire body of work over the last 25 years) which can be summarized as follows: garbage in, garbage out. If you still insist that teachers should bear the lion’s share of the blame for the garbage, or that they should function as alchemists, turning garbage into gold, I’ll not try to dissaude (or impress) you. But do me the kindness of getting in touch in, say, 20 years time to let me know how the teacher-quality-is-all-that-matters reform worked out. I’ll be laboring in other vineyards and eager to hear the news.

    Comment by Robert Pondiscio — August 20, 2010 @ 3:06 pm

  26. [...] In a strikingly simple U-tube  cognitive scientist Dan Willingham shows us six reasons why “value-added” scores are NOT valid and reliable teacher assessment tools. [...]

    Pingback by Why Are Teachers Suspicious of Test Score Driven Assessments? Why We Can’t De-Link Evaluation from Supervisory/Peer Judgements? | Ed In The Apple — August 21, 2010 @ 6:59 pm

  27. @AnthonyGuzzaldo,

    Look at the ‘Goals’ at the bottom of the Illinois Social Science(not studies as you claim) Standards Page.You’ll see that in Goal 16 for history, it lays out exactly what teachers should teach by listing the historical eras in world history and emphasizes American history in the early grades. If that isn’t good enough for you, you shouldn’t be teaching at all. You’re like many teachers I’ve met: lazy and inaccurate.

    Comment by AlexB — August 23, 2010 @ 12:40 pm

  28. Mr. Pondiscio,

    Let’s suppose that you’re correct: our educational system doesn’t understand how to teach reading. Perform the following thought experiment: If we suddenly knew tomorrow how to teach every child the rudiments of reading and writing and their test scores increased, what would be you’re reaction? Do you seriously want every child to score at the proficient level?

    Look at it this way: If reading test scores went up, that would be an indication that the kids were learning something because all states do have content standards for what kids should read already. A national curriculum is unnecessary because it would dumb down what states are already doing. Massachusetts is a good example. The reason why that isn’t the case today is because of the teaching profession and how it resists reform because if every child were successful, the teachers would get accused of dumbing down instruction so that every kid could pass. The teaching profession hasn’t been willing to abide by and defend the state standards because they are more worried about their political chances with an ignorant public than doing right by kids. They do know what to teach because if they didn’t, they never would have survived college and gone into the teaching profession in the first place.

    I agree with Hirsch that content is important. I disagree with his assertion that we don’t have any.

    Comment by AlexB — August 23, 2010 @ 12:54 pm

RSS feed for comments on this post. TrackBack URL

Leave a comment

While the Core Knowledge Foundation wants to hear from readers of this blog, it reserves the right to not post comments online and to edit them for content and appropriateness.