An Inconvenient Truth About Teacher Quality

by Robert Pondiscio
December 5th, 2011

If teacher quality is the most important school-based factor in student outcomes, then why are math scores rising, while reading scores stay flat?  Do we just happen to have really good math teachers and really lousy reading teachers?  That can’t be: in the case of 4th grade teachers, the exact same teachers are responsible for both subjects.

Or maybe it’s not the teachers. Could it be the curriculum?

That’s the question posed by Dan Willingham and David Grismer in an op-ed in the New York Daily News this morning.  They point out intriguing data from the National Assessment of Educational Progress that has been hiding in plain sight:

“Reading scores over the last 20 years have been flat. But in math, scores have increased markedly. A fourth-grader at the 50th percentile in 1990 would score at about the 25th percentile compared to the kids taking the test in 2009. That’s an enormous improvement.

“This raises an uncomfortable question for teacher quality advocates: If teachers are so vitally important, why have fourth-grade math scores dramatically improved, but reading scores have flatlined, given that — at least at the elementary level — the same teachers are responsible for each?

Perhaps the secret sauce is not who’s teaching but what’s being taught.  It’s a lot easier to align standards, curriculum and assessment in math. “There is little controversy as to the subject matter to be covered, and the order in which one ought to tackle subjects is more obvious,” Willingham and Grissmer write.  “Indeed, substantial effort has been made over the last 25 years to develop coherent math standards and curricula from K-8.”

In reading? Not so much.

As we’ve discussed many times on this blog, there’s no direct correlation between the subject matter that gets taught and tested in reading.  We teach random, incoherent content that bears no relation to the passages children ultimately encounter on their reading tests.  We insist on teaching and testing the “skill” of reading comprehension when it’s clearly not a skill at all.  Willingham and Grissmer conclude:

“Yes, overall teaching quality would improve with a more sensible method to usher hapless teachers out of the profession. Better teacher training would help too. But in addition to these longer-term goals, policymakers ought to focus on ensuring that the unglamorous but vital work of curriculum design is done properly. The popular perception is that America’s teachers are largely ineffective compared to international peers. But the data show that when given a clear, cogent curriculum to work with, they’re a lot stronger than we think.”

Education Week

by Guest Blogger
November 21st, 2011

by Jessica Lahey

Last Friday, the Illinois State Board of Education proposed new rules that will link teacher performance to their students’ performance on assessments. Up to thirty percent of teacher evaluations will be based on how students perform on tests, and while I understand the value of student progress in evaluating teachers, it’s certainly not the main thing that determines success in education. My mind has been on assessments lately because I just came out of a week defined by what I initially labeled a colossal assessment failure. I gave unit tests to cap off a couple of weeks in Latin and English grammar, and things did not go well. My students failed, failed, failed, and as teachers are wont to do, I used the transitive property and concluded that I had failed, failed, failed.

I spent the following weekend going over the assessments, my preparation, my teaching, the students’ homework scores, and found that the week of failure was much more complicated than one faulty assessment or a failure to teach some critical aspect of the lesson. As I could not go back and re-do the previous month of teaching, I decided to move forward, and figure out how to turn failure in to a learning experience. Once some time had passed, and I’d gained the benefit of hindsight, I wrote about the solution I came up with in my blog, Coming of Age in the Middle . I wrote about my teaching methods, but mostly, I wrote about how I had managed to make it through the week without tucking my tail between my legs and quitting my job.

A writer friend of mine liked the post, one thing led to another, and the next thing I knew, my failure was in the Gray Lady herself. When K.J. Dell’Antonia wrote her piece on my blog, titled “What Good Teachers Do When Kids Fail,” in the New York Times’ parenting blog Motherlode , the comments fell into two distinct camps: Parents who wished their teachers had more time to address student failure and teachers who lamented that they had no time to address student failure. A few teachers wrote about the time they took for re-writes and remedy, but for the most part, the message from educators was one of regret and frustration with a testing-centric schedule that did not allow for reflection.

The solution I came up with for my students required humility on both sides of the classroom – I had to admit I had failed my students and my students had to admit that they had not held up their end of the pedagogical bargain – but mostly, it took time. Time that, according to the comments after the article, most teachers just don’t have. I handed out blank tests and asked the students re-take the assessment as an open book exercise. They were asked to work in pairs I had strategically assigned, and teach each other the material on the test. They were required to not only find the correct answer, but to show why all of the other answers were wrong. This process ate up two classes, and as I only see my Latin students twice a week, this one remedial exercise burned an entire week of the school year. Clearly, this is simply not an option in many classrooms. Maria, from Baltimore, MD, wrote:

“I am a public high school math teacher. It’s only November, and I’m already 10 days behind schedule in one class, 3 days behind in another. And this is without me taking any sick days, no snow days, just a few days away from class for . . . you guessed it, administering the No Child Left Behind tests. I would love to have students retake their tests and learn from mistakes, but thanks to NCLB, and curricula that are an inch deep and a mile wide, we need to press on to the next topic.”

Many comments stressed the vital role that failure plays in education. Dr. Kim, from Ithaca, NY wrote,

“We need to allow students opportunities to fail. Too often our kids are afraid of failure. If we don’t fail, we’re not pushing our limits–we’re not challenging ourselves. I have a friend who is an amazing skier who says “if you don’t fall, you’re not pushing yourself hard enough.” This is true. Plus, we learn much more from failure. Our brains are programmed to remember those things with strong emotional attachments — positive or negative. Failures are memorable.”

I completely agree that some of the best lessons are learned from failure. Failure can shock a student out of complacency, particularly among those students who are smart enough to do well on a bare minimum of effort. Middle school is the ideal time for this time of shock; the stakes are still low(ish) and the potential for growth is huge. I’m not one for sports quotes, but in this case, baseball player and coach Vernon Law had it right. “Experience is a hard teacher because she gives the test first, the lesson afterwards.” It would have been much easier to teach the lessons first and give the test after, but in the end, I think the experience taught all of us a greater lesson. Everyone has to admit to failure – teacher and student. As a result of this failure, I grew as a teacher and they grew as students. Crossroads Academy was built on a core virtues curriculum as well as a core knowledge curriculum, so our journey through this week of failure became an important part of the students’ character education. That’s where commenter T. Zinner of Boston hits the nail on the head:

This article goes to the heart of our goal as parents and the ideal of teachers: creating individuals with strength of character. The happiest and most successful people seem to be the individuals who take their talents and face obstacles either directly with perseverance or creatively so that the obstacles are no longer viewed as challenges. This is the case for the most exceptional physicians I work with, the patients who live fully despite illness and friends and neighbors who create lives of joy and depth in the face of unexpected loss or change in circumstance.

That’s the kind of teaching I love to do, teaching that helps students become better people, teaching that takes into account the unpredictability inherent teaching adolescents.

But this sort of teaching is increasingly not what is valued today, and it’s certainly not what counts as quality teaching or a gauge of student progress. Failure makes people nervous because in order to find anything of value in the situation, everyone has to face their role in the failure. It would have been much easier for me to fail the students and move on, or curve the exam so much that the failure got lost in a sea of amended numbers. The grades would have looked good, the students would have felt good, and everyone would have been satisfied with my performance. But lurking under this neat and tidy appearance, my students would know. They would know they had not really learned the material, that I had swept something under the rug. Worse, I would know that somewhere down the line that gap in their education would come back to haunt them.

Assessments are often blunt instruments, and to decide a teacher’s worth based on student testing measures just one small fraction of the learning that goes on in the classroom. This one assessment failure taught me valuable lessons about my teaching methods, the quality of my assessments, and the courage of my students. Two of my students summed up our week perfectly as they handed in their remedy exam: “I think I learned more from that one failing grade than from any A,” and “You know, now that we have gone through every question, that test really wasn’t that hard.”

My sentiments exactly.

Neither Good Nor Bad

by Robert Pondiscio
September 13th, 2010

Assessment is nearly a constant feature of a decent classroom.  Every time a teacher asks a question in class, leads a discussion, conferences with a child about his work, looks at homework, or glances over a student’s shoulder while she is writing, he or she is assessing–making a judgement about what the child knows, can do, and needs help with.  A baseline idea in education is “assessment drives instruction” — in order to meet a child where he or she is, you have to know where exactly that is. 

For a teacher, this is among the blandest, most obvious statements imaginable.  So why bring it up?  The New York Times, as it is wont to do, has discovered that young children in China are tested constantly–from “mad minute” math quizzes to science exams.  Elizabeth Rosenthal writes that for her two young children attending elementary school in China, “taking tests was as much a part of the rhythm of their school day as tag at recess or listening to stories at circle time.”

In Asia, such a march of tests for young children was regarded as normal, and not evil or particularly anxiety provoking. That made for some interesting culture clashes. I remember nearly constant tension between the Asian parents, who wanted still more tests and homework, and the Western parents, who were more concerned with whether their kids were having fun — and wanted less.

Point taken.  Another recent New York Times piece described the benefits of testing as a learning tool.  “The process of retrieving an idea is not like pulling a book from a shelf; it seems to fundamentally alter the way the information is subsequently stored, making it far more accessible in the future,” observed the Times’ Benedict Carey. 

So assessment is fundamental to teaching, and testing is not only a form of assessment, but a potentially powerful learning tool.  Still, I worry that the wrong takeaways will result from these pieces.  As with, well, everything in education, the risk of oversimplification here is great.    Surely, there is a difference between the constant assessment –formal and informal — that takes place in nearly every good classroom and drives instruction, and the annual ritual of high-stakes testing that now dominates elementary and middle schools.  Likewise there is a difference between studying and mastering a body of material for, say, a biology or geometry test, and a state reading test.  There is no body of knowledge to study with a reading test.  Test-taking skills and reading strategies that might provide a short-term boost are deleterious in the long run.  Countless hours of test prep and strategy sessions are educationally unproductive.  And it would be naive in the extreme to suggest high-stakes tests are not materially different than a workaday math quiz in the anxiety they produce.

The bottom line, as always: it’s complicated.  These issues are not simply about “testing good” or “testing bad.”  With the exception of a few anti-testing zealots — and they are few indeed — it is the rare educator who is opposed to all testing.  Indeed, it’s almost impossible to teach at all without assessing your students on a nearly constant basis, formally and informally.  Yes, lots of kids love to compete against themsleves and classmates on “mad minute” math drills.  Yes, classroom tests focus the mind and efforts of students to master material.  Unfortunately, none of these things are true of high stakes reading and math tests. They don’t drive instruction because months go by before you get the results. No bragging rights or competitive juices are fired by them. And reading tests are impossible to study for  since they are constructed on a mistaken notion of reading as a transferable skill.  It’s possible to be a firm believer in testing–even high stakes testing–yet have misgivings about their impact on education.

Diane Ravitch on Teacher Evaluation and Value-Added

by Guest Blogger
November 18th, 2008

by Diane Ravitch

In his post, “Getting Value-Added Right,” Robert raises excellent questions, and his restaurant metaphor is apt. The value-added growth model, as Dan Willingham notes in the comments section and his post on the Britannica Blog, is not ready for prime time. There are too many intervening variables to hold teachers solely accountable for the test-score growth of every student. Given high rates of mobility, there is a large fluctuation in the student population in schools. As Thomas J. Kane and Douglas O. Staiger point out in one of their papers, their inherent volatility make test scores a poor basis for an accountability system.

The imprecision of test score measures arises from two sources. The first is sampling variation, which is a particularly striking problem in elementary schools. With the average elementary school containing only sixty-eight students per grade level, the amount of variation stemming from the idiosyncrasies of the particular sample of students being tested is often large relative to the total amount of variation observed between schools. The second arises from one-time factors that are not sensitive to the size of the sample; for example, a dog barking in the playground on the day of the test, a severe flu season, a disruptive student in a class, or favorable chemistry between a group of students and their teacher. Both small samples and other one-time factors can add considerable volatility to test score measures.

There are many, many reasons why one-year changes in scores are not reliable. There are many reasons why it is hard to give credit or blame for students’ test score gains and losses from year to year. Until we have better tests and have ironed out many of the confounding variables, it is unfair to make credible inferences about teacher performance from test scores, let alone use such data to dispense rewards and punishments.

There is another reason to worry about value-added growth models that determine a teacher’s fate and compensation. If we turn teaching into an activity whose sole purpose is to produce gains on tests that we know are mainly low-level and dumbed-down, we will not make education better. We may succeed in destroying it altogether. We better find ways to emphasize the quality of curriculum (think Core Knowledge) and to de-emphasize the number of times that kids are asked to check off a box on standardized tests in the course of a month. Or our education system will be far worse than ever.

Diane blogs on education at Bridging Differences — ed.

Low-End Grade Inflation

by Robert Pondiscio
June 10th, 2008

USA TodayNot content with making 50 the new zero, one North Carolina school district is considering imposing a lowest possible grade for tests or assignments of 61.

Proponents of eliminating zeroes as grades for work not submitted point out that A, B, C, and D letter grades are typically signify increments of ten — an A is 100 to 91; B is 90 to 81, etc. — but there is a 60-point spread between D and F, which makes it mathematically impossible for some failing students to ever catch up.

“There is little or no evidence that repeated failure makes people more responsible,” Sherri Martin the Chapel Hill-Carrboro district’s director of high school programming said at a Board of Education meeting last week. “The threat of a low grade is more likely to motivate high-achieving students than low-achieving students.”

Many parents and teachers disagree. “The system for years had talked about raising expectations for all children in the district, and I don’t feel that demonstrates raised expectations for everybody,” said Beth Ann Ghio, whose son is an East Chapel Hill High junior. “I don’t think that’s fair for children who actually submit the work — even if it’s not passing quality — that they receive the same grade as a student who doesn’t submit anything.”

There’s No “A” In Whole Child

by Robert Pondiscio
February 15th, 2008

New York TimesWriter and parent Maura J. Casey complains in the New York Times (So Is That Like An A?) about report cards in Hartford, Connecticut. The reports—clearly not cards—are up to seven pages long and grade a child on how he or she “establishes and maintains a healthy lifestyle by avoiding risk-taking behavior” and 57 other academic, social and behavioral criteria. In music class, for example, students are being graded on how they make “connections between music and other disciplines through evaluation and analysis of compositions and performances.”

It’s no mere rant. Casey points out that the academic measurements, which are designed to grade areas of student performance that are also measured on state standardized tests, seem more likely to confuse than illuminate. “I confess that as a parent, I’ve always focused on the basics. I want my children to be curious, enjoy learning, to read for pleasure, to be polite, to do their homework and to try not to hate school. If my kids got A’s or B’s, I got a pretty good sense that they were mastering the necessary skills. If they did much worse, I knew that it was time to call their teachers,” Casey writes.

In cities like Hartford, where many students come from non-English speaking homes, Casey points out that educational jargon like “uses numeracy and literacy skills to describe, analyze and present scientific content, data and ideas” seems destined to confuse, not clarify. “If report cards are weighed down with educational jargon that even native English speakers have to struggle to understand, ” she concludes, “it is fair to ask who the administrators are really reporting to: students and their families or the educational bureaucracy?”