The Work of a Great Test Scientist Helps Explain the Failure of No Child Left Behind

by E. D. Hirsch, Jr.
January 10th, 2013

In Praise of Samuel Messick 1931–1998, Part II

In a prior post I described Messick’s unified theory of test validity, which judged a test not to be valid if its practical effects were null or deleterious. His epoch-making insight was that the validity of a test must be judged both internally for accuracy and externally for ethical and social effects. That combined judgment, he argued, is the only proper and adequate way of grading a test.

In the era of the No Child Left Behind law (2001), the looming specter of tests has been the chief determiner of classroom practice. This led me to the following chain of inferences: Since 2001, tests have been the chief determiners of educational practices. But these tests have failed to induce practices that have worked. Hence, according to the Messick principle, the tests that we have been using must not be valid. Might it be that a new, more Messick-infused approach to testing would yield far better results?

First, some details about the failure of NCLB. Despite its name and admirable impulses it has continued to leave many children behind:

 

NCLB has also failed to raise verbal scores. The average verbal level of school leavers stood at 288 when the law went into effect, dropped to 283 in 2004, and stood at 286 in 2008.

Yet this graph shows an interesting exception to this pattern of failure, and it will prove to be highly informative under Messick’s principle. Among 4th graders (age 9) the test-regimen of NCLB did have a positive impact.

Moreover, NCLB also had positive effects in math:

This contrast between the NCLB effects in math and reading is even more striking if we look at the SAT, where the test takers are trying their best:

So let’s recap the argument. Under NCLB, testing in both math and reading has guided school practices. Those practices were more successful in math and in early reading than in later reading. According to the Messick principle, therefore, reading tests after grade 4 had deleterious effects and cannot have been valid tests. How can we make these reading tests more valid?

A good answer to that question will help determine the future progress of American education. Tune in.

An Inconvenient Truth About Teacher Quality

by Robert Pondiscio
December 5th, 2011

If teacher quality is the most important school-based factor in student outcomes, then why are math scores rising, while reading scores stay flat?  Do we just happen to have really good math teachers and really lousy reading teachers?  That can’t be: in the case of 4th grade teachers, the exact same teachers are responsible for both subjects.

Or maybe it’s not the teachers. Could it be the curriculum?

That’s the question posed by Dan Willingham and David Grismer in an op-ed in the New York Daily News this morning.  They point out intriguing data from the National Assessment of Educational Progress that has been hiding in plain sight:

“Reading scores over the last 20 years have been flat. But in math, scores have increased markedly. A fourth-grader at the 50th percentile in 1990 would score at about the 25th percentile compared to the kids taking the test in 2009. That’s an enormous improvement.

“This raises an uncomfortable question for teacher quality advocates: If teachers are so vitally important, why have fourth-grade math scores dramatically improved, but reading scores have flatlined, given that — at least at the elementary level — the same teachers are responsible for each?

Perhaps the secret sauce is not who’s teaching but what’s being taught.  It’s a lot easier to align standards, curriculum and assessment in math. “There is little controversy as to the subject matter to be covered, and the order in which one ought to tackle subjects is more obvious,” Willingham and Grissmer write.  “Indeed, substantial effort has been made over the last 25 years to develop coherent math standards and curricula from K-8.”

In reading? Not so much.

As we’ve discussed many times on this blog, there’s no direct correlation between the subject matter that gets taught and tested in reading.  We teach random, incoherent content that bears no relation to the passages children ultimately encounter on their reading tests.  We insist on teaching and testing the “skill” of reading comprehension when it’s clearly not a skill at all.  Willingham and Grissmer conclude:

“Yes, overall teaching quality would improve with a more sensible method to usher hapless teachers out of the profession. Better teacher training would help too. But in addition to these longer-term goals, policymakers ought to focus on ensuring that the unglamorous but vital work of curriculum design is done properly. The popular perception is that America’s teachers are largely ineffective compared to international peers. But the data show that when given a clear, cogent curriculum to work with, they’re a lot stronger than we think.”

NAEP: Proof of Education Insanity

by Robert Pondiscio
November 7th, 2011

The following post by Lynne Munson appeared originally on the blog of Common Core, a Washington, DC-based organization that works to promote a liberal arts, core curriculum in U.S. schools.  Munson is Common Core’s executive director and a former deputy chairman of the National Endowment for the Humanities — rp.

I challenge anyone to think of a nation that works as hard as we do to find silver linings in its educational failures. On Tuesday morning NAEP reported that, in the course of two years, our nation’s 4th and 8th graders improved a single point (on a 500-point scale) in three of four reading and math assessments, and flatlined on the fourth. If you look at figures plotting NAEP scores over the last 30 years, any upward slope in the data is nearly undetectable to the naked eye. Analysts have spent the last few days slicing and dicing this data and making unconvincing arguments that some positive trends can be detected.

But the reality is that these results are appalling—particularly if you consider the massive federal funding increases, intense reform debates, and the incessant promises of new technologies that have dominated the education discussion for nearly two decades. We have spent a great deal and worked very hard but gotten unimpressive results. And this is in reading and math where, to the detriment of so many other core subjects, we’ve aimed nearly all of our firepower.

Einstein* defined “insanity” as “doing the same thing over and over but expecting different results.” Well, my bet is that Einstein would have deemed NAEP data absolute proof of America’s educational insanity.

We’ve spent the last twenty years attempting to make what, on the surface, appears to be a diverse, creative, and wide-ranging series of reforms to public education. We’ve tried to bring market pressures to bear through charters and choice. We’ve attempted to set high standards and given high-stakes tests. We’ve experimented with shrinking school and class sizes. We’ve focused on “21st century skills” and used the latest technologies. We’ve collected and analyzed data on an unprecedented scale. We’ve experimented with a seemingly endless array of “strategies” for teaching reading and math and have tried to “differentiate” for every imaginable “type” of student. And we’ve paid dearly in tax dollars and in other ways for each of these “reforms.”

Interestingly, all of these reforms have one thing in common (aside from their failure to improve student performance except in isolated instances): None deals directly with the content of what we teach our students.

Maybe we need to give content a chance. What I mean by “content” is the actual knowledge that is imbedded in quality curricula. Knowledge of things like standard algorithms, poetry, America’s past, foreign languages, great painters, chemistry, our form of government, and much more. There are a few widely used curricula (e.g. International Baccalaureate, Latin schools curricula, Core Knowledge) that effectively incorporate much of this knowledge base. And performance data strongly suggests that these curricula work for ALL students.

So let’s draw on such successes and, sure, conduct more research, do more experiments, and spend more money. But let’s do it to build a shared understanding what our students need to learn —the content they need to learn. Then let’s use the best technology available and make the kind of investments we need in professional development to teach that content effectively. In light of the poor results other approaches have yielded, is there any other sane course?

Et tu, Yglesias?

by Robert Pondiscio
June 16th, 2011

Oh no he didn’t!

American Progress pundit Matthew Yglesias commits the unpardonable sin of repeating what has to be to most wrong-headed idea in all of education: that teaching kids content can wait until they’ve learned the “skill” of reading.  He wraps up a column on the just-released NAEP history scores with this jaw-dropper:

“What we’re seeing, in particular, is that trying to teach history to kids who can’t read is a fool’s errand. Focusing more clearly on making sure that kids aren’t falling behind in their core skills is helping the worst-off kids do better across the board even at history.”

Teacher/blogger Rachel Levy sets Yglesias straight, so I don’t have to.

A Curious Takeaway

by Robert Pondiscio
July 15th, 2010

An interesting experiment shows 12th graders’ scores on the National Assessment of Educational Progress (NAEP) go up when students are paid for correct answers on the exam.  The study by researchers from ETS and Boston College has reignited the debate (the hardiest of perennials) about intrinsic motivation and paying kids.

“Though the testing program is considered a national barometer of student achievement, there really isn’t much of an incentive, after all, for students to do well,” Edweek’s Debra Viadero writes.  “Scores from NAEP assessments don’t show up on a report card or count toward graduation requirements. Likewise, colleges never see NAEP scores when students apply for admission.”  The study purports to show ”credible evidence” that NAEP underestimates the reading abilities of students enrolled in 12th grade.  ”Responsible officials should take this into account as they plan changes to the NAEP reading framework and expand the scope of the 12th-grade assessment survey,” the authors conclude.

In other words, things aren’t as bad as they seem?   To my mind the salient point about 12th grade NAEP is that it has read like a dead man’s EKG for 40 years.  Unless you’re ready to suggest that high school seniors were more motivated to do well in years gone by (and show evidence) then NAEP has consistently underestimated reading abilities that entire time.   Under the same testing conditions (no pay) over several decades, there has been zero change in outcome.  

Or am I missing something?

NAEP and the 4th Grade Fall Off

by Robert Pondiscio
April 7th, 2010

Dan Willingham sorts through NAEP data and points out that Chad Alderman is right that 4th grade reading scores look much better when the data are disaggregated by race.  But the bigger story is the lack of progress among all groups in 8th and 12th grade reading.  If 4th grade scores are improving, he asks, why are the gains evaporating just four years later?

“Fourth grade scores have been improving because we’ve gotten better at teaching kids how to decode–that is, how to translate letters into sounds. In the fourth grade some kids are good decoders and some are not, so differences in reading scores are largely differences in decoding.”

In the later grades, reading comprehension is largely a function of background knowledge, not as one NAEP board member said, that we’re not asking kids to read enough. “In fact, Americans are reading more text than they ever have before,” Willingham writes at the Wash Post’s Answer Sheet blog.  ”And kids in lower elementary already spend half their time on language arts, and less than ten percent of their time on social studies and science, combined.”  There’s the rub.

The belief that kids will be better readers if we simply get them to read more is rooted in the belief that reading comprehension is a transferable skill that, once mastered, applies to any text. That’s true of decoding, but not of comprehension.  What’s needed is a substantial knowledge base. Knowledge of the content they are likely to encounter when reading the sorts of materials we expect them to read confidently: newspapers, magazines, and serious books.

“Until we start paying more attention to content, expect flat reading scores,” Willingham concludes.

This is as good a time as any to remind those who haven’t seen it to check out Willingham’s YouTube video, Teaching Content is Teaching Reading, which has been viewed over 32,000 times.  I recommended it recently to a friend in California who is a first-year teacher.  She said she’d already seen it in her ed school classes. 

Slowly, slowly…

TUDA Mathematics Results Out Today

by Robert Pondiscio
December 8th, 2009

New NAEP numbers, the Trial Urban District Assessment or “TUDA” for math, are out this morning, looking at 4th and 8th grade samples from 18 urban school  districts.  From the IES news release: 

  • In comparison to 2007, scores improved in two districts at each grade in 2009.  Scores did not change for the remaining nine districts that participated in 2007.
  • Five districts at both grade 4 and grade 8 had higher scores than large cities nationally in 2009. Ten districts had scores lower than large cities at both grades.
  • When compared to 2003, the 2009 average mathematics scores at grade 4 were higher in eight out of ten participating districts, and in nine out of ten participating districts at grade 8.
  • Average mathematics scores in 2009 were higher for Hispanic fourth-graders in seven out of ten participating districts, when compared to 2003. Over the same period, White and Black fourth-graders achieved higher scores in five districts each.

The full report is available at http://nationsreportcard.gov

Flatline! Call a Code Blue!

by Robert Pondiscio
October 14th, 2009

Reactions to today’s dispiriting NAEP scores….

“The trend is flat; it’s a plateau. Scores are not going anywhere, at least nowhere important.  That means that eight years after enactment of No Child Left Behind, the problems it set out to solve are not being solved, and now we’re five years from the deadline and we’re still far, far from the goal.” (Chester E. Finn, Jr. Thomas B. Fordham Institute)

“Had we had 19 years of flat results and one year of increases in one subject, we wouldn’t celebrate. Similarly, we shouldn’t press the panic button over one year of stalled growth in one subject…this is far from convincing evidence that NCLB failed or education reform is doomed.” (Andy Smarick @ Flypaper)

“It’s clear from the data at both grade levels that we still have a long way to go to effectively prepare all of our elementary and middle school students for the world that awaits them in high school and beyond.” (Kati Haycock, President of The Education Trust)

“Supporters of the No Child Left Behind Act–and I’ve generally been one of them–hoped that the law would catalyze a major upward move in student achievement. That hasn’t happened.” (Kevin Carey @ The Quick and The Ed)

“Seeing stuff flat-line is not what we want as a country — seeing achievement gaps that are unacceptably large.  The status quo isn’t good enough. We have to get dramatically better.”  (Secretary of Education Arne Duncan)

“We’re losing ground to our international competitors every year.  It’s a situation that calls for dramatic improvement. Unfortunately there seems to be apathy across the country.” (David P. Driscoll, chairman of the National Assessment Governing Board)

“The current system is producing school teachers who do not have a strong background in math themselves and may even be ‘afraid’ to teach math to pre-K students…if we want to improve students’ proficiency in math, we have to improve teachers‘ proficiency too. (Lisa Guernsey @ Early Ed Watch)

NAEP Math Scores Flat for 4th Graders; Up in 8th

by Robert Pondiscio
October 14th, 2009

New NAEP scores are out this morning:  No increase for 4th graders from 2007 to 2009; 8th graders are up two points.  From the IES release:

For the first time since the assessment began, 4th graders showed no overall increase at the national level, although they scored significantly higher in 2009 than when the assessment began in 1990.  For 8th graders, scores in 2009  were higher when compared to both 2007 and 1990.  These nationwide patterns also held for most student subgroups.

The report is here.  EdWeek’s Sean Cavanagh is first out of the box with analysis here.

Grading the Common Core Standards

by Robert Pondiscio
October 8th, 2009

A new report from the Fordham Foundation gives a grade of “B” to the draft of the proposed “Common Core” standards in ELA and Math.

Fordham’s report, Stars by Which to Navigate: Scanning National and International Standards in 2009, asked subject-matter experts to review the “content, rigor, and clarity of the first public drafts of the ‘Common Core’ standards” as well as the reading, writing and mathematics frameworks of NAEP, TIMSS, and PISA.  How’d they do?

Common Core Reading/Writing/Speaking & Listening: B
Common Core Math: B
NAEP Reading/Writing: B
NAEP Math: C
TIMSS Math: A
PISA Reading: D
PISA Math: D

The executive summary (I have not read the full report, which was just released this morning) makes a couple of important points, explaining and justifying the “B” grade for the common standards:

The document properly acknowledges that essential communication skills must be embraced and addressed beyond the English classroom….These skill-centric standards do not, however, suffice to frame a complete English or language arts curriculum. Proper standards for English must also provide enough content guidance to help teachers instill not just useful skills, but also imagination, wonder, and a deep appreciation for our literary heritage. Despite their many virtues, these skills-based competencies cannot serve as a strong framework for the robust liberal arts curricula that will prepare young Americans to thrive as citizens in a free society. States adopting these standards must, therefore, be very careful about how they supplement them so as to achieve that goal.

 Hard to disagree with any of that, and the B grade sounds fair.  “The Common Core standards are off to a good start,” says Fordham’s Checker Finn, “though there’s room for improvement—and a sound English curriculum will require plenty more than the valuable skills set forth here.”