If He’s So Important, Why Haven’t We Heard of Him?

by E. D. Hirsch, Jr.
January 9th, 2013

In Praise of Samuel Messick 1931–1998

Everyone who is anyone in the field of testing actually has heard of Samuel Messick.  The American Psychological Association has instituted a prestigious annual scientific award in his name, honouring his important work in the theory of test validity.   I want to devote this, my first-ever blog post, to one of his seminal insights about testing.   It’s arguable that his insight is critical for the future effectiveness of American education.

My logic goes this way:   Every knowledgeable teacher and policy maker knows that tests, not standards, have the greater influence on what principals and teachers do in the classroom.   My colleagues in Massachusetts—the state that has the most effective tests and standards—assure me that it’s the demanding, content-rich MCAS tests that determine what is taught in the schools.  How could it be otherwise?  The tests determine whether a student graduates or whether a school gets a high ranking.  The standards do vaguely guide the contents of the tests, but the tests are the de facto standards.

It has been and will continue to be a lively blog topic to argue the pros and cons of the new Common Core State Standards in English Language Arts.    But so far these arguments are more theological than empirical, since any number of future curricula—some good, some less so—can fulfill the requirements of the standards.   I’m sure the debates over these not-yet-existent curricula will continue; so it won’t be spoiling anyone’s fun, if I observe that these heated debates bear a resemblance to what was called in the Middle Ages the Odium Theologicum over unseen and unknown entities.   Ultimately these arguments will need to get tied down to tests.   Tests will decide the actual educational effects of the Common Core Standards.

But Samuel Messick has enunciated some key principles that will need to be heeded by everyone involved in them if our schools are to improve in quality and equity—not only in the forty-plus states that have agreed to use the common core standards—but also in those states that have not.   In all fifty states, tests will continue to determine classroom practice and hence the future effectiveness of American education.

In this post, I’ll sketch out one of Messick’s insights about test validity.   In a second post, I’ll show how ignoring those insights has had deleterious effects in the era of NCLB.  And in a third, and last on this topic, I’ll suggest policy principles to avoid ignoring the scientific acumen and practical wisdom of Samuel Messick in the era of the Common Core Standards.


Messick’s most distinctive observation shook up the testing world, and still does.  He said that it was not a sufficient validation of a test to show that it exhibits “construct validity.”    This term of art means that the test really does accurately estimate what it claims to estimate.   No, said Messick, that is a purely technical criterion.   Accurate estimates are not the only or chief function of tests in a society,   In fact, accurate estimates can have unintended negative effects.   In the world of work they can unfairly exclude people from jobs that they are well suited to perform.  In the schools “valid” tests may actually cause a decline in the achievement being tested – a paradoxical outcome that I will stress in the three blogs devoted to Messick.

Messick called this real-world attribute of tests “consequential validity.”    He proposed that test validity be conceived as a unitary quality comprising both construct validity and consequential validity—both the technical and the ethical-social dimension.   What shall it profit a test if it reaches an accurate conclusion yet injures the social goal it was trying to serve?

Many years ago I experienced the force of Messick’s observation before I knew that he was the source of it.    It was in the early 1980s, and I had published a book on the valid testing of student writing. (The Philosophy of Compsition).   At the time, Messick was the chief scientist at the Educational Testing Service, and under him a definitive study had been conducted to determine the most valid way to measure a person’s writing ability.   Actual scoring of writing samples was notoriously inconsistent, and hence unfair.   Even when graded by specially socialized groups of readers (the current system) there was a good deal of variance in the scoring.

ETS devised a test that probed writing ability less directly and far more reliably.   It consisted of a few multiple-choice items concerned with general vocabulary and editorial acumen.     This test proved to be not only far shorter and cheaper, it was also more reliable and valid.    That is, it better predicted elaborately determined expert judgment of writing ability than did the writing samples.

There was just one trouble with this newly devised test.  Used over time, student writing ability began to decline.   The most plausible explanation was that although the test had construct validity it lacked consequential validity.   It accurately predicted writing skill, but it encouraged classroom activity which diminished writing skill—a perfect illustration of Messick’s insight.

Under his intellectual influence there is now, again, an actual writing sample to be found on the verbal SAT.   The purely indirect test which dispensed with that writing sample had had the unfortunate consequence of reducing the amount of student writing assigned in the schools, and hence reducing the writing abilities of students.  A shame: the earlier test was not just more accurately predictive as an estimate, it was fairer, shorter, and cheaper.  But ETS has made the right decision to value consequential validity above accuracy and elegance.

Next time: Consequential Validity and the Era of No Child Left Behind

The PIRLS Reading Result–Better than You May Realize

by Dan Willingham
December 17th, 2012

This was written by cognitive scientist Daniel Willingham, professor of psychology at the University of Virginia and author of  “When Can You Trust The Experts? How to tell good science from bad in education.” This appeared on his Science and Education blog.

The PIRLS results are better than you may realize.

Last week, the results of the 2011 Progress in International Reading Literacy Study (PIRLS) were published. This test compared reading ability in 4th grade children.

U.S. fourth-graders ranked 6th among 45 participating countries. Even better, US kids scored significantly better than the last time the test was administered in 2006.

There’s a small but decisive factor that is often forgotten in these discussions: differences in orthography across languages.

Lots of factors go into learning to read. The most obvious is learning to decode–learning the relationship between letters and (in most languages) sounds. Decode is an apt term. The correspondence of letters and sound is a code that must be cracked.

In some languages the correspondence is relatively straightforward, meaning that a given letter or combination of letters reliably corresponds to a given sound. Such languages are said to have a shallow orthography. Examples include Finnish, Italian, and Spanish.

In other languages, the correspondence is less consistent. English is one such language. Consider the letter sequence “ough.” How should that be pronounced? It depends on whether it’s part of the word “cough,” “through,” “although,” or “plough.” In these languages, there are more multi-letter sound units, more context-dependent rules and more out and out quirks.

Another factor is syllabic structure. Syllables in languages with simple structures typically (or exclusively) have the form CV (i.e., a consonant, then a vowel as in “ba”) or VC (as in “ab.”) Slightly more complex forms include CVC (“bat”) and CCV (“pla”). As the number of permissible combinations of vowels and consonants that may form a single syllable increases, so does the complexity. In English, it’s not uncommon to see forms like CCCVCC (.e.g., “splint.”)

Here’s a figure (Seymour et al., 2003) showing the relative orthographic depth of 13 languages, as well as the complexity of their syllabic structure.

From Seymour, et. al. (2003)

Orthographic depth correlates with incidence of dyslexia (e.g., Wolf et al, 1994) and with word and nonword reading in typically developing children (Seymour et al. 2003). Syllabic complexity correlates with word decoding (Seymour et al, 2003).

This highlights two points, in my mind.

First, when people trumpet the fact that Finland doesn’t begin reading instruction until age 7 we should bear in mind that the task confronting Finnish children is easier than that confronting English-speaking children. The late start might be just fine for Finnish children; it’s not obvious it would work well for English-speakers.

Of course, a shallow orthography doesn’t guarantee excellent reading performance, at least as measured by the PIRLS. Children in Greece, Italy, and Spain had mediocre scores, on average. Good instruction is obviously still important.

But good instruction is more difficult in languages with deep orthography, and that’s the second point. The conclusion from the PIRLS should not just be “Early elementary teachers in the US are doing a good job with reading.” It should be “Early elementary teachers in the US are doing a good job with reading despite teaching reading in a language that is difficult to learn.”


Seymour, P. H. K., Aro, M., & Erskine, J. M. (2003). Foundation literacy acquisition in European orthographies. British Journal of Psychology, 94, 143-174.

Wolf, M., Pfeil, C., Lotz, R., & Biddle, K. (1994). Towarsd a more universal understanding of the developmental dyslexias: The contribution of orthographic factors. In Berninger, V. W. (Ed), The varieties of orthographic knowledge, 1: Theoretical and developmental issues.Neuropsychology and cognition, Vol. 8., (pp. 137-171). New York, NY, US: Kluwer

Words Get in the Way

by Robert Pondiscio
November 30th, 2012

This blog has long kvetched about the tendency to use terms like standards (what proficiencies kids should be able to demonstrate) and curriculum (the material that gets taught in class) interchangably.  Michael Goldstein, founder of Boston’s MATCH school observes that education lacks a common vocabulary, which makes life harder for teachers.  “They get bombarded all the time with new products, websites, software that all claim they can get students to ‘deeper learning.’ But without a common understanding of what actually qualifies, it’s hard to know if X even purports to get your kids where you want them to go,” he writes.

Goldstein compares education to medicine where there is broad agreement, for example, on the five stages of cancer–and that makes it easier for for medical professionals and patients to work together.  “When scientists come up with treatments,” he notes, “they often find them to be effective for cancers only in certain stages. So when they tell doctors: ‘treatment only effective for X cancer in stage two,’ everybody knows what that means.”

In education, no such common vocabulary exists.

“Our sector talks a lot of “Deeper Learning.” Or “Higher-Order Skills.”

“But what does that mean? There’s not a commonly-accepted terminology or taxonomy. Instead, there are tons of competing terms and ladders.

“In math, for example, here’s language that the US Gov’t uses for the NAEP test. Low, middle, and high complexity. I suppose they might characterize the “high” as “deeper learning.”

“Here’s Costa’s approach, a different 3 levels. Text explicit, text implicit, and activate prior knowledge. Again, perhaps the last is “deeper learning.”

“Here’s another take, more general than math-specific, from Hewlett.

“A software like MathScore has its own complexity ratings.

“And so on. You could find 10 more in 10 minutes of Googling.

Goldstein posts a question from Massachusetts’ MCAS tests, a perimeter question that shows four different rectangles and asks, “Which of these has a perimeter of 12 feet?”

“First you need to know what perimeter means. Second you need to know you that you need to fill in the “missing sides.” Third you need to know what to fill in, because you understand “rectangle.” Finally you need to add those 4 numbers. If you only understand 3 of the 4 ideas, you’ll get the question wrong.

“Does this question probe “deeper learning” for a 3rd grader? Who the heck knows?

If this strikes you as mere semantics, think again.  A lack of an agreed vocabulary — what is a “basic skill?”  What is “higher order thinking?” — is not merely irritating, it can lead to bad practice and misplaced priorities.   A third-grade teacher looking to remediate a lack of basic skills might seek help from a software product but she would have “no real idea on how ‘deep’ they go, or how ‘shallow’ they start,” Goldstein notes.  “No common language for ‘Depth’ or ‘Complexity.’”

I would add that the problem is more fundamental than that.  If a teacher is told “teach higher-order thinking” she might incorrectly assume that time spent on basic knowledge, math skills or fluency is a waste of time.  Or, in the worst case scenario, that reading comprehension or higher order thinking can be directly taught.  

In reality, without the basic skills and knowledge firmly in place, there’s no such thing as higher order anything and never will be.  Yet terms like “higher order thinking” and “complexity” are held up as the gold standard we should be teaching toward.  Basic knowledge and prerequisite skills are the unlovely companions of “drill and kill” rather than, say, ”fluency” or “automaticity.” Mischief and miplaced priorities are the inevitable result.

A common vocabulary of diagnosis and treatment would help. 






Second Thoughts on Pineapplegate

by Robert Pondiscio
May 4th, 2012

Writing in his TIME Magazine column, Andy “Eduwonk” Rotherham offers up a largely exculpatory take on Pineapplegate.  The media jumped all over a bowdlerized version of the test passage, he notes.  New York state officials should have been clearer in explaining that nothing makes its way onto standardized tests by accident.  And in the end, Andy writes, what is needed is “a more substantive conversation rather than a firestorm” over testing.

Very well, let’s have one.

In the unlikely event you haven’t heard, a minor media frenzy was ignited a few weeks back when the New York Daily News got hold of a surreal fable, loosely modeled on the familiar tale of the Tortoise and the Hare, which appeared on the just-administered New York State 8th grade reading test.  In the test passage, a talking pineapple challenges a hare to a foot race in front of a group of woodland creatures, loses the race (the pineapple’s lack of legs proving to be a fatal competitive disadvantage)  and gets eaten by the other animals.

Rotherham points out that the passage picked up by the paper was not the actual test passage, but a second-hand version plucked from an anti-testing website. “The passage the paper ran was so poorly written that it would indeed have been inexcusable,” he wrote.  Perhaps, but the correct passage wasn’t exactly a model of clarity and coherence either.  Indeed, the fable’s author mocked the decision by the testing company, Pearson, to create multiple choice questions about his story on a state test.  “As far as I am able to ascertain from my own work, there isn’t necessarily a specifically assigned meaning in anything,” Daniel Pinkwater told the Wall Street Journal. “That really is why it’s hilarious on the face of it that anybody creating a test would use a passage of mine, because I’m an advocate of nonsense. I believe that things mean things but they don’t have assigned meanings.”

Ultimately the real version of the test passage was released by the state to quiet the controversy.  But it did little to reverse the impression that this was a questionable measure of students’ ability.  Rotherham’s big “get” in Time is a memo from Pearson to New York State officials detailing the question’s review process as well as its use on other states’ tests as far back as 2004.  The message:  nothing to see here, folks.  Show’s over.  Go on back to your schools, sharpen those No. 2 pencils and get ready for more tests.

“Standardized tests are neither as bad as their critics make them out to be nor as good as they should be,” Rotherham concludes.  Perhaps, but they’re bad enough.  The principal problem, which Pineapplegate underscores vividly, is that we continue to insist on drawing conclusions about students’ reading ability based on a random, incoherent collection of largely meaningless passages concocted by test-makers utterly disconnected from what kids actually learn in school all day.  This actively incentivizes a form of educational malpractice, since reading tests reinforce the mistaken notion that reading comprehension is a transferable skill and that the subject matter is disconnected from comprehension.   But we know this is not the case as E.D. Hirsch and Dan Willingham have pointed out time and again, and as we have discussed on this blog repeatedly.

So this is not a simple case of an uproar based on bad information and sloppy damage control.  What Rotherham misses in a somewhat strident defense of standardized tests and testing is that we are suffering generally from a case of test fatigue. The entire edifice of reform rests on testing, and while the principle of accountability remains sound, the effects of testing on schools has proven to be deleterious, to be charitable. Thus the conditions were ripe for people to overreact to perceived absurdity in the tests. And that’s exactly what happened here.

Was the story was blown out of proportion by some people playing fast and loose with the facts?  Perhaps.  But the facts, once they became clear, were more than bad enough.

Did You Hear the One About the Talking Pineapple…

by Robert Pondiscio
April 20th, 2012

“It’s clearly an allegory. The pineapple is the Department of Education. The hare is the student who is eagerly taking the test,” said E.D. Hirsch. “The joke is supposed to be on the hare, because the questions are post-modern unanswerable,” he said. “But in fact the joke is on the pineapple, because the New York Daily News is going to eat it up.”

I’d explain what he’s talking about, but some things are beyond explanation….

Update:  At EdWeek Teacher, Anthony Cody asks the question that needs to be asked:  Would YOU want to be judged based on an 8th grader’s ability to make sense of this bizarre little story?

A Place in the World

by Guest Blogger
March 2nd, 2012

by Jessica Lahey

In the wake of last week’s release of New York City Teacher Data Reports, educators and administrators are debating what exactly the value in a high value-added teacher looks like. Even teachers who scored high marks on the Teacher Data Reports question the value of tests that cannot possibly evaluate every aspect of what it means to be a great teacher, and the value that teacher imparts to his or her students.

The new feature-length documentary A Place in the World, directed by Adam Maurer and William Reddington, addresses the question of teacher value and the role of a school in building community. The documentary chronicles two years at The International Community School (ICS), a K-6 charter school in DeKalb County, Georgia. DeKalb County is the largest refugee resettlement area in the country and the most diverse county in the state of Georgia. Half the students at ICS are recent immigrants and refugees from war zones, and half are local children from DeKalb County.

The film focuses on two educators: Drew Whitelegg (Mr. Drew to his students), a first-year teacher, and Dr. Laurent Ditman, Principal of ICS. Mr. Drew, formerly a post-doctoral Fellow at Emory University, speaks honestly about how tiring his job as a fourth-grade teacher is, how difficult it is to avoid being consumed by the challenges inherent in teaching a population of barely English-literate, emotionally and physically terrorized children how to function as educated members of American society. “Teaching at a university was a dawdle compared to teaching here. I mean it really was. And there’s a sense that you are in this for the long haul. But the rewards – the rewards here are absolutely endless. And they don’t come from all the great moments, they come from the small moments.”

According to Mr. Drew, the education gap that divides the American and refugee students in his fourth grade classroom at ICS is created by language deficits. Mr. Drew is not talking about language deficits in terms of the ability to hold a basic conversation, he’s talking about cultural vocabulary, the connotation words carry in American culture that help proficient readers understand context and relevance. Mr. Drew gives an example in the film: The math problem 1/2 + 1/4 written numerically, as a math problem, is something his students can do. But ask this same problem as a word problem, with one kid baking cakes and giving half away to friends and then deciding to give another quarter away to another friend, “then it’s not a test of math, it’s a test of language ability.” Many of Mr. Drew’s students come to his classroom with no knowledge of English, and some students, such as Bashir, who was born in a refugee camp in Ethiopia, have no understanding of the concept of school. Bashir spent his first days at ICS wandering the halls, walking in and out of classrooms, calling out for his father. Principal Laurent Dittman recounts the story of a girl from the refugee camps in the Sudan who spent her first weeks at ICS huddled under a table, hiding from whatever dangers she had survived in the Sudanese refugee camp.

Dr. Dittman, himself an immigrant and the child of Holocaust survivors, believes in school as a refuge from his students’ unsettled home lives. He understands his students’ impulse to hide under tables in order to escape. “The first thing I learned from my parents was how to hide. When something bad happens, or is about to happen, you hide. I see that in many of the kids at the school.” Dr. Dittman views his school as a refuge for his students, a place to come out of hiding and learn. Dr. Dittman says of his own upbringing in an immigrant family in France, “I really liked school. It was a safe place. My parents were refugees and things at home were not always a lot of fun, and I saw school clearly as a refuge.”

When asked about the standards his students are expected to meet under No Child Left Behind (NCLB), his outlook is not quite as hopeful. “According to NCLB 2014, all students – 100% – will be proficient in all subject matters. What’s the old Garrison Keillor, everybody is above average? That doesn’t make any sense. My guess is that in a few years, all those standards, all those compulsory standardized tests will be a bad memory. I think that the pendulum is going to swing back the other way and return to a more rational, less ideological approach to education.”

ICS did not make Adequate Yearly Progress (AYP) in 2011 under NCLB. Dr. Dittman and Mr. Drew, who educate malnourished, traumatized, impoverished and previously uneducated children, must cover core subjects such as math, science, and history while helping their students find a place in American society. They are not simply teaching American history, they are teaching their students how to be Americans. The making of Americans is currently not a category in the Teacher Data Reports’ calculation of a teacher’s value-added assessments.

For validation on that front, Dr. Dittman and Mr. Drew do not look to test scores and value-added assessments; they look to their students. Dr. Dittman thinks back to that that one Sudanese girl, hiding under the classroom table. His voice breaks as he recounts the ending to her story. The girl refused to come out until one day her teacher crawled under the table and joined her there. Once her teacher had gained the girls’ trust, she felt safe enough to crawl out from under the table and join the class. According to Mr. Drew, “I don’t think teachers should blow their own trumpets or credit themselves overtly, but I think that you can go home at the end of the day and say, you know what, I’ve made a difference, you know, and the world is actually a better place from what I did today.”

As teachers and administrators move forward and continue to do the job of teaching this country’s students, it is important to remember that not all value is quantifiable. The Teacher Data Reports, in all their margins of error and fuzzy logic, can never get at the real value of this country’s teachers.

Jessica Potts Lahey is a teacher of English, Latin, and composition at Crossroads Academy, an independent Core Knowledge K-8 school in Lyme, New Hampshire. Jessica’s blog on middle school education, Coming of Age in the Middle, where this piece also appears, can be found at http://jessicalahey.com.

What is the Value in a High Value-Added Teacher?

by Guest Blogger
January 12th, 2012

by Jessica Lahey

Great news emerged this week for elementary- and middle-school teachers who make gains in their students test scores.  While the teachers themselves may not be pulling down big salaries, their efforts result in increased earnings for their students. In a study that tracked 2.5 million students for over 20 years, researchers found that good teachers have a long-lasting positive effect on their students’ lives, including those higher salaries, lower teen-pregnancy rates, and higher college matriculation rates.

I’m a practical person.  I understand that we spend billions of dollars educating our children and that the taxpayer deserves some assurance that the money is not being squandered.  Accountability matters.  I get it.  Still, as a teacher, it’s hard not to feel a little bit wistful, perhaps even wince a little, reading this study.

It’s important to remember that its authors, Raj Chetty, John N. Freidman, and Jonah E. Rockoff, are all economists. Their study measures tangible, economic outcomes from what they call high versus low “value-added” teachers. This “value-added” approach, which is defined as “the average test-score gain for his or her students, adjusted for differences across classrooms in student characteristics such as prior scores,” may work for measuring such measurable outcomes as future earnings, but it misses so much of the point of education.

I asked my Uncle Michael, a professor of law and economics, what he thought of the study, and he compared the proponents of the study’s mathematical economic approach to education to acolytes of The Who’s Tommy, pinball wizards who “sought to isolate themselves from the world so as to improve their perception of a very narrow sliver of that world. The entire ‘assessment’ enterprise defiles education as that word once meant.”

He attempted to explain his feelings about the study in terms of mathematical equations – something to do with linear regression thinking and educational outcomes, but I got lost in the Y = a + bX + errors of it all.

Tim Ogburn, 5th grade teacher in California, phrases the debate a bit more simply: Why are we educating children?

His answer goes like this: Until fairly recently, teachers would have answered that they were educating children to become good Americans or good citizens, but now we seem to teach only to prepare elementary- and middle-school children for high paying jobs. When money figures into the goal, we lose so much along the way, such as curiosity, a love of learning for its own sake, and an awareness that many of the most worthwhile endeavors (both personally and socially) are not those with the highest monetary rewards.

To which I reply: Hear, hear. If economic gain is the measure of our success, we have lost sight our goals in education.

In order to round out the definition of “value” as defined by Chetty’s study, I conducted my own research project. Sure, my sample was smaller – about thirty versus Chetty’s 2.5 million, and the duration of my study was three days rather than 20 years…and of course there might just have been a wee bit of selection bias in my Facebook sampling. Oh, and I chose not to apply Uncle Michael’s formulas because they gave me a headache.

The goal of my study was to find out what some of the other, less measurable benefits of good teaching. I asked people to write in with examples of good teaching, teaching that has resulted in positive outcomes in their lives. Who were their “high value-added” teachers?

Sarah Pinneo, a writer from New Hampshire, recalled her third grade teacher, who took her aside one day and said, “You are going to be a writer. Here’s your portfolio. Every poem you finish, we’re going to save it in here.” Sarah’s first novel will be released on February first, and she still has that poetry portfolio.

Carol Blymire, a food writer and public relations executive in Washington, D.C, recalled her kindergarten teacher “who taught me that letters make words and words make sentences…and is the reason I love to write today.” She counts among her low value-added teachers, “Every other teacher reprimanded me for asking questions that came across as challenging them, even though it was really my way of wanting to know more and understand the bigger picture.”

My favorite example came from Dr. Jeffrey Fast, an English teacher in Massachusetts.

“One morning, when I was a senior, we were discussing Maxwell Anderson’s Winterset. While I can no longer remember exactly what I said, it was something about the interaction among the characters. Immediately after I spoke, [my teacher] responded by saying – for all to hear: ‘I like you!’ His response, of course, was coded language to identify and mark – for both me and my peers – something insightful. I felt enormously rewarded. That was the benchmark that I tried to replicate in dealing with literature ever afterwards. That was 50 years ago. He never knew that those three words catapulted me – to a Ph.D. and a career as an English teacher!”

While the studies of economists may add to the discussion about what makes teachers valuable in our lives, I believe that if we reduce teachers’ value to dollars and cents, we run the risk of becoming, in Oscar Wilde’s phrase, “the kind of people who know the price of everything, but the value of nothing.”



Poking the Sacred Cow

by Guest Blogger
December 30th, 2011

by Jessica Lahey

It’s day six of my holiday break and I have finally acknowledged the large stack of paper on the floor next to my desk. I had been ignoring it, hoping it would magically grade itself, but alas, this has not been the case. It’s still there, still huge, still daunting. In the meantime, I have cleaned the entire house, gone to the dump twice, moved our furniture around, stacked another cord of wood, winterized the chicken tractor, and killed seven mice in the attic, but now, it’s time. Time to grade the mid-year writing assessments.

While I was completing all of these other acts of procrastination, I was mentally composing another essay for an upcoming deadline, a piece has been freaking me out, both as a writer and a teacher. In order to be successful in this piece, I must come clean about my homework practices. For non-teachers, that may sound like an easy task, but it’s not. Homework is a time-honored tradition among teachers, a sacred cow best left undisturbed to chew its cud in the median. We go about our daily business in its shadow, so used to its presence right there in the middle of things that we don’t even see it anymore. Even discussed delicately, teacher-to-teacher, it elicits fight-or-flight defensiveness in some and outright anger in others.

But it’s good to sharpen your Ticonderoga #2 and poke that cow from time to time, isn’t it? Otherwise, how  do you know if it’s just resting or if it’s been dead for a while and you just had not noticed?

As I am writing about homework elsewhere, I am taking on another sacred cow at my school over here – the writing assessment. These assessments make up the giant pile of menace stacked next to my desk, and as I don’t want to get around to grading them, I thought I’d spend some time poking them with a proverbial stick.

Twice a year, we give the students a prompt, two days to prepare an outline, two class periods to write a four-paragraph essay. Based on the responses I have read so far, this year’s questions went fairly well, and I actually like reading these essays once I am into the groove, but it’s an endless task. So, if I have to question why I give homework, I also have to question why I spend four full days a year of class time and hours at home spent grading on these writing assessments.

The students don’t enjoy writing them, I hate grading them…so what’s the point?

In order to answer that question, I went over to my office and pulled out a couple of my student’s files. Because we give these assessments every year from the third grade on up, I can spread a students’ entire writing education out in one place. I can see how handwriting, vocabulary, and syntax evolve over the entire length of one student’s education. Most importantly, I can see their individual voices evolve as thinking becomes more complex, more sophisticated. It’s fun to pull these files out when a student is frustrated with the slow pace of his or her learning, or an apparent backsliding in skills, and show them how far they have come in such a short time.

One of my favorite things about my job is the strategizing I get to do behind the scenes. As I teach my students for three straight years in Latin and/or English, I have the opportunity to do some real long-term planning for the future. I taught high school English before I moved to middle school, so I know what will be expected of them in a few short years. Many of them will go on to attend the very school I used to teach in, so I have very specific goals about where they need to be in terms of independence, organization and self-advocacy by the time they head off to high school.

In sixth grade, we coddle them as we ease them into the relative chaos of middle school class transitions and increased homework load. In seventh grade, however, I ease off a bit. I give them a little bit more rope and see what happens when they are expected to plan ahead or stay on top of a long-range assignment. In eighth grade, I really let them have their heads, and expect that they will know how to take charge of their education when no one else is looking out for them. Writing assessments are part of that process. I hand them the prompt and directions, and they are expected to prepare their notes or outline, find supporting evidence and plan their writing. I give them no other guidance than the prompt itself. Timed writing assignments will become a fact of life for them in the coming years, and it’s fascinating to see their progress as they master the task.

When I was first hired at my school, I was informed that the writing assessment was simply a part of what I did in English class, and I was too overwhelmed with the details of a my new position (including my first year teaching Latin, twenty years since I last cracked open a Latin text) to question any reasoning behind the tradition. But now, long settled-in and armed with perspective and experience, I think it’s good to question what I do the things I do. This week’s re-evaluation of my homework practices has been really enlightening - I have dropped some of the less effective assignments and shored up my reasoning behind the better ones. So much of what I do, particularly the most subjective aspects such as grading and assessments, leave me feeling uneasy at times, unsure of my standards, perspective, or reasoning.

In the end, some of those cows were long dead and really needed to get rolled out of the road, but I am quite fond of the ones that remain. When I return to school in the New Year, the students will notice a change. I will be more confident in my choices, and the road ahead will be much less congested. True, the writing assessments will remain, lying placidly in the middle of that road, but at least I will be able to explain why they are there.

A Little More Text, A Little Less Self

by Robert Pondiscio
December 19th, 2011

When studying a story or an essay, is it possible to be too concerned with what the author is saying? In an opinion piece in Education Week, Maja Wilson and Thomas Newkirk complain the publisher’s criteria for Common Core State Standards are overly “text dependent,” discouraging students from bringing their own knowledge and opinions to bear on their reading.

Wilson, a former high school English teacher, and Newkirk, a University of New Hampshire English professor applaud the guidelines’ “focus on deep sustained reading—and rereading.” However they pronounce themselves “distressed” by the insistence that students should focus on the “text itself.”

“There is a distrust of reader response in this view; while the personal connections and judgments of the reader may enter in later, they should do so only after students demonstrate ‘a clear understanding of what they read.’ Publishers are enjoined to pose ‘text-dependent questions [that] can only be answered by careful scrutiny of the text … and do not require information or evidence from outside the text or texts.’ In case there is any question about how much focus on the text is enough, ‘80 to 90 percent of the Reading Standards in each grade require text-dependent analysis; accordingly, aligned curriculum materials should have a similar percentage of text-dependent questions.”

Consider me undistressed. If this means less reliance on the creaky crutch that is “reader response” in ELA classrooms, then I’m very nearly overjoyed.

The very worst that can be said about an over-reliance on text-dependent questions is that it’s an overdue market correction. As any teacher can tell you, it’s quite easy to glom on to an inconsequential moment in a text and produce reams of empty “text-to-self” meandering using the text as nothing more than a jumping off point for a personal narrative. The skill, common to most state standards, of “producing a personal response to literature” does little to demonstrate a student’s ability to read with clarity, depth and comprehension.

Indeed, educator, author and occasional Core Knowledge Blog contributor Katharine Beals points out in a response to the piece that Wilson and Newkirk have it precisely backwards: research from cognitive science suggests that making external associations during reading can actually worsen comprehension. She cites a paper by Courtenay Frazier Norbury and Dorothy Bishop which found that “poor readers drew inferences that were distorted by associations from their personal lives. For example, when asked, in reference to a scene at the seashore with a clock on a pier, ‘Where is the clock?’ many children replied, ‘In her bedroom.’”

“Norbury and Bishop propose that these errors may arise when the child fails to suppress stereotypical information about clock locations based on his/her own experience. As Norbury and Bishop explain it: ‘As we listen to a story, we are constantly making associations beween what we hear and our experiences in the world. When we hear “clock,” representations of different clocks may be activated, including alarm clocks. If the irrelevant representation is not quickly suppressed, individuals may not take in the information presented in the story about the clock being on the pier. They would therefore not update the mental representation of the story to include references to the seaside which would in turn lead to further comprehension errors.’

Struggling readers in particular would benefit from a lot more text and a lot less self. As Beals explains, “Text-to-self connections, in other words, may be the default reading mode (emphasis mine) and not something that needs to be taught. What needs to be taught instead, at least where poor readers are concerned, is how not to make text-to-self connections.”

Wilson and Newkirk illustrate their concern about over-reliance on text by describing their preferred way of teaching Nicholas Carr’s 2008 essay from The Atlantic, “Is Google Making Us Stupid?”

“Before assigning the essay, we would have students log their media use for a day (texts, emails, video games, TV, reading, surfing the Internet) and share this 24-hour profile with classmates. We might ask students to free-write and perhaps debate the question: “What advantages or disadvantages do you see in this pattern of media use?” This ‘gateway’ activity would prepare students to think about Carr’s argument. As they read, they’d be mentally comparing their own position with Carr’s. Surely, we want them to understand Carr’s argument, but we’d help them do that by making use of their experiences and opinions.”

It’s critical to understand that this approach to teaching Carr’s essay would not be verboten under CCSS publishing guidelines, which have nothing whatsoever to say about teaching methods. In fact, there’s much to recommend Wilson and Newkirk’s approach. But the test of whether the students understand Carr’s line of argument has nothing to do with the “gateway” activity, which serves mostly as an engaging hook to draw students into Carr’s thesis. Students cannot be said to have understood the piece—or any piece—of writing without the ability to show internal evidence.

Thus if publishers are “enjoined to pose text-dependent questions [that] can only be answered by careful scrutiny of the text” that is at heart not a teaching question–it’s an assessment question that probes whether or not the student understands the text.

All those connections—to our own experience, to other works of literature, make the study of literature thrilling and rewarding. But for those connections to be deep and meaningful requires more than just the superficial, paper-thin connections that too often pass for “personal response.”

What often gets lost in our rush to engage young readers and make their reading personally relevant is the simple fact that text has communicative value. When someone commits words to print, they mean to communicate facts, ideas, imagery or opinions. They should expect, if they’ve done their job well, to be understood. Might the reader have a response? Let’s hope so. But unless they have understood the author’s words and intent clearly, any response they make is less than satisfying and may not be particularly relevant as a “response.”

The bottom line: Demonstrating comprehension based on what a text says is not a problem. It’s a baseline skill for any literate human being.

A Critical Look at the Critical Lens Essay

by Guest Blogger
December 14th, 2011

by Diana Senechal

On standardized high school English examinations in New York, students must write what is often called a “critical lens” essay. They are given a quotation (the “lens”) and must interpret it, state whether they agree or disagree with it, and substantiate their position with examples from literary texts of their choice. This task has logical flaws and encourages poor reasoning and writing. The problem is largely due to the lack of a literature curriculum; when there are no common texts, essay questions on state tests become vague and diffuse. The test question needs an overhaul, and New York State needs a literature curriculum with some common texts and ample room for choice.

One flaw of the “critical lens” task is that students must interpret the quotation out of context. Students may or may not have read the source of the quotation; they are allowed to make it mean whatever they want it to mean (within reason). The test-taker must provide a “valid” interpretation of the quote, but without a context, “valid” simply means free of egregious error. When it comes to analysis, this is not good practice; the student latches onto the interpretation that comes to mind instead of searching for the most fitting one.

A sample New York Regents English examination illustrates how this might play out. (I discuss this example in my book, Republic of Noise.)  Here the quotation is from The Little Prince by Antoine de Saint-Exupéry: “It is only with the heart that one can see rightly.” (See p. 21 of the PDF file.) This quotation can mean many things, but it has particular meaning in The Little Prince. It is the fox who speaks these words, after befriending the prince and being tamed by him. They have been meeting, day by day, at the same time and place; the regularity of the ritual allows the fox to prepare his heart for the prince’s arrival. Seeing with the heart in this case has to do with caring for another, spending time with another, honoring rituals together. But students are more likely to take the quotation as a comment on romantic attraction (and some of the sample responses do precisely that). Then they agree or disagree with the quotation on the basis of this incorrect interpretation.

Another flaw in the “critical lens” task is that it hinges on the student’s opinion (about a statement that may apply to a range of situations). The opinion may be hasty or superficial, yet it is unassailable. It would make more sense to ask the student to explain how a particular literary work affirms the quotation in some ways and negates it in others, and to decide whether the affirmation or the negation is ultimately stronger. That would require careful, thoughtful analysis and examination of a work and would leave room for the student’s ideas and judgment. At the very least, the prompt could ask the students to show how a literary work addresses or touches on the idea in the quotation. That runs the risk of reducing literature to ideas and themes, but at least it keeps the focus on the literature.

A third flaw is that students must cite examples from literature in support of their opinion. It is possible to do this, but one must do so cautiously. Literature is not a direct reflection of life; often its messages are oblique and contradictory. So, for instance, if one looks to Romeo and Juliet for examples of people blinded by love (not seeing rightly with the heart), one will find them, but one will also miss the point. In the play, love has both delusion and illumination and is part of a larger scheme. Help and harm intermingle, as Friar Laurence suggests in his monologue:

O, mickle is the powerful grace that lies
In herbs, plants, stones, and their true qualities:
For nought so vile that on the earth doth live
But to the earth some special good doth give,
Nor aught so good but strain’d from that fair use
Revolts from true birth, stumbling on abuse:
Virtue itself turns vice, being misapplied;
And vice sometimes by action dignified.


The play does not pass judgment on the lovers’ passion; rather, it shows the playing out of passions, feuds, and good intentions, where no one grasps the full situation until the end. But students who ignore this can get a high score on the essay. One can even ignore key details of plot and get a high score. A sample student response with the highest score (on p. 58) states that “if Romeo had not used his heart, he would have seen rightly. He could have stayed with Rosaline, and saved both the Montagues and Capulets from enduring his reckless, love-inspired antics.” The student neglects the fact that Rosaline has sworn herself to chastity, that the Montagues and Capulets have antics of their own (the play begins with a fight that escalates), and that it is the lovers’ deaths that brings an end, finally, to the warring of the two families. This is at least partly the fault of the essay question; by requiring students to cite literary examples to support their opinion, it encourages (or at least does not penalize) shallow interpretations of these examples.

In short, the “critical lens” task rewards poor writing and thinking, precisely because it can rely on no common knowledge. There is no check on the student’s opinion; nothing  challenges the student to examine the quotation or the works closely. The student who follows the directions does well. He may provide a flawed interpretation of the literary examples and quotation, yet receive a top score. He may even get basic plot details wrong without losing any points. It would not be surprising if some students made up the details and still passed. To fight this absurdity, we should have a few texts—just a few—that everybody reads, including those scoring the tests. The essay question could then pertain to the works themselves. This would allow for coherent, probing essays and would take students out of opinion’s muddier puddles.