Archive for the 'Assessment and Testing' Category

Civics and Sanskrit

Only 3.5% of Arizona public school students got six or more questions correct on a version of the United States Citizenship Test.  Matthew Ladner of Jay Greene’s blog thought that was pretty pathetic–new immigrants to the U.S. have to answer six or more correct–until they gave the same test to kids in Oklahoma.  The results were not OK.

Perhaps I ought not to have been so hard on Arizona students. After all, they passed at a rate that was 25% higher than their peers in Oklahoma!  That’s right: the passing rate for Oklahoma high school students was 2.8%. They somehow underperformed Arizona’s already abysmally pathetic performance.

“These kids wouldn’t do much worse if the pollster asked them questions in Sanskrit instead of English,” Ladner concludes.  Over at Joanne Jacobs, guest blogger Diana Senechal says Ladner’s right.  ”According to a binomial distribution calculator, the chances of getting at least 6 out of 10 questions correct (where each question has 4 options) is about 2 percent. So, no, they wouldn’t do much worse in Sanskrit,” she writes.

“I have an empty metal coffee pot in my office marked “Sweden Civics Survey Fund,” Ladner writes.  “Please drop by a give what you can afford. Once it gets to a couple of thousand bucks, I’ll retain the pollster to give this exact same survey on AMERICAN civics to high school students in Sweden.”

Great idea.  I’ve got a ten-spot in my hand, Matthew.  What’s the address?

“An Unavoidable Element of Subjectivity”

Schools need much more than merit pay to recruit and retain good teachers, argues Kevin Carey at the Quick and the Ed.  “They need strong leadership, good facilities, safe working conditions, and the right kind of organizational culture,” he writes. “You can’t paper over the lack of those things by simply tacking on a salary bonus, even a big one, to the existing steps-and-lanes pay scale.”

Carey’s reasoned (and reasonable) take on merit pay feels like a welcome departure from the teacher-quality-and-test-scores über alles refrain more commonly sung by accountability hawks.  Especially in his recognition that “we need to build schools great people want to teach in, and that means fully recognizing their value in all ways, including pay.”

The great schools of the future will be professional meritocracies in a way today’s public schools are not, but not by adding test scores to the mechanistic logic of an industrial-age salary scale. Rather, they’ll spend a great deal of energy on getting the conditions and culture right, and then negotiate substantially higher and substantially more variable salaries with individual teachers. It will be an expensive, time-consuming, imperfect process with an unavoidable element of subjectivity. It will also be much, much better than what most schools use today.

Agreed.  I’d also wager there isn’t one teacher in a thousand who wouldn’t welcome merit pay in a school that spent “a great deal of energy on getting the conditions and culture right.” 

The phrase “unavoidable element of subjectivity” also strikes me as a recognition of the infinite complexity teachers face in working with our most disadvantaged students (any attempt to move past mindless “teachers fear accountability” sloganeering is a welcome development).  Guest-blogging over at Joanne Jacobs, the always insightful Diana Senechal captures the dilemma of nuance-averse accountability well.  “With dumbed-down tests, vapid literacy programs, an overwhelming focus on test prep at the exclusion of essential subjects, and unreliable rating systems, we end up taking a yardstick to a void–and declaring miracles whenever we please,” she wrote.  The flip side of that — the thing that teachers reasonably fear — is that it is too easy to declare failure whenver we please, and hold teachers solely responsible when they are too often reduced to foot soldiers with no control over what or even how they teach. 

This cannot be said often enough: teachers are not by nature accountability-averse.  They are, however, sensibly averse to having an extraordinarily difficult and complex task measured by crude and simplistic tools.

Update:  John Thompson, a vocal teacher advocate who also viewed Carey’s post favorably, takes up a similar theme at This Week in Education.  “I’ve never understood why ‘reformers,’ who are angered by the terrible results of policies set by principals and central offices, respond by attacking teachers who do not set those policies. But the answer, which the New Teacher Center makes clear, is not to attack principals but to use ‘contextual data’ to enhance teacher and principal quality and create a learning culture which attracts and retains educators.”

I’m Confused

When 97% of New York City schools get As and Bs on their report cards, it’s proof accountability works.  When 98% of teachers get satisfactory ratings, it’s proof there’s no accountability.

Good Teachers Improve Their Peers

Having good teachers for colleagues helps other teachers improve.  Common sense, right?  This new study documents some pretty dramatic peer effects.  EdWeek’s Debra Viadero breaks it down for you here.

Attention: Wendy Kopp and Teach For America:  Maybe this wasn’t such a crazy idea after all?

Observations on Observations

If you’re a teacher, would you rather be judged by a 200-page list of indicators of highly skilled teaching, or by a principal who shares your philosophy of teaching and learning, supports your approach and pretty much leaves you alone–but has the power to fire you at will? 

This question occurred to me after reading a long and excellent post by John Merrow over at Learning Matters on teacher observations. He concludes that the observation process is “changing for the better in some places, but that, unfortunately, it’s still mostly useless.”

In the old days, teachers closed their doors and did their thing, for better or for worse. As long as things were quiet, administrators [rarely] bothered to open the door to see what was going on, and teachers never watched each other at work. That’s changing, sometimes for better, sometimes for worse. In some schools today, teachers are actually expected to watch their peers teach, after which they share their analysis. In other schools, however, principals armed with lists sit in the back of the class checking off ‘behaviors’ and later give the teacher a ‘scorecard’ with her ‘batting average.’

“Whether these observations are diagnostic in nature and therefore designed to help teachers improve or a ‘gotcha’ game is the essential question,” Merrow perceptively observes.  Teacher observations, like test scores, will undoubtedly loom ever larger as the issue of teacher quality bubbles to the top of the nation’s education agenda.  Like test scores, there’s a lot to learn from observations.  And like test scores, we’re equally likely to learn the wrong lessons.

Of all the “best practices” that have migrated to education from the business world, the one that didn’t make the trip is the idea that good managers hire excellent people, empower them with real decision-making authority, then get out of their way.  The closest thing to that in education is “close your door and do your thing,” as Merrow puts it.  That goes against the grain in the Age of Accountability, but it is undeniable that for many excellent and experienced teachers and their students, it works perfectly.   And while that approach is endangered, it has not disappeared.  Nor should it.  The point of any accountability system should be to help bad schools and teachers look and act like good schools and teachers, not the opposite.  Our schools still have plenty of brilliant iconoclasts who do things their own way to great effect. 

For such  teachers nothing could be worse than “observation by checklist,” where the adminstration wants to see what it wants to see: aim and standard on the board?  Check.  Students sitting in groups?  Check.  Updated work on the bulletin board?  Check. A “print rich” environment in “kid-friendly language?” Check.   Ask why these items are important and you’ll invariably hear that it’s what the principal’s supervisor expects to see.  What they are indicative of is lost.  The consummate irony is this kind of evaluation seems rigorous, but it is more likely — much more likely — to create a civil service mentality than to foster excellence.  It’s another variation of the Cargo Cult Education phenomenon.   Teachers and administrators spend all their energy manufacturing the visible markers of learning, often not knowing (and after a while no longer caring) what the “indicators” indicate. 

Indeed, this is the thing the every teacher knows, that every armchair expert does not: it is simple (but time-consuming) to create an environment that gives all the appearances of being a high-functioning classroom and still be a lousy teacher.  Among the very first survival skills a new teacher learns, either through the advice of a kindly colleague or through a series of administrative reprimands, is the art of the dog and pony show.   In some schools, it’s the quid pro quo that earns you the right to close your door and practice your craft.  In more punitive environments, it’s the tail that wags the dog.   But the aim of observation-by-checklist is not great teaching, it’s plausible deniability–and it’s the enemy of accountability, for both teachers and administrators.  Miss Jones’ classroom demonstrates a high degree of student engagement and all of the indicators of high quality teaching, but her students are still not making progress.  Why? Miss Jones’ energy is misdirected.  She’s learning to play the game, not become a great teacher.  After a few years, she gets tired of it and quits.  Mediocrity wins again. 

The bottom line is that great teaching is like Potter Stewart’s definition of hard-core pornography.  It’s hard to define but you know it when you see it.  Unfortunately, that’s never going to cut it in our data-mad, accountability-obsessed age. 

So which would you rather?  Find a school and work with a principal who shares your philosophy and approach, trusts you and supports you, but has the power to fire you at will?  Or a school where your duties are codified to the letter, where you know what’s on the checklist and spend all of your time ‘working to rule‘ and playing “gotcha.”  Where are you going to be happiest and most productive?

Am I the only one who thinks this is what the teacher quality debate is really all about?

SAT Down and Cried Today

The Class of 2009, who were in 5th grade when No Child Left Behind became the law of the land, and were not yet born when A Nation at Risk ushered in the era of education reform, have posted SAT scores that summon to mind a flatlined EKG.  Math unchanged at 515.  Writing down a point to 493.  Critical reading, down a point to 494.  The results are of a piece with last week’s ACT scores, which showed only one of four high school graduates are prepared to do C level college work in English, math, reading and science.

“Completing a core curriculum remains strongly related to SAT scores,” the College Board notes in a news release.  ”Students in the class of 2009 who took core curricula scored an average of 46 points higher on the critical reading section, 44 points higher on the mathematics section, and 45 points higher on the writing section than those who did not.”

“The College Board, as always, hung a smiley face on it, but the latest SAT results are a real bummer,” writes Checker Finn at Fordham’s Flypaper blog.  Looking at years of stagnant NAEP results, last week’s dispiriting ACT scores and flat high school graduation rates, Finn says “please sing out if you’ve spotted any good news regarding the readiness of American adolescents to face successfully the challenges of higher education, the workforce, adulthood and citizenship. I can’t find it.”

Let me add a few verses to Checker’s refrain:  Please sing out if you see elementary schools creating a path to college readiness by favoring a rich, robust curriculum over of the deadening pabulum of test prep and ineffective reading strategies.  Please sing out too, if you can explain how changing the operative definition of well educated to “reads on or near grade level” has done anything other than cement in place this march of mediocrity.  

There’s no guarantee that a patient buildup of knowledge and language proficiency that pays dividends over time will show up in a single year’s standardized testing snapshot, so please explain too how any school or teacher can afford  to take the necessary long view, when we have essentially declared that a little bit of success every year is more important–and measurable–than great success over time. 

Please sing out if you see something–anything–that is going to change this dispiriting trend in the foreseeable future.  I can’t find it.

Ready, Fire, Aim

At The Quick and the Ed, Kevin Carey attempts to take on Diane Ravitch’s criticism of Race to the Top, accusing her of…well, I’m not sure exactly. But his criticism of Ravitch’s take on tying teacher evaluations to test scores is noteworthy. 

No state has ever really tied teacher evaluations to test scores in a methodologically valid way and made those evaluations meaningful in terms of compensation, hiring, tenure, and other things people care about. So Ravitch is just engaging in garden variety chicken-and-egg obstructionism: you can’t prove X works because nobody’s ever tried it; you can’t try X because nobody’s ever proved it works.

Well, no.  It’s not that it’s never been tried.  It’s that there is not a way to evaluate teachers fairly by using test scores.  I guess I’m obstructionist too, since like Ravitch I don’t see the benefit of coming to vast conclusions based on half-vast data.  Commenter Ceolaf nails the problem precisely: 

“It is not merely a case of banning a practice or allowing it. Rather, it is a case of mandating it. Require — or pressuring very strongly — states to adopt policies that are unproven is the issue. We knew that seat belts save lives, so requiring states to adopt seatbelt laws made sense. We knew that lowering speed limits saved gas, so requiring states to lower theirs to 55mph made sense. But that is not the case here.”

Just so.  But argue that this well-intentioned idea has too many problems to be taken seriously and you’re immediately a status quo loving, running dog lackey of the teachers unions, or as Carey describes Ravitch, the ”go-to name-brand anti-Obama quote on K–12 issues.”

Oy.

Maybe we can make this simple and unambiguous:  Accountability?  Good.  Figuring out if a teacher is competent or incompetent? Very good. Using tests to determine the difference?  Not very good.  In fact, not possible.  Forcing states to do it anyway? Not very smart.  Being incurious about the impact such a move will have on education?  Unforgivable. 

When did “not very good but it’s the best we can do” become a way of making policy?  When did suggesting we can do better become heresy?

Oh Say Can You C?

More than three out of four college-bound high school graduates are unprepared to earn a “C” or higher in first-year college courses in English, math, reading and science.

That’s the news from 1.5 million ACT tests taken by of the class of 2009, but curiously it’s not the lede.  A press release from the Iowa City-based ACT frames the results in the opposite manner, noting “the percentage of graduates ready to earn at least a “C” or higher in first-year college courses in all four subject areas tested on the ACT increased from 22 percent in 2008 to 23 percent in 2009.”  USA Today, the  New York Times and lots of others repeat the 23% figure or otherwise lead with the “slight improvement” in scores over 2008 results. The Wall Street Journal alone among major papers seems to catch the obvious story.  “Only about a quarter of the 2009 high school graduates taking the ACT admissions test have the skills to succeed in college,” the paper notes.

In other news, 254 million Americans have health insurance.

Update:  EdWeek weighs in and gets the headline right: “ACT Scores Show Most Students Aren’t Ready for College.”  Catherine Gewertz’s piece also features a great quote from FairTest’s Robert Schaeffer on the failure of NCLB to improve college readiness: “Politicians can make all the claims they want that it is raising achievement, but even when there are improvements in state test scores, they don’t show up in college-admissions test data, or on [the National Assessment of Educational Progress].  So where is the beef?”

Social Promotion? Easy as A, B, C!

Can you earn a promotion to the next grade in New York by simply guessing the answers on state tests?   It’s easy as A, B, C according to a provocative experiment by former Core Knowledge teacher Diana Senechal.   

In a call for tougher tests in the New York Post last week, Diane Ravitch revealed that the points needed to earn a “Level 2″ — the lowest “passing” score on the state’s tests–have dropped dramatically.  On the 6th grade English language-arts test, for example, the cutoff to earn a Level 2 in sixth grade dropped from 41 percent of the points in 2006 to just 17.9 percent in 2009.  “Ending social promotion, as the [New York City] rightly wants to do, is thus meaningless, because students can reach Level 2 by just guessing,” Ravitch concluded.

Struck by Ravitch’s observation, Senechal tried an experiment to see if it’s possible to pass the test by simply guessing.  She posts the results over at Gotham Schools

I first tried my experiment with the sixth grade ELA test. I “guessed” all the answers on the multiple-choice portion and left the written portions blank. Or, rather, I didn’t “guess,” but filled in the answers as follows: A, B, C, D, A, B, C, D, and so on, all the way through the 26 questions. I didn’t read one of them.

Naturally, Senechal got a zero on the written portion of the test.  But her multiple choice guesswork earned 12 out of 39 “raw points” and a scale score of 622–a rock-solid “2″ on the state’s four-point system.  A “2″ is described as “approaching grade level” and good enough to earn promotion to the next grade.  “I got a 2 without looking at a single test question or writing a single word,” she writes.   Repeating the experiment with the 7th grade math test, Senechal also scored a 2 “without solving a single math problem, or even looking at one.”

While this approach does not result in a 2 for all the tests, it comes a bit too close for comfort, and another guessing system might work. A fifth grader told me that his father had told him, “Just mark ‘C’ for all of the answers, and you will pass.” On the fifth grade ELA test, this would indeed have resulted in a 2.  Yes, it is possible to guess your way to promotion. You may not even have to look at the questions or write a word on the written sections.

“It may not be called social promotion, but it amounts to the same thing,” concludes Senechal, a frequent contributor to the Core Knowledge Blog.  “You do not need to know or understand much to move along.”

Test Data Plan Personally Approved by Obama

Insisting that states allow the use of standardized test data to evaluate teachers, the signature feature of the Race to the Top fund initiative, was personally approved by President Obama, according to EdWeek’s Michele McNeil.

When Education Department staff members finally settled on the data firewall rule, which would effectively knock out two states with giant student populations and powerful Congressional delegations, I’m told that education staffers took it up to those above their pay grades.  To Obama Chief of Staff Rahm Emanuel, and eventually, to the president himself. And Obama, apparently, didn’t need much convincing.

Meanwhile, in a detailed analysis of the guidelines, Stephen Sawchuck says the data firewall issue is not just about performance pay. 

States receiving Race to the Top funds must commit to using their teacher-effectiveness data for everything from evaluating teachers to determining the type of professional development they get to making decisions about granting tenure and pursuing dismissals. And, they will also be expected to track graduates of their education schools into classrooms to help institutions figure out which pathways and courses produce the best teachers.

The issue is not the use of the data, but the value of the data.  Is it possible to make good decisions with bad data?  Perhaps it doesn’t matter. One possibility raised by Fordham’s Mike Petrilli is that states will “superficially swear allegiance to these reform ideas but implement them half-heartedly down the road.”