Data-Driven…Off a Cliff

by Robert Pondiscio
October 20th, 2010

Miami English teacher Roxanna Elden makes a compelling case for how “data-driven instruction” can be misleading and self-defeating.  Writing at Education Next, Elden describes a nonfiction passage about owls on a practice test for the state’s FCAT test: Which of the owls’ names is the most misleading? Is it the screech owl “because its call rarely approximates a screech?” Or is it the long-eared owl, “because its real ears are behind its eyes and covered by feathers?”

Each question on the practice test supposedly corresponds to a specific reading skill or benchmark. “Teachers are supposed to discuss test results in afterschool ‘data chats’ and then review weak skills in class,” Elden writes.  Like so:

First Teacher: Well, it looks like my students need some extra work on benchmark LA.910.6.2.2: The student will organize, synthesize, analyze, and evaluate the validity and reliability of information from multiple sources (including primary and secondary sources) to draw conclusions using a variety of techniques, and correctly use standardized citations.

Second Teacher: Mine, too! Now let’s work as a team to help students better understand this benchmark in time for next month’s assessment.

Third Teacher: I am glad we are having this “chat.”

Forget for a moment that people only speak like this after they fall asleep next to a pod.   Here’s how Elden’s actual “data chat” went:

First Teacher: My students’ lowest area was supposedly synthesizing information, but that benchmark was only tested by two questions. One was the last question on the test, and a lot of my students didn’t have time to finish. The other question was that one about the screech owl having the misleading name, and I thought it was kind of confusing.

Second Teacher: We read that question in class and most of my students didn’t know what approximates meant, so it really became more of a vocabulary question.

Third Teacher: Wait … I thought the long-eared owl was the one with the misleading name.

Language arts teachers, Elden points out, “know that answering comprehension questions correctly does not rest on just one benchmark.”  That may work for math, but, she correctly observes, “reading is different.”

“After students have mastered basics like decoding, reading cannot be taught through repeated practice of isolated skills. Students must understand enough of a passage to utilize all the intricately linked skills that together comprise comprehension. The owl question, for example, tests skills not learned from isolated reading practice but from processing information on the varying characteristics of animal species. (The correct answer, by the way, is the screech owl.)”

Data-driven instruction says teach the skill?  Well, data-driven instruction is wrong.  Reading is not a transferable skill with components that can be separated like an egg yolk from the egg white. Comprehension is a function of interwoven skill, prior knowledge and vocabulary.   Expecting teachers to tease out a specific skill from the question Elden cites is like asking them to separate the yolk from a scrambled egg.

“Unfortunately, strict adherence to data-driven instruction can lead schools to push aside science and social studies to drill students on isolated reading benchmarks. Compare and contrast, for example, is covered year after year in creative lessons using Venn diagrams. The rersult is students who can produce Venn diagrams comparing cans of soda, and act out Venn diagrams with Hula–hoops, but are still lost a few paragraphs into a passage about owls. When they do poorly on reading assessments, we pull them again from subjects that give them content knowledge for more review of Venn diagrams. Many students learn to associate reading with failure and boredom.”

The expectation that teachers should use data in a way that belies what we know about reading is a prime example of what Rick Hess called The New Stupid – “a reflexive and unsophisticated reliance on a few simple metrics.”

“It’s impossible to teach kids to read well while denying them the knowledge they need to make sense of complex material,” Elden concludes “Following the data often forces teachers to do just that.”


  1. Thanks for highlighting Roxanna’s piece. I was surprised to see it at Education Next, frankly–it’s such a refreshing burst of reality and entirely practice-based. Teachers have always used “data” to inform their teaching–the good teachers, at least. To presume that we need special psychometric skills to analyze our students’ work or that we simply don’t pay attention and roll merrily along is part of the bad teacher myth.

    I recently got on an elevator with a colleague, at a conference, and remarked to her “If I hear the phrase ‘data-driven anything’ once more, I’m going to throw up.” Two women standing in the back of the elevator looked at each other and smugly rolled their eyes. They were wearing “Presenter” ribbons and carrying stacks of notebooks labeled “Data-driven Results in _____ County!”

    Comment by Nancy Flanagan — October 20, 2010 @ 11:30 am

  2. [...] Data-Driven…Off a Cliff « The Core Knowledge Blog. [...]

    Pingback by Data-Driven…Off a Cliff « The Core Knowledge Blog « Parents 4 democratic Schools — October 20, 2010 @ 3:25 pm

  3. [...] This post was mentioned on Twitter by Sheila Stewart, Robert Pondiscio. Robert Pondiscio said: Prime example of how data driven instruction forces bad teaching. #education [...]

    Pingback by Tweets that mention Data-Driven…Off a Cliff « The Core Knowledge Blog, The Core Knowledge Blog -- — October 20, 2010 @ 5:05 pm

  4. The New Stupid is endemic to the Common Core Standards in English Language Arts. The Standards are what the New Stupid Tests will be referenced to. The tests are what teacher evaluations will be partially based on. And these pillars/assurances are going to keep the US internationally competitive and deliver all kids who graduate from high school college or career ready by the year 2020.

    Does anyone see anything wrong with that narrative?

    Comment by Dick Schutz — October 20, 2010 @ 6:08 pm

  5. Robert, I’d love to provide many, many, many more examples of stupid assessment perpetrated by the state testing system – but I’m required to sign a non-disclosure form. What passes for “reading” in these assessments is so fraudulent. It’s everything but reading – it’s background knowledge testing, it’s logic and reasoning testing, it’s cultural assumption testing…

    Comment by David B. Cohen — October 20, 2010 @ 6:16 pm

  6. There’s a lot to the example above. Poor assessment literacy, misunderstanding of how learning occurs, etc. But I’d encourage you to slow down before you throw the whole “using data” baby away. As noted above, the best teachers have always used information about their students to help them improve instruction — and they know that more and better information can lead to even better results. Yet, unlike for almost all other professionals who perform complex, demanding work, the information tools available to teachers have been remarkably limited. This should not be an either/or.

    Moreover, when we think about teachers collaborating in teams, [the right] data becomes central to those conversations. I’m assuming you want to avoid overly simplistic and flawed approaches — not the use of data entirely.

    And, if we had better data to understand the impact of instructional programs and practices, curricula, etc. then perhaps that situation might be better too.

    Comment by Bill Tucker — October 20, 2010 @ 7:21 pm

  7. @Bill Tucker I don’t think anyone’s arguing against the use of data. I’m certainly not. But the post’s point about data vis a vis reading is pretty hard to argue. It promotes the falacious idea of reading comprehension as a transferable skill, which if I had to choose Public Enemy #1 in eduation would be a pretty strong contender. As a teacher, I would regularly get progress reports that would detail the “skill” at which my kids were weak, as if those skills existed in a vacuum. In the owl example in the original post, there are at least three reasons kids might get the question wrong. 1) lack of relevant knowledge of science/biology/animals/owls; 2) vocabulary 3) the skill of comparing and contrasting. My guess is that far more kids would boot the question for reasons #1 and #2. But as a practical matter, the only one teacher are expected to address is #3 — the least efficacious and least likely to succeed in future tests.

    Comment by Robert Pondiscio — October 20, 2010 @ 8:04 pm

  8. to me, the funniest comment is the PLC driven:

    “Each question on the practice test supposedly corresponds to a specific reading skill or benchmark. “Teachers are supposed to discuss test results in afterschool ‘data chats’ and then review weak skills in class,” Elden writes.”

    Rats routinely beat humans in tests where people think data has a pattern, but the data is random. Alleged experts pressing for ‘data chats,’ are arguing against research! Administrators mimic researchers. So it goes.

    Leonard Mlodinow, associate of Hawking, wrote about this in “The Drunkard’s Walk.”

    Comment by Dennis Ashendorf — October 20, 2010 @ 9:00 pm

  9. Roxana’s argument seems to be against “data-driven” instruction, not against data-driven instruction. As you point out, a truly research-based instructional paradigm acknowledges that reading (beyond decoding) is not a transferable skill.

    Why is this distinction important? Because otherwise, you get equally facile arguments against the use of any type of data, even when the measurement of skills might be legitimate, as in math. An example from the EdNext comments:

    “This should make anyone wary about programs like the “School of One,” which uses software to determine students’ skill needs. The software gathers “data” during the day (from students’ answers to multiple-choice questions) and then chooses lesson plans for teachers to deliver the following day”

    Comment by Hainish — October 21, 2010 @ 9:06 am

  10. “To presume that … we simply don’t pay attention and roll merrily along is part of the bad teacher myth.”

    This isn’t all teachers, but it’s a spot-on description of the cooperating teacher/mentor I had during my student teaching.

    Comment by Hainish — October 21, 2010 @ 9:18 am

  11. Hainish, I was not arguing against the use of “any kind of data.” I was arguing against the reliance on software to diagnose students’ needs on the basis of data. It is not a facile argument; it is an important one.

    The similarity between the School of One model and Elden’s situation is as follows: in Elden’s case, the teachers were supposed to use test results to determine students’ learning needs, skill by skill, and they found out quickly that the skills can’t be isolated in that way. They were able to exercise judgment, but they were still constrained by the assumptions behind these tests.

    In the School of One model, the software determines the students’ needs on the basis of student performance during the day. There is even less room for human judgment. The teacher receives the next day’s lesson plans in the evening–different lessons for different groups. How much are you going to change that at the last minute? Keep in mind that if you decide to change the grouping, you will throw off other groups and other teachers. So here, to an even greater degree, the teachers are constrained by a narrow interpretation of data.

    Moreover, math is not just a collection of skills. In many ways math is like literature; it requires insight, understanding, ingenuity, and deep knowledge. Plato’s discussion of math in book VII of the Republic is fascinating; he distinguishes between those things that “summon thought” through contradictions and opposites, and those that don’t. (He gives less credit to poetry, but that’s another matter.)

    I may be partly wrong about the School of One. I have not seen the computer program, and I have not seen the classes in action. But I raise these concerns because they seem obvious and because all I hear about the School of One is hype, hype, hype.

    Comment by Diana Senechal — October 21, 2010 @ 10:22 am

  12. Correction: the fourth sentence of the second paragraph should read: “How much are you going to change at the last minute?”

    Comment by Diana Senechal — October 21, 2010 @ 10:40 am

  13. I meant the third paragraph.

    Comment by Diana Senechal — October 21, 2010 @ 10:41 am

  14. Diana, The difference between your situation and Elden’s is that the assumptions behind the data are not equally valid (or invalid). I think you’re missing that point from Elden’s article. It is not that teachers are constrained by technology. It is that data is misused, and is therefore ineffective in driving instruction.

    Is it equally misused in math? IOW, are the assumptions behind the data also invalid? It certainly might be the case, and if you wanted to assert as much it would be relevant to Elden’s point, but that is not the argument you are making.

    If you want to argue against the use of technology in teaching, then that is a different argument altogether.

    Comment by Hainish — October 21, 2010 @ 11:29 am

  15. Hainish, you are missing my point. I didn’t say that the teachers were constrained by technology in Elden’s situation. I said they were constrained by the assumptions behind the tests. That is, the test questions are supposed to measure students’ mastery of a given skill. They don’t do this accurately. Indeed, the data is misused. What’s more, even though there is some room for human mediation, the teachers have to wrestle with the flawed assumption that ELA test questions can measure mastery of isolated skills.

    Now, granted, in math it may be easier to measure mastery of isolated skills. But similar problems arise, as most math problems draw on a combination of skills and knowledge. Moreover, the more advanced the topics, the more lesson time they require. A single theorem can take up a whole lesson, and it may take much longer to sink in. The “School of One” model seems to rely on the assumption that math skills and knowledge can be taught in bits. From what I gather, the data is used to determine which “bits” each student needs. This, to me, is not an adequate approach to mathematics.

    Don’t get me wrong–I am all for the thorough teaching of math knowledge and skills–drills, memorization, discussion, and all. Students need intensive practice, and if technology can help with that, so much the better. But I am wary of reducing math to a succession of skills and breaking lessons into mini-lessons. And when teachers have limited control over this, when the groups and lessons are determined for them every day, it seems even more problematic.

    Comment by Diana Senechal — October 21, 2010 @ 12:23 pm

  16. The math equivalent of the question “Which of these four owls’ names is the most misleading?”, I guess, would be “Which of these four proofs is the most incorrect?…”

    Comment by andrei radulescu-banu — October 21, 2010 @ 12:51 pm

  17. Well, how about this one, from the 2010 New York State math test, grade 7:

    Based on Rudy’s baseball statistics, the probability that he will pitch a curveball is 1/4.

    If Rudy throws 20 pitches, how many pitches most likely will be curveballs?
    A. 1
    B. 2
    C. 5
    D. 10

    This problem has a clear right answer–but, according to the answer key, it supposedly tests the student’s ability to “predict the outcome of an experiment.” So, if the student gets this wrong, does he or she need more practice predicting outcomes of experiments? Or might this signal a problem with basic calculation? This is an example of a question supposedly testing mastery of a particular skill but in fact testing a number of things at once.

    On the same test, there is also a question with two possible answers:

    A rectangular pyramid is shown below. Which shape could be the base of the pyramid?

    A. square
    B. pentagon
    C. triangle
    D. trapezoid

    The answer is A, but it could also be D, as there are two valid definitions of a trapezoid. There is the exclusive definition: a trapezoid is a quadrilateral with only one pair of parallel sides. Then there’s the inclusive definition: a trapezoid is a quadrilateral with at least one pair of parallel sides. Granted, the textbook probably gives the exclusive definition, but both definitions are in use.

    Comment by Diana Senechal — October 21, 2010 @ 1:54 pm

  18. Well, there are two things wrong with that owl question. First, there are two plausible answers. Second, it doesn’t really test the skill it purports to test.

    The second issue is more common than the first on math tests, but both can be found. Here’s an example of a problem that doesn’t really test the skill it purports to test (from the 2010 New York State seventh-grade math test):

    Based on Rudy’s baseball statistics, the probability that he will pitch a curveball is 1/4.

    If Rudy throws 20 pitches, how many pitches most likely will be curveballs?

    A. 1
    B. 2
    C. 5
    D. 10

    According to the answer key, this problem supposedly tests a student’s ability to “predict the outcome of an experiment.” Yet a student who gets it wrong is just as likely having trouble with arithmetic.

    As for the first issue–that of two possible answers–here’s a question from the same test:

    A rectangular pyramid is shown below. Which shape could be the base of the pyramid?

    A. square
    B. pentagon
    C. triangle
    D. trapezoid

    There are actually two definitions of a trapezoid, both currently in use. There is the exclusive definition: a trapezoid is a quadrilateral with exactly one pair of parallel sides. Then there is the inclusive definition: a trapezoid is a quadrilateral with at least one pair of parallel sides. Granted, the textbook probably gives the exclusive definition. But there are reasons to prefer the inclusive definition: in particular, anything that is true of a trapezoid is also true of a parallelogram. Under this definition, a parallelogram is a type of trapezoid, a rectangle is a type of parallelogram, and a square is a type of rectangle.

    Comment by Diana Senechal — October 21, 2010 @ 2:08 pm

  19. By definition, all squares are trapezoids, and if a textbook says otherwise, it is not worth the paper it is printed on.

    Now I don’t have to look at the picture of the rectangular pyramid that came with the New York State seventh-grade question to tell there may be another problem with it – from a 3 dimensional drawing, it is impossible to tell if a parallelogram is a square or not…

    Comment by andrei radulescu-banu — October 21, 2010 @ 3:11 pm

  20. I am having commenting trouble today… My first comment regarding the test questions seemed to disappear when I posted it, so I rewrote it. Sorry for the redundancy.

    Anyway, Andrei, thanks for your good points. I believe that the word “could” is supposed to take care of it. That is, according to the test makers’ understanding, the base is not necessarily a square, but the square is the only possibility. Clearly they are assuming the “exclusive” definition of trapezoid (which you indicate is incorrect).

    Comment by Diana Senechal — October 21, 2010 @ 3:44 pm

  21. You’re right – they do say ‘could’. It cold be a square, and it is a trapezoid…

    Comment by andrei radulescu-banu — October 21, 2010 @ 4:18 pm

  22. The standards movement has unleashed the worst of the left and the worst of the right. From the right we have pugilisitic demands for instant, measurable results, even in the humanities, which depend so much on nuance and messy human variables. On the left, it’s obnoxious progressives forever contriving to make Education, with a capital E, a quasi-scientific priesthood that rules over all the other academic subjects. Both reason and education–with a little e–are being squeezed out.

    Now, can anyone get me some hard data on that?

    Comment by James O'Keeffe — October 21, 2010 @ 4:38 pm

  23. For what it’s worth there’s another interpretation to the 25%-probability-of-a-curveball problem. Is the first throw most likely to be a curveball? No, the first throw has a 25% chance of being a curveball and a 75% chance of not being a curveball. Therefore it is most likely that the first throw will not be a curveball. The same could be said of the second throw. It will most likely not be a curveball. And the third, and the fourth, and so on, So how many throws are most likely to be curveballs? None of them! They are all mostly likely to not be curveballs. In each case there is a 75% probability of not being a curveball.

    When I first read this problem I immediately thought “five”, and almost as immediately thought, “No, it’s a trap, a trap that will catch anyone who is not thinking precisely about probability.” So in order to discover the trap I looked for another interpretation, which I described in the previous paragraph, and was quick to come.

    In the world of fuzzy math there is a call for teaching some concepts of probability in elementary school. So if we envision teaching the topic precisely and analytically, then I would think good students would expect to approach this problem rigorously. And it seems to me that if we are going to use language precisely, the correct answer is 0. None of the pitches are most likely to be curve balls.

    And there is yet another possible interpretation, and this interpretation would depend on knowing something about baseball, which I don’t. Perhaps there are a half dozen categories of pitches. Perhaps the curveball is the most likely with a probability of 25%. The next most likely, let is say, has a probability of 22%. And the third most likely has a probability of 21%, and so on. If this were the case, it would be true that the most likely pitch on the first throw would be a curveball. The first throw is not likely to be a curveball, to be sure, but it is more likely to be a curveball than to be any of the other five types with lesser probabilities. By this interpretation the answer is 20. Every throw is most likely to be a curveball, most likely out of the six identified possibilities.

    I’m not sure all this has much to do with the main idea of this discussion, but it seems worth bringing up.

    Comment by Brian Rude — October 22, 2010 @ 4:07 pm

  24. Brian, thanks–excellent and very interesting points. I was bothered by the wording of the question but had trouble coming up with an alternative. Perhaps: “What is the likeliest total number of curveballs that Rudy will pitch?” But then the students would have to be able to compare the likelihood of, say, pitching four curveballs to the likelihood of pitching five.

    I thought perhaps “most likely” could be taken to modify the entire sentence, since it is adverbial here–but it is a pesky phrase, and “how many pitches will be curveballs, most likely?” doesn’t sound quite right. In any case, it is ambiguous, and your interpretations are at least as sound as the intended one.

    There are still more possibilities. Perhaps Rudy rarely throws curveballs, but on the days that he does, he throws one after another. If he throws more than one, he is likely to throw curveballs throughout the whole game. Over the course of, say, twelve games, one-quarter of his total pitches are curveballs. In that case a total of five curveballs in a given game would be very unlikely; the numbers at the outer ends would be likelier. (Granted, the additional information would have to be mentioned in the problem itself.)

    I believe this does have something to do with the main idea of the discussion. If the question is poorly worded in the first place, an incorrect answer does not necessarily reflect a problem with any skill. As for your point that students should learn to approach problems rigorously, I heartily agree.

    Comment by Diana Senechal — October 22, 2010 @ 6:47 pm

  25. I think everyone here is getting all hung up on the “fine
    r points” and overlooking the overall concept of the probability of 25% of throwing a curve ball. While I agree there needs to precision in mathematics, remember this is a test for 7th grade students testing a general concept in probability, not a graduate course! Certainly this kind of precision and rigor at this level is inappropriate and to expect any middle school student to formulate all these possibilities on a time-limited test is silly.

    Consider this: Would you teach the “finer points” of the game to a novice golfer or would you just teach him the basics of grip, stance, swing, keep-your-eye-on-the-ball,
    etc. then tell him to go out a play. Then after a period of time if the novice shows some talent and seriousness you could then teach him the “finer points” of the game.

    Comment by e.g.e. — October 23, 2010 @ 9:57 am

  26. I’d have to agree with e.g.e. Grip it and rip it has worked for many good golfers. The finer points of the game can be incorporated anytime after the basics have been established.

    Comment by Paul Hoss — October 23, 2010 @ 2:33 pm

  27. What’s wrong with a little excursion into logic and language? It’s fun–but also useful. It is very important to teach students to spot ambiguities in language. Seventh graders are not too young for it. Moreover, teachers and test makers should convey the finer points by example at the very least. Do you think it’s acceptable for a test maker to say, “aw, come on, everyone knows what I was trying to say, who cares if I didn’t quite get the wording right?”

    Also, certain kinds of fine points are essential for beginners. If you’re learning an instrument, you shouldn’t get in the habit of playing out of tune. If you’re learning a language, you should pronounce it as precisely as possible and get the intonation right. In many cases the “basics” are indeed fine points–and it is those fine points that illuminate the subject.

    In any case, none of this detracts or distracts from the earlier points that (a) test questions do not necessarily test the skills that they purportedly test, (b) test questions are sometimes ambiguous or poorly worded, and (c) many topics, problems, and texts draw on a number of skills at once. Of course tests are needed and informative, but they do not always show directly or precisely what students need to learn.

    Comment by Diana Senechal — October 23, 2010 @ 3:29 pm

  28. I quite agree with you Diana. After reading the post and going through all the comments that have been posted here, I feel there’s nothing wrong in a chirpy discussion about logic and language. Thanks to Robert Pondiscio, the post was helpful.

    Comment by ILEAD India — October 25, 2010 @ 4:06 am

  29. [...] though, are the many examples of superficial and faddish uses of data (h/t Core Knowledge). Rick Hess, in a smart 2008 Educational Leadership article, warns of the glib use [...]

    Pingback by Data Boondoggles and the “New Stupid” — August 30, 2012 @ 2:10 pm

  30. [...] – perhaps.  For thoughtful critiques of “data-driven” teaching, see also: Robert Pondiscio; Esther Quintero; James [...]

    Pingback by On the Uses and Meaning of Data « InterACT — November 21, 2012 @ 4:15 pm

  31. […] Data-Driven…Off a Cliff is the title of an excellent post by Robert Pondiscio. […]

    Pingback by The Best Resources Showing Why We Need To Be “Data-Informed” & Not “Data-Driven” | Larry Ferlazzo’s Websites of the Day… — March 17, 2014 @ 11:19 pm

RSS feed for comments on this post. TrackBack URL

Leave a comment

While the Core Knowledge Foundation wants to hear from readers of this blog, it reserves the right to not post comments online and to edit them for content and appropriateness.