The August 29, 2007 issue of Education Week reports the release of the What Works Clearinghouse’s review of beginning reading programs. Out of nearly 900 studies that were reviewed, only 51 met the WWC standards—an average of about two studies per reading program that were included. (120 other reading programs were examined in 850 studies deemed methodologically unacceptable.) The article, written by Kathleen Kennedy Manzo, notes that the major textbook offerings, on which districts spend hundreds of millions of dollars, did not have acceptable research available. Bob Slavin, an accomplished researcher and founder of the Success for All program (which got a middling rating on the WWC scale), also noted that the programs reviewed were mostly supplementary and smaller intervention programs, rather than the more comprehensive school-wide programs.
Why is there this apparent bias in what is covered in WWC reviews? Is it in the research base or in the approach that the WWC takes to reviews? It is a bit of both. First it is easier to find an impact of a program when it is supplemental and it is being compared to classrooms that do not have that supplement. This is especially true where the intervention is intense and targeted to a subset of the students. In contrast, consider trying to test a basal reading program. What does the control group have? Probably the prior version of the same basal or some other basal. Both programs may be good tools for helping teachers teach students to read, but the difference between the two is very hard to measure. In such an experiment, the “treatment” program would have “no discernible effect” (the WWC category for no measurable impact). Unlike a medical experiment where the control group gets a placebo, we can’t find a control group that has no reading program at all. Probably the major reason there is so little rigorous research on textbook programs is that districts usually have no choice: they have to buy one or another. Research on supplementary programs, in contrast, can inform a discretionary decision and so has more value to the decision-maker.
While it may be hard to answer whether one textbook program is more effective than another, a better question may be whether one works better for specific populations, such as inexperienced teachers or English learners. It is a useful question if you are deciding on a text for your particular district but it is not a question that is addressed in WWC reviews.
Another characteristic of WWC reviews is that the metric of impact is the same whether it is a small experiment on a highly defined intervention or a very large experiment on a comprehensive intervention. As researchers, we know that it is easier to show a large impact in a small targeted experiment. It is difficult to test something like Success for All that requires school-wide commitment. At Empirical Education we suggest to educators that WWC is a good starting point to find out what research has been conducted on interventions of interest. But the WWC reviews are not a substitute for trying out the intervention in your own district. In a local experimental pilot, the control group is your current program. Your research question is whether the intervention is sufficiently more effective than your current program for the teachers or students of interest to make it worth the investment. —DN