Question: What do we learn from a study that shows a technique or technology likely has affected an educational outcome?
Answer: Not nearly enough.
Despite widespread criticism, the field of education research continues to emphasize statistical significance—rejecting the conclusion that chance is a plausible explanation for an observed effect—while largely neglecting questions of precision and practical importance. Sure, a study may show that an intervention likely has an effect on learning, but so what? Even researchers’ recent efforts to estimate the size of an effect don’t answer key questions. What is the real-world impact on learners? How precisely is the effect estimated? Is the effect credible and reliable?
Yet it's the practical significance of research findings that educators, administrators, parents and students really care about when it comes to evaluating educational interventions. This has led to what Russ Whitehurst has called a “mismatch between what education decision makers want from the education research and what the education research community is providing.”
Unfortunately, education researchers are not expected to interpret the practical significance of their findings or acknowledge the often embarrassingly large degree of uncertainty associated with their observations. So, education research literature is filled with results that are almost always statistically significant but rarely informative.
Early evidence suggests that many edtech companies are following the same path. But we believe that they have the opportunity to change course and adopt more meaningful ways of interpreting and communicating research that will provide education decision makers with the information they need to help learners succeed.
Admitting What You Don’t Know
For educational research to be more meaningful, researchers will have to acknowledge its limits. Although published research often projects a sense of objectivity and certainty about study findings, accepting subjectivity and uncertainty is a critical element of the scientific process.
On the positive side, some researchers have begun to report what is known as standardized effect sizes, a calculation that helps compare outcomes in different groups on a common scale. But researchers rarely interpret the meaning of these figures. And the figures can be confusing. A ‘large’ effect actually may be quite small when compared to available alternatives or when factoring in the length of treatment, and a ‘small’ effect may be highly impactful because it is simple to implement or cumulative in nature.
Confused? Imagine the plight of a teacher trying to decide what products to use, based on evidence—an issue of increased importance since the Every Student Succeeds Act (ESSA) promotes the use of federal funds for certain programs, based upon evidence of effectiveness. The newly-launched Evidence for ESSA admirably tries to help support that process, complementing the What Works Clearinghouse and pointing to programs that have been deemed “effective.” But when that teacher starts comparing products, say Math in Focus (effect size: +0.18) and Pirate Math (effect size: +0.37), the best choice isn’t readily apparent.
It’s also important to note that every intervention’s observed “effect” is associated with a quantifiable degree of uncertainty. By glossing over this fact, researchers risk promoting a false sense of precision and making it harder to craft useful data-driven solutions. While acknowledging uncertainty is likely to temper excitement about many research findings, in the end it will support more honest evaluations of an intervention’s likely effectiveness.
Communicate Better, Not Just More
In addition to faithfully describing the practical significance and uncertainty around a finding, there also is a need to clearly communicate information regarding research quality, in ways that are accessible to non-specialists. There has been a notable unwillingness in the broader educational research community to tackle the challenge of discriminating between high quality research and quackery for educators and other non-specialists. As such, there is a long overdue need for educational researchers to be forthcoming about the quality and reliability of interventions in ways that educational practitioners can understand and trust.
Trust is the key. Whatever issues might surround the reporting of research results, educators are suspicious of people who have never been in the classroom. If a result or debunked academic fad (e.g. learning styles) doesn’t match their experience, they will be tempted to dismiss it. As education research becomes more rigorous, relevant, and understandable, we hope that trust will grow. Even simply categorizing research as either "replicated" or "unchallenged" would be a powerful initial filtering technique given the paucity of replication research in education. The alternative is to leave educators and policy-makers intellectually adrift, susceptible to whatever educational fad is popular at the moment.
At the same time, we have to improve our understanding of how consumers of education research understand research claims. For instance, surveys reveal that even academic researchers commonly misinterpret the meaning of common concepts like statistical significance and confidence intervals. As a result, there is a pressing need to understand how those involved in education interpret (rightly or wrongly) common statistical ideas and decipher research claims.
A Blueprint For Change
So, how can the education technology community help address these issues?
Despite the money and time spent conducting efficacy studies on their products, surveys reveal that research often plays a minor role in edtech consumer purchasing decisions. The opaqueness and perceived irrelevance of edtech research studies, which mirror the reporting conventions typically found in academia, no doubt contribute to this unfortunate fact. Educators and administrators rarely possess the research and statistical literacy to interpret the meaning and implications of research focused on claims of statistical significance and measuring indirect proxies for learning. This might help explain why even well-meaning educators fall victim to “learning myths.”
And when nearly every edtech company is amassing troves of research studies, all ostensibly supporting the efficacy of their products (with the quality and reliability of this research varying widely), it is understandable that edtech consumers treat them all with equal incredulity.
So, if the current edtech emphasis on efficacy is going to amount to more than a passing fad and avoid devolving into a costly marketing scheme, edtech companies might start by taking the following actions:
- Edtech researchers should interpret the practical significance and uncertainty associated with their study findings. The researchers conducting an experiment are best qualified to answer interpretive questions around the real-world value of study findings and we should expect that they make an effort to do so.
- As an industry, edtech needs to work toward adopting standardized ways to communicate the quality and strength of evidence as it relates to efficacy research. The What Works Clearinghouse has made important steps, but it is critical that relevant information is brought to the point of decision for educators. This work could resemble something like food labels for edtech products.
- Researchers should increasingly use data visualizations to make complex findings more intuitive while making additional efforts to understand how non-specialists interpret and understand frequently reported statistical ideas.
- Finally, researchers should employ direct measures of learning whenever possible rather than relying on misleading proxies (e.g., grades or student perceptions of learning) to ensure that the findings reflect what educators really care about. This also includes using validated assessments and focusing on long-term learning gains rather than short-term performance improvement.