Statistical significance doesn’t always warrant a miracle headline

Interpreting studies is a dicey thing. Often I find what might be statistically significant translated into headlines that might not really get at the nuance of the study or the results.

Take these three for example:

  1. “Pine bark extract improves severe perimenopausal symptoms”
  2. “Two weeks of antibiotic therapy relieves IBS (irritable bowel syndrome)”
  3. “Study: ‘Female viagra flibanserin’ works”

The first line of the last article: “Need a boost to your sex life. The magic could be in a little pill?”

Let’s look at the studies referenced by these three headlines.

French maritime bark extract, in the dose studied, improved hot flashes for 35% of women who took the drug versus 29% of women who took placebo. Insomnia and sleep problems improved by 28% for the bark extract versus 21% for placebo. (In the 3 domains tested the bark extract “worked”, but it barely reached statistical significance.) Statistically more women who took the drug did better, but is 7 women out of a hundred getting benefit for hot flashes and insomnia from a daily medication a clinically significant or meaningful benefit?

The same goes for the IBS drug, an antibiotic called rifaximin. In the article a physician says that many participants ”say they are 80% improved, 90% improved, that kind of results …”

Looking at the studies, TARGET 1 and TARGET 2 we see that 41% in the drug group were responders versus 32% of the placebo group (which is pretty poor overall if you ask me as a known placebo improves the symptoms of IBS for 59% of patients)! Might a few of the responders felt dramatically better? Is that “many”? I guess it depends on your perspective, but understanding that only 9% of people were true responders puts the findings in a  different light.

And flibanserin? Again, the same type of numbers. The initial studies quoted as “magic” actually showed animprovement for 30-40% of women who took flibanserin versus 15-30% for placebo. Overall, there was an increase in 1-1.8 number of satisfying sexual encounters a month. In my very unscientific study (a poll from a couple of weeks ago) it appears that 78% of people didn’t find those numbers clinically meaningful either.

Statistical significance doesnt always warrant a miracle headline

There is no doubt that each one of these studies has statistical significance. When studying any therapy that is obviously the first step, however, statistical significance is often parlayed into miracle headlines and the promise that a drug is truly helpful is tempered somewhat when you compare it with the placebo response rate.

It behooves everyone not to get caught up in the hype and to put the results in perspective. When a medication helps 65% of people versus 25% for placebo the medical decisions tend to be easier (and by the way, that is more along the lines of what I’d call a miracle response rate, not 40% versus 32%).

If no previous therapy has worked for a condition before then 7 responders out of 100 might truly be a miracle. If a patient has not responded to any previous therapy then 7 responders might be worthwhile. If the medication is extremely low-cost and has minimal side effects, then 7 responders out of a 100 might seem reasonable for many people. However, if the medication is expensive and/or has side effects then obviously the enthusiasm needs to be tempered. The response rate to a medication is part of the risk benefit ratio and every patient that will be different and is controlled by variables such as severity of illness, response to previous treatments, impact of the condition on their life, and previous experience with side effects and cost.

Statistical significance doesn’t mean something works for everyone and it doesn’t mean a therapy is a miracle, so reporters and health care providers need to stop intimating that it does. Statistical significance simply means the likelihood that the desired effect was achieved by chance. The next and far more important step is interpreting those results and applying them in a clinically meaningful way, but those kinds of discussions probably don’t generate sexy, amazing headlines.

Jennifer Gunter is an obstetrician-gynecologist and author of The Preemie Primer. She blogs at her self-titled site, Dr. Jen Gunter.

Comments are moderated before they are published. Please read the comment policy.

  • Lisa

    Great article!

    I think anyone who is diagnosed with a serious illness should be offered a medical statistics course. Relative versus absolute probability should be discussed. As a cancer patient, it drives me nuts when I read articles that doesn’t differentiate between the two. How can I make decisions without understanding this concept?