What is likely, although we cannot of course know for sure, is that the doctor looked at more than disease severity (the "existing clinical trait") as a separate classifier, hunting for subgroups for which the p-value indicated a promising trend.
The principle behind the proscription against multiple comparisons is well-known to statisticians. If we consider a 1 chance in 20 result to be statistically significant, then, randomly, on average 20 "trials" will yield one statistically significant result.
By dividing the patients into disease severity subgroups, Dr. Harkonen increased the number of "trials" from 1 to 4, thereby elevating the likelihood of yielding an effect that appeared to be statistically significant. If he also examined other subgroups in his quest to find a positive result, then he elevated the likelihood of finding a positive result toward certainty.
Our desire to find patterns and see cause and effect make us prone to confirmation bias. We can guard against this bias with care, including the use of statistics. It was not a surprise that a subsequent study looking only at the "mild-to-moderate" group did not demonstrate any benefit of the treatment. The belief that the treatment would benefit "mild-to-moderate" patients was speciously derived.
(Even if they did correct for multiple tests, I think the sub-group would have potentially been significant. An uncorrected p-value of 0.004 is what is sticking in my head).
But the point is, just because he's bad at statistics does that make it fraud? Based on what we know from the article, I'd argue no. People are allowed to be wrong and make mistakes in their analysis. They just aren't allowed to knowingly make those mistakes. And this is what we don't know... what he knew and what he thought at the time.
sure, but when people deliberately lie in order to gain millions of dollars, I can't get that upset when they get 6 months stuck in their house
he, and statisticians at his company, knew or should have known what he was doing was wrong. This stuff is covered in the first inferential stats course taken as an undergrad.
The principle behind the proscription against multiple comparisons is well-known to statisticians. If we consider a 1 chance in 20 result to be statistically significant, then, randomly, on average 20 "trials" will yield one statistically significant result.
By dividing the patients into disease severity subgroups, Dr. Harkonen increased the number of "trials" from 1 to 4, thereby elevating the likelihood of yielding an effect that appeared to be statistically significant. If he also examined other subgroups in his quest to find a positive result, then he elevated the likelihood of finding a positive result toward certainty.
Our desire to find patterns and see cause and effect make us prone to confirmation bias. We can guard against this bias with care, including the use of statistics. It was not a surprise that a subsequent study looking only at the "mild-to-moderate" group did not demonstrate any benefit of the treatment. The belief that the treatment would benefit "mild-to-moderate" patients was speciously derived.