I’ve just been pointed to a fantastically bad article about statistical significance on the blog of one Dr John Briffa, former natural health columnist for the Daily Mail, no less. He says:
This tells us, supposedly, whether there’s some real effect or change going on, or it’s merely something that’s most likely to be due to chance. Statistical significance in scientific studies is denoted by what is known as the P (or probability) value. A value of less than 0.05 is generally regarded as denoting ’˜statistical significance’.
Sounds fine so far. Except, I do feel compelled to point out that the choice of 0.05 as a cut-off is utterly arbitrary. It’s a value that the scientific community agree on. It’s a consensus ’“ it’s not carved in stone like some irrefutable scientific truth. If the scientific community decided that 0.01 was going to be a cut-off, then less things would be ’˜statistically significant’. If the limit was set at 0.1 then many more things would be deemed significant. When we understand this, we begin to see just how arbitrary a lot of scientific ’˜findings’ really are.
And that’s all true, although the fact that p<0.05 is a totally arbitrary choice isn’t exactly a secret. We all know it. Often, people will demand p<0.001. That’s why we quote p values rather than just printing “yes” or “no”. It’s a bit like saying “People over 6’2″ are considered ‘tall’. Except, I do feel compelled to point out that the choice of 6’2″ as a cut-off is utterly arbitrary” and then saying that therefore tallness isn’t a useful concept in real life.
One commenter said
yes totally agree
‘science’.. statistics.. are black and white
life is grey, a vague misty kind of grey
so it’s clear that his plan is working. Medical science is anything but “black and white”. It’s all about measuring and weighing risks and benefits. About p-values and error bars. Confidence intervals and placebo effects.
There ends Briffa’s explanation of p-values. He hasn’t bothered to explain what they are and doesn’t plan to. (In fact, a p-value is roughly defined as the odds of getting a result that good if the hypothesis you’re testing is false: if you get a high p-value then there’s really no reason to think your hypothesis is true.) Instead, he charges straight into a badly thought out analogy:
Let’s imagine someone decided to do a big study on road safety. Let’s say they counted up the number of times someone, somewhere, crossed the road. And now, let’s imagine, they also count up the number of times someone gets run over (and hurt or killed) as a result of crossing the road. Now, I’m writing this on a plane and can’t even check if these statistics exist. But I think it’s reasonable to assume, that compared to the total number of road crossings, the number of people being knocked down is likely to be very small indeed.
Now imagine we applied some statistical ’˜wizardry’ to this (with that arbitrary P value, remember) It’s not too difficult to imagine that one would turn up a result which shows: ’˜crossing the road is not associated with a statistically significant increased risk of getting run over.’ Now, many doctors and scientists would interpret this finding as evidence that crossing the road is ’˜safe’. However, we all know that while most of the time it is, sometimes it’s not.
He seems to be trying to implicitly conflate probability of an event occurring with statistical significance. He’s implying that because accidents are rare, the results of a study will be statistically insignificant. In fact, if a prospective study looked at people crossing the road (the tests), and the same number of other people sitting at home for the same length of time (the controls) then if even six of the test subjects got run over, your p value would be less than 0.05 (assuming the controls all survived). If eleven of the test group were hit by cars then you’d get p<0.001 level significance. You can test for very unlikely events and get good p-values; you just need a large sample space. In this case, you need to look at enough people that six die. That’s probably a much bigger number than is at all feasible, so in fact, you’d do a retrospective case-control study. You’d locate a group of people who were killed by being hit by cars, find out how many of them were crossing the road at the time, and compare that to the proportion of the general population crossing the road at any given time. I suspect you’d find a big (and statistically significant) difference there. That’s how we test for rare events. His analogy proves nothing at all.
That said, he’s right that one could do a study that shows no link. That’s pretty easy: we all know that crossing the road actually is very safe considering how many times most people do it. The point is that that study would not say “crossing the road isn’t linked to death by car accident”; it would say “we have not found a link between crossing the road and death in car accidents”. It certainly wouldn’t prove that no such link exists, and no competent scientist would ever claim it did.
The point that he is attempting to prove by this is as follows:
An example of where statistical significance appears to have got in the way of a constructive debate on the subject is vaccination. Our Government here in the UK, most doctors (I suspect) and many commentators would have us believe that vaccination, including the measles, mumps and rubella vaccination (MMR) is ’˜safe’. Many will not even entertain the thought that there may be a problem with MMR.
That’s totally untrue. Of course injecting pathogens into people has associated risks. Nobody with any relevant knowledge is claiming otherwise. The claim is simply that the risks are tiny compared to the benefits, and that autism isn’t one of the risks.
Science hasn’t proven that there’s no risk. Science can’t do that: you can’t prove a negative. It’s possible that there’s exactly one person in the world whose body is set up in such a way that the MMR jab would cause them to become autistic. In that case, there would be a risk, but it would be impossible to detect it unless (indeed, even if) your study contained that one person (in which case the risk would go away when the study was done anyway). So we’re stuck? No, not really.
We know that all the ‘evidence’ that MMR causes autism is, for want of a better word, shit. It’s rubbish. It is insignificant. We also know that a number of better studies have found no risk at all. So yes, it’s still theoretically possibly, but why should it be true? Crossing the road causes car accidents? That figures. We don’t need epidemiology to convince ourselves that that’s pretty likely. But “MMR causes autism”? Why not “MMR causes tallness” or “MMR causes hair loss” or “hats cause kidney stones”? None of those have been absolutely disproven either. There’s absolutely no reason for it to be true, except that many people irrationally believe it to be true, and their beliefs can all be traced back to a load of scary stories published together in a right-wing, anti-scientific propaganda tome. And in Dr Briffa, it even has people standing up for it by harping on about “the limitations of science”. (He has form on this.)
What we have here, then, is a religion. Classic case.