If you were to look back at my previous postings on this blog, you would find that I am often motivated by topics that arise from teaching on our Masters course in Medical Statistics. Well, a new cohort of students started recently and in the first couple of weeks we give them an overview of basic statistics to make sure that everyone is up to the same level. This year, as part of that introduction, one of my colleagues asked the students to discuss the fact that the psychology journal, Basic and Applied Social Psychology, has banned p-values from the articles that it publishes.
This is a Bayesian blog so, as you might expect, I have a great deal of sympathy with anyone who has a problem with p-values, but is a ban the right reaction?
Here is a question for you. What have the following in common?
(a) the charge of Light Brigade
(b) the election of Jeremy Corbyn to lead the UK Labour party
(c) the banning of p-values by Basic and Applied Social Psychology
My answer is that all three are admirable because they were inspired by sincerely held, well-founded beliefs, but in each case it was clear from the outset that the venture was going to end in tears.
Let me steer clear of history and politics and concentrate on the statistics.
Anyone who knows anything about the fundamentals of statistics will know that the logic of null hypotheses and p-values has very little of relevance to say about scientific investigation. This is not the place to repeat those arguments but there are so many well-known problems with p-values that they definitely deserve a place in the rubbish bin of history.
If that were not enough, we can add to the list of problems with p-values that so many scientists misunderstand and misuse them. In my experience, few people outside of the specialist statistics community have any real understanding of what a p-value actually is. So most people use p-values as a blackbox method and even then there are problems, because, due to a historical accident, conclusions are usually based on whether p<0.05. If we were to reset the threshold today, we would surely choose a stricter cut-off.
So, good for Basic and Applied Social Psychology, they have done the scientific world a great favour and I’m sure that people will look back on the banning of p-values as an important step forward.
The Journal’s ban extends to confidence intervals, as logically it must once p-values have been dropped, and they even express doubts about Bayesian methods, so they end up with something that is very close to an attack on statistics, masquerading as an attack on p-values.
In fact, their doubts about Bayesian statistics are, in themselves, quite interesting. The editors attack the Laplacian assumption that ignorance can be expressed by equal probabilities. Effectively an attack on the adoption of the flat priors that is so common in poor quality Bayesian analyses. So I am even in sympathy with those views.
Where the ban turns into glorious farce is that the editors have so little that is positive to offer as an alternative.
For instance, they say “we encourage the use of larger sample sizes than is typical in much psychological research”. I would certainly give my students a hard time if they said something like that. How large is large? The statement means nothing unless you have a method for defining the necessary sample size and how do you do that without a calculation that comes very close to depending on the standard error, which in turn is linked to the p-value.
The editors’ other requirement is for “strong descriptive statistics, including effect sizes”. So I do a “large” psychological experiment and I find a 6% difference between the average scores of men and women. What do you conclude? Without some consideration of sampling variation you can conclude nothing. So I tell you that men scored between 30 and 80 and women between 28 and 85. Descriptive but still not enough. Already you will probably be saying to yourself, a range of 50, so the standard deviation must have been just over 10, if I knew the sample size I could calculate the standard error of the mean.
Drop p-values by all means but you cannot escape from a consideration of random variation.
Banning p-values and then leaving the authors without firm guidance is gloriously daft. It is a pity that the editors missed the opportunity to advocate subjective Bayesian analysis or pure likelihood analysis, but I am really pleased that they have taken a step in the right direction. Their journal will suffer, but science will gain.