Field of Science

Statistics ≠ Checking Your Brain at the Door

As part of an endeavor to improve undergraduate writing, I was involved in a day long session of reading senior level writing assignments.  Basically, a group of us had a list of criteria and ranked each assignment as meeting or not meeting the criteria. We were not grading or assessing the worth of the assignments simply whether a given criteria was met or not met. I learned a few things during the 7 hours of reading 16 assignments (~20-30 pages each), one of which I want to touch on here.


Now I am not a statistician nor do I have any real expertise in statistical analysis. In fact, I turn to statisticians when I need to do statistical analyses beyond student T-tests or analysis of variance. However, I think I know enough about statistics to not make the error of the p-value.

The p-value essentially tells you the probability that some event, data, occurrence is due to chance (generally referred to as the null hypothesis). So if you are hoping that the effect you are looking at is not due to chance you want a small p-value. The question, of course, then becomes 'how small'? The scientific community has generally agreed that a p-value of < 0.05 is a rigorous cut-off. A p-value > 0.05 is considered to be reasonable odds that your effect may be due to chance. However, this cut-off of 0.05 is arbitrary and indeed higher and lower cut-offs are used in some fields.

To be clear, p-values can range from 0 - 1, so you can consider a p-value of 0.05 to be analogous to a 5% chance that the effect you are looking at to be due to chance. That also means that your p-value of 0.06 means there is a 6% chance your data is due to chance and that is too high for most scientists to consider your data significant.

xkcd's take
Now we come to the error of the p-value. By way of example, you should check out xkcd's acute take on the problem (well several problems, but the one we care about is central). If we look at a lot of different data under a given condition, then we should expect a data set to show a p-value < 0.05 on average 1/20 times (5%) that is strictly due to chance. This does not mean that we should discount a result that comes with a p-value of 0.05. It means there is only a 5% chance the result is due to chance. However, if there is additional data to back this result up, we can increase our confidence even more. If there is not additional data, hopefully the scientists (aka senior undergraduate students) will at least acknowledge the limitation of the many data points. Sadly, both of these were lacking in a couple of cases that I observed, although I admit I do not know if this represents a statistically significant (p < 0.05) result.


Anonymous said...

I think this highlights the point that at least elementary statistics should be requirement in the high school or undergraduate science curriculum. Biology is rapidly advancing so that statistics (often times more advanced than the elementary) is a requirement, I feel.

Also, I find that the biologists' misconception about the relevance and limitations of the p-value by itself is quite wide-spread but explained it well. More sample size = more statistical power.

The Defective Brain said...

I've seen statistical errors at all levels of academia.
If I see another paper that has *-p<0.05, **p<0.01, ***p<0.001 in the SAME figure, I will start frothing at the mouth. There can only be one p-value cut off that's meaningful in any one experiment.

The Lorax said...

My personal pet peeves is when authors use the term 'very significant' or in the case of those who know that 'very' is a useless word 'highly significant'.

Anonymous said...

The singular of "criteria" is "criterion".

efrique said...

The p-value essentially tells you the probability that some event, data, occurrence is due to chance (generally referred to as the null hypothesis).

Well, no, I'm sorry, that's wrong.

The p-value is the probability of a result at least as unusual as the one you got *given* no difference. That's not the same thing as what you said at all, which basically has the conditioning the wrong way around.

What you said is what it would be handy to know - 'the probability that the null hypothesis is true given a result at least this big', but that's not what the p-value gives you.

[To convert from one to the other (swap the conditioning around), you would normally use Bayes' rule but in the case of hypothesis tests, you can't because you don't know the denominator.]