An Ode to Statistics – Significance

In my, albeit biased, opinion, statistics is one of the most important fields in any physical, biological or social science. It forms the backbone of accurate, replicable and logical science, and allows us to clear the mist surrounding the workings of our world. For me, the power of stats is not in the complex formulae and theorems that are at the fringe of the subject, but rather in the simple statistical concepts. The concepts that are often taught in any university science degree, but seldom taught in an accessible and applicable way.

As a result, I wanted to dedicate a couple of posts to the some of the simplest, yet most important and pervasive concepts in statistics. In my personal experience, statistics is not taught in a pupil-friendly way. Statistics evoke an image of complex mathematical equations, and boring applications that immediately put the learner on the back foot. But, I don’t believe that’s what statistics should be. Stats (at an undergraduate or secondary school level) should be about applicability and about understanding, not learning how to carry out an ANOVA by hand. So that’s what we’re going to focus on; simplicity and applicability.

The very first concept I learnt in my psychology statistics modules, and perhaps the most important concept in current scientific research, is the idea of significance. “Significant” in common parlance simply means “something memorable, or of note or interest”, but in science, “significance” holds a very specific meaning. In statistics and science, if a difference is “significant”, that is to say that we are relatively confident that the observed difference is due to a real effect. For example, imagine someone claimed that shorter people live longer than tall people, if the difference was significant, we would be relatively sure that this difference in lifespan was not due to chance.
Let’s demonstrate with an example. Imagine one of your friends claimed that they could predict the outcome of a coin flip. You don’t believe them, so, naturally, you test it out. Your friend makes 10 predictions, you flip the coin 10 times, and you compare the results. This is what you get.

1 2 3 4 5 6 7 8 9 10

Is this enough to say that your friend can predict the future? Is 8 out of 10 significant? To find this out, let’s first figure out how many we would have expected your friend to get right just by chance. We know that she has a 50% chance of being right on any given flip. That means that, on average, we would expect her to get around 5/10. To demonstrate this, if we simulate the coin flip (and the prediction) 10,000 times, these are the number of correct predictions we observe:

As you can see, the most frequent number of correct predictions is 5, and as the number of correct predictions increases or decreases, the probability decreases*.

But your friend got more than 5 correct, she got 8. According to our graph, this is quite unlikely. So let’s calculate the likelihood that she would get 8 right by chance:

That means that we would only expect someone to correctly guess 8 out of the 10 flips completely by chance 0.39% of the time!

As a result of this very small chance of someone guessing that many just by chance, we are relatively sure these results are significant; in other words, we are fairly sure that the results did not occur by chance!

But the conclusions we can draw from this are limited. These results, for example, does not prove that your friend can predict the future. In other words, the fact that we would only expect to see these results 0.39% of the time if it was down to change does not mean that we are 99.61% sure that our hypothesis (that your friend can predict the future) is true. There are a number of other explanations that could explain why we’ve seen these results. Therefore, the only information this p value gives us is how likely the observation is to occur simple by chance.

As you could imagine, people have very different ideas of what the cut-off point for significance is. In modern science, we use the p value of 0.05, or 1 in 20, with anything less likely to occur by chance than that being counted as significant. Applying this to our coin flip example, any number of correct guesses more than (but not including 7) would constitute significant results.

In truth however, this is really just an arbitrary number that the collective of science has decided on for the reason of progress. But an analysis of whether 0.05 is an effective cut-off point is for another post.

So there we have it. That is the simple concept of significance. It just means results that we are 95% sure (or greater) did not occur by chance.

Of course, this isn’t the entirety of what significance is in science, but that’s not what we’re going for. We’re aiming for an applicable and useful understanding: a basic level of understanding without the requirement of qualifiers or complex formulae. And I hope this has provided you with that. Next time, we’ll look at some of the problems with significance, and power.

*The reason that probability decreases with fewer correct answers, is that we’re counting the exact number of correct predictions (only 1 out of 10, or only 2 out of 10), not “at least that many of correct predictions”, in which case the graph would look very different.