Level of Significance
A predetermined level of significance allows for the null hypothesis to either be rejected or accepted . The significance level that is widely used in academic research is 0.05, which is often reported as ‘p = 0.05’ or ‘α = 0.05’. The null hypothesis is rejected in favour of the alternative hypothesis if the calculated p-value is less than the predetermined level of significance. For instance, if you were to analyse a set of data looking at reaction times following caffeine consumption, with the resulting significance value being p = 0.03 you are able to reject the null hypothesis and accept the alternative hypothesis, on the basis that all assumptions for the statistical model were met. This is because, the smaller the p-value, the greater the statistical incompatibility of the data with the null hypothesis . In other words, the smaller the p-value, the more unusual the data would be if every single assumption were correct .
There is a common misconception that lower p-values are associated with having a stronger treatment effect than those with higher p-values . For example, an outcome of 0.01 is often interpreted as having a stronger treatment effect than an outcome of 0.05. Whilst this is true if we can be certain that every assumption was met, a smaller p-value does not tell us which assumption, if any, is incorrect. For example, the p-value may be very small because, indeed, the targeted hypothesis is false; however, it may instead be very small because the study protocols were violated . As a result, the p-value tells us nothing specifically related to the hypothesis unless we are absolutely positive that every other assumption used for its computation is correct . In other words, a lower p-value is not synonymous with importance. Therefore, we must take caution when accepting or rejecting the null hypothesis and should not be taken as proof that the alternative is indeed valid .
Although the use of the p-value as a statistical measure is widespread, the sole use and misinterpretation of statistical significance has led to a large amount of misuse of the statistic and thus has resulted in some scientific journals discouraging the use of p-values . For instance, NHST and p-values should not lead us to think that conclusions can be a simple, dichotomous decision (i.e. reject vs not reject) . A conclusion does not simply become “true” on one side of the divide and “false” on the other . In fact, many contextual factors (i.e. study design, data collection, the validity of assumptions, and research judgement) can all contribute to scientific inference rather than by finding statistical significance [9,12]. Despite these criticisms, the recommendation is not that clinical researchers discard significance testing, but rather that they incorporate additional information that will supplement their findings . With that being said, it is important that statistical significance can be correctly interpreted to avoid further misuse.