Significance Testing: What Happens If We are Wrong?

statistical significance testingWhat happens if we are wrong? It is a critical question in risk assessment. A wrong decision has implications, sometimes small and inconsequential, sometimes large and disastrous. A wise decision maker always considers the implications of making the wrong choice and factors that into the final decision.

What happens if we are wrong? As market researchers we should be asking this question every time we conduct a statistical significance test. Significance testing is based on probability. It gives you the likelihood that the differences you see are due to a treatment effect as opposed to sampling and other sources of random error. You can be wrong in two ways. You can decide that the effect is real when the difference is due to random error or you can decide the observed difference is due to random error when the treatment actually had an effect.

The dual threat of “getting it wrong” can be demonstrated with a simple example. Imagine that a major medical journal has just published an article that shows your key competitor’s mechanism of action adversely impacts patient mortality. Your company’s drug is in the same class but has a different mechanism and was not implicated in the article. Marketing asks you to assess the impact of the article on your brand. If it is adversely impacted, marketing is prepared to invest in a campaign to counteract the article’s impact. If you decide that the article will not impact your brand but it actually does, your company could lose a lot of money as brand share erodes. Alternately, if you decide that the article will hurt share and it does not, your company will waste money investing in an unnecessary campaign.

When faced with situations like these many researchers focus on increasing sample size to help insure they get the right answer. Increasing sample size makes results more precise and increases power to detect an effect if it exists. However, increasing sample can be very costly and may not be practical due to budget or sample availability.

An alternative and/or complementary approach is to adjust the level of confidence in your significance test to reflect the risk. Requiring 95% confidence is a substantial hurdle for accepting that an effect exists. It is used in medicine and theoretical research because the downside of a false positive typically far outweighs the downside of missing a real effect. Selecting a lower level of confidence will make you more likely to conclude an effect exists when it does not, but you will be less likely to miss a real effect if it exists. In situations where the cost of missing an effect is greater than a false positive this is a simple and inexpensive way to reduce your risk.

Lowering the threshold for deciding a result is “significant” is outside the comfort zone of many researchers. If you recommend it, expect some strong and vocal opposition. However, our job as researchers is to provide information that promotes effective decision making. Helping decision makers appreciate the risk inherent in statistical analyses and its potential impact on outcomes is a critical part of this job.

Would you like to hear more about what you should be asking yourself when conducting a statistical significance test? Feel free to contact me at bduncan@thinkrga.com.