Eston Martz And Carly Barry 2013-02-24 02:13:40
It’s all too easy to make mistakes involving statistics. Using software such as Minitab removes much of the difficulty surrounding statistical calculations, thus reducing the risk of mathematical errors—but correctly interpreting the results of an analysis can be even more challenging. No one knows that better than Minitab’s technical trainers. They spend most of the year traveling around the world to help people learn to analyze data for quality improvement. Based on their experience, Minitab trainers compiled a list of common statistical mistakes they encounter over and over again. Four of the most commonly observed mistakes involve drawing an incorrect conclusion from the results of an analysis. Have you ever made any of these? Mistake 1. Not DistiNguishiNg BetweeN statistical aND Practical sigNificaNce Using statistics, we can find a “statistically significant” difference that has no discernible effect in the real world. In other words, just because a difference exists doesn’t make the difference important. And you can waste a lot of time and money trying to “correct” a statistically significant difference that doesn’t matter. Let’s say the Tastee-O’s cereal factory produces 18,000 boxes per shift. The company randomly samples boxes to ensure they meet a target fill weight of 360 grams. The quality manager can use statistics to detect a shift of 0. 06 grams in the mean fill weight 90% of the time. But just because that 0.06 gram shift is statistically significant doesn’t make it practically significant: a 0.06 gram difference probably amounts to two or three Tastee- O’s—not enough to make the customer notice and not likely to impact the total cost of materials. In most hypothesis tests, we know that the null hypothesis is not exactly true. The quality manager doesn’t expect the mean fill weight to be precisely 360 grams—he rather wants to see if there is a meaningful difference between the actual mean fill weight and the target. Instead of a hypothesis test, he could use a confidence interval to see the range of likely values for the mean fill weight of every cereal box produced, and then decide if action is needed. Mistake 2. “ProviNg” the Null hyPothesis In a hypothesis test, you pose a null hypothesis (H0) and an alternative hypothesis (H1). Then you collect data and use statistics to assess whether or not the data support H0. We use a statistic called the p-value, which is the probability you’d get the observed results if the null hypothesis is true. For many experiments, if that probably is less than 5%—if the p-value is less than .05— statisticians will reject the null. Hence the saying, “If the p-value is low, the null must go.” But a p-value greater than .05 doesn’t mean you’ve proven the null hypothesis. It only means you don’t have enough evidence to reject the null hypothesis. The null may or may not be true. It’s like “innocent until proven guilty” in court. The data analyst is the judge, the hypothesis test is the trial, and the null hypothesis is the defendant. If the prosecution doesn’t prove the defendant’s guilt, it doesn’t mean the defendant is innocent—perhaps the prosecution just didn’t collect enough evidence (data) to prove guilt. That’s why the verdict is “not guilty” rather than “innocent.” Similarly, a p-value greater than .05 does not make the null hypothesis true, so we can only say we “fail to reject” the null. Mistake 3. ThiNkiNg correlatioN = causatioN Correlation is an association between two variables. It’s tempting to observe the linear relationship between two variables and conclude that a change in one causes a change in the other, but that’s not necessarily so. For example, data analysis has shown a strong correlation between ice cream sales and murder rates. When ice cream sales are low, the murder rate is low. When ice cream sales are high, the murder rate is high. But it’s silly to conclude that ice cream sales lead to murder. Look deeper, and you find that in summer months, both are high. In winter, both are low. The data(above left) suggest not that the murder rate and ice cream sales affect each other, but rather that both are affected by another factor: the weather. If you’ve ever misinterpreted the significance of a correlation between variables, you’ve got company: news stories that equate correlation and causation abound— especially when it comes to the effects of diet, exercise, and other factors on our health! Mistake 4: Misinterpreting Overlapping ConfidenCe intervals When comparing multiple means, quality practitioners are sometimes advised to compare the results from confidence intervals and determine whether the intervals overlap. When 95% confidence intervals for the means of two independent populations don’t overlap, there will indeed be a statistically significant difference between the means (at the 0.05 level of significance). However, the opposite is not necessarily true: even if confidence intervals overlap, there may still be a statistically significant difference between the means. The graph (above right) shows considerable overlap in the confidence intervals, but what’s the the t-test p-value result? In this case, the p-value is less than 0.05, telling us that there is a statistically significant difference between the means. A gOOd Way tO avOid statistiCal Mistakes For most quality improvement professionals, data analysis is one of many responsibilities, not necessarily a daily activity. But making a mistake or misinterpreting your results is even easier if you’re not performing data analysis consistently. If it’s been a while since your last round of data analysis, check out the free trial of Minitab 16 Statistical Software (www.minitab16.com). It can help you with basic graphical analysis, capability analysis, measurement system analysis, hypothesis tests, regression, and creating control charts. Minitab’s Assistant helps you choose the right tool and walks you through your analysis step-by-step. The Assistant even provides interpretation of your output and comprehensive reports you can use to easily and effectively present your results. You can be confident you’re analyzing your data appropriately with Minitab Statistical Software.
Published by QualityMagazine. View All Articles.