Statistics: Hypothesis Testing and Correlation

Statistics Aids Business Decision-Making

Introduction

There are several different statistical methods that can be used to aid in business decision making. This article will concentrate on hypothesis testing and correlation.

Hypothesis Testing

Hypothesis testing is a method that is used to draw conclusions about a population using data obtained from a sample. Hypothesis testing is therefore classified under inferential statistics.

A hypothesis is a statement or claim regarding a characteristic of one or more populations that is believed to be true and can be proven with a test. The null hypothesis (Ho) is the statement or claim that will be tested, the alternative hypothesis (Ha) is the exact opposite of the null hypothesis, and hypothesis testing is the statistical procedure used to test the hypothesis. There are four steps to hypothesis testing:

1) Formulate the null hypothesis

2) Identify a test statistic that can be used to measure the truth of the null hypothesis

3) Determine the P-value (a small P-value is evidence against the null hypothesis)

4) Compare the P-value to an acceptable significance value (the level of significance is the probability of making a Type I error

(Weissstein, 1999)

There are two correct results that may come from hypothesis testing: Ho is rejected when Ha is true; and Ho is not rejected when Ho is true. There are also two incorrect results, known as errors, that may come from hypothesis testing: Ho is rejected when Ho is true, which is called a Type I error; and Ho is not rejected when Ha is true, which is called a Type II error (Sullivan III, p. 524).

Hypothesis testing can be helpful in business situations. Organizations often need to choose between two or more options. Hypothesis testing can be used to determine which choice is most likely to produce the desired results.

Correlation

Correlation can be used to quantify the relationship between two or more variables. Correlation can be positive, which means the variables move together in the same direction, or negative, which means they move in opposite directions. Correlation values vary between -1 and +1, with -1 representing a perfect negative correlation, 0 representing no correlation, and +1 representing a perfect positive correlation.

It is important to note that although correlation indicates a relationship between variables, it does not necessarily indicate causation. For example, there may be a positive correlation between the high school drop-out rate in Pennsylvania and the number of cats dropped off at the York, PA Humane Society. However, this does not mean that reducing the drop-out rate will reduce the number of cats dropped off at the Humane Society. This is what often leads to the third variable problem. In this example, the third variable may be the Pennsylvania economy. When the Pennsylvania economy is poor, students quit high school to begin working to help with family finances, and families can no longer afford to keep pets. Therefore, it is the economy that affects drop-out rates and the number of cats taken to the Humane Society. Therefore, while a correlation might indicate causation, this is not always the case.

Correlations can be helpful in business. Once a correlation is identified, organizations can determine if the correlation indicates causation. With this information, the company can develop methods to influence the correlation to the organization's benefit.

References

Sullivan, III, M. "Statistics: Informed Decisions Using Data." Upper Saddle River, NJ: Prentice Hall.

Weissstein, E. W. "Hypothesis Testing." MathWorld-A Wolfram Web Resource website. URL: http://mathworld.wolfram.com/HypothesisTesting.html

Published by Melissa Bushman

Melissa Bushman is a freelance writer living in Clark, Wyoming with her husband, two dogs, and six cats. She graduated Magna Cum Laude with a BS in accounting.  View profile

  • Statistical decision making
  • Hypothesis testing
  • Correlation
It is important to note that although correlation indicates a relationship between variables, it does not necessarily indicate causation.