Association and Causality

# Association and Causality

The benefit of creating a divisional structure is that those managing and working in each division have a sense of responsibility for their own area of operations, but the risk lies in the divisional management taking actions which may appear to be beneficial to the division but which are not good for the organization as a whole.
The benefit of creating a divisional structure is that those managing and working in each division have a sense of responsibility for their own area of operations, but the risk lies in the divisional management taking actions which may appear to be beneficial to the division but which are not good for the organization as a whole.

Published by: ClassOf1.com on Sep 04, 2013

Statistics

Subject: Statistics

Association and Causality
A common pitfall in data interpretation is to mistake association for causality. If two variables areshown to be correlated, this does not in itself establish that there is a causal relationship betweenthe two variables. It may, however, suggest that there is a causal relationship between the two variables and motivate scientists to try to identify a mechanism for a causal relationship.Nevertheless, statistical data analysis in itself cannot establish causality.For example, suppose that after a party some of the participants fall seriously ill. A doctorinterviews all of the people who attended the party and finds out how will they are and how much wine they consumed. Analysis of these data reveals a positive correlation between the level of illness and the amount of wine consumed. Does this prove that the wine was responsible for theillness? The data analysis certainly suggests that the wine may be responsible for the illness, or atleast may be a contributing factor.However, there are other possibilities. Suppose that the salted peanuts at the party are the realcause of the illness. Consequently, the more peanuts a person consumed, the more ill that person islikely to be, so that there is a causal relationship between peanuts and illness with a consequentpositive association. Also, suppose that the more peanuts a person consumed, the thirstier the
person is and so the greater is the person’s wine consumption. Consequently, there is also a positive
association between peanut consumption and wine consumption.The following scenario explains the positive association between wine consumption and illness,even though wine consumption in itself has nothing to do with the illness. In fact, conditional onthe amount of peanuts consumed, wine consumption has nothing to do with illness. Therefore,even though there is a positive correlation between wine consumption and illness, it would beincorrect to use this result to infer that the wine consumption caused the illness.This illustrates how the positive correlation exhibited between wine consumption and illness can beexplained by the presence of a third variable, in these case peanuts, which is positively correlated with both wine consumption and illness. The sample correlation coefficient is a convenient way o

summarizing the degree of association between two variables, whereas a more detailed regressionanalysis actually forms a linear model relating the two variables.Notice that the sample correlation coefficient is unchanged if the x and y variables are interchangedand are relabeled the y and x variables. This is in contrast to a regression analysis, which requiresone of the variables to be designated the dependent variable and one the explanatory variable. Thecloser the sample correlation coefficient is the stronger is the linear association.

