Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
0 of .
Results for:
P. 1
MAT130 Lecture Module 4

# MAT130 Lecture Module 4

Ratings: (0)|Views: 1|Likes:
math
math

### Availability:

See more
See less

04/20/2013

pdf

text

original

MAT 130 Module Four 1
This module introduces what are probably some of the most important statistical conceptsfor businesses and governments—scatterplots, correlation, and regression. The U.S. militaryrelies on linear and nonlinear regression analysis to predict equipment and program costs,often paying high consulting fees to those who can come up with the best regressionequations. Major corporations also rely on regression analysis to predict future trends of costs, profits, and other parameters. Just being able to plot data and show trends can boost careers and enhance presentations,so learning the concepts of correlation and regression and their applications is an importanttool. However, one must remember that correlation—even high correlation—does notnecessarily imply a cause-and-effect relationship between variables. There may be a thirdhidden variable, or the correlation may be spurious. Statisticians usually require controlledexperiments to definably declare cause and effect, but any correlation and regression can beuseful for making predictions, think about the businesses you know and what correlationsand regression predictions would be useful in growing and improving those businesses.
Scatter Diagram
A scatter diagram, or scatterplot is a two-dimensional xy-coordinate graph that shows therelationship between the two quantities. The focus here is the correlation between the
x
and
y
variables and developing the best-fit linear equation to represent the relationship betweenthe variables. Two variables can be positively associated or negatively associated. To measure thestrength of the correlation, use the linear correlation coefficient,
r.
The linear correlationcoefficient is always between
−1 and 1, inclusive and the closer to −1 or 1, the stronger the
linear relationship between the variables. The equation for computing the sample linearcorrelation coefficient is(Eqn. 1)
1
i ix y
x x y ys sn
=
,

2 MAT 130 Module Four
where
x
is the sample mean of the explanatory (independent) variable, s
x
is the sampledeviation of the explanatory variable,
y
is the sample mean of the response (dependent)variable, s
y
is the sample standard deviation for the response variable, and n is the numberof data items in the sample. As you can see, the equations involve many computations. Thisis where the application tools aid in the computations, resulting in a greater focus on theinterpretation of the value of r.
Least-Squares Regression
Given two variables, one can compute linear equations that relate the variables. Theselinear equations, each of the form y = mx +b, can be used to make predictions for value of y. The ultimate goal is to find the linear equation that best matches the points. The least-squares regression line is the line that minimizes the sum of the squared errors, and as aresult, is a more accurate line. The equation of the least-squares regression line is(Eqn. 2)
10
ybxb
= +

,where b
1
is the slope and b
0
is the y-intercept. The following scatter diagram generaqted using Excel shows the club-head speed anddistance a golf ball travels. The scatter diagram includes the data points and a trendline. The difference between theobserved value of y and the predicted value of y from the trendline is known as the residual.