Professional Documents
Culture Documents
Updates Regression Notes-1
Updates Regression Notes-1
legoneggggONB
-
1
*
&
&- C
&
We MLfor
use
estimating relationships.
Problem statement:
Prepare a marketing plan for a can
company
->. *
Manufacturers of the can
Making a marketplan Recommend which info of
·
can
Engine Size of the can
Horsepower of the
·
can
data
represented
·
Which into contributes to sales? → can be
graph
•
line
accurately can we estimate the effect of each
·
How info on sales? on
How
accurately can we predict the sales? relationship btw
·
→
Clear
·
Is the relationship linean? the
were 9 that
graph
Geometrical understandingBuilding
will
a regieggion
be
" strain
moddappnx.net
th i
q.tk
e
try to find straight
a line
which best fits the data, so that
- -
-
-
1 y
y mab
=
is a mathematical equation of a straight line,
m isslope of the line and
b is intercept.
You alreadyhave earlier classes.
gone throughin your
We
try to find best "m" and "b"
This line has its own m
Dummy
This
Example
line has its own
225 39
slope where it intersects
o 210 ⑧ on
y-axis
150 20 We also call this as
weights on
parameters.
200 18
intercept,
We need to find best fit line
HousePowen which fits the data, the resulting Let'sassume m= 3 and intercept 4) dummy)
=
so we as =
f(x) mx b, we
= +
need to build a function that maps "x"Horsepower to "y"sales. simplyputs and then we
get our
prediction
which is the sales.
i
Howdo we findthe best &b?
·
One possible way to hit and trial different mandb values and see what works well.
but It will be hectic, Isn't it?and
very even impossible to do by hand.
Wehaveanoptimizationalgorithm
which is similar too persol but isone
it
·
Howtogee what wouldwell?
By Graphical >Just generalize graphical to numerical
Humenical Representation
-esp
60
55 ·
f(x) mx b
=
+
.
↑
.lModepned.valuelzi
value
input
50
e function to find
Residual Jenson between connectvalue & approximated value)
It
.
35
30
↑ Fop n
but the
=
55
-+2..:.
(yz yz) (44 45) 1
-
- -
= =
:
(y, y) (39 43) 4
-
- - =
=
of export terms.
:for all data plots
-
a we have no
-
30
5 10
Improving a leageline
We have
negative sign, get we can rid of by:
it
squaring clesolute value
·
M
approx value
for averaging
Now, our ennor formula becomes:
Costfunction (Mean Squared Erroran
m(z -y,)
Why is it useful:
the real reason of '2'
·
It's
way to evaluate the best fit line
J(m,b) =
- m(y,- y,) later on
It's
way to the performance
·
measure
add 2 of "convention"
how well
your model perform on data. cause
worked example of MSE
iostol werentnoteboidoiain".For1
De·
ennors
canaring
all the emorton see
sum of terms
0.5= 6.08 (Ans)
error
=
0.8(47) +
9.2 46.8
=
44 -
46.8 =
-
2.8
-
horsepower
type of icelationshipg ·
Deterministicrectionghife
E F c 32
=
+
Ia
If you know the value of C; know the value F
exact.
you
Others:C:Axdiameters
8
These aredeterministic relationship, theequation exactly describes
18 2030 40 50
Celsius
We areinterested in statistical relationships, in which the relationship between the variables is not perfect.
E
225
y 389.2
-
Highen altitudes of other northern u.s., the less exposed you'd be to the
= -
5.98x
.....
·
175 ↑
But the relationship isn't perfect, we don't get exact value. ·
.......
·
It shows some trend. 150 ·
&
It also shows scatten relationship. ..
......
-
125
...
100
50
Statistical relationghi
We are interested in
finding statistical relationship.
Estimating thecoefficient
Given:(x,y,), (12,y2), (13,33), .... (am,Tm(
We
try to find the bestfit line basedon given data which bestfits the data.
Why isthe bestfittingline?
Line which bestfits the data have in &
ennons
they are as small as possible in some overall sense.
We have a
way to achieve it by least squares criterion.
n(xi -
x)(yi )
-
(2)
WTH!!!
b, i=
1
n(x, x/2
= -
i1
=
E(pixlyi
=values of the variable in a sample.
p =
Given:(1,1); (2,3);(4,5);(5,7)
x 3&
y4
=
1 1 6
+
!4
+ - - - + -
- - + - - + +
B=
+ +
+
=
= =
1
1.4
=
4 1 1 +4
(1 3)" (2 3) (4 3)2 (5 3)2 (2) ( 1)2 ( 1) (2)
+ +
- - + +
- + - + +
+ + -
Bo y b,z
=
-
h(x) p,x B0 =
+
- =
=
=
-
4.2 -
=
1.2
= -
0.2
Employing BI:
If,p,x0;
·
If, B,<0
·
independent variable & responsevariable will have 't'relationship. independentvariable & responsevariable will have' -
'relationship.
x:predictor
y:dependent variable
a increases X ↑
?
& >
V
lead What is the
y increases Y +relationship?
Bo
h(0) b,x0 30
+
=
Employing so:
The
average value of ywhenz is zero. I more on this in
assignment
Recope onfunction:
what we have
(x) B0+b,x h
hypothesis
· =
·
MSE:1) B0,B,) 1
=
m (h(x) y(i))2 -
·
in order to find
Goal:Minimise MSE Bo & B
We will see an
approach that is more common in
finding bestvalues of the parameters.
30 & B,
80, asimclele question,"Now to find box B=?
Idea #I:
Try out dienal values of BB, see if with that particular parameter, whether costfunction is
decreasing not.
your on
Initialize
your Bo
&p=0; &
you evaluate there, so, of the cost function will be super high. Right?
·
=0
And keepon
doing this until & unless we
get low enor
whatwe want?
strategic
1
We want a
system which automatically tries out the values every iteration in such a way that
the iteration at nth value parameter's value' is better than before iteration.
whatif the
glialegy?
Whatis the plotof 22?? Ennon ·
·
B,
P,
So,Izm(...) B,
&So the plot of the cost function will also look like a
parabola. - p1
· B,
B,
y
Using the concept of Differentiation.
X.
3,
p,
·
·How much changes when Xchanges?
Y
&
In a nutshell we are
looking for this strategy only 11!
y dIPoE
tells the same like how much J(80)
same here well.
changes when so changes as
B. b. =
-
Gadientdecentstepsare:
-
Bo &B1
Initialize
dy(pil obi (forje)
·
Calculate derivative
·
havviIcULLaLeccete
For how much time we can repeat: effectoNy
man.no. of iteration reached.
>Step size is smaller than the tolerance.
=
y)
B,i b, =
-
(h(u) y).xj
3
-
isupdaterule
We can
technically do for another parameters: ozles)
Repeat until some conditions3 [ This for single
is
training example!!
pii bi+
-
aj(Pi)(foni 0,1) =
Pi. Bi
=
-
a
m (h(x) y(i))uj() (for every;
-
3
behind a:
story **is rate. a
learning
Whydoes
·
·
The first is the direction to move
How big the stepto take in?
Lecuingatevacationration
If the slope is large we want to take
your
Beta
a
in
Hand woundedexample
Look at reference
google sheets provided in the resource.
Typeof Reguegneggion
Univerilate
Lineanegreggion:
You have only one input feature variables based on that we mapit to tanget variable.
f(x) y
only
one input feature
Multivariatelineonegreggion:
so want to build house price prediction:
you
of the house (X,)
3
size
X,mn
risspl*( 2-:
-in,xn... imp
Y. Bu B,Xi, B2Xiz
=
+
=
+
X xxx,... X,p Bo
X2, Xx ....
BpXip
(X()
:
(x(m))
GIBlIm(XB
+
-
z
jT (XB-
T
Im
al
we take this out?
no of nows
IxP no of columns
Internet... generals.:::
* .....i =====
..
Matrix Distribution
P, P2. Bi.. Bp
-
1 yiXaxp
eg: cat recognise ->
-
1,..... it >
y,
- By y
= 1
"i
:
. . . .
. . . . . . . . . . .
-PP(PTX
yonh(x) B0X0 b,X, b2Xz B3X3 -> data matrix
+ +
+
=
Xo
xnixn ... . Xnp
.
- Bj
hypothesis XI nXI
. ->
yn
y y bu b,X, B2Xz B3X3 PpXp
&I
-
...
+ +
+ +
Xo1
=
X 2 32
-.
- -
xxp-
I
1 X1x12... Bo
=
X2X2. . . X2P
!
1 ...
: 1Xn, Xne . . . ..
Xnp Pp
Y=8 B,Xn1 B2Xn2 BgXnst... +BpXnp
-.
( (X)
- -.
+
+
+
nx1 nx (p 1) +
y
nX I
j1B ·
regressionsent (p 1)x1 +
design matrix
Oh
data matrix
nx(p 1) +
J(B) 2m =
(h(x()) -
y(i)
let's break it down:
1
-
xxp-
I
X =
1 X11 x12 . . . . . . Bo
p B,x,y e
1XY- ↑ po
- -
-
-
-
X2X2. . . X2P X
!
1 ...
=
+
B,X3E
30 B,X21
:xixiz.......
+
xip Bi -
1X3. -
Bo +
i
-
-
1 Xn, Xne.....
XnP Pp i x- j B p,xx y,+
-
-
=
P b,X21 yz
- -
+
-
nx(p 1)
3 B,X3, Y.
+
+
-
E ()=(X5 ( p,xx y)
-
j)
+
-
-
=
=
mn
Multiple linerRegregion
Hypothegig: Optini8CLion:
J(B) Em(XB j)(xB j)
1
-
Xp-
I
X =
1 X11 x
12 . . . . . . Po =
- -
X2X22 X2P
!
1
minz(B)
... . .
:xixi.......
: cost function
goal to minimise
Xip Bi
i Solve optimization problem:
Xn,Xne..... Xnp
........T)
1
- -.
Bp a. Initialize the parameters Btoto
algorithm to
-X b. Repeat s optimize the problem.
h(X) xB Bnew: Boid x[Xp8] poid vectorized Gradient descent !!
-
=
X
inputparameter H.W.!.
288 (83,18
matrix vector! =
Analytical solutionof IcheanRegueggion:
(x x)"xy
+
What if we obtain B
by simply plugging Xandy into
B Gradient descent
=
->
Requires multiple iterations.
y.
-
-
XX
-
X =
X+X!2... X,p y
= -
is not inventiblee > Need to choose X.
You may need to: Works well when his
Xi, Xiz...
large.
·
1
Xip ·
.
xn, xn.... Xnp.
i -in remove extra features to ensure thatan can support
incremental
learning.
Problem: Xo X
Closed form Solution
B (X x)'x+y
+
104 X 15 =
-
=
->
Non-iterative.
j =
20 17
>
No need on X.
if his
slow
large
·
30 110
Computing (XTx) is
roughly 0(n3).
"
40 1/2
-50 120
5x1 5x2
-
- -
F5 zt
1 -
+
Step1:XX xx
Foinviz-o in
I
=
1 =
-57
Step 2:
Determinant-
(XTX) = 1XFX/
1554 Adj(xTX) adj of
t=11 14
+
(X X)
-
=
5X718
=
-
54x54 674 = =
=
=
Step3:Compute XTY
1 1 10 =
-
- 1 -
y
15
x = 1 1
1
20
57101228 30
1970
- -
2x5
-
40
50
- -
5x1
Step4:B (x x)x y
+
+
0.030l---= 2070
-
1.87
=
-
0838
= --
y B0 b,X,
+
=
·
say you wanna predicty without a corresponding X val. We can
simply predict the average of all values.
y
ennor:11.1879
enpop
But how much?
>
Totalreduction 41.1879-13.7627
=
>41.1879 -
13.7627
27.4252
=
= 66.59%
41.1879 > we can represent as
·
it tells what %ofthepred.error in the is eliminated when least squares the
us
van we use
regression on a van
> Khan
academy def.
Revanienceexplained by the Model
total variance
O' represents that a model doesn't explain any variations in the bes. Yararound its mean,
100%representsthat a model explains all the variation in the response vanand and its mean.
MulFiele LinearRegression:
91. Is atleast one of the predictors X, X,.....,Xp useful in pred the response?
> is there
any relationship
between
response predictor?
·
Conpelation used to measure the strength of the relationship between a variables.
Conpelation coefficient
> Is 1, a takes the valuebetween &
inclusivity.
n
2(xi
=
-
x)(y, z)
-
> The
sign indicates the direction of the Ir. relationship positively or negatively related.
#(xi
y( [(y, -
1
-
The alternative
-
hypothesis states that the model fits the data better than intercept only model.
formulate hypothesis about the relationship between the predictor and response
1. variables.
· Thus, the null hypothesis is setthat there is no relationship between response & the pred van. hence thecoel. for each feature.
if B0 B, X, B2X2 +
y
+
=
then null
hypothesis
0 0x(1 0 x42
y
+ +
=
·
"Test for significance
! -
of regression" t-test Ftest
"this test checks the "This test can be used to
significance of individual simultaneously check
negression coefficient" the
significance of of
no
h(x) Bo +B, X,
=
Hypothegig formulationforf-Test
·Hypothesis test done around the claim that there is a linear regression model representing the responsevan & all the predictor van.
~ The null
hypothesis is that the I.R. model does not exist,which means that the value ofall the coefficient is equal to 0.
Ho.pl=e
P-value:
>Apval. is a statistical measurement used to validate a
hypotheses againstobserved data.
·
Aprval. measures the
probability of obtaining the observed results assuming research on value by yourself!
Hypothegig Testing Readingg
1
There are
ways to test the significance of a regression model,
several to using the mean of the response values as the predicted value for all
3 7
3
The t-test is calculated as
where bis the coefficient for the ith predictor 4 I
bi
I variable, and SE(bi) is the standard ennon of 4511
SE(bi) the coefficient. The
begression equation for this model is:
y b b,x,
=
+ +
bzxz
F
(SR) where MSR is the mean
square
of
equation:
regression, p is the number
j, 1 2x1 3x2 9,
= + + =
=
15,
=
3x 4
+
21,
=
predictor and response variables could have occurred by chance, given that there is The mean of the observed response values is (5 +7 +9 11) /4
+
8.5
=
37
=
In summary, that test, Itest and p-value are all used to evaluate the Therefore, the SSRfor this regression model is 37. This indicates that the
regression
significance
of a
regression model, and to determine whether the predictor model explains a significant amount of the variation in the response variable compared
variables can be used accurately predict the response variable. to using the mean of the response values as the predicted value for all observations.
The SSR used to evaluate the improvement in the model's fitcompared to using For example, if a data set contains 10 observations and a
is
regression model has two
the mean of the response values as the
predicted value for all observations. parameters (the intercept and the slope), then the degrees of freedom for the SSR
It is calculatedas: 10-2 = 8. This means that the SSR
is calculated the variation in the response
are
using
variable for the S remaining observations, after the effect of the regression model has
SSR [(y,- h)
= where ,is the
predicted response value for the ith been removed.
observation, and is the mean of the observed
response values. Thedegrees of freedom for the SSRare used in calculating the mean square enor
TheSSRcanbeusedinconjunctionwiththesumofsquarestotalssiaseed (MSE), which is the average of the squared differences between the observed and predicted
values. The MSE
is calculated as
the SSR.
bi) regression model, the critical value is the value of the statistic that
in a linear
t =
where b,is the coefficient for the i-th
predictor variable and SE(bi) is the
is used to determine whether a coefficient is significantly differentfrom zero.
standard error of the coefficient.
The critical value is determined
by the level of significance chosen for the
hypothesis test (e.g.0.05), the degrees of freedom for the test, and the type
of alternative hypothesis (one-sided on two-sided).
The statistic is then compared to a critical value from the distribution,
with a specified level of to determine whether the coefficient is
significance, In summary, a critical value is the value of a test statistic that is used to
significantly different from zero. For example, if the level of significance is determine the significance of the observed results in a hypothesis test. It is
0.05and the statistic for a predictor variable is than 2 (on less than
greater compared to the calculated value of the test statistic to determine whether
2)then the coefficient for thatvariable is considered to be significantly the null favor of the alternative
different from zero. hypothesis should be rejected in
hypothesis.
In summary, the t-test is used to test the hypothesis that the regression
coefficients for prediction variable are significantly different from
each
511
The standard errors for the coefficients can be calculated using the following formulas:
where SSEis the sum of squares endor n is the total number of observations,
SE(bo) squt(SSE (n
=
-
p 1)
-
(yn
x
M
+
p
-
1) x ( /SSxx))
1
his the mean of the observed response values
x (1/SSxx))
SE(b2) squt(SSE/ (n -p 1)
=
-
For example, the standard ennows for the coefficients are: Thet-statistics for the coefficients can then be calculated as:
= x =
The critical value for the statistic, with a level of significance of 0.05and two degrees of freedom (dfn p-)
is e.g. for a one-sided alternative
hypothesis and 2.01 for a two-sided alternative hypothesis.
1
If the p-value is less than a pre-determined significance level (e.g., 0.05), then the null
and twogidedalternative hypotegig? hypothesis can be rejected and it can be concluded that the observed results are statistically
significant.This means that the sample statistic is
unlikely to have occurred by chance and that
effective if the p-value
the drug is likely be
to in
treating the condition. On the other hand,
In a hypothesis test, the alternative hypothesis is a statementthatis tested is
greaten than the significance level, then the null hypothesis cannot be rejected and it cannot
against the null hypothesis. The alternative hypothesis is usually the opposite be concluded that the observed results are
statistically significant.
of the null
hypothesis, and is typically used to express the research hypothesis
hypothesis of interest. In of the statistical significance of the observed
on the
summary, the p-value is a measure
results in
hypothesis test. It is the probabilityof obtaining a sample statistic atleast
a
The alternative hypothesis can be one-sided on two-sided. A one-sided alternative as extreme as the one that was
actually observed, assuming that the null hypothesis is true.
hypothesis is a statement that specifies the direction of the difference on If the p-value is less than the pre-determined significance level, then the null
hypothesis
relationship between the variables, while a two-sided alternative hypothesis can be rejected and it can be concluded that the observed results are statistically
is a statementthatdoes not specify the direction of the differences on
significant.
relationship.
In regression analysis, the F-statistic can be used to evaluate the overall To summarize, the null
hypothesis for the F-statistic in
regression analysis is thatall of the
significance of the regression model. This is done by conducting an Ftest, prediction variables are not significantly related to the response variable, while the alternate
which is a
hypothesis testthatcompares the mean square of the regression hypothesis is that atleast one of the predictor variables is significantly related to the
(MSR) to the mean
square ennon (MSE) response variable. The F statistic is used to evaluate the over all
significance of the
regression model by compaining the mean square of the regression to the mean square
The null
hypothesis for the F-test is that all of the predictor variables in the error. If the F-statisticis large, it indicates that the model is significantly related to
related to the response variable. If the p-value for the response variable and the null be rejected favor of the
model are not
significantly hypothesis can in
For
example, suppose you want to model the relationship between a person's age (the independent variable) and their income (the dependent variable).
If plotthe data on a
you graph, and the points form a straight line, then you can use linear regression to model the relationshipage and income.
However, if the points do not form a straight line, then the relationshipbetween age and income is non-linean, and you might need to use
a different type of statistical model.
In summary, the linearity assumption in linean regression refers to the assumption that the relationshipbetween the dependent variables and the independent
variables is linear. If this assumption is not met, then a different type of statistical model
mightbe more appropriate.
Suppose we have the following data:
X 4 X Y
2
1
"
1
2 2
3 38
42
48
7
5 18
AssumptionFegt
V
L
viqual insection
One approach is to create a scatterplot ofthe
concelation mation
Another approach is to use statistical tests to assess
Linequity
You can
test
linearity
also use a test, such as the
Goldfeld. Quandt test, to
predictor variables and the response variable, and the
linearity assumption. One common method is to formally test the linearity
the plots to see ifthereis a line an the Pearson conpelation coefficient assumption.
visually
inspect use to measure
Remedieg
Eagfomationof value. L
Interactionterring adifferent
rging
Transforming the predictor variables:If the relationship Adding higherorder terms on interaction terms: Using a different model:Ifthe linearity
between the predictor variables and the response variable You can
try adding higher on den terms on interaction assumption cannot be satisfied using the
appears to be non-linear, you can try transforming the if it improves the
terms to
your model to see
linearity above methods,
you may need to consider
predictor variables to see if it improves the linearity assumption. using a different type of model, such as
assumption. For example, if you have a predictor a nonlinear
regression model on a
For example, try taking the log, square root, variable that appears to have non-linean relationship
you can a
generalized linean model.
Violating the independence assumption can affect the validity of the model and the conclusions drawn from it. If the endors are not independent, it can result in biased
estimates of the model coefficients and inflated standard endors, which affect the precision and of the model.
can
accuracy
One
way to test for independence is to plot the residuals (the differences between the predicted values observed values)
and the
against the predicted values.
If the residuals are bandomly scattered around a horizontal line, it independent of the predicted values. This can provide some
suggests that the endors are
evidence that the independence assumption has not been violated.
Another test for test, such as the Durbin-Watson test. This test compares the autoconelation of the residuals to a known
way to independence is to use a statistical
value. If the test statistic is close to the known value,it suggests that the errors are independent.
It's important to note that these tests are not foolproof and it is still possible for the independence assumption to be violated even if the tests do not detect any issues.
It is always a good idea to
carefully
examine the data and consider the context in which the data was collected to ensure that the independence assumption has not
been violated.
Anotherexample
mightbe a study examining the relationshipbetween the price of a productand the numbers of units sold. If the data was collected over a period of
time, and the price of the productwas changed during thatperiod, the (differences between the predicted number of unit sold and the actual number of units
endors
sold) would not be independent. This is because the number of units sold would be influenced
by the price of the product,which changed over time.
In both of these
examples, violating the independenceassumption
could lead to inconnect conclusions about the
relationshipbetween the
predictors and outcome variables, and could result in inaccurate predictions
on estimates. It is important to carefully consider the contextin which the
data was collected and to check for violations of the independence assumption
before drawing conclusions from the data.
Homogoedagticity
Homoscedasticity is a statistical assumption that states that the variance of the error team (also known as residuals) in a
regression model is constant across
of predictor variables. In other words, it assumes that the amount of endor
all values in the model is the same
begandless of the value of the predictor variables.
This is an important assumption because itallows us to use statistical techniques that rely on the assumption of constant variance, such as t-tests and F-tests,
to test
hypotheses about the relationships between the predictor and response variables in the model.
In layman's terms, homoscedasticity can be thought of as the assumption that the spread on dispension of the residuals (ennors) is constant across all values
of the predictor variables. If the varience ofthe residuals is not constant, it is said to be heteroscedastic, which can lead to problems when
using certain
statistical techniques to
analyze the model.
It's important to homoscedasticity is not always present in real-world data, and it is often necessary to check for and connect for
note that
heteroscedasit icity in order to properly analyze a
regression model. There arevarious techniques that can be used to check for and adness
heteroscedasiticity, such as transforming the data on using weighted least squares regression.
Aggumption
Normality
thenormality assumptiononlineannegressionnetenstotheassumptionthat theepnonsthedifferencesbetweenthepredictedvalues andresound
the mean and fewer endows at the extreme ends of the distribution,
One
way normality is to plot the residuals (the differences between the predicted values and the observed values) and lookfor a bell-shaped
to test for curve.
Another Anderson
way to testfor normality is to use a statistical test, such as the
Darling test on the Shapino Walk test. These tests
compare the
-
-
distribution of the residuals to a normal distribution and be used to determine whether the
provide a p-value, which can normality assumption has
been violated.
No Aggumption
Multicollinearity
In the context of multiple linear regression, the no multicollinearity assumption refers to the assumption that there is no
strong linear relationshipbetween
the predictor variables. In other words, itassumes that the predictor variables are not highly correlated with each other. This is an important assumption
because it allows us to interpret the coefficients of the predictor variables in the regression model as measures of the unique effect of each predictor on
the response variable, predictions constant.
holding all other
In
layman's terms, the no
multicollinearity thought of as the assumption that the predictor variables are independent of each other, on
assumption can be
that
they do not have a strong linean relationship. If the predictor variables are highly conselated, it can be difficult to disentangle their individual effects on
the response variable, which can difficult to interpret the results of the
make it
regression analysis.
It's important to note that multicollinearity is a common problem in regression analysis, and it is often necessary to check for and address multicollinearity
in order to
properly analyze a multiple linean regression model. There are various techniques that can be used to check for and address multicollinearity,
such as
examining the variance inflation factor (DIF) on using ride regression on LASSO regression.
No
Aggumption Example
Multicollinearity
suppose you are a marketing research analyst and you want to understand the factors that influence sales ofa particular product. You collect data on
several potential predictor variables,
including the price of the product, the advertising budget for the product, and the number of stones that can by the product.
You then fit a multiple linean regression model to predict sales as a function of these predictor variables.
In this, the no
multicollinearity assumption would be important because you want to be able to interpret the coefficients ofthe predictor variables as
measures of their unique effect on sales,
holding all other predictors constant. If there is a linean
relationship between the prediction variables (e.g.,
strong
if the
advertising budget highly
is corbelated with the number of stones that
canny the product), it can be difficult to disentangle their individual effects
sales, which make it difficult to interpret the results of the
on can
regression analysis.