0 Up votes0 Down votes

8 views30 pagescheck

Jul 13, 2016

© © All Rights Reserved

PPT, PDF, TXT or read online from Scribd

check

© All Rights Reserved

8 views

check

© All Rights Reserved

- Excel Cheatsheet
- 523-Business Mathematics & Statistics
- Regression
- Lecture - 7 Overview of or , LPP Graphical , Simplex
- Correlational Research 2
- Sample Conceptual Framework & Paradigm and Chapter 3
- Optimal Risk Portfolio
- Stein - Discussion of 'an Economic Analysis of Audit and Nonaudit Services
- Ansys Dynamics Tutorial Instriuctions 1 (9)
- Discussion Week 7R.docx
- 8
- S1 June 2011
- Regression
- Modlin 3 Questions
- Omv Bias Note
- rbp statistik
- Navidi Ch07 4e Linear Regression
- Ethical Behavior Self-coworker
- Factors Affecting Customer Satisfaction
- Cleveland State University Lab 1

You are on page 1of 30

Regression

Davina Bristow &

Angela Quayle

Topics Covered:

What is the strength of this relationship

x?

Regression

Pearsons r

t test

Relevance to SPM

GLM

Correlation: is there a relationship between 2

variables?

Regression: how well a certain independent

variable predict dependent variable?

CORRELATION CAUSATION

In

variable and observe effect on dependent variable

Scattergrams

Y

Positive correlation

Negative correlation

No correlation

Variance vs Covariance

If youre wishing to assume that your sample is

representative of the general population (RANDOM

EFFECTS MODEL), use the degrees of freedom (n 1)

in your calculations of variance or covariance.

But if youre simply wanting to assess your current

sample (FIXED EFFECTS MODEL), substitute n for

the degrees of freedom.

Variance vs Covariance

n

Variance:

Gives information on variability of a

single variable.

S

2

x

2

(

x

x

)

i

i 1

n 1

Covariance:

Gives information on the degree to

which two variables vary together.

Note how similar the covariance is

to variance: the equation simply

multiplies xs error scores by ys error

scores as opposed to squaring xs

error scores.

cov( x, y )

(x

i 1

x)( yi y )

n 1

Covariance

n

cov( x, y )

(x

i 1

x)( yi y )

n 1

When X and Y : cov (x,y) = neg.

When no constant relationship: cov (x,y) = 0

Example Covariance

cov( x, y )

( x x)( y

i 1

n 1

y ))

xi x

yi y

0

2

3

4

6

3

2

4

0

6

-3

-1

0

1

3

0

-1

1

-3

3

x3

y3

7

1.75

4

( xi x )( yi y )

0

1

0

-3

9

7

number tell us?

the datas standard deviations: if large, the value will be

greater than if small even if the relationship between x and y

is exactly the same in the large versus small standard

deviation datasets.

relies on variance

High variance data

Subject

x error * y

error

X error * y

error

101

100

2500

54

53

81

80

900

53

52

61

60

100

52

51

51

50

51

50

41

40

100

50

49

21

20

900

49

48

2500

48

47

Mean

51

50

51

50

7000

28

Covariance:

1166.67

Covariance:

4.67

Solution: Pearsons r

Solution: standardise this measure

Divides the covariance by the multiplied standard deviations of X and Y:

rxy

cov( x, y )

sx s y

Pearsons R continued

cov( x, y )

( x x)( y

i 1

n 1

y)

rxy

( x x)( y

i 1

Z

i 1

y)

(n 1) s x s y

rxy

xi

* Z yi

n 1

Limitations of r

When r = 1 or r = -1:

We can predict y from x with certainty

all data points are on a straight line: y = ax + b

r is actually r

r

= estimate of r based on data

5

4

3

2

1

0

0

Regression

between x and y but it doesnt describe the

relationship or allow you to predict one

variable from the other.

Best-fit Line

gives best prediction of y for any value of x

minimises distance between

data and fitted line, i.e.

the residuals

= ax + b

slope

intercept

= , predicted value

= y i , true value

= residual error

the squares of the residuals (the vertical distances

from the data points to our line)

Model line: = ax + b

a = slope, b = intercept

Residual () = y -

Sum of squares of residuals = (y )2

(y )2

Finding b

sum of squares

shifting the line up and down the scatter plot

Finding a

sum of squares

b

changing the slope of the line, while b stays

constant

= ax + b

so need to minimise:

(y - ax - b)2

If we plot the sums of squares

for all different values of a and b

we get a parabola, because it is a

squared term

So the min sum of squares is at

the bottom of the curve, where

the gradient is zero.

Gradient = 0

min S

Values of a and b

where the gradient = 0

by taking partial derivatives of (y - ax - b)2 with

respect to a and b separately

and b that give the min sum of squares

The solution

a=

r sy

sx

sy = standard deviation of y

sx = standard deviation of x

A low correlation coefficient gives a flatter slope (small value of

a)

Large spread of y, i.e. high standard deviation, results in a

steeper slope (high value of a)

Large spread of x, i.e. high standard deviation, results in a flatter

slope (high value of a)

Our model equation is = ax + b

This line must pass through the mean so:

y = ax + b

b = y ax

r = correlation coefficient of x and y

r sy

s = standard deviation of y

x

b=y- s

s = standard deviation of x

x

y

x

intercept is to the mean of y

a

r sy

x+y = ax + b =

sx

a

Rearranges to:

r sy

x

sx

r sy

(x x) + y

=

sx

If the correlation is zero, we will simply predict the mean of y for every

value of x, and our regression line is just a flat straight line crossing the

x-axis at y

We can calculate the regression line for any data, but the important

question is how well does this line fit the data, or how good is it at

predicting y from x

(y y)2

Total variance of y:

s2 =

( y)2

n-1

sy2 =

SSpred

df

Error variance:

serror =

2

(y )2

n-2

SSer

dfer

n-1

SSy

dfy

explained by our

regression model

This is the variance of the error

between our predicted y values and

the actual y values, and thus is the

variance in y that is NOT explained

by the regression model

How

good is our model cont.

sy2 = s2 + ser2

s2 = r2 sy2

r2 = s2 / sy2

our regression model

= sy2 (1 r2)

the smaller the error variance, so the better our

prediction

from our regression equation than by just predicting

the mean?

F-statistic:

F(df ,df ) =

er

s2

ser2

r (n - 2)

2)

t(n-2) =

(because F = t

1 r2

complicated

rearranging

r2 (n - 2)2

=......=

1 r2

So all we need to

know are r and n

Linear regression is actually a form of the

General Linear Model where the parameters

are a, the slope of the line, and b, the intercept.

y = ax + b +

A General Linear Model is just any model that

describes the data in terms of a straight line

Multiple regression

of independent variables, x1, x2, x3 etc, on a single dependent

variable, y

The different x variables are combined in a linear way and

each has its own regression coefficient:

y = a1x1+ a2x2 +..+ anxn + b +

independent variable, x, to the value of the dependent variable,

y.

i.e. the amount of variance in y that is accounted for by each x

variable after all the other x variables have been accounted for

SPM

independent variable, x, on ONE dependent variable, y

variables, x1, x2 etc, on ONE dependent variable, y

independent x variables on several dependent variables, y1, y2,

y3 etc, in a linear combination

This is what SPM does and all will be explained next week!

- Excel CheatsheetUploaded byPeeyush Thakur
- 523-Business Mathematics & StatisticsUploaded bymubbasherashraf
- RegressionUploaded byhanik i
- Lecture - 7 Overview of or , LPP Graphical , SimplexUploaded bypresentationsite
- Correlational Research 2Uploaded byCarlo Mabini Bayo
- Sample Conceptual Framework & Paradigm and Chapter 3Uploaded bymark_torreon
- Optimal Risk PortfolioUploaded byKarthik Kota
- Stein - Discussion of 'an Economic Analysis of Audit and Nonaudit ServicesUploaded byRoshni Bachoe
- Ansys Dynamics Tutorial Instriuctions 1 (9)Uploaded byShahrukh roshan
- Discussion Week 7R.docxUploaded byPandurang Thatkar
- 8Uploaded byHoras Ronald Indrayanto Siahaan
- S1 June 2011Uploaded byjuliako_1997
- RegressionUploaded byHeza Mutaqin
- Modlin 3 QuestionsUploaded byDavid Lim
- Omv Bias NoteUploaded byGabrielito
- rbp statistikUploaded byapi-305009900
- Navidi Ch07 4e Linear RegressionUploaded byAmin Zaquan
- Ethical Behavior Self-coworkerUploaded bymansoormani201
- Factors Affecting Customer SatisfactionUploaded byplaycharles89
- Cleveland State University Lab 1Uploaded byRenzo Jose Canro Calderon
- sdwUploaded bypisal
- D 15 Paper II Management.pdfUploaded byVINOD
- file_2Uploaded byMichael Baguyo
- 8.3 Data Assignment.docxUploaded bymohammad_samhan
- 376 hw 5Uploaded byapi-451560413
- APPENDICES A.docxUploaded byJezaJoyMarquezMorada
- Application of Hedonic Housing prices as indicator of Spatial Economic InequalityUploaded byme hahaha
- Scholes 1977Uploaded byRomina Gavancho Valderrama
- 1-s2.0-S1877042812024676-mainUploaded byMaria Cecilia Bermudez
- Use and Effectiveness of Banana Peel as Water FillerUploaded byICE Spades Movies

- MetaData APIUploaded byHeather
- The Herbal Alchemists HandbookUploaded byJahanzeb Khan
- Apps - Forms Personalization in Oracle HRMSUploaded byHacene Lamraoui
- An Exploratory Study About Culture and MarketingUploaded byjelena79
- 36.331 3gpp SpecsUploaded bySwayam Sambit
- Strategy CourseUploaded byMohan Rao
- cell phones rule settingUploaded byapi-284171435
- Peer Group SupervisionUploaded byGandac Ionut Daniel
- week 7 2 2f25-3 2f1 honors americanUploaded byapi-363845691
- GATE Syllabus for Electronics and Communication Engineering (ECE) _ All About EducationUploaded bySai Sarath
- UG Lubrajel Hydrogels - Lubrasil Microemulsions BrochureUploaded bymikocorpus
- Scottish National Heritage - The Nature of Scotland Spring 2010Uploaded byThomasEngst
- 1897 1935 Economia Politica Curso Popular A_BogdanoffUploaded bySaulo Josef
- resume of ian peers pdf-3-3Uploaded byapi-246237326
- Language Learning StrategiesUploaded byLebiram Mabz
- Battlefield Earth, Space Jazz and the Sounds of ScientologyUploaded byMickey Champion
- Audio Precision System One - User ManualUploaded byStefanoViganó
- Abr_rules Small BookUploaded byParas Sirwaiya
- Development of a State MindfulnessUploaded bySkidzo Analysis
- 20060214 DoD Directive JIEDDOUploaded byAleš Zelenko
- An Introduction to the Supply Chain Council's SCOR MethodologyUploaded byLuis Andres Clavel Diaz
- Why PDF “reading order” is irrelevant to accessibilityUploaded byvijay123in
- Bild Auf KopfUploaded byzoki
- Animal ReincarnationUploaded byrandhir.sinha1592
- Technology for TransparencyUploaded byDaniel Roberto Otzoy García
- Ehp3 for Utilities CrmUploaded byDeep Akg
- A i Report FinalUploaded byJinita Bhatt
- Globalization EssayUploaded byJeegar
- Scientific American - Febuary 2016Uploaded byVu Nguyen
- Dmitry Paranyushkin Curriculum VitaeUploaded byDmitry Paranyushkin

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.