22 views

Uploaded by lauren smith

JHU SON, Spring 2017 course
Biostatistics Notes, Essential Statistics, Instructor’s Copy, Moore/Notz/Fligner 2 edition
Transcribed course lecture by Janna Stephens, PhD, RN
Correlation and simple linear Regression

- 1171924
- Gunter 1997
- A CLARIFYING CORRELATION on the NONusefullness of General Speed Limits
- Notebook File Chapter 8
- 20180809104733D4998_Chapter_13 Correlation and Linear Regression.pptx
- SDA 3E Chapter 6 (1)
- stats project chapter 4 megan and kyran
- Lecture 7 Regression and Correlation
- STAT-30100. Week 3 Chapter 2.9 & 9.1–9.2, 9.5. Graphing Bivariate Numerical Data; Correlation; Simple Linear Regression.pdf
- DASE Session Problem Sheet 4
- 10.1.1.871.1667
- sakshamassignment2
- Datascience Training in hyderabad
- Competitive Priorities
- multiple regression mechanics
- Research methods and Statistics_Ismail Khater
- Articol ET 1
- The Analysis of Pretender Building Price Forecasting Performance a Case Study
- Old Test 2
- SOFTWARE NOTES S10

You are on page 1of 5

JHU SON

Spring 2017 course

Essential Statistics, Instructors Copy, Moore/Notz/Fligner 2 edition

Transcribed course lecture by Janna Stephens, PhD, RN

Response variable - A variable that measures an outcome of a study.

(Dependent variable, or outcome variable)

Explanatory variable - A variable that may explain or influence changes in

a response variable. (independent variable, or predictor)

*studies show that change in one or more explanatory variable(s) can CAUSE

change in response variable.

Ex. To study correlation between amount of time studying biostatistics and

final exam grade: explanatory variable is amount of t studying; final grade is

response variable

may help predict it, but there are other variables involved that should be

accounted for.

Ex. Does amount t studying cause a good grade on exam? It may help, but

other factors like previous exposure to math, comfort with math, years since

last math course, hours of sleep previous night, etc.

Most useful graph to display relationship between two quantitative variables

is Scatterplot.

variables measured on the same individuals. If one of the variables is an

explanatory variable, it should be represented on the horizontal axis (x-axis).

on x-axis).

2. Label and scale axis

3. Plot individual values

important departures from patterns. Describe overall pattern by direction,

form and strength of the relationship. Look for outliers individual value

that falls outside the overall pattern of the relationship.

Direction:

Positive association - Description for two variables when above

average values of one tend to accompany aboveaverage values of the

other, and belowaverage values also tend to occur together.

Negative association - Description for two variables when above

average values of one tend to accompany belowaverage values of the

other, and vice versa.

Correlation - Denoted by r. Measures the direction and strength of the

linear relationship between two quantitative variables.

(perfect positive relationship) (very rare to have a perfect positive or

negative r that is perfect linear relationship; you can get close by

measuring constructs that theoretically hang together).

r>0 indicates positive association

r<0 indicates negative association

values of r near 0 indicate a very weak linear relationship.

Strength of the linear relationship increases as r moves away from 0

toward -1 or 1.

The extreme values r=-1 and r=1 occur only in the case of a perfect

linear relationship.

Notes on Correlation:

1. Correlation makes no distinction between explanatory and response

variables (doesnt matter which variable you all x or y)

2. r has no units and does not change when we change the units of

measurement of x, y or both. (ex. You can measure weight in pounds or

kilograms or height in cm or inches, and it wont change correlation

between height and weight).

3. Positive r indicates positive association between the variables, and

negative r indicates negative association.

4. The correlation r is always a number between -1 and 1.

response variable (y).

Use a regression line to predict values of (y) for values of (x).

Regression line - A straight line that describes how a response variable y

changes as an explanatory variable x changes. You can use a regression line

to predict the value of y for a given value of x.

x is the value of the explanatory value

y-hat is the predicted value of the response variable for a given

value of x.

b is the slope:

o Slope - Denoted by b in the straight line equation of the form y

= a + bx, the amount by which y changes when x increases by

one unit.

a is the intercept:

o Intercept - Denoted by a in the straight line equation of the

form y = a + bx, the value of y when x = 0.

Since we are trying to predict y, we want the

regression line to be as close as possible to the

data points in the vertical (y) direction.

Least-squares regression line (LSRL) - The line

that makes the sum of the squares of the vertical

distances of the data points from the line as small

as possible. Where sx and sy are the standard

deviations of the two variables, and r is their

correlation.

Residual - The difference between an observed value of the response

variable and the value predicted by the regression line.

So, the residual = observed y predicted y OR = y y-hat

We want to see how tightly grouped points are to the regression line. So we

look at each data point and draw a line from the data point to the regression

line. These lines are the residuals. Then, we plot that information on a

residual plot.

explanatory variable.

Influential - Description given to an observation for a statistical calculation

if removing it would markedly change the result of the calculation.

Recall that an outlier is an observation that lies far away from the other

observations.

Outliers in the y direction have large residuals

Outliers in the x direction are often influential for the least-squares

regression line, meaning that the removal of such points would

markedly change the equation of the line.

Also, we discussed previously how correlation (r), describes the

strength of a straight line relationship. In the regression setting, this

description is r2 (or the square of the correlation, is the fraction of the

variation in the values of y that is explained by the least squares

regression of y on x).

1. both describe linear relationships (require scatterplot show linear

pattern)

2. both are affected by outliers (susceptible to influential observations)

3. always plot the data before interpreting

4. beware of extrapolation.

Extrapolation - The use of a regression line for prediction far

outside the range of values of the explanatory variable x that you

used to obtain the line. Such predictions are often not accurate.

5. Beware of lurking variables

Lurking variable - A variable that is not among the explanatory or

response variables in a study and yet may influence the

interpretation of relationships among those variables.

Correlation does not imply causation!

- 1171924Uploaded byfabriciolafebre
- Gunter 1997Uploaded bydaniel.fadokun
- A CLARIFYING CORRELATION on the NONusefullness of General Speed LimitsUploaded byAl Gullon
- Notebook File Chapter 8Uploaded byamy12young
- 20180809104733D4998_Chapter_13 Correlation and Linear Regression.pptxUploaded bycatherine wijaya
- SDA 3E Chapter 6 (1)Uploaded byxinearpinger
- stats project chapter 4 megan and kyranUploaded byapi-442122486
- Lecture 7 Regression and CorrelationUploaded byfa2heem
- STAT-30100. Week 3 Chapter 2.9 & 9.1–9.2, 9.5. Graphing Bivariate Numerical Data; Correlation; Simple Linear Regression.pdfUploaded byA K
- DASE Session Problem Sheet 4Uploaded byAnkit Dangi
- 10.1.1.871.1667Uploaded bysoleh
- sakshamassignment2Uploaded bySaksham Aggarwal
- Datascience Training in hyderabadUploaded byrs trainings
- Competitive PrioritiesUploaded bysaraaqil
- multiple regression mechanicsUploaded byalbertnow8
- Research methods and Statistics_Ismail KhaterUploaded byIsmail Khater
- Articol ET 1Uploaded bymihadariciuc
- The Analysis of Pretender Building Price Forecasting Performance a Case StudyUploaded byindikuma
- Old Test 2Uploaded bygms304
- SOFTWARE NOTES S10Uploaded byhfan88
- ARFANUploaded byMuh Arsawan
- OLS_regressionUploaded byAliya89
- outputUploaded bymuhammad ashari
- PASW Categories 18Uploaded byJairo Vargas Caleño
- 03z-2004_Paper_Intl_Workshop.pdfUploaded byAnonymous JGegGBF6
- Regression AnalysisUploaded bykanishk2987
- Data Analysis (1) (2)Uploaded byAbhishek Dubey
- FACTOR AFFECTING THE FINANCIAL PERFORMANCE OF TAKAFUL COMPANIES IN PAKISTA.pdfUploaded byZubair Arshad
- khsdfUploaded bymaimi009
- DataMinerXL ManualUploaded byGerman Toledo

- STAT211_062_02_E1Uploaded byAnnia Codling
- Effects of Demonetization on Financial Inclusion in IndiaUploaded byNiranjan Shaw
- Analytical Competency 8Uploaded byRandy Sanichar
- Full Paper a Comparative Analysis of Academic Performance of Public and Private Junior High SchoolsUploaded byLene
- fferUploaded bylhz
- 8184120486 Graphical TechniquesUploaded bysagarcha
- LecturenotesEnv.modellingUploaded byNathan Zwise
- Model Non Normal DataUploaded byqualipalha
- 09. VCA Step 4. Conducting Fieldwork InterviewsUploaded byOmar Dahan
- Proposed Action Research OutlineUploaded byObet Gonzales
- The Detection of Earnings Manipulation Messod D. BeneishUploaded byOld School Value
- 1 6b a engineering stem careers-2Uploaded byapi-264258719
- Exam Race (Autosaved)Uploaded bySyedFarhatNaaz
- REF-N-WRITE PhraseBank for Writing Research PapersUploaded byKhushairi Amri Kasim
- Ch 8 Residual AnalysistopostUploaded bySergio Boillos
- University of Malaya Dental Students' Attitudes Towards Communication Skills Learning: Implications for Dental EducationUploaded byUniversity Malaya's Dental Sciences Research
- InterpolationUploaded byaaaaaaaa43254252624624627
- Chapter 7 SlidesUploaded byParth Rajesh Sheth
- Past Exam2Uploaded byShyama Sundari Devi
- Sajitha ProjectUploaded bySajitha Ns
- SAS Statistics QuizzesUploaded byMahesh Kumar Joshi
- Spatial Preprocessing in fMRI imagesUploaded bycavron
- Determination of a Mining Cutoff Grade Strategy Based on an Iterative FactorUploaded byalvaroaac4
- Chapter 01Uploaded by0Wa
- Ova PrimUploaded bywei
- Liquidity Prediction in Limit Order Book MarketsUploaded bylookimba
- Anxiety and Well-being in First-time Coronary Angioplasty Patients AndUploaded byAlexandru Martis
- Duursma fitplcUploaded byGilberto Aleman Sancheschulz
- Proceedings I World Congress Science and TriathlonUploaded bykabut81
- BA BSC Semester VUploaded bynaseeb