0 Up votes0 Down votes

11 views11 pagesols

May 19, 2013

© Attribution Non-Commercial (BY-NC)

DOC, PDF, TXT or read online from Scribd

ols

Attribution Non-Commercial (BY-NC)

11 views

ols

Attribution Non-Commercial (BY-NC)

- The Law of Explosive Growth: Lesson 20 from The 21 Irrefutable Laws of Leadership
- Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
- Hidden Figures Young Readers' Edition
- The E-Myth Revisited: Why Most Small Businesses Don't Work and
- Micro: A Novel
- The Wright Brothers
- The Other Einstein: A Novel
- State of Fear
- State of Fear
- The Power of Discipline: 7 Ways it Can Change Your Life
- The Kiss Quotient: A Novel
- Being Wrong: Adventures in the Margin of Error
- Algorithms to Live By: The Computer Science of Human Decisions
- The 6th Extinction
- The Black Swan
- The Art of Thinking Clearly
- The Last Battle
- Prince Caspian
- A Mind for Numbers: How to Excel at Math and Science Even If You Flunked Algebra
- The Theory of Death: A Decker/Lazarus Novel

You are on page 1of 11

us 4/28/2006

CHAPTER 1

Introduction

There are many names to call this method, but it is one of the first regression models that people lean in STAT 101. It is a technique to understand a relationship between an outcome variable (also called dependent variable) and predictor variables (also called independent variables).

Let start

We use an example of taxi fare. We can have more than one predictor variable, but for simplicity we just use one predictor variable, distance of travel. Taxi fare: X Distance of travel: Y Imagine that you are trying to invent OLS regression model. Nobody in the world knows yet what OLS regression is. Now the question is How do you go about knowing the strength of relationship between X and Y here? The first thing you can do easily is to call a cab company and obtain detailed information about the cost of taking a cab. But what if they dont tell you? You have to take cabs many times yourself and try to figure out from your data. Lets do that. For this experiment you took a cab three times. You enter your observations into an Excel sheet:

Distance (miles) 1st time 2nd time 3rd time 5 7 9 Fare ($) 6 8 12

You graph your observations. This is one way to know if distance has anything to do with fare. (By the way, the excel sheet I used for this presentation can be downloaded at www.estat.us/sas/OLS.xls )

14 12 10 Fare ($) 8 6 4 2 0 0 2 4 6 8 10 Distance (mile)

What about drawing a line on the graph to express the relationship better. I used a MISCROSOFT PAINT to draw a line on a graph. I carefully draw this line, following my intuition.

Something is not right. Lets use a straight line instead, so it looks better:

How do I know I drew a line correctly? Actually I dont know if it is a correct line. After all, I just draw a line that looked right to me. Is there a mathematical way to draw a correct line? Before thinking like a mathematician, lets cheat a little bit here. We are still inventors; trying to invent an OLS regression model, so please dont forget that spirit. Lets use EXCEL to draw a line. Right-click on the dots and choose ADD TREND LINE.

Choose LINEAR.

Also click on Options. Click on Display Equation on chart. Also click on Display Rsquared value on chart. Click OK.

Cab ride: Distance and Fare

14 12 10 Fare ($) 8 6 4 2 0 0 2 4 6 8 10 Distance (mile)

What is y=1.5x 1.83333? (Ignore R-square for now). This equation is usually written in this way: y = -1.83333 + 1.5x

This equation is showing you the relationship between X and Y. To understand OLS regression, you need to know how we obtained this equation y = -1.83333 + 1.5x. Where did -1.83333 come from? It is called an intercept. Where did 1.5 come from? It is usually called the effect of X. It is also called a slope for X. To be able to say that the relationship between X and Y can be expressed in such a tight mathematical expression is neat. It is better than using a lousy graph like this:

Now we established why we need a regression model like OLS regression. We need it because an alternative like a graph above is just way too intuitive and imprecise. When I have a chance next, I will write about the questions I raised: Where did -1.83333 come from? Where did 1.5 come from? Also later I will write about standard errors we can derive for these estimates. By estimates I am referring to the numbers above -1.83333 and 1.5. What I wrote here is usually referred to as parameter estimation.

DISCUSSION TOPICS Q1. Both of the graphs below are not so great. I just handwrote the line on the left graph. For the graph on the right, I just used a straight linewithout thinking too much about it. But why do we feel that one is better than the other?

Q2. Compare the two graphs below. On the left, I have a graph where I draw a straight line just by my intuition. For the graph on the right, I used EXCEL to add a line. Talk about the differences in intelligent ways. (I know I havent covered what algorithm EXCEL uses to draw this line, but please do your best.)

14 12 10 Fare ($) 8 6 4 2 0 0 2 4 6 8 10 Distance (mile)

Q3. Why do you think we want to draw a line on a graph? Why is it useful? Does it help you to predict anything? Q4. What kind of algorithm do you think EXCEL is using to determine the line???? In other words, what kind of mathematical expression may do a trick? Can you guess at all? HINT: if you have to use your intuition to draw a linewithout relying on a mathematical algorithm, what kind of intuition are you using?

Chapter 2

Cab ride: Distance and Fare

14 12 10 Fare ($) 8 6 4 2 0 0 2 4 6 8 10 Distance (mile)

How does EXCEL compute 1.5 as a slope for the line? How does Excel compute -1.83333 as a value for an intercept? Excel used an algorithm called OLS (Ordinary Least Squares). Intuitively, it does what you would do when you have to draw this line by hand. You somehow try to draw a line that goes through the data. For example, you feel the LEFT one is better than the RIGHT one. I did both of them by guessing.

WHY????? The line has to be somehow close to all observations on the graph, which is why.

In fact, a mathematically derived line (the line done by EXCEL) is the line that MINIMIZES the distance between each observations and a line.

So again this graph done by Excel has a line that minimizes the distance between the observations and the line.

Cab ride: Distance and Fare

14 12 10 Fare ($) 8 6 4 2 0 0 2 4 6 8 10 Distance (mile)

I want to make one more point about such a line. Imagine data points are cookies and you are holding all these cookies on the plate (and the plate doesnt have a weight for some mysterious reason.) You have a straight stick. Try to put a stick underneath the plate, but when you do this, place a stick to the bottom of the plate, such that the plate makes a perfect balance (meaning the cookies dont fall). If you somehow place the stick underneath the plate in a way that cookies dont fall, then you are creating a perfect line that minimizes the distance between the stick and each cookie. Now please go ahead and figure out what kind of algorithm would allow it to happen. What kind of algorithm allows you to obtain the numbers like 1.5 and -1.83333, both of which allow you to construct an equation? Y= -1.833333 + 1.5*X By the way, what is an algorithm? It is like a black box. You feed in X and Y and you get a slope and a coefficient for an X. What kind of box will get you 1.5 and -1.8333 when you enter this data?

X: Distance (miles) 1st time 2nd time 3rd time 5 7 9 Y: Fare ($) 6 8 12

- Multilevel Analysis on Determinants of Academic Achievement of Second Year Regular Students: The Case of Addis Ababa University School of CommerceUploaded byInternational Organization of Scientific Research (IOSR)
- 1st PUC Maths Model QP 1.pdfUploaded byPrasad C M
- econometricsUploaded byFrancisco Ortiz
- RegressionUploaded byDendy Lee
- Analysis of relationship between road safety and road design parameters of four lane National Highway in IndiaUploaded byIOSRjournal
- Economic Shocks and Civil Conflict: An Instrumental Variables ApproachUploaded byMau Bau
- Determinants of Demand for International Tourist Flow to Turkey. Tourism ManagementUploaded bySriyantha Fernando
- 2011-Penetration Response of Silicon Carbide as a Function of Impact VelocityUploaded byvenkatesanjs
- Forecasting Ppt01Uploaded byLeojelaineIgcoy
- Syllabus M.tech.(ME) Revised (Part-timer)Uploaded byParminder Singh
- ChristensenUploaded byVellin Lusiana
- Greg.pdfUploaded byjozef
- Ch 10 Even e and pUploaded byYee Sook Ying
- Cross Section AnswersUploaded bydamla87
- 01 Regression AnalysisUploaded byManuel Mercy Garcia
- Onl9 Monitoring of TransformerUploaded byMTECH IPS
- lmUploaded byKaka Kaka
- Theory_Of_HPLC_Quantitative_and_Qualitative_HPLC.pdfUploaded byArRashidRashid
- McBratney Et Al. - On Digital Soil MappingUploaded bycarlos castro
- (Chapman & Hall Texts in Statistical Science Series) A. A. Afifi, V. Clark - Computer-Aided Multivariate Analysis-Chapman & Hall (1996).pdfUploaded bydacsil
- EC of Firms Deprecitation Method ChoiceUploaded byfalequimar
- 12-CoorelationUploaded bydranu3
- Logarithmic Functional FormUploaded byshahanara basher
- 05.vvvvvvUploaded byJamil Hassan Khattak
- Fixed vs Random the Hausman Test Four Decades LaterUploaded byillman
- STATA_Intro-May Print Few PagesUploaded byfahim063
- Inter Acci OnesUploaded byajax_telamonio
- Autocorrelation_MRUploaded bysandrageorgiaion
- multrUploaded byHItesh Khurana
- multr.docUploaded byHItesh Khurana

- example, event studyUploaded byRosh Otojanov
- OLSinRUploaded byRosh Otojanov
- PWT 7.1 WebUploaded byRosh Otojanov
- paperUploaded byRosh Otojanov
- MIT6_094IAP10_assn01Uploaded byRosh Otojanov
- InteractionUploaded byRosh Otojanov
- Getting Started With RUploaded byRosh Otojanov
- MIT6_094IAP10_assn02Uploaded byRosh Otojanov
- Matlab CheatsheetUploaded byRaja Swaidan
- Imf_What is an Emerging MarketUploaded byRosh Otojanov
- Thesis Final ZaytsevUploaded byRosh Otojanov
- var_vecUploaded byRosh Otojanov
- Lecture 24 Ur Tests IIUploaded byRosh Otojanov
- EG_metodologieUploaded byRosh Otojanov
- Why Higher Energy Prices Are GoodUploaded byRosh Otojanov
- Do Energy Prices Have an Impact on Swedish Stock MarketsUploaded byRosh Otojanov
- World Annual Oil Production..Uploaded byRosh Otojanov
- Ch02SUploaded byRosh Otojanov
- BreakevenUploaded byRosh Otojanov

- The TSP Code MatlabUploaded byGastonVertiz
- Test BankUploaded byRodel D Dosano
- 07 Automatic Control System Frequency Domain AnalysisUploaded byBolWol
- OriginLab OriginPro 9.1 - UltimatezUploaded bySupol
- kvkpapersbiometricsUploaded byडॉ.मझहर काझी
- data_flow_3Uploaded byHemant Sanvaria
- principal component analysisUploaded byapi-223092373
- Erdos Ren.pdfUploaded byCraigStefornMatadeen
- ISyE 3039 Homework 1 ShiUploaded byfsalfaslksaf
- Ansys Tutorials02 UAUploaded bymhk665
- DiscriminantAnalysis_BasicRelationshipsUploaded byebtg_f
- CTS AjanovicUploaded byZlatan Ajanovic
- Tabel Seismologi.pdfUploaded byDewi Yuanita
- Mean Report SUploaded byTara Badiola
- fft_engUploaded byPrit Raj
- hecrasUploaded byTri Nugraha
- GA Stepped BeamUploaded byandi0610b
- Fuzzification of College Adviser Proficiency Based on Specific KnowledgeUploaded byAndysah Putra Utama Siahaan
- Lecture 1 PrintUploaded byAhsab
- 13 ReferencesUploaded bynareshph28
- Ramsri Face Detection and TrackingUploaded byAkbarSab
- 297752504-Data-Structures-2-Marks-and-16-Marks-Question-Bank-With-Answers.txtUploaded byrcpasc
- Lecture 1 2-IntroductionUploaded bysdp071660
- 1Probability ExamUploaded bynglok
- Comparison of Different Clustering Algorithms using WEKA ToolUploaded byIJARTES
- Algorithms IIT Delhi Tutorial 1Uploaded byKartikeya Gupta
- Study Guide FinalUploaded byshivambarca
- 03 Simplified SM Model ModifiedUploaded bykhalid
- System Identification ToolboxUploaded byMahavirsinh Gohil
- AppendixUploaded byDr-Eng Imad Shaheen

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.