8 views

Uploaded by Shyaam Prasadh R

Attribution Non-Commercial (BY-NC)

- chapter 3
- Multiple Linear Regression in Data Mining
- Simple Linear Regression Scott M Lynch
- Module 5
- Citizen Community Boards in Pakistan
- 10 Ch 10 Linear Regression and Correlation
- Regression Cookbook
- Statistical and Regression Analysis of Material Removal Rate for Wire Cut Electro Discharge Machining of Ss 304L Using Design of Experiments
- Fu Ch11 Linear Regression
- Weibull Analysis
- 7346-27013-1-PB
- Binary Logistic Regression
- Awareness, Perception and Satisfaction towards Neera Health Drink: Consumers Perspective
- Barrons Data Analysis 1 (1)
- Correlation & Regression
- Jaggia1e_PPT_Ch11
- roll1992.pdf
- Aod
- Regression Lecture 3
- Attraction and Retention of Employees

You are on page 1of 36

Learning Objectives

Understanding What is regression analysis Where it is used?

Regression

Provides a conceptually simple method for investigating functional relationships between one or more independent explanatory variables (factors) and a dependent variable (outcome of interest)

or a model connecting the response or dependent variable and one or more explanatory or predictor or

independent variables

Regression in Business

Predict the future joint distribution of asset returns Construct a optimal portfolio (choose weights) Estimate the effect of price and advertisement on sales Decide what is optimal price and ad campaign Predict the future probability of default using known characteristics of borrower Decide whether or not to lend (and if so, how much)

Regression in Business

Sales volume, market movement (icecream, houses) Customer complaints over time Key product specialization Predict the demographics and types of future workforce for large companies Estimate training impact

What price should I charge for my car? What will the interest rate be next month? Will this person like that movie?

Does your income increase if you complete this course? Will tax incentives change purchasing behaviour? Is my advertising campaign working?

Where to start?

Linear Prediction

Example : Predicting house price Problem: Predict market price based on observed characteristics

Solution: Look at property related data where we know the price and some observed characteristics Build a decision rule that predicts price as a function of the observed characteristics

What characteristics do we use?

Size No of rooms Attached baths Garage space, UPS facility, neighbourhood etc

Easy to quantify variables like price and size but what about other variables like aesthetics, workmanship etc.

To keep things simple lets focus only on size

The value that we seek to predict is called the dependent (or explained ) variable, and we denote this as

The variable that we use to guide prediction is the independent (or explanatory) variable, and this is labelled

Linear prediction

Recall that the equation of a line is: Y = b0 + b1X We add the random residual term Y = b0 + b1X + u

The intercept value is in units of Y (Rs.1,00,000) The slope is in the units of Y per unit of X (Rs.1,00,000/1,000 Sq feet)

Y = b0 + b1X + u

Intercept b0 : when X =0, Y = b0 Intercept is the best predictor of Y Slope b1 : when X increases by 1 unit (1000 sq ft), Y increases by b1 units (Rs.1,00,000)

Linear Prediction

Linear Prediction

We can now predict the price of a house when we know only the size

Y = dependent variable X1, X2, X3, Xp = independent variables Linear relationship is written as:

Estimating this model requires statistical tools better than simple graphical methods Least Square Method

A reasonable way to fit a line is to minimize the amount by which the fitted value differs from the actual value. This amount is called the residual or Error

Yi 0 1 X i u i

X Y i 0 1 i

Fitted value

What is the fitted value?

The dots are the observed values and the line represents our fitted values given by

What is the residual for the ith observation?

ui

u Yi Y i i u Y Y

i i i

Total may be small but the individual residual may be widely scattered Also positives may cancel out negative residuals resulting in a small total

OLS Criteria

How well does the sample regression line fit the data? We want to know what proportion of variations in Y does our model explain?

r2: Measures the goodness of fit

Ballentine view of r2

r2 = 0

r2 = 1

Sam wants to predict the sale of compact cassette tape recorder across stores using advertisement and price data where Sales is measured in number of units sold Advertisement = number of times product is advertised within the store Price = in dollars Predict the sale of compact cassette tape recorder if advertisement = 7 and price = $132?

Yi 0 1 X 1i 2 X 2i u i

Sales 0 1 ( Advertisement ) 2 (Pr ice) Error

Coefficientsa Unstandardized Coefficients B (Constant) 1 Number of Advertisement Price in Dollars 219.231 6.381 -1.671 Std. Error 86.242 2.180 .684 .847 -.706 Standardize d Coefficients Beta 2.542 .085 2.927 .061 -2.441 .092

Model

Sig.

Estimated Equation

Interpretation Constant: 0: When Advertisement and price are zero Average sales = 219.231 (constant) Slopes: 1 : If advertisement increases by 1 number, sales increases by 6.4 units 2 : If price increases by 1 $, sales decreases by 1.67 units

Prediction

Predict Sales when Advertisement = 7 and Price = $132 Sales = 219.231 + 6.381 x 7 -1.671 x 132 = 219.231 + 604.667 220.572 =603.326 units of sale

R-Square

SPSS output

Model Summary Model 1 R .884a R Square .782 Adjusted R Square .637 Std. Error of the Estimate 16.108

R-Square = 0.782 indicates that the model explains 78.2 % variation in Y variable

Hypothesis testing

H1 : 2 2

For df = n-k and level of significance read the table value from t table Decision rule: if the calculated |t| > t, then reject the Ho.

Hypothesis testing

Test if each of the slope coefficients make any impact on the Y variable at significance level of 0.05.

0 H0 : 1 0 H1 : 1

Significance level (SPSS output):0.061 0.061 > 0.05 => Do not reject H0 Advertisement has no significant impact on Sales

Coefficientsa Unstandardized Coefficients B (Constant) 1 Number of Advertisement 219.231 6.381 Std. Error 86.242 2.180 .847 Standardized Coefficients Beta 2.542 .085 2.927 .061 Model t Sig.

Price in Dollars

-1.671

.684

-.706

-2.441 .092

Is the regression as a whole significant? Test if atleast one X variable has an impact on the Y

H1: atleast one i 0 ( Y depends on at least one X)

Statistics used : F Statistics Given as ANOVA table output in SPSS output At Significance level of 0.05 If Sig < 0.05, then Reject H0

ANOVAb Model Sum of Squares df 2 3 5 Mean Square F Sig.

a. Predictors: (Constant), Price in Dollars, Number of advertisement b. Dependent Variable: Sales (units sold) Sig = 0.102 > 0.05. Hence do not reject H0. Y does not depend on any of the X variables

- chapter 3Uploaded bycarlo knows
- Multiple Linear Regression in Data MiningUploaded byakirank1
- Simple Linear Regression Scott M LynchUploaded bypedda60
- Module 5Uploaded byEugene Palmes
- Citizen Community Boards in PakistanUploaded bymukhtarparas
- 10 Ch 10 Linear Regression and CorrelationUploaded bytilahunthm
- Regression CookbookUploaded byPollen1234
- Statistical and Regression Analysis of Material Removal Rate for Wire Cut Electro Discharge Machining of Ss 304L Using Design of ExperimentsUploaded byHeineken Ya Praneetpongrung
- Fu Ch11 Linear RegressionUploaded byjong
- Weibull AnalysisUploaded byPujan Neupane
- 7346-27013-1-PBUploaded byPhuongThyMai
- Binary Logistic RegressionUploaded byAlexi Ash
- Awareness, Perception and Satisfaction towards Neera Health Drink: Consumers PerspectiveUploaded byBONFRING
- Barrons Data Analysis 1 (1)Uploaded by1012219
- Correlation & RegressionUploaded byAbhinav Aggarwal
- Jaggia1e_PPT_Ch11Uploaded byEjCheang
- roll1992.pdfUploaded byjamilkhann
- AodUploaded byKanishk
- Regression Lecture 3Uploaded bydan
- Attraction and Retention of EmployeesUploaded byHeruba
- chapter08Uploaded byapi-172580262
- IO 3 Page ReportUploaded byArienneVanLeeuwenConnolly
- Stat 6120 ProjectUploaded byPaul Gokool
- 20161221 - Final Exam - SolUploaded byCátia Fernandes Vaz
- OpenGeoDa3Uploaded byYulintinRianaDewi
- Estimating Project Development Effort Using Clustered Regression ApproachUploaded byCS & IT
- Regression.slides 0Uploaded byNavjot Singh
- TERM PAPER ON “CURRENCY STABILITY OF EURO(FRANCE) WITH REST OF THE WORLDUploaded byVubon Minu
- 2015_Hw3_KeyUploaded byPETER
- Precision Water Management in Corn Using Automated Crop Yield Modeling and Remotely Sensed DataUploaded bySEP-Publisher

- Sri Hanuman Chalisa in TamilUploaded bysarabala1979
- jofi713.pdfUploaded byShyaam Prasadh R
- SSRN-id383560.pdfUploaded byShyaam Prasadh R
- buckley2014.pdfUploaded byShyaam Prasadh R
- Froniters in FinanceUploaded byShyaam Prasadh R
- Theory of Financial Decision Making IngersollUploaded byJorge Bejarano
- ABIRAMI ANTHATHIUploaded byRAJARAJAN KARUPPAIAH
- Multi Asset Portfolio Optimization and Out of Sample Performance an Evaluation of Black Litterman Mean Variance and Na Ve Diversification ApproachesUploaded byShyaam Prasadh R
- Manuscript ID.docxUploaded byShyaam Prasadh R
- Results.docxUploaded byShyaam Prasadh R
- TimeTable Session 3 & 4 (1).docxUploaded byShyaam Prasadh R
- www_business_standard_com_article_companies_l_t_to_sell_49_i.pdfUploaded byShyaam Prasadh R
- Thesis 8-2-16.docxUploaded byShyaam Prasadh R
- synopsis v.5 10-02-2016.docUploaded byShyaam Prasadh R
- res gap.docxUploaded byShyaam Prasadh R
- Determinants of oil price as on 6-2-2016.docUploaded byShyaam Prasadh R
- 001195-full-apcbss-2015-kuala-lumpur-srinivasan v.3.docxUploaded byShyaam Prasadh R
- A Multiple Lender Approach to Understanding Supply and Search in the Equity Lending MarketUploaded byShyaam Prasadh R
- SSRN-id245828Uploaded byShyaam Prasadh R
- Equity Risk PremiumUploaded bysimonong_au
- Corporate DistressUploaded byShyaam Prasadh R
- LVB2Uploaded byShyaam Prasadh R
- Arunachala Aksharamanamalai - TamilUploaded bydeepaksubsmani@yahoo.com
- Applied Multivariate Statistical Analysis by Johnson WichernUploaded byJooyong Nam
- What is ProbabilityUploaded byShyaam Prasadh R
- Matrix ProblemsUploaded byShyaam Prasadh R
- Characterising and Displaying Multivariate DataUploaded byShyaam Prasadh R

- Profile Sage IccpUploaded byJosé Manuel Rodríguez Porras
- Eficacy of Ayurvedic Health care system - A study in a Community.pdfUploaded byMSKC
- Green ShieldUploaded byManuel Solis
- Aromatization of Ethanol on Mo2CZSMUploaded byKristof Van der Borght
- 3649_RepComp_133-560_NG001-NG010Uploaded bysandroterra
- The Crimean War - NOTESUploaded bySanah Khan
- women entrepreneurUploaded bynemchand
- Stylistics and PoetryUploaded byKhalid Shamkhi
- Medically Supervised Water-only Fasting in the Treatment of HypertensionUploaded byTrueNorth Health Center
- Pastpaper Pk Nts Gat Subject Test Sample Paper of ManagementUploaded byMuhammad Arslan Usman
- Matrix2latex.mUploaded byAlexander
- Mollick (2014) the Dynamics of Crowdfunding - An Exploratory Study JOThe dynamics of cBVUploaded byJaap Woltjes
- Assignment1 SolutionsUploaded byJeff Ens
- Neu Ocampo - Doing Missionary Work the World Bank and the Diffusion of Financial PracticesUploaded byfalequimar
- Why a Diet Will Not Help You Lose Belly FatUploaded byRosy Regg
- 5_HXvPjqE.pdfUploaded byResna Marvinza
- The morphology and clinical importance of the axillary archUploaded byNicolas Ernesto Ottone
- ch05Uploaded byRaf.Z
- Trikhanda MudraUploaded bySathis Kumar
- 9 WholeUploaded byruianz
- Assignment on International Business ( The current business environment of India)Uploaded byAman Bhattacharya
- learningtousemanakin-ja-sig-2008.pdfUploaded byluilas
- Parkinsonismo VascularUploaded byAna María Saldaña Benavides
- 090729 in Tech Fault Detection for ReviewUploaded byNavamaniSp
- the lorax assignmentUploaded byapi-265668539
- Trees in DataStructuresUploaded bybooknpen
- project list b techUploaded byapi-111780244
- Pedro Paramo - Juan Rulfo.epubUploaded byJorge Ramírez
- Vrv-wiii Engineering DataUploaded byBillyHitman
- a1 10805443Uploaded byArchana Sinha