590 views

Uploaded by 3rlang

regresi

- A Linear Regression Approach to Prediction of Stock Market Trading Volume a Case Study
- 396-398
- GST_807
- Forecasting
- Lecture 15
- Multifactor Non-linear Modeling for Accelerated Stability Analysis and Prediction
- R09 Correlation and Regression
- Qmm Assignment 2
- Exercise 15.18
- Simple Linier Regression Model
- Path Coefficient Model 2conference
- Univariate Regression
- Affine Invariant Descriptors of 3D Object Using Multiple Regression Model
- C2D1
- ch11
- 10.2307@174241
- 3 Multiple Regression Analysis Estimation
- qbank1
- 12.Simple Regression NLS Edit(1)
- Quiz 4 Review Questions With Answerss

You are on page 1of 9

http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis...

Quality Improvement (http://blog.minitab.com/blog/quality-improvement-2)

Project Tools (http://blog.minitab.com/blog/project-tools-2)

Minitab.com (http://www.minitab.com)

4

259

61

()

()9 (http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis-how-to-interpret-the-constanty-intercept)

analysis seems to be such a simple

thing. Also known as the y intercept, it

is simply the value at which the fitted

line crosses the y-axis.

Master

Statistics

Anytime,

Anywhere

Quality Trainer

teaches you how to

analyze your data

anytime you are

online.

a lot of confusion about interpreting

the constant. Thats not surprising

because the value of the constant

term is almost always meaningless!

Paradoxically, while the value is generally meaningless, it is crucial to include the

constant term in most regression models!

In this post, Ill show you everything you need to know about the constant in linear

Take the Tour! (

regression analysis.

http://www.minitab.com

/products

I'll use fitted line plots to illustrate the concepts because it really brings the math to

/quality-trainer

life. However, a 2D fitted line plot can only display the results from simple

/?WT.ac=BlogQT)

regression, which has one predictor variable and the response. The concepts hold

true for multiple linear regression, but I cant graph the higher dimensions that are

required.

Impossible

1 of 9

12/12/2014 11:48 AM

http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis...

Ive often seen the constant described as the mean response value when all

predictor variables are set to zero. Mathematically, thats correct. However, a zero

setting for all predictors in a model is often an impossible/nonsensical

combination, as it is in the following example.

In my last post about the interpretation of regression p-values and coefficients

(http://blog.minitab.com/blog/adventures-in-statistics/how-to-interpretregression-analysis-results-p-values-and-coefficients), I used a fitted line plot to

illustrate a weight-by-height regression analysis. Below, Ive changed the scale of

the y-axis on that fitted line plot, but the regression results are the same as before.

If you follow the blue fitted line down to where it intercepts the y-axis, it is a fairly

negative value. From the regression equation, we see that the intercept value is

-114.3. If height is zero, the regression equation predicts that weight is -114.3

kilograms!

Clearly this constant is meaningless and you shouldnt even try to give it meaning.

No human can have zero height or a negative weight!

Now imagine a multiple regression analysis with many predictors. It becomes even

more unlikely that ALL of the predictors can realistically be set to zero.

If all of the predictors cant be zero, it is impossible to interpret the value of the

constant. Don't even try!

Outside the Data Range

2 of 9

12/12/2014 11:48 AM

http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis...

Even if its possible for all of the predictor variables to equal zero, that data point

might be outside the range of the observed data.

You should never use a regression model to make a prediction for a point that is

outside the range of your data because the relationship between the variables

might change. The value of the constant is a prediction for the response value

when all predictors equal zero. If you didn't collect data in this all-zero range, you

can't trust the value of the constant.

The height-by-weight example illustrates this concept. These data are from middle

school girls and we cant estimate the relationship between the variables outside of

the observed weight and height range. However, we can get a sense that the

relationship changes by marking the average weight and height for a newborn

baby on the graph. Thats not quite zero height, but it's as close as we can get.

I drew the red circle near the origin to approximate the newborn's average height

and weight. You can clearly see that the relationship must change as you extend

the data range!

So the relationship we see for the observed data is locally linear, but it changes

beyond that. Thats why you shouldnt predict outside the range of your data...and

another reason why the regression constant can be meaningless.

Regression Model

Even if a zero setting for all predictors is a plausible scenario, and even if you

3 of 9

12/12/2014 11:48 AM

http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis...

collect data within that all-zero range, the constant might still be meaningless!

The constant term is in part estimated by the omission of predictors from a

regression analysis. In essence, it serves as a garbage bin for any bias that is not

accounted for by the terms in the model. You can picture this by imagining that the

regression line floats up and down (by adjusting the constant) to a point where the

mean of the residuals is zero, which is a key assumption for residual analysis

(http://blog.minitab.com/blog/adventures-in-statistics/why-you-need-to-checkyour-residual-plots-for-regression-analysis). This floating is not based on what

makes sense for the constant, but rather what works mathematically to produce

that zero mean.

The constant guarantees that the residuals dont have an overall positive or

negative bias, but also makes it harder to interpret the value of the constant

because it absorbs the bias.

Regression Model?

Immediately above, we saw a key reason why you should include the constant in

your regression model. It guarantees that your residuals have a mean of zero.

Additionally, if you dont include the constant, the regression line is forced to go

through the origin. This means that all of the predictors and the response variable

must equal zero at that point. If your fitted line doesnt naturally go through the

origin, your regression coefficients and predictions will be biased if don't include

the constant.

Ill use the height and weight regression example to illustrate this concept. First, Ill

use General Regression in Minitab statistical software (http://www.minitab.com

/en-us/products/minitab/) to fit the model without the constant. In the output

below, you can see that there is no constant, just a coefficient for height.

Next, Ill overlay the line for this equation on the previous fitted line plot so we can

compare the model with and without the constant.

4 of 9

12/12/2014 11:48 AM

http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis...

The blue line is the fitted line for the regression model with the constant while the

green line is for the model without the constant. Clearly, the green line just doesnt

fit. The slope is way off and the predicted values are biased. For the model without

the constant, the weight predictions tend to be too high for shorter subjects and

too low for taller subjects.

In closing, the regression constant is generally not worth interpreting. Despite this,

it is almost always a good idea to include the constant in your regression analysis.

In the end, the real value of a regression model is the ability to understand how the

response variable changes when you change the values of the predictor variables.

Don't worry too much about the constant!

If you're learning about regression, read my regression tutorial

(http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis-tutorialand-examples)!

How to Interpret Regression Analysis Results: P-values and Coefficients (http://blog.minitab.com

/blog/adventures-in-statistics/how-to-interpret-regression-analysis-results-p-values-and-coefficients)

Regression Analysis Tutorial and Examples (http://blog.minitab.com/blog/adventures-in-statistics/regressionanalysis-tutorial-and-examples)

How to Predict with Minitab: Using BMI to Predict the Body Fat Percentage, Part 1 (http://blog.minitab.com

/blog/adventures-in-statistics/how-to-predict-with-minitab-using-bmi-to-predict-the-body-fat-percentagepart-1)

Regression Analysis: How Do I Interpret R-squared and Assess the Goodness-of-Fit? (http://blog.minitab.com

5 of 9

12/12/2014 11:48 AM

http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis...

/blog/adventures-in-statistics/regression-analysis-how-do-i-interpret-r-squared-and-assess-the-goodnessof-fit)

Comments

Name: Tim McDaniel Sunday, September 15, 2013

Very nice. It is amazing, and I think understandable, how desperately new-to-regression students want to

attach a substantively meaningful interpretation to the intercept term. I tell students that one could interpret

the intercept as a "correction factor" when using particular values of the x's to predict y.

I'm studying empirical economic research in Germany and the lecture notes did not explain this parameter, it

was just there. Thank you very much for explaining this with graphics!

This is an excellent explanation, particularly for a negative constant in regression analysis. Thanks.

Great! Very helpfu material to me.

6 of 9

12/12/2014 11:48 AM

http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis...

5 Comments

Regression line drawn as Y=c+1075x, when x was 2, Y was 239, given that Y intercept was

11,. Calculate the residual.

Mod

The residual equals the observed value minus the fitted value. So, let's figured out both

of those.

You state that the observed value for Y is 239.

We'll plug in your values in the equation to figure out the fitted value.

Y=11+1075*2. So, the fitted value equals 2161.

So, the residual is 239 - 2161 = -1922

Jim

John K.

Jim, can you elaborate on the purpose and meaning of assessing the significance of a

constant. The significance measure is included in regression results and occasionally is way

above .05 (in my example: .559). Thank you!

Mod

Hi John,

The strict technical meaning of the p-value for the constant is that it measures how

compatible your data are with the null hypothesis that the constant equals zero. If you

have a sufficiently low p-value for the constant, you can reject the null hypothesis and

conclude that the constant does not equal zero. In other words, the regression line

does not go through the origin.

Your higher p-value indicates that you cannot reject the null that the constant equals

zero. Your constant could be zero.

However because the value of the constant is generally meaningless determining

7 of 9

12/12/2014 11:48 AM

- A Linear Regression Approach to Prediction of Stock Market Trading Volume a Case StudyUploaded byWilliam Scott
- 396-398Uploaded byNitaSintiaSari
- GST_807Uploaded byBukola Bukky
- ForecastingUploaded byTeresse Dacanay
- Lecture 15Uploaded byJosh Potash
- Multifactor Non-linear Modeling for Accelerated Stability Analysis and PredictionUploaded bySiri Kalyan
- R09 Correlation and RegressionUploaded byTorakiSato
- Qmm Assignment 2Uploaded byAmbika Sharma
- Exercise 15.18Uploaded byLeonard Gonzalo Saavedra Astopilco
- Simple Linier Regression ModelUploaded byMedico Nol Delaphan
- Path Coefficient Model 2conferenceUploaded byDanlami Moses Ogah
- Univariate RegressionUploaded bySree Nivas
- Affine Invariant Descriptors of 3D Object Using Multiple Regression ModelUploaded byAnonymous Gl4IRRjzN
- C2D1Uploaded byDeepak Yadav
- ch11Uploaded byAídaMariana
- 10.2307@174241Uploaded byHamid Ullah
- 3 Multiple Regression Analysis EstimationUploaded byIves Lee
- qbank1Uploaded bySoumik Sarangi
- 12.Simple Regression NLS Edit(1)Uploaded byZaldy Harrist
- Quiz 4 Review Questions With AnswerssUploaded bySteven Nguyen
- Week 09 Fall15Uploaded byAmit Singh
- CBE486/586 Syllabus Fall 2016Uploaded bySB216
- Artificial Intelligence Training and Placement Program - Bangalore and CoimbatoreUploaded byAerofolic Business Solution
- 163Uploaded byDeepak Jain
- Linear Regression and CorelationUploaded bySally Goodwill
- ERIC Multivariate Analysis CommresearcjUploaded byIala Raolona
- Problem Set EconometricsUploaded byJelac Capalac
- difcultad_1Uploaded byelisban
- MBA1 SyllabusUploaded byTheRHKapadiaCollege
- Regression Stepwise Backward (PIZZA)Uploaded byJigar Priydarshi

- KW Expanded Tables 3groupsUploaded by3rlang
- Regresi Logistik UIDUploaded by3rlang
- Pertemuan 10-SK Dan Uji Hipotesis 2 PopUploaded by3rlang
- Jadwal UAS Ganjil TA 2017_2018 FIXUploaded by3rlang
- Regresi Logistik Biner UIDUploaded by3rlang
- Kruskal-Wallis H TableUploaded by3rlang
- Usm Stis 2017 - b InggrisUploaded by3rlang
- 40 Top QuestionsUploaded by3rlang
- 4 Uji HipotesisUploaded by3rlang
- Pertemuan 13 - MultikolinieritasUploaded by3rlang
- Tugas-Kuis 8Uploaded by3rlang
- KW Expanded Tables 4groupsUploaded by3rlang
- Tugas Flasafah Sains AUploaded by3rlang
- 02 Kebijakan DAU 2017Uploaded by3rlang
- Bahan Rapat Penguji Skripsi 2017Uploaded by3rlang
- MSN Uji U Mann WhitneyUploaded by3rlang
- 1 Parametrik Dan NonparametrikUploaded by3rlang
- Pertemuan 01-Statistik Parametrik Dan NonparametrikUploaded by3rlang
- Pertemuan 01-02 - Regresi Linier SederhanaUploaded by3rlang
- 04.2-MSN-Dua Sampel Berhubungan-Uji Tanda cUploaded by3rlang
- 04.3-MSN-Dua Sampel Berpasangan-Uji Ranking Bertanda Wilcoxon c.pdfUploaded by3rlang
- 04.3-MSN-Dua Sampel Berpasangan-Uji Ranking Bertanda Wilcoxon c.pdfUploaded by3rlang
- 04.3-MSN-Dua Sampel Berpasangan-Uji Ranking Bertanda Wilcoxon c.pdfUploaded by3rlang
- x5 Uji Dua Sampel Dependen - Uji Walsh (Gak Diajarin)Uploaded by3rlang
- GBPP S2006 Statistik NonparametrikUploaded by3rlang
- BNP.01.Uji Tanda (Sign-Test) - 2Uploaded by3rlang
- Run testUploaded byRohaila Rohani
- Pertemuan 10Uploaded by3rlang
- modul1Uploaded by3rlang
- 02 MSN 2017 Uji Binomial Dan Uji Chi-SquareUploaded by3rlang

- Chapter9 StatsUploaded byPoonam Naidu
- Mas 1Uploaded bywalczakc
- Rng DiehardUploaded bybbanelli
- 9. Fundamentals of Hypothesis Testing One-Sample Testsnew.docUploaded byWinnie Ip
- Analysis of spatial price difference of major staple foods in Tanzania: A case of Rice Dar es Salaam city and Morogoro MunicipalityUploaded byMUHIDINIZUNGO
- applications of the dose-response for muscular strength development- a review of meta-analytic efficacy and reliability for designing training prescriptionUploaded byapi-316361245
- ADKUploaded bySepfira Reztika
- R11 - Monitoring Project Duration and Cost in a Construction Project by Applying Statistical Quality Control Charts 2013Uploaded byEnas Basheer
- skittlesUploaded byapi-337959039
- PN 24440 SolutionsUploaded bySastry75
- What Drives UAE Buyers Towards Organic Food Product an ExperimentalstudyUploaded bymbilal78
- Suman DeviUploaded byNaveen Kumar
- Research paper on women empowermentUploaded bytrivedi jigs
- Good Governance A New PerspectiveUploaded byarslanshani
- Basic+Elements+of+a+Scientific+ManuscriptUploaded byKaren Betsay Castellanos Ebratt
- Asset Allocation as Determinant of Bank Profitability in NigeriaUploaded byEditor IJTSRD
- Basic Probability and Statistics Review Six Sigma Black Belt PrimerUploaded byJosé Esqueda Leyva
- Samplemid3 SolUploaded byFatimaIjaz
- A16307Uploaded byMohammad Miyan
- 62058-225982-1-PBUploaded byBadhan Mustary
- Damage Evaluation Assessment of Reinforced Concrete Structure using b-value and Damage Parameter Analysis of Acoustic Emission SignalsUploaded byAmin Mojiri
- The statistical crisis in scienceUploaded bybiblos
- Template Format Jurnal APAUploaded byFathur Rachman
- lec25 (1)Uploaded bysammyluver5
- Barriers to Islamic Banking GrowthUploaded byWahaaj Ahmad
- 1 2Uploaded byapi-366807311
- The Effect of Corporate Social Performance on Financial Performance (the Moderating Effect of Ownership Concentration)Uploaded byHanan Raditya
- nota T-testUploaded byhamidah2
- ermco-guide-to-en206-2013-8.1.2014-finalUploaded byNektarios Matheou
- Presentation by Deepan.pptxUploaded byDeepan Kumar Das