You are on page 1of 6

Linear Regression Analysis In its simplest form, regression analysis is the search for a line, or a linear model, that

best approximate the relationship between two quantities, as seen in Figure 1. A line is perhaps the simplest geometric figure that can be used to describe various relationships not only because many existing relationships are linear in nature, but also because it provides close approximations to complicated relationships (Perlez & Sullivan, 1969).

Figure 1. Simple bivariate linear regression

In linear regression analysis one variable is expressed as a function in terms of another, i.e. explaining one variable through the other. Lines are described through linear equations which are of the form:

y a bx


Here, the y variable is the dependent or the explained variable and the x variable is the independent or the explanatory variable. a and b are numerical constants and once they are known we can quickly draw the line by substituting values of x . An error term, denoted as , in the rightmost part of the equation represents the influence of other factors that are not on the model, errors of measurements and/or the randomness of events. In this simple bivariate case, the line can instantly tell us that the two quantities are somewhat directly proportional to each other. Also, through the line, we can readily determine the degree of relationship concerning these two variables. Typically, after plotting two quantities as ordered pairs, one can easily draw a line which, in their own judgement, best describes the underlying relationship. However,

subjectively sketching a line presents a difficulty when it comes to assessing the merit of the approximated relationship. To overcome this, a criterion must be made in order to have a basis by which we will be able to identify a single line that best represent the data. The Method of Least-Squares In determining the best line representation of the relationship of the data the Least-squares method is more often preferred. According to Walpole (1982), of all possible lines that one might draw on a scatter diagram, the line selected by the leastsquares method selects the line that minimizes the sum of squares of the error term, . As seen in Figure 2, the error is the distance between the regression line and the actual point with the corresponding value of the independent variable. The line provided by this method is said to be the line that best fit the given data.

Figure 2. Least Squares Ilustration

Multiple Linear Regression

Although there are many situations in which we are satisfied with the results of simple bivariate linear regression, it stands to reason that we can have a better insight if additional information can be included in the study. Multiple linear regression analysis takes advantage of this idea by extending the simple linear regression model by adding one or more independent variables in the model. The linear model will now have the form:

y 1 x1 2 x 2 3 x3 ......


Here, x1 , x 2 , x3,........., are the known dependent variables and y is the dependent variable. Such a model will give an estimate to the individual relationship of the independent variables to the dependent variable, as well as their relationship as a whole.

Results and Discussions

In this study, five linear regression models are tested for validity, each representing a hypothesized relation. Each model has been tested and was shown to comply with all the assumptions of a linear regression model, assuring that all estimates are best linear unbiased estimates (BLUE). The first model tested was the model that expresses the academic performance ( Grade ) as a linear function of Intelligence Quotient ( IQ ) and Emotional Quotient ( EQ ), having the form:

Grade ( 1 * IQ ) ( 2 * EQ) .


Here, , 1 and 2 are constant coefficient that will be estimated using least-square method. This will test the relationship between the academic performance and both IQ and EQ. The second model tested was the model that expresses academic performance as a linear function of only Intelligence Quotient which is in the form of:

Grade ( * IQ) ,


where and are constant least-square estimates. This will test the relationship between the academic performance and IQ only. The third model tested was the model that expresses academic performance as a linear function of Emotional Quotient only, having the form:

Grade ( * EQ) ,


where and are constant least-square estimates. This will test the relationship between the academic performance and EQ. All three models mentioned above are specified with academic performance as the dependent variable in order to investigate is whether IQ and EQ can be an indication to academic performance individually and collectively. The last two models tested are models which express IQ as a linear function of EQ and vice versa, having the form:

IQ ( * EQ) and

(5) (6)

EQ ( * IQ)

respectively. These two models explore the existence of a relationship between IQ and EQ. Both models were considered for the sake of checking.
Table 1. Regression model Comparison

Model Specification
Grade ( 1 * IQ ) ( 2 * EQ )

ANOVA (P-value) 0.007* 0.025* 0.015* 0.3930 0.3930

Grade ( * IQ) Grade ( * EQ) IQ ( * EQ) EQ ( * IQ)

Explained Variation (R-squared) 0.2622 0.1394 0.1612 0.0215 0.0215

Significant Variable/s all all all constant constant

Model Evaluation Criterion (AIC) 15.7100 17.3360 16.8960 115.3600 158.6800

Referring to Table 1, analysis of variance indicates that all models are shown to be significant except for the last two models (having p-values greater than 0.05). This means that the data failed to confirm any linear relationship between IQ and EQ. However, in the case of academic performance, IQ and EQ are shown to be good determinants to academic performance both individually and collectively (with p-values less than 0.05). Looking at the rsquare values, we see that, collectively, IQ and EQ explain the largest percentage of the variation in the academic performance, by approximately 26%. On the other hand, individually, IQ and EQ explain the variation in the academic performance by 13% and 16%, respectively. A word of warning, the above discussion may suggest correlation among the variables but it does not imply causation. Meaning, correlation between a dependent and an independent variable does not mean that the independent variable causes the variability in the dependent variable. This only confirms that a pattern is observed between the two variables.
Table 2. Model one regession coefficients

Variable Name IQ* EQ* Constant*

Estimated Coefficient 0.12380 0.12488 57.93910

Standard Error 0.06240 0.05327 7.12750

T-ratio 33 df 2.125 2.344 8.052

P-value 0.041 0.025 0.000

Table 3. Model two regression coefficients

Variable Name IQ* Constant*

Estimated Coefficient 0.1543 66.8980

Standard Error 0.06575 6.23650

T-ratio 33 df 2.346 10.727

P-value 0.025 0.000

Table 4. Model three regression coefficients

Variable Name EQ* Constant*

Estimated Coefficient 0.1415 68.3760

Standard Error 0.05535 5.15600

T-ratio 33 df 2.556 13.261

P-value 0.015 0.000

Table 2, Table 3 and Table 4 shows the results of the least-square estimation of the

three significant models, namely models one, two and three. Notice that all estimated coefficients are all positive and significant (having p-values less than 0.05). This indicates that academic performance has a positive relationship with IQ and EQ. Specifically, an increase in IQ score, or an increase in EQ score, prompts also an increase in academic performance, assuming that all other factors are held constant.

Perlez, B., & Sullivan, C. (1969). Modern Business Statistics. Englewood Cliffs, N. J.: Prentice-Hall, Inc. Walpole, R. E. (1982). Introduction to Statistics 3rd Edition. Singapore : Pearson Education South Asia Pte Ltd.