You are on page 1of 28

Assignment 5 (Dummy Variable)

Group 1

Cross Section Data


We have taken the literacy rate data of the 32 regions (states and the UTs) in the year 2001-02. The dummy variable:

1: Northern and Eastern states 0: Southern and Western states

The literacy rates have also been examined w.r.t the enrollments (independent variable) in secondary for the two groups. Data source: indiastat.com

A Priori Reasoning

Generally the literacy rates is perceived to be higher in the southern and western parts of India as compared to the eastern and northern regions. The Literacy rates may depend on the secondary level school enrollments from these two regions.

Step Function

Hypothesis There is no significant difference between the literacy rate of the two categories of zones considered. Process Regression of literacy rate values with the dummy variable as independent variable was done

Results
Model Summary Std. Error of the Model 1 R .345a R Square .119 Adjusted R Square .089 Estimate 10.99198

a. Predictors: (Constant), Dummy Coefficients Standardized Unstandardized Coefficients Model 1 (Constant) Dummy a. Dependent Variable: LiteracyRates B 62.806 8.070 Std. Error 2.458 4.014 .345 Coefficients Beta t 25.553 2.011 Sig. .000 .053

Equation: Literacy Rate = 8.070 * Dummy + 62.806 Significance level: 0.053 for Dummy R Square value: 0.119

Conclusion

As dummy is significant so the literacy rate in different zones are different

The Intercept Method

Null Hypothesis: There is no significant difference between the average literacy rates of the two groups when literacy rate is a function of the enrollment from the regions. Regression of Literacy rate against Enrollments and dummy as independent variables was done

Results
Model Summary Std. Error of the Model 1 R .480a R Square .231 Adjusted R Square .178 Estimate 10.44510

a. Predictors: (Constant), d, x Coefficientsa Standardized Unstandardized Coefficients Model 1 (Constant) x d a. Dependent Variable: y B 65.605 -.005 9.262 Std. Error 2.704 .003 3.858 -.339 .396 Coefficients Beta t 24.264 -2.055 2.401 Sig. .000 .049 .023

Equation
Literacy Rate = 65.505 (0.005 * Enrollments) + (9.262 * Dummy) Significance levels: Enrollments: 0.049 Dummy: 0.023

Conclusion

There is a statistical difference in the average literacy rates of the two regions. Null hypothesis is rejected

Slope Intercept method


Null Hypothesis

There is no significant difference between the average literacy levels and the rate of growth of literacy rate as function of enrollments from the two regions (groups) Process Regression of Literacy rate done against enrollments, the dummy variable and the product of enrollments and the dummy variable as independent variables

Results
Model Summary
Std. Error of the Model 1 R .490a R Square .240 Adjusted R Square .159

Estimate
10.56623

a. Predictors: (Constant), xd, d, x Coefficientsa

Standardized
Unstandardized Coefficients Model 1 (Constant) x d xd a. Dependent Variable: y B 66.432 -.007 7.325 .003 Std. Error 3.082 .004 5.128 .005 -.439 .313 .169 Coefficients Beta t 21.556 -1.833 1.429 .582 Sig. .000 .078 .164 .565

Equation
Literacy Rate = 66.432 (0.007 * Enrollments) + (7.325 * Dummy) + (0.003 * Dummy * Enrollments) Significance levels: Enrollments: 0.078 , Dummy: .164 Slope dummy: .565

Conclusion

We see the growth rate is statistically insignificant hence we accept the null hypothesis So there is no statistically significant relationship between the growth rate of literacy with enrollments from various regions

Time Series Data

Data of Tata Motors ltd. India considered over a period of 21 years (1989-2009) Dependent Variable : PAT Independent Variable : Time Dummy Variable:

1989-2000: 0 2001-2009: 1

Intercept Test

As the dummy variable is not significant, so there is no structural change.

Slope-Intercept Test

As the year- dummy variable is not significant, so there is no structural change

Slope-Intercept Test

Time Series Data

Data of Tata Motors ltd. India considered over a period of 21 years (1989-2009) Dependent Variable : PAT Independent Variable : Time , Expenses Dummy Variable:

1989-2000: 0 2001-2009: 1

Null hypothesis

There is no structural change in the data (i.e in Profit after Tax)

Full Data Regression

Equation: PAT = -179.723 + (0.062 * Expenses)

MAPE 13.28558

PAT (in crores) v/s Time Span (years)


2500

2000

1500

1000

500

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

-500

-1000

Time Periods

Regression for 1989-2000:


Equation: PAT = -26.924 + (0.048 * Expenses) ; R square = 0.382

Regression for 2001-2009


Equation: PAT = -741.857 + (0.084* Expenses) ; R square = 0.825

Regression : 1989-2000

Regression: 2001-2009

CHOW Test
df RSS1 RSS2 RSS3(RSSr) RSS ur F 327147.291 1058290.395 1967296.235 1385437.686 3.569845 10 7 19 17

For degrees of freedom 17 value of F from the table = 6.11 for 1% significance level. Fcritical > F We accept the null hypothesis that there is no structural change

Regression with dummy

Dummy :

0: 1989-2000 1: 2001-2009

Scatter Plot
Profit after tax
2500 2000 1500 1000 Profit after tax 500 0 0 -500 -1000 10000 20000 30000 40000

Regression with dummy

MAPE

10.2507

The dummy is significant so there is change in intercept but not in rate of change A considerable drop in the MAPE is realized with the updated equation with the inclusion of the slope and intercept dummies

Thank You

You might also like