ordinal, nominal)

Size_Family Nominal

Education Nominal

Region Ordinal

Lifestyle Ordinal

Cars Nominal

Credit Ordinal

Stataion_Wagon Ordinal

Foriegn_Car Ordinal

Van Ordinal

Other Ordinal

2. Carry out the data cleaning steps, identify the possible outlier or extreme values and

remove them if necessary.

Data has been observed and there were no missing values in this data. Also no extreme

values of variables has been observed.

Cases

Valid Missing Total

N Percent N Percent N Percent

Size of the Family 100 100.0% 0 0.0% 100 100.0%

Years of Education of 100 100.0% 0 0.0% 100 100.0%

the head of the family

Area (Northeren or 100 100.0% 0 0.0% 100 100.0%

Southeren)

Life Style 100 100.0% 0 0.0% 100 100.0%

Number of the cars in 100 100.0% 0 0.0% 100 100.0%

possesion

Buy cars on credit 100 100.0% 0 0.0% 100 100.0%

Have a stataion Vagon 100 100.0% 0 0.0% 100 100.0%

Have a foriegn 100 100.0% 0 0.0% 100 100.0%

economic care?

Have a Van? 100 100.0% 0 0.0% 100 100.0%

Have another type of a 100 100.0% 0 0.0% 100 100.0%

car?

3. Examine the distribution of income. Is income normally distributed? If not, which

transformation would you recommend to make it normal?

For checking the normality distribution of variables One-sample test is used. Here to

check the distribution of income of family one- sample test is applied and following

results have been observed.

Interpretations

Total no. of respondents are 100 and salary have mean value 43105, with Std. Deviation

value is 37918.172 as shown here in the following table of One-Sample Statistics.

With the Confidence interval of difference of 95% lower value is 35581.21 and 50628.79

is higher value. With the df 100, t value is 11.368 and Sig .000 is observed. The values are

showing that the income of family is not normally distributed.

Cases

Valid Missing Total

N Percent N Percent N Percent

Income of the 100 100.0% 0 0.0% 100 100.0%

family

Table 1: Case Processing Summary

Statistic Std.

Error

Mean 43105.00 3791.817

Lower 35581.21

95% Confidence Bound

Interval for Mean Upper 50628.79

Bound

5% Trimmed Mean 38484.44

Median 35400.00

Income of the 1437787752.

Variance

family 525

Std. Deviation 37918.172

Minimum 0

Maximum 304200

Range 304200

Interquartile Range 37350

Skewness 3.843 .241

Kurtosis 22.614 .478

Table 2: Descriptives

Kolmogorov-Smirnova Shapiro-Wilk

Statistic df Sig. Statistic df Sig.

Income of the .201 100 .000 .675 100 .000

family

Table 3: Tests of Normality

4. After preparing the data, apply the cross tabulation and chi-square test

of independence.

To describe the relationship between two categorical variables, we use a special type of table

called a cross-tabulation. This type of table is also known as a Crosstab. In a crosstab, the

categories of one variable determine the rows of the table, and the categories of the other variable

determine the columns.

A Cross tabulation and chi-square test is applied on all variables by assuming folloing assumptions.

H0= There is no relation between Area and other variables

In the below table p- value .558 is greater than the critical value .05 so H0 is

accepted.

Chi-Square Tests

Value df Asymp. Sig. (2-

sided)

a

Pearson Chi-Square 85.417 88 .558

Likelihood Ratio 115.194 88 .027

Linear-by-Linear Association .074 1 .786

N of Valid Cases 100

a. 178 cells (100.0%) have expected count less than 5. The minimum

expected count is .40.

In the below table p- value .882 is greater than the critical value .05 so H0 is

accepted

Chi-Square Tests

Value df Asymp. Sig.

(2-sided)

a

Pearson Chi-Square 4.418 9 .882

Likelihood Ratio 5.114 9 .824

Linear-by-Linear .000 1 1.000

Association

N of Valid Cases 100

a. 14 cells (70.0%) have expected count less than 5. The minimum

expected count is .40.

In the below table p- value .498 is greater than the critical value .05 so H0 is

accepted

Chi-Square Tests

Value df Asymp. Sig. (2-

sided)

a

Pearson Chi-Square 10.368 11 .498

Likelihood Ratio 12.844 11 .304

Linear-by-Linear Association .764 1 .382

N of Valid Cases 100

a. 19 cells (79.2%) have expected count less than 5. The minimum

expected count is .40.

d) Area (Northeren or Southeren) * Life Style

In the below table p- value .04 is smaller than the critical value .05 so H0 is

rejected.

Chi-Square Tests

Value df Asymp. Sig. (2- Exact Sig. (2-sided) Exact Sig. (1-sided)

sided)

Pearson Chi-Square 4.209a 1 .040

Continuity Correction b 3.409 1 .065

Likelihood Ratio 4.220 1 .040

Fisher's Exact Test .064 .032

Linear-by-Linear Association 4.167 1 .041

N of Valid Cases 100

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 18.00.

b. Computed only for a 2x2 table

In the below table p- value .819 is greater than the critical value .05 so H0 is

accepted

Chi-Square Tests

Value df Asymp. Sig.

(2-sided)

a

Pearson Chi-Square .400 2 .819

Likelihood Ratio .402 2 .818

Linear-by-Linear .111 1 .739

Association

N of Valid Cases 100

a. 2 cells (33.3%) have expected count less than 5. The minimum

expected count is .80.

f) Area (Northeren or Southeren) * buy cars on credit

In the below table p- value .181 is greater than the critical value .05 so H0 is

accepted

Chi-Square Tests

Value df Asymp. Sig. (2- Exact Sig. (2- Exact Sig. (1-

sided) sided) sided)

Pearson Chi-Square 1.786a 1 .181

Continuity Correctionb 1.240 1 .265

Likelihood Ratio 1.826 1 .177

Fisher's Exact Test .265 .132

Linear-by-Linear Association 1.768 1 .184

N of Valid Cases 100

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 12.00.

b. Computed only for a 2x2 table

In the below table p- value .755 is greater than the critical value .05 so H0 is

accepted

Chi-Square Tests

Value df Asymp. Sig. (2- Exact Sig. (2- Exact Sig. (1-

sided) sided) sided)

a

Pearson Chi-Square .097 1 .755

Continuity Correctionb .003 1 .959

Likelihood Ratio .098 1 .754

Fisher's Exact Test .801 .484

Linear-by-Linear .096 1 .756

Association

N of Valid Cases 100

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 7.60.

b. Computed only for a 2x2 table

h) Area (Northeren or Southeren) * Have a foriegn economic care?

In the below table p- value .117 is greater than the critical value .05 so H0 is

accepted

Chi-Square Tests

Value df Asymp. Sig. (2- Exact Sig. (2- Exact Sig. (1-

sided) sided) sided)

Pearson Chi-Square 2.451a 1 .117

Continuity Correctionb 1.536 1 .215

Likelihood Ratio 2.697 1 .101

Fisher's Exact Test .192 .105

Linear-by-Linear 2.427 1 .119

Association

N of Valid Cases 100

a. 1 cells (25.0%) have expected count less than 5. The minimum expected count is 4.40.

b. Computed only for a 2x2 table

i) Area (Northeren or Southeren) * Have a Van?

In the below table p- value .000 is smaller than the critical value .05 so H1 is

accepted

Chi-Square Tests

Value df Asymp. Sig. (2- Exact Sig. (2- Exact Sig. (1-

sided) sided) sided)

a

Pearson Chi-Square 21.094 1 .000

Continuity Correctionb 18.815 1 .000

Likelihood Ratio 21.710 1 .000

Fisher's Exact Test .000 .000

Linear-by-Linear 20.883 1 .000

Association

N of Valid Cases 100

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 8.00.

b. Computed only for a 2x2 table

j) Area (Northeren or Southeren) * Have another type of a car?

In the below table p- value .002 is smaller than the critical value .05 so H1 is

accepted

Chi-Square Tests

Value df Asymp. Sig. (2- Exact Sig. (2- Exact Sig. (1-

sided) sided) sided)

a

Pearson Chi-Square 9.557 1 .002

Continuity Correctionb 8.203 1 .004

Likelihood Ratio 9.472 1 .002

Fisher's Exact Test .003 .002

Linear-by-Linear 9.461 1 .002

Association

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 11.20.

b. Computed only for a 2x2 table

5. After analyzing the data, try to make a simplified table to present all the

results to

Variable Pearson Chi-Square

the Size_Family 0.882 mangers

Education 0.498

lifestyle 0.40

Cars 0.819

credit 0.181

Stataion_Wagon 0.755

Foriegn_Car 0.117

van 0.0000

Other 0.002

Salary 0.558

(professionals).

6. Does the number of cars possessed depend more on the income of the

family or the size of the family? Does there exist an interaction between

these two factors?

To see the dependency of car possessed on income or size of family is

analyzed by the regression analysis.

First regression analysis is done between car possession and

income of family and following results have been observed.

Model Summaryb

Model R R Square Adjusted R Std. Error of

Square the Estimate

1 .245a .060 .050 .477

a. Predictors: (Constant), Income of the family

b. Dependent Variable: Number of the cars in possesion

ANOVAa

Model Sum of df Mean Square F Sig.

Squares

Regression 1.422 1 1.422 6.254 .014b

1 Residual 22.288 98 .227

Total 23.710 99

a. Dependent Variable: Number of the cars in possesion

b. Predictors: (Constant), Income of the family

Coefficientsa

Model 95.0% Confidence Interval for

B

Lower Bound Upper Bound

(Constant) .990 1.277

1 Income of the .000 .000

family

a. Dependent Variable: Number of the cars in possesion

Residuals Statisticsa

Minimum Maximum Mean Std. Deviation N

Predicted Value 1.13 2.10 1.27 .120 100

Residual -1.095 1.553 .000 .474 100

Std. Predicted Value -1.137 6.886 .000 1.000 100

Std. Residual -2.297 3.257 .000 .995 100

a. Dependent Variable: Number of the cars in possesion

family and following results have been observed.

ariables Entered/Removeda

Model Variables Variables Method

Entered Removed

Size of the . Enter

1

Familyb

a. Dependent Variable: Number of the cars in

possesion

b. All requested variables entered.

Model Summaryb

Model R R Square Adjusted R Std. Error of

Square the Estimate

1 .587a .344 .338 .398

a. Predictors: (Constant), Size of the Family

b. Dependent Variable: Number of the cars in possesion

ANOVAa

Model Sum of df Mean Square F Sig.

Squares

Regression 8.161 1 8.161 51.438 .000b

1 Residual 15.549 98 .159

Total 23.710 99

a. Dependent Variable: Number of the cars in possesion

b. Predictors: (Constant), Size of the Family

Coefficientsa

Model 95.0% Confidence Interval for

B

Lower Bound Upper Bound

(Constant) .481 .851

1 Size of the .111 .195

Family

a. Dependent Variable: Number of the cars in possesion

Residuals Statisticsa

Minimum Maximum Mean Std. Deviation N

Predicted Value .97 2.35 1.27 .287 100

Residual -.737 1.569 .000 .396 100

Std. Predicted Value -1.039 3.756 .000 1.000 100

Std. Residual -1.849 3.940 .000 .995 100

a. Dependent Variable: Number of the cars in possesion

7. Does there exist an association between the fact of having a Van and the

life style? What happens to this association when the area is considered?

Finally, the possession of a Van is related to the area or the life style?

Does there exist an association between the fact of having a Van

and the life style?

To find out the relation between the having van and life style cross tab

analysis is done and following results have been observed. Following tables

show that p values is greater than value of 0.05 so there is no association

between having a van and life style.

Cases

Valid Missing Total

N Percent N Percent N Percent

Have a Van? * Life 100 100.0% 0 0.0% 100 100.0%

Style

1 2

1 46 34 80

Have a Van?

2 9 11 20

Total 55 45 100

Chi-Square Tests

Value df Asymp. Sig. Exact Sig. (2- Exact Sig. (1-

(2-sided) sided) sided)

Pearson Chi-Square 1.010a 1 .315

Continuity Correctionb .568 1 .451

Likelihood Ratio 1.005 1 .316

Fisher's Exact Test .329 .225

Linear-by-Linear 1.000 1 .317

Association

N of Valid Cases 100

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 9.00.

b. Computed only for a 2x2 table

To find out the relation between the having van and area cross tab analysis is

done and following results have been observed. Following tables show that p

values is less than value of 0.05 so there is an association between having a

van and area.

Cases

Valid Missing Total

N Percent N Percent N Percent

Have a Van? * Area 100 100.0% 0 0.0% 100 100.0%

(Northeren or

Southeren)

Crosstabulation

Count

Area (Northeren or Total

Southeren)

1 2

1 57 23 80

Have a Van?

2 3 17 20

Total 60 40 100

Chi-Square Tests

Value Df Asymp. Sig. Exact Sig. (2- Exact Sig. (1-

(2-sided) sided) sided)

Pearson Chi-Square 21.094a 1 .000

Continuity Correctionb 18.815 1 .000

Likelihood Ratio 21.710 1 .000

Fisher's Exact Test .000 .000

Linear-by-Linear 20.883 1 .000

Association

N of Valid Cases 100

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 8.00.

b. Computed only for a 2x2 table

8. Does the possession of economic foreign cars depend on the size of the

family?

The possession of economic foreign cars depends on the size of the family

as shown in the following ANOVA table.

Crosstabulation

Count

Have a foriegn economic Total

care?

1 2

Chi-Square Tests

2 17 2 19

Value Df Asymp. Sig.

3 28 2 30

(2-sided)

4 28 1 29

Pearson Chi-Square 24.970a 9 .003

5 3 2 5

Likelihood Ratio 16.830 9 .051

Linear-by-Linear Size of the 7.0116 1 6 .008 1 7

Association Family 7 3 0 3

N of Valid Cases 1008 2 1 3

a. 16 cells (80.0%) have expected9count less than 5. The

2 0 2

minimum expected count is .11. 10 0 1 1

9. Is 11 0 1 1 possession

Total 89 11 100

of a station-

wagon has any association with the size of the family? Is this conclusion

remained when one adds the effect of the income?

H0= There is no association between having a station wagon and size of family

In the below table p- value .000 is less than the critical value .05 so H1 is

accepted.

Chi-Square Tests

Value df Asymp. Sig.

(2-sided)

Pearson Chi-Square 59.589a 9 .000

Likelihood Ratio 53.811 9 .000

Linear-by-Linear 37.268 1 .000

Association

N of Valid Cases 100

a. 14 cells (70.0%) have expected count less than 5. The

minimum expected count is .19.

The regression analysis is showing that there is no effect of income on the association of having

station wagon and size of family.

Descriptive Statistics

Mean Std. Deviation N

have a stataion Vagon 1.19 .394 100

Size of the Family 3.95 1.877 100

Income of the family 43105.00 37918.172 100

ANOVAa

Model Sum of df Mean F Sig.

Squares Square

Regression 5.862 2 2.931 29.841 .000b

1 Residual 9.528 97 .098

Total 15.390 99

a. Dependent Variable: have a stataion Vagon

b. Predictors: (Constant), Income of the family, Size of the Family

10.Can the use of a credit for the purchase of a car be explained by the

level of education? What does happen if one considers simultaneously

the effect of the income?

In the below table p- value .181 is greater than the critical value .05 so H0 is

accepted. So there is no relationship between these two variables.

Chi-Square Tests

Value df Asymp. Sig. Exact Sig. (2- Exact Sig. (1-

(2-sided) sided) sided)

Pearson Chi-Square 1.786a 1 .181

Continuity Correctionb 1.240 1 .265

Likelihood Ratio 1.826 1 .177

Fisher's Exact Test .265 .132

Linear-by-Linear 1.768 1 .184

Association

N of Valid Cases 100

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 12.00.

b. Computed only for a 2x2 table

When the effect of income is added

Following results have been observed. And conclusion remains the same.

ANOVAa

Model Sum of Df Mean Square F Sig.

Squares

Regression .781 2 .391 1.873 .159b

1 Residual 20.219 97 .208

Total 21.000 99

a. Dependent Variable: buy cars on credit

b. Predictors: (Constant), Income of the family, Years of Education of the head of the

family

11.Is the possession of a station-wagon affected by the area in which the

family lives? Is this conclusion upheld when one adds the size of the

family?

A Cross tabulation and chi-square test is applied on all variables by assuming following assumptions.

In the below table p- value .755 is greater than the critical value .05 so H0 is

accepted.

Case Processing Summary

Cases

Valid Missing Total

N Percent N Percent N Percent

have a stataion Vagon * 100 100.0% 0 0.0% 100 100.0%

Area (Northeren or

Southeren)

Chi-Square Tests

Value Df Asymp. Sig. Exact Sig. (2- Exact Sig. (1-

(2-sided) sided) sided)

a

Pearson Chi-Square .097 1 .755

Continuity Correctionb .003 1 .959

Likelihood Ratio .098 1 .754

Fisher's Exact Test .801 .484

Linear-by-Linear .096 1 .756

Association

N of Valid Cases 100

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 7.60.

b. Computed only for a 2x2 table

When the effect of size of family is studied on this relation a regression analysis showing the

following results and according to following table the possession of station wagon is

dependent on the area and size of family.

ANOVAa

Model Sum of df Mean F Sig.

Squares Square

1 Regression 5.809 2 2.904 29.402 .000b

have a stataion Vagon * Area (Northeren or Southeren)

Residual 9.581 97

Crosstabulation .099

Total

Count 15.390 99

a. Dependent Variable: have a stataion Vagon

Area (Northeren or Total

b. Predictors: (Constant), Area (Northeren orSoutheren)

Southeren), Size of the Family

1 2

have a stataion 1 48 33 81

Vagon 2 12 7 19

Total 60 40 100

