You are on page 1of 15

Question : 1

Perform the following tasks in SPSS.


a. Use catalog.sav sample data file to fit a multiple linear model to predict
“Sales of Men’s Clothing” on the basis of variables “Number of Catalogs
Mailed” , “Number of Pages in Catalog” , “Number of Phone Lines Open for
Ordering” , “Amount Spent on Print Advertising” and “Number of Customer
Service Representatives”. Use forward selection, backward elimination and
enter method in this respect. Interpret your result in each case.
Solution :
Forward Selection Method :

Variables Entered/Removeda

Model Variables Entered Variables Removed Method

1 Forward (Criterion: Probability-


Number of Catalogs Mailed .
of-F-to-enter <= .050)
2 Number of Phone Lines Open Forward (Criterion: Probability-
.
for Ordering of-F-to-enter <= .050)
3 Amount Spent on Print Forward (Criterion: Probability-
.
Advertising of-F-to-enter <= .050)
4 Forward (Criterion: Probability-
Number of Pages in Catalog .
of-F-to-enter <= .050)

a. Dependent Variable: Sales of Men's Clothing

Model Summarye

Model R R Square Adjusted R Square Std. Error of the Estimate

1 .803a .645 .642 3785.49685


b
2 .877 .770 .766 3061.36064
c
3 .885 .784 .778 2980.12178
d
4 .891 .794 .787 2919.90929

a. Predictors: (Constant), Number of Catalogs Mailed


b. Predictors: (Constant), Number of Catalogs Mailed, Number of Phone Lines Open for Ordering
c. Predictors: (Constant), Number of Catalogs Mailed, Number of Phone Lines Open for Ordering, Amount Spent on
Print Advertising
d. Predictors: (Constant), Number of Catalogs Mailed, Number of Phone Lines Open for Ordering, Amount Spent on
Print Advertising, Number of Pages in Catalog
e. Dependent Variable: Sales of Men's Clothing
ANOVAa

Model Sum of Squares df Mean Square F Sig.

1 Regression
3069712621.002 1 3069712621.002 214.216 .000b

Residual 1690938397.841 118 14329986.422

Total 4760651018.843 119


2 Regression
3664135333.036 2 1832067666.518 195.485 .000c

Residual 1096515685.807 117 9371928.939


Total 4760651018.843 119
3 Regression
3730440424.702 3 1243480141.567 140.014 .000d

Residual 1030210594.141 116 8881125.812


Total 4760651018.843 119
4 Regression
3780175938.309 4 945043984.577 110.844 .000e

Residual 980475080.535 115 8525870.266

Total 4760651018.843 119

a. Dependent Variable: Sales of Men's Clothing


b. Predictors: (Constant), Number of Catalogs Mailed
c. Predictors: (Constant), Number of Catalogs Mailed, Number of Phone Lines Open for Ordering
d. Predictors: (Constant), Number of Catalogs Mailed, Number of Phone Lines Open for Ordering, Amount Spent on
Print Advertising
e. Predictors: (Constant), Number of Catalogs Mailed, Number of Phone Lines Open for Ordering, Amount Spent on
Print Advertising, Number of Pages in Catalog
Coefficientsa

Standardized
Unstandardized Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) -14064.614 2099.365 -6.699 .000

Number of Catalogs Mailed 2.991 .204 .803 14.636 .000


2 (Constant) -15361.047 1705.559 -9.006 .000
Number of Catalogs Mailed 1.971 .209 .529 9.424 .000
Number of Phone Lines
334.103 41.952 .447 7.964 .000
Open for Ordering
3 (Constant) -20665.869 2554.586 -8.090 .000
Number of Catalogs Mailed 1.862 .207 .500 8.977 .000
Number of Phone Lines
339.159 40.880 .454 8.296 .000
Open for Ordering
Amount Spent on Print
.218 .080 .121 2.732 .007
Advertising
4 (Constant) -23898.558 2838.361 -8.420 .000

Number of Catalogs Mailed 1.847 .203 .496 9.083 .000

Number of Phone Lines


327.802 40.329 .439 8.128 .000
Open for Ordering

Amount Spent on Print


.208 .078 .115 2.656 .009
Advertising

Number of Pages in Catalog 50.508 20.912 .104 2.415 .017

a. Dependent Variable: Sales of Men's Clothing

Excluded Variablesa
Collinearity

Partial Statistics

Model Beta In t Sig. Correlation Tolerance

1 Number of Pages in Catalog .149b 2.773 .006 .248 .980

Number of Phone Lines


.447b 7.964 .000 .593 .625
Open for Ordering

Amount Spent on Print


.104b 1.877 .063 .171 .957
Advertising

Number of Customer
.153b 1.997 .048 .182 .501
Service Representatives
2 Number of Pages in Catalog .110c 2.496 .014 .226 .968
Amount Spent on Print
.121c 2.732 .007 .246 .955
Advertising
Number of Customer
-.064c -.933 .353 -.086 .416
Service Representatives
3 Number of Pages in Catalog .104d 2.415 .017 .220 .965
Number of Customer
-.079d -1.183 .239 -.110 .413
Service Representatives
4 Number of Customer
-.072e -1.096 .275 -.102 .412
Service Representatives

a. Dependent Variable: Sales of Men's Clothing


b. Predictors in the Model: (Constant), Number of Catalogs Mailed
c. Predictors in the Model: (Constant), Number of Catalogs Mailed, Number of Phone Lines Open for Ordering
d. Predictors in the Model: (Constant), Number of Catalogs Mailed, Number of Phone Lines Open for Ordering,
Amount Spent on Print Advertising
e. Predictors in the Model: (Constant), Number of Catalogs Mailed, Number of Phone Lines Open for Ordering,
Amount Spent on Print Advertising, Number of Pages in Catalog

Interpretation :

Backward Elimination Method :

Variables Entered/Removeda
Variables Variables
Model Entered Removed Method

1 Number of
Customer
Service
Representatives
, Number of
Pages in
Catalog, Amount
Spent on Print
. Enter
Advertising,
Number of
Phone Lines
Open for
Ordering,
Number of
Catalogs
Mailedb
2 Number of
Backward (criterion:
Customer
. Probability of F-to-remove >= .
Service
100).
Representatives

a. Dependent Variable: Sales of Men's Clothing


b. All requested variables entered.

Model Summaryc

Adjusted Std. Error of the Change Statistics

Model R R Square R Square Estimate R Square Change F Change df1 df2 Sig. F Change

1 .892a .796 .787 2917.35290 .796 89.071 5 114 0.000


2 .891b .794 .787 2919.90929 -.002 1.202 1 114 0.275

a. Predictors: (Constant), Number of Customer Service Representatives, Number of Pages in Catalog, Amount Spent on Print
Advertising, Number of Phone Lines Open for Ordering, Number of Catalogs Mailed
b. Predictors: (Constant), Number of Pages in Catalog, Amount Spent on Print Advertising, Number of Phone Lines Open for
Ordering, Number of Catalogs Mailed
c. Dependent Variable: Sales of Men's Clothing

ANOVAa

Model Sum of Squares df Mean Square F Sig.

1 Regression 3790402952.211 5 758080590.442 89.071 .000b


Residual 970248066.632 114 8510947.953

Total 4760651018.84
119
3
2 Regression 3780175938.30
4 945043984.577 110.844 .000c
9

Residual 980475080.535 115 8525870.266

Total 4760651018.84
119
3

a. Dependent Variable: Sales of Men's Clothing


b. Predictors: (Constant), Number of Customer Service Representatives, Number of Pages in
Catalog, Amount Spent on Print Advertising, Number of Phone Lines Open for Ordering, Number of
Catalogs Mailed
c. Predictors: (Constant), Number of Pages in Catalog, Amount Spent on Print Advertising, Number
of Phone Lines Open for Ordering, Number of Catalogs Mailed

Coefficientsa

Unstandardized Standardized
Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) -24498.305 2888.172 -8.482 .000

Number of Catalogs Mailed 1.973 .233 .530 8.454 .000

Number of Pages in Catalog 49.453 20.916 .102 2.364 .020

Number of Phone Lines Open for Ordering 348.158 44.367 .466 7.847 .000

Amount Spent on Print Advertising .215 .079 .119 2.740 .007

Number of Customer Service Representatives -41.720 38.059 -.072 -1.096 .275


2 (Constant) -23898.558 2838.361 -8.420 .000

Number of Catalogs Mailed 1.847 .203 .496 9.083 .000

Number of Pages in Catalog 50.508 20.912 .104 2.415 .017

Number of Phone Lines Open for Ordering 327.802 40.329 .439 8.128 .000

Amount Spent on Print Advertising .208 .078 .115 2.656 .009

a. Dependent Variable: Sales of Men's Clothing

Excluded Variablesa

Model Beta In t Sig. Partial Collinearity


Correlation Statistics
Tolerance

2 Number of Customer
-.072b -1.096 .275 -.102 .412
Service Representatives

a. Dependent Variable: Sales of Men's Clothing


b. Predictors in the Model: (Constant), Number of Pages in Catalog, Amount Spent on Print Advertising, Number
of Phone Lines Open for Ordering, Number of Catalogs Mailed

Interpretation :

Enter Method :
Variables Entered/Removeda

Variables Variables
Model Entered Removed Method

1 Number of
Customer
Service
Representatives
, Number of
Pages in
Catalog, Amount
Spent on Print
. Enter
Advertising,
Number of
Phone Lines
Open for
Ordering,
Number of
Catalogs
Mailedb

a. Dependent Variable: Sales of Men's Clothing


b. All requested variables entered.

Model Summaryb

Adjusted R
Model R R Square Square Std. Error of the Estimate

1 .892a .796 .787 2917.35290


a. Predictors: (Constant), Number of Customer Service Representatives, Number of Pages in Catalog,
Amount Spent on Print Advertising, Number of Phone Lines Open for Ordering, Number of Catalogs
Mailed
b. Dependent Variable: Sales of Men's Clothing

ANOVAa

Model Sum of Squares df Mean Square F Sig.

1 Regression 3790402952.211 5 758080590.442 89.071 .000b

Residual 970248066.632 114 8510947.953

Total 4760651018.84
119
3
a. Dependent Variable: Sales of Men's Clothing
b. Predictors: (Constant), Number of Customer Service Representatives, Number of Pages in Catalog, Amount Spent on
Print Advertising, Number of Phone Lines Open for Ordering, Number of Catalogs Mailed

Coefficientsa

Standardized
Unstandardized Coefficients Coefficients

Model B Std. Error Beta t Sig.

1 (Constant) -24498.305 2888.172 -8.482 .000

Number of Catalogs Mailed 1.973 .233 .530 8.454 .000

Number of Pages in Catalog 49.453 20.916 .102 2.364 .020

Number of Phone Lines


348.158 44.367 .466 7.847 .000
Open for Ordering

Amount Spent on Print


.215 .079 .119 2.740 .007
Advertising

Number of Customer
-41.720 38.059 -.072 -1.096 .275
Service Representatives

a. Dependent Variable: Sales of Men's Clothing

Interpretation :

Question : 1
b.Use bankloan.sav sample data file to fit a binary logistic regression model
to predict default on the basis of variables age, ed, income, debtinc,
creddebt, and othdebt. Interpret your results.
Logistic Regression :
Case Processing Summary

Unweighted Casesa N Percent

Selected Cases Included in Analysis 700 82.4

Missing Cases 150 17.6

Total 850 100.0


Unselected Cases 0 .0
Total 850 100.0

a. If weight is in effect, see classification table for the total number of


cases.

Dependent Variable Encoding

Original Value Internal Value

No 0
Yes 1

Classification Tablea,b

Predicted

Previously defaulted Percentage


Observed No Yes Correct

Step 0 Previously defaulted No 517 0 100.0

Yes 183 0 .0

Overall Percentage 73.9

a. Constant is included in the model.


b. The cut value is .500

Variables in the Equation

B S.E. Wald df Sig. Exp(B)

Step 0 Constant -1.039 .086 145.782 1 .000 .354

Variables not in the Equation

Score df Sig.

Step 0 Variables age 13.265 1 .000

ed 9.205 1 .002
income 3.526 1 .060

debtinc 106.238 1 .000

creddebt 41.928 1 .000

othdebt 14.863 1 .000

Overall Statistics 148.310 6 .000

Omnibus Tests of Model Coefficients

Chi-square df Sig.

Step 1 Step 153.662 6 .000

Block 153.662 6 .000

Model 153.662 6 .000

Model Summary

Cox & Snell R Nagelkerke R


Step -2 Log likelihood Square Square

1 650.702a .197 .289

a. Estimation terminated at iteration number 5 because


parameter estimates changed by less than .001.

Classification Tablea

Predicted

Previously defaulted Percentage


Observed No Yes Correct

Step 1 Previously defaulted No 483 34 93.4


Yes 122 61 33.3

Overall Percentage 77.7

a. The cut value is .500

Variables in the Equation

95% C.I.for EXP(B)

B S.E. Wald df Sig. Exp(B) Lower Upper

Step 1a age -.047 .014 10.632 1 .001 .954 .927 .981

ed .392 .105 13.802 1 .000 1.480 1.203 1.820

income -.013 .007 3.077 1 .079 .987 .974 1.001

debtinc .111 .027 16.986 1 .000 1.117 1.060 1.178


creddebt .341 .088 15.186 1 .000 1.407 1.185 1.671

othdebt -.069 .062 1.221 1 .269 .933 .826 1.055

Constant -1.198 .563 4.523 1 .033 .302

a. Variable(s) entered on step 1: age, ed, income, debtinc, creddebt, othdebt.

Interpretation :

Question : 1
b.Find frequency distribution of “Preferred breakfast” for those senior
citizens who are also living an active an active life (Use sample dataset
cereal.sav).

solution yet to be solved

Question : 2
a. Carry out a test of independence of attributes “Preferred breakf(Use
cereal.sav)ast” and “Marital status”.

Case Processing Summary

Cases

Valid Missing Total

N Percent N Percent N Percent

Preferred breakfast * Marital


880 100.0% 0 0.0% 880 100.0%
status

Preferred breakfast * Marital status Crosstabulation

Marital status Total


Unmarried Married

Preferred breakfast Breakfast Bar Count 108 123 231

Expected Count 79.5 151.5 231.0

% within Preferred breakfast 46.8% 53.2% 100.0%

Oatmeal Count 95 215 310

Expected Count 106.7 203.3 310.0

% within Preferred breakfast 30.6% 69.4% 100.0%

Cereal Count 100 239 339

Expected Count 116.7 222.3 339.0

% within Preferred breakfast 29.5% 70.5% 100.0%


Total Count 303 577 880

Expected Count 303.0 577.0 880.0

% within Preferred breakfast 34.4% 65.6% 100.0%

Chi-Square Tests

Asymp. Sig. (2-


Value df sided)
a
Pearson Chi-Square 21.157 2 .000
Likelihood Ratio 20.623 2 .000
Linear-by-Linear Association 16.226 1 .000
N of Valid Cases 880

a. 0 cells (0.0%) have expected count less than 5. The minimum


expected count is 79.54.

Symmetric Measures

Value Approx. Sig.

Nominal by Nominal Contingency Coefficient .153 .000


N of Valid Cases 880

Interpretation :
Queation : 2
b. Carry out a test of independence of attributes “Preferred breakfast” and
“Age category”. (Use cereal.sav)

Case Processing Summary

Cases

Valid Missing Total

N Percent N Percent N Percent

Preferred breakfast * Age


880 100.0% 0 0.0% 880 100.0%
category

Preferred breakfast * Age category Crosstabulation

Age category

Under 31 31-45 46-60 Over 60 Total

Preferred breakfast Breakfast Bar Count 84 90 39 18 231

Expected Count 47.5 54.1 60.6 68.8 231.0

% within Preferred
36.4% 39.0% 16.9% 7.8% 100.0%
breakfast

Oatmeal Count 4 24 97 185 310

Expected Count 63.8 72.6 81.4 92.3 310.0

% within Preferred
1.3% 7.7% 31.3% 59.7% 100.0%
breakfast

Cereal Count 93 92 95 59 339

Expected Count 69.7 79.4 89.0 100.9 339.0

% within Preferred
27.4% 27.1% 28.0% 17.4% 100.0%
breakfast
Total Count 181 206 231 262 880

Expected Count 181.0 206.0 231.0 262.0 880.0

% within Preferred
20.6% 23.4% 26.3% 29.8% 100.0%
breakfast
Chi-Square Tests

Asymp. Sig. (2-


Value df sided)

Pearson Chi-Square 309.336a 6 .000


Likelihood Ratio 350.688 6 .000
Linear-by-Linear Association 4.986 1 .026
N of Valid Cases 880

a. 0 cells (0.0%) have expected count less than 5. The minimum


expected count is 47.51.

Symmetric Measures
Value Approx. Sig.

Nominal by Nominal Contingency Coefficient .510 .000


N of Valid Cases 880

Interpretation :

Question : 2
c. Use grocery_coupons.sav sample data file to test that the mean of amount
spent is equal to 105. Find 90% confidence interval for the mean of amount
spent.
Also test that the mean of amount spent by both male and female customers
is equal. What would to say about the equality of means for amount spent on
stores of different sizes.

You might also like