You are on page 1of 3

# PPE 6044 Statistik Gunaan

1

TAKE HOME TEST
Question 1
Using Data File: staffsurvey.sav
a. Follow the procedures to generate appropriate descriptive statistics to answer the following
questions:

i. What percentage of the staff in this organization are permanent employees (use the
variable employstatus)?

ii. What is the average length of service for staff in the organization (use the variable
service)?

iii. What percentage of respondents would recommend the organization to others as a good
place to work (use the variable recommend)?

b. Assess the distribution of scores on the Total Staff Satisfaction scale (totsatis) for employees who
are permanent versus casual (employstatus).

i. Are there any outliers on this scale that you would be concerned about?

ii. Are scores normally distributed for each group?

Question 2
Using Data File: sleep.sav
a. Generate a histogram to explore the distribution of scores on the Epworth Sleepiness Scale
(ess).

b. Generate a bar graph to compare scores on the Sleepiness and Associated Sensations Scale
(totSAS) across three age groups (agegp3) for males and females (gender).

c. Generate a scatterplot to explore the relationship between scores on the Epworth Sleepiness
Scale (ess) and the Sleepiness and Associated Sensations Scale (totSAS). Ask for different
markers for males and females (gender).

d. Generate a boxplot to explore the distribution of scores on the Sleepiness and Associated
Sensations Scale (totSAS) for people who report that the do/ do not have a problem with their
sleep (problem).

e. Generate a line graph to compare scores on the Sleepiness and Associated Sensations Scale
(totSAS) across the different age groups (use the agegp3 variable) for males and females
(gender).

PPE 6044 Statistik Gunaan

2

Question 3
Using Data File: Data Exam
a. Start with a visual inspection of the data draw a scatter plot with the variable Preparation on the
x-axis and Mark on the y-axis. Does the graph indicate any association between the two
variables?

b. To obtain a quantitative measure of the degree of association between the two variables, select
Correlate Bivariate from the Analyze pull-down menu. In the dialogue box, move the two
variables Preparation and Mark into the Variables frame. Make sure that Pearson is selected in
the Correlation Coefficients frame and that Two-tailed is selected in the Test of Significance
frame. Click OK. Take a look at the output table: What is the correlation between the two
variables? Is it significant?

c. To perform a linear regression analysis, select Regression Linear from the Analyze pull down
menu. In the Linear Regression dialogue box select Mark as Dependent and Preparation as
Independent. To get some graphs for the model validation, click the Plots button and select
*SDRESID as Y and *ZPRED as X. Select Histogram and Normal probability plot in the
Standardized Residual Plots frame. Click Continue and the OK in the Linear Regression box
to perform the analysis. Take a look at the output:

i. Compare R in the Model Summary table with the correlation coefficient that you obtained
above.

ii. From the results presented in the ANOVA table, what can you say about the overall linear
regression model is it significant?

iii. What is the intercept and slope of the estimated regression line, and are they both
significantly different from zero? Also compare the sign of the estimated slope with the sign of
the correlation coefficient should they have equal or opposite signs?

iv. The ANOVA and the t-tests used in the analysis assume that the errors are normally
distributed and have constant variance. Are there any signs in the plots of the residuals that
these assumptions are not met?
.
d. To insert a regression line in the scatter plot of the data, right-click on the graph and choose Edit
Content In Separate Window from the pop-up menu. In the Chart Editor select Fit Line at
Total from the Elements pull-down menu. Make sure that Linear is selected in the Fit Method
box in the Fit Line tab. Then click Close to proceed. The least squares linear regression line
should now be visible in the graph. Close the Chart Editor.

e. Perform the corresponding analysis with the variable Pub (which represents hours spent at the
pub instead of preparing for the exam) as independent variable and Mark as dependent variable.

f. Finally, do the same with the variable Commuting (which represents hours spent commuting to
and from the university) as independent variable and Mark as dependent variable.

PPE 6044 Statistik Gunaan

3

Question 4

The accompanying data is on y = profit margin of savings and loan companies in a given year, x1 = net revenues in
that year, and x2 = number of savings and loan branches offices.

X1 X2 y X1 X2 y
3.92 7298 0.75 3.66 6546 0.78
3.61 6855 0.71 3.97 7115 0.70
3.32 6636 0.66 3.82 6890 0.79
3.07 6506 0.61 4.07 7327 0.68
3.06 6450 0.70 4.25 7546 0.72
3.11 6402 0.72 4.41 7931 0.55
3.21 6368 0.77 4.49 8097 0.63
3.26 6340 0.74 4.70 8468 0.56
3.42 6349 0.90 4.58 8717 0.41
3.42 6352 0.82 4.69 8991 0.51
3.45 6361 0.75 4.71 9179 0.47
3.58 6369 0.77 4.78 9318 0.32
3.78 6672 0.84

a. Determine the multiple regression equation for the data.

b. Compute and interpret the coefficient of multiple determination, R
2
.

c. At the 5% significance level, determine if the model is useful for predicting the response.

d. Create scatterplots to check Assumption 1 as well as to identify potential outliers and potential influential
observations.

e. At the 5% significance level, does it appear that any of the predictor variables can be removed from the full
model as unnecessary?

Hint:
1. Type in ARIAL , font 10
2. Paste the SPSS output by using paste special [Picture (Enhanced Metafile)] to the word.
3. Report must be your own approach ( can refer from the statistics books) and strictly do not copy paste from
others.
4. Put your name, programme, matric no, mobile phone, passport photo and e-mail: Compulsory
5. Due date: 5 June 2014
6. Address:Jamal @ Nordin Yunus, Fakulti Pengurusan Dan Ekonomi, UPSI.

THE END