You are on page 1of 4

Business Analytics

Tuesday + Thursday Lectures: 11:30am-12:50pm

Hello.

To clarify on:

Question 3: Here we should do an Analysis of Variance (ANOVA).

● Line 2996 of the HR database has the value "product_mng" on Position. It should read
"Production"

Question 4: Use the t-test: Two-sample

Question 5: Use the Paired Two Sample for Means

If there are other questions about the test, I will update this announcement.

Thank you,

Cris.

November 23rd 11:59pm


Instructions:
▪ The test is made up of one Excel worksheet: (Second Exam Fall 2020.xls)
▪ Download and save the worksheet in your hard drive. You will answer the test in the
worksheet, and upload the worksheet on the Dropbox folder called “Second Exam” on
Courselink.
▪ For every question, create a new tab in Excel and name the tab according to the question
number.
▪ This is an open-book, open-note, open-source exam. However, you may not consult with
your classmates.
▪ One file may be uploaded, please be careful about the file you choose to upload.
▪ Also, give yourself enough time to upload the file, in case of internet failures. All delays will
be penalized with 20% off your test score.
▪ Due Date: Monday, November 23 at 23:59
▪ Total Possible points: 60 points (15% of your total grade)

Individual Questions
1. In a random sample of 321 senior citizens, 61 were found to own a desktop
computer. Based on this sample, what is the 95% confidence interval for the proportion of
desktop owners among senior citizens? Input your answer in the Excel worksheet, under a
tab named “1”. Show your calculation of the standard error, lower value and upper value (3
pts)
2. In a survey of 53 randomly selected patrons of a shopping mall, the mean amount
of currency carried is $42 with a standard deviation of $78. A) What is the 95% confidence
interval for the mean amount of currency carried by mall patrons? Show your calculation of
the standard error, lower value and upper value (3 pts) B) Develop a 95% prediction interval
for the amount of currency carried by mall patrons. (3 pts). C) Explain the difference between
a confidence interval and a prediction interval. Input your answer in the Excel worksheet,
under a tab named “2” (2 pts).

Human Resources Database

Use the HR tab in the worksheet to answer questions 3 – 5. Create a tab in Excel for each
question.

3. At your workplace, there is a big debate among the department managers. Each
manager claims that their department put more hours of work than the rest. You are called to
settle this matter. Determine if the mean average monthly hours differ by position.
• State the null and alternative hypothesis (1pt)
• Show the summary output (1pt)
• State your conclusion (follow the format/wording we have used in class) (1pt)

4. The General Manager has heard a rumor that the employees with a “low” salary
are working more hours (on average) than the employees with a “high” salary. The manager
asks you to find out if the difference in average working hours is significantly more on the
“low” salary employees than on the “high” salary employees.
• State the null and alternative hypothesis (1pt)
• Show the summary output (1pt)
• State your conclusion (follow the format/wording we have used in class) (1pt)

5. During 2020, the organization has been implementing some measurements to


improve the overall employee satisfaction. Using the HR tab, test the hypothesis that the
mean difference between Satisfaction Level 2020 and Satisfaction Level 2019 is different.
Use the pair sample procedure.
• State the null and alternative hypothesis (1pt)
• Show the summary output (1pt)
• State your conclusion (follow the format/wording we have used in class) (1pt) The

Kahana Hotel

6. The Kahana Hotel has 500 guest rooms. Over the last 3 years, average yearly
occupancy has varied a good deal. Their record low was 250 rooms occupied, and their high
was 484. That range of variation makes staff planning difficult.

Leo, the owner, would like to be able to predict the Kahana´s occupancy one month
in advance. So far, he has been making educating guesses about the level of occupancy
one month in the future based on the number of advance bookings. Leo takes the number of
bookings and add another 50% to be on the safe side. Obviously, he has not done very well
with that method.

You need to find out more precisely what is the relationship between advance
bookings and occupancy. Run a regression analysis on Leo´s occupancy and advance
bookings data. (20 points)
• Run a regression using advanced bookings to predict occupancy. (Show the
summary output) (1pt)
• What is the correlation between advanced bookings and the Kahana occupancy
and what does it mean? (2pts)

• How much of the variation in occupancy is explained by the variation in the number
of advanced bookings? (State what the variation and what it means in one sentence) (2 pts)

• Based on the data, how can you be 95% confident that this relationship is
statistically significant? How did you know, and which value did you use? (2 pts)

• For every 100 additional bookings, how many additional guests? (2 pts)

• What is the formula that would help Leo forecast the Kahana occupancy? (3 pts)

• If Leo has 175 advance bookings, how much occupation can he expect? (2 pts)

• If the true population parameters (mean) are the extremes of the confidence
intervals, the estimate might as low as _____ and as high as ____ (3 pts)

• Using the formula, calculate how many advanced bookings should we have to
reach the maximum of 500 occupation? (You will have to play with the numbers. Make sure
your number is closest to 500, without going over 500) (3 pts)

7. Leo was very pleased with your model, but there is still an important amount of
variability unexplained. You want to take a look at the relationship between arrivals on Kauai
and the Kahana´s occupancy numbers. (7 points)
• Run a regression using advanced bookings and arrivals to the island to predict the
occupancy. (Show the summary output) (1pt)
• What is the multiple coefficient of determination in this question and what does it
mean? (2 pts)
• Which number tells you that the dependent variable has a linear relationship with at
least one independent variable? (2 pts)
• For December 2020, If Leo has 175 advance bookings and Kauai is expecting
2,100 arrivals, how much occupation can he expect? (2 pts)
8. You also want to investigate what is the impact of the Kahana´s competition, The
Excelsior. Excelsior´s manager is always offering special promotions that undercut Leo´s
room prices! (7 points)
• Run a regression using advanced bookings, arrivals to Kauai and Excelsior´s
promotions to predict occupancy. (Show the summary output) (1pt)
• How much of the variation is explained by this model? (Answer with a full sentence)
(2 pts)
• Does the dependent variable have a linear relationship with at least one
independent variable? How do you know? (Answer with a full sentence) (2 pts)
• For December 2020, If Leo has 175 advance bookings and Kauai is expecting
2,100 arrivals, and we know that the Excelsior always has a promotion on December, how
much occupation can he expect? (2 pts)

Multiple regression with categorical variables

9. Using the data in the HR tab, find the best multiple regression model for estimating
the “Satisfaction Level 2020”. To build the model you may use all the variables, EXCEPT
“Position”. (6 points).
• Show the summary outputs of your work (1pt)
• What is the R-square and adjusted R-square of your final model? (1 pt)
• Which variable(s) was not significant? (1 pt)
• Write down the formula for your model (3pt)

You might also like