Professional Documents
Culture Documents
BUS105 Self-Practices
BUS105 Self-Practices
(a) The marketing director at Ace Realty Company has collected selling price
information on the houses sold in the last month for his study on the
market trend. Selling price is reported in thousands of dollars and the
following chart was constructed.
30
Number of homes sold
25
26
20
20
15
16
10 12 13
5 7 7
4
0
<150 [150,175) [175,200) [200,225) [225,250) [250,275) [275,300) >=300
Selling Price (Thousands S$)
ii. Explain the meaning of measure of dispersion and state the values of
three (3) measures of variation shown in the above table.
(a) The following table shows a recent study on the relationship between
gender and interest in coffee:
Does Not
Likes Coffee Like Coffee TOTAL
i. is male,
i. Compute the probability that Mary can sell at least two houses on a
typical week.
ii. How many houses on average can Mary sell on a typical week?
(c) The weight of a bag of rice is normally distributed with mean of 500g and a
standard deviation of 25g.
ii. A box contains 10 such bags of rice. What is the probability that the
mean weight is between 490g and 515g?
Question 3
4.15, 4.3, 4.25, 4.3, 4.3, 4.5, 4.5, 4.25, 4.45, 4.2, 4.1, 4.55
Lifespan
Mean 4.320833
Standard Error 0.042399
Median 4.3
Mode 4.3
Standard Deviation 0.146874
Sample Variance 0.021572
Kurtosis -1.12413
Skewness 0.256023
Range 0.45
Minimum 4.1
Maximum 4.55
Sum 51.85
Count 12
Confidence Level(95.0%) 0.093319
iii. What are the assumptions required for the construction of the
confidence interval and how we could verify it.
(b) The Wawa University wanted to investigate whether it is true that its
engineering graduates earn more than its business graduates. As such,
the University conducted a graduate survey on its recently graduated
students. Unfortunately, due to poor response, only ten engineering and
eight business graduates responded. These students provided information
on their starting salary and Wawa proceeded with its analysis by
generating the following Excel table.
Engineering Business
Mean 30000 29000
Variance 4000000 2285714.286
Observations 10 8
Pooled Variance 3250000
Hypothesized Mean Difference 0
df 16
t Stat 1.169410692
P(T<=t) one-tail 0.129682486
t Critical one-tail 1.745883676
P(T<=t) two-tail 0.259364971
t Critical two-tail 2.119905299
ii. Discuss two (2) statistical concerns you might have in conducting
this statistical analysis.
Question 4
A major portion of Cranberry Food Delivery Pte Ltd involves delivering lunch
boxes to its customers. In order to schedule the deliveries efficiently, the
managers needed to estimate the total travel time for the drivers of the company
on assignments which are carried out in the day.
It is deemed that the daily total travel time is dependent on the number of
delivery stops and distance travelled (in km). Data was thereby collected and
displayed in the following diagram:
1 16.46 3 100
2 15.26 5 70
3 16.48 4 100
4 19.17 4 100
5 12.37 2 80
6 6.5 1 40
7 12.56 4 50
8 13.06 4 70
9 10.07 3 55
10 17.46 5 75
(a) Describe the linear relationship underlying the data by writing down the linear
equation that relates the Total Travel Time to the Number of Deliveries and
Distance Travelled. Interpret the coefficients obtained as they relate to the
managers’ scheduling problem.
(c) A new route is planned on which there will be 3 delivery stops and the length
of the route is 85 km. Estimate the total travel time in hours.
(f) What are the assumptions made when the method of regression is applied to
study data? Discuss these assumptions in the context of the delivery scheduling
problem of Cranberry Food Delivery Pte Ltd.
Solutions to BUS105 Self-Practices
Question 1
a)
iii. From the chart, we can observe that it is positively skewed (or right-
skewed) since it had a longer tail to the right.
iv. The above is a histogram showing the selling price of 105 houses sold by
Ace Realty Company during last month. Generally, most houses were sold
at the range of $150,000 to $275,000. We observed that most houses sold
were at the price range of $175,000 to $200,000. Nevertheless, there
were also substantial number of houses sold above the price of above
$275,000. There were very few houses sold at a price that was below
$150,000. (answer to this question may vary)
b)
iii. In this case, the mean would be the most appropriate measure of location
since the standard deviation is small (about 20% of the mean).
Question 2
a)
300 3
(i) P(male) 0.6
500 5
110 11
(ii) P(female and likes coffee) 0.22
500 50
230
500 23 0.767
300 30
500
b)
µ = 500, σ = 25
510 500
(i) P(X > 510) P( Z )
25
= P(Z > 0.4) = 0.5 - 0.1554 = 0.3446
P (1.26 Z 1.90 )
a)
H0: μ1 - μ2 ≤ 0
H1: μ1 - μ2 > 0 (claim)
(right-tailed)
The two samples were independent. The sample variance were close enough to
infer equal population variance. Assuming that both populations were normally
distributed, we can perform a pooled t-test.
From the Excel table, the test statistics is 1.17. This gave rise to a p-value = 0.13
Conclusion:
Do not reject H0, cannot accept H1 since p-value = 0.13 > 0.05.
Therefore, we cannot accept that engineering graduates earn more than its
business graduates.
1. As the sample sizes were small, there is high chance for the sample to be
biased. Besides, those who replied may be those who are doing well hence they
are more forthcoming in responding to the survey. It may not be representative of
the cohort.
(a)
Estimated Mean Travel Time
= -0.19 + 1.53 (No. of Deliveries) + 0.12 (Distance Travelled)
Interpretation of Coefficients:
For every additional delivery, the estimated mean travel time is increased by 1.53
hours.
For every one additional km travelled, the estimated mean travel time is
increased by 0.12 hour.
(b) The coefficient of determination (R2) is 0.939 and the adjusted coefficient of
determination (adjusted R2) is 0.922.
This means 93.9% (or 92.2% respectively) of the variation in travelling time could
be explained by the variation of the independent variables - Number of Deliveries
and Distance Travel.
H0: β1 = β2 = 0
H1: Not all β's equal 0
α = 0.05
According to statistical report, the test statistic is F = 54.0 and this resulted in a p-
value = 5.56 x 10-5.
H0: β1 = 0 H0: β2 = 0
H1: β1 ≠ 0 H1: β2 ≠ 0
α = 0.05
Since both p-values are less than 0.05, we reject H0 and accept H1 for both
hypotheses.
Therefore, the independent variables Number of Delivery Stops and Distance
Travelled are both significant and do not need to be removed.
(f) Assumptions for the method of regression:
1. There is a linear relationship between the dependent variable and the set of
independent variables.
>> We can observe this through the scatter plots of Travel Time vs No. of
Deliveries and Travel Time vs km Travelled.
2. The variation in the residuals is the same for both large and small values of
the estimated dependent variable.
>> This can be determined by plotting Residual vs Estimated Mean Travel Time.
These residuals should be scattered randomly in an even, horizontal band
around 0 and show no obvious pattern.
>> We can plot a histogram of the residual. We should observe that the
histogram do not fit well in the normal probability distribution.
>> We can compute the VIF for each variable. It all the VIFs are less than 10,
we conclude that multicollinearity is not a concern for this model.
>> The independence of the residuals is subject to the design of the study and
the way the data have been collected. For this case, if the dataset of 10 routes
were randomly selected and are independent, we may conclude that the
assumption is satisfied.