You are on page 1of 13

Business Statistics

WORKSHOP 3 SOLUTIONS

Q3.1
A study of Melbourne’s climate compared annual rainfall values for the 75 years from 1861
to 1935 (“Historical”) with annual rainfall values for the following 75 years 1936 to 2010
(“Recent”).

Using the exhibits below, compare the distribution of yearly rainfall totals for the “Historical”
period with the distribution for the “Recent” period by making appropriate references to the
(i) measures of central location,
(ii) measures of variability, and
(iii) shapes
of the distributions.

1
Business Statistics

SOLUTION:

Measures of central location (L)

• Mean Historical 653.6 vs Recent 642.1 mm


• Median Historical 647.3 vs Recent 625.7 mm

• Modal class 600-700 mm for both periods.


Of course, modal class is dependent on the bins (class intervals) which are used.

Overall, with a lower mean and median, we can conclude that RECENT average annual
rainfall is lower than HISTORICAL average annual rainfall.

(Of course, modal class is dependent on the bins (class intervals) which are used. As usual, for
a numeric variable with many possible values, the mode is of no interest. We would usually
refer to the modal class)

Measures of variability (V)

• Range Historical 570.4 vs Recent 542.6 mm

HISTORICAL rainfall range is larger than the RECENT rainfall range.

(Historical has higher minimum and higher maximum. This fact does not, by itself, assure

that the Historical range will be larger.

Example, A: min 1, max 19 range = 18. B: min 4, max 20  range = 16. B has higher max
and min, but smaller range.)

• Standard deviation Historical 127.7 vs Recent 138.1 mm.


The annual rainfall values are more spread out in RECENT times compared to HISTORICAL
times.

• Interquartile range Historical 159.2 vs Recent 243.0 mm.


The spread of the middle 50% of annual rainfall is larger (more variable) for RECENT times
compared to HISTORICAL times.

• Coefficient of variation Historical 19.5% vs Recent 21.5%.

2
Business Statistics

Co-eff of variation measures the relative variability (std deviation as a percentage of mean)
and is higher for RECENT rainfall than for HISTORICAL.

Overall then, the above evidence shows that whilst HISTORICAL annual rainfall has a larger
range, RECENT annual rainfall has greater variability.

Shape (S)

Both distributions are unimodal. They are very close to being symmetrical (mean ≈ median)
OR only slightly positively skewed with the mean being slightly greater than the median. The
difference in the mean and median for RECENT is slightly more pronounced, indicating that
there must have been an unusually high rainfall at some point.

Video Solution for Q3.1

Q3.2
PlanFinan Pty Ltd

PlanFinan is a financial planning organisation. The data in PlanFinan.xlsx was obtained from
PlanFinan’s database of a particular group of clients. Definitions are given for the following
variables:
 Sex 1 if the client is male; 0 for female
 EducLevel 1: high school incomplete 2: high school complete
3: undergraduate degree 4: postgraduate degree
 Salary Annual salary ($,000)

Use the following pivot table to analyse the data and report on how sex and educational
level are associated with annual salary for this group of clients.

Exhibit 1

Average of Salary EducLevel


Sex 1-2 3-4 Grand Total
Female $90.07 $109.17 $102.35
Male $83.98 $98.91 $92.69
Grand Total $87.17 $104.92 $98.12

3
Business Statistics

SOLUTION
Step One:
Identify the dependent and independent variables
Dependent variable (DV) Salary ($,000) (DV)
Independent variables (IV) Sex (IV1),
EducLevel (IV2),

For all answers in each step, refer to Exhibit 1


Step Two:
Describe any relevant overall features of Salary.

Describe the overall behaviour of mean Salary.

The average salary of clients is $98,120.

Step Three:
Describe the overall relationship between Salary and Sex (IV1).

Salary and Sex (IV1)


The mean Salary is greater for female clients ($102,350) than for male clients ($92,690).

This is also true for each level of education. Female clients with a high level of education
have a mean salary ($109,170) which is greater than their male counterparts ($98,910).

Female clients with a lower level of education have a mean salary ($90,070) which is
greater than their male counterparts ($83,980).

Step Four:
Describe the overall relationship between Salary and Education Level (IV2).

Salary and Education Level (IV2)


The mean Salary is greater for clients with a high level of education ($104,920) than for
clients with a lower level of education ($87,170).

This is also true for each sex. Female clients with a high level of education have a mean
salary ($109,170) which is greater than the female clients who have a lower level of
education ($90,070).

Male clients with a high level of education have a mean salary ($98,910) which is
greater than the male clients who have a lower level of education ($83,980).

4
Business Statistics

Overall Conclusion:
 Female clients with a high level of education have a higher mean salary ($109,170)
than any other client group.
 Male clients with a low level of education have the lowest mean salary ($83,980) of
all client groups.
Video solution for Q3.2

Q3.3
Hytex Company

Hytex Company is a direct marketer of electronic equipment and wants to investigate the
efficacy of catalogue mailings to its 1,000 mail order customers. Hytex is currently
sending catalogues to customers who are not married and are renting a house. Catalogue
Marketing.xlsx contains customer demographic attributes including the following:

 OwnHome 1 if the customer owns their home; 0 if renting


 Marital Status 1 if the customer is married; 0 otherwise
 AmountSpent Amount spent ($) for most recent transaction

Use the following pivot table to analyse the data and report on how AmountSpent is related
to these demographic attributes. Is Hytex sending catalogues to the right customers? If not,
to whom should the catalogues be sent to?

Exhibit 2

Average of
AmountSpent Marital Status
OwnHome Not married Married Grand Total
Renting $597.72 $1,339.02 $868.82
OwnHome $1,015.12 $1,853.45 $1,543.14
Grand Total $757.81 $1,672.07 $1,216.77

5
Business Statistics

SOLUTION
Step One:
Identify the dependent and independent variables
Dependent variable (DV) Amount Spent ($) (DV)
Independent variables (IV) OwnHome (IV1),
Marital Status (IV2)

For all answers in each step, refer to Exhibit 2

Step Two:
Describe any relevant overall features of AmountSpent.

Describe the overall behaviour of mean AmountSpent.

The average amount spent by customers is $1,216.77.

Step Three:
Describe the overall relationship between AmountSpent and OwnHome (IV1).

AmountSpent and OwnHome (IV1)


The mean AmountSpent is greater for homeowners ($1,543.14) than for renters ($869.82).

This is also true for each marital status. Married homeowners have a mean AmountSpent
($1,853.45) which is greater than married renters ($1,339.02).

Homeowners that are not married have a mean AmountSpent ($1,015.12) which is greater than
renters who are not married ($598.72).

Step Four:
Describe the overall relationship between AmountSpent and Marital Status (IV2).

AmountSpent and Marital Status (IV2)


The mean AmountSpent is greater for married customers ($1,672.07) than for
customers who are not married ($757.81).

This is also true for each level of OwnHome. Married homeowners have a mean
AmountSpent ($1,853.45) which is greater than homeowners who are not married
($1,015.12).

Married renters have a mean AmountSpent ($1,339.02) which is greater than renters
who are not married ($597.72).

6
Business Statistics

Now, it is important to address HyTex’s question. Is it is sending the catalogues to the right
customers? If not, to whom should HyTex send the catalogues?

Overall Conclusion:

• Hytex is not sending catalogues to the right segment of customers.

• Married customers who own their own home have a higher mean spend
($1,853.45) than any other customer group.
• Married customers who rent have the next highest mean spend ($1,339.02).
• So, it seems to make good sense for HyTex to send catalogues to these customer
segments instead of sending them to unmarried customers who rent a home.

Video Solution for Q3.3

7
Business Statistics

FURTHER PRACTICE QUESTIONS


Q3.4

As part of an Australian Household Expenditure Survey (1988-89), the following data was
collected for 1000 households:

INCOME = Weekly household income (in dollars)


CONSUME = Consume alcohol (1 = yes, 0 = no)
The variable income was studied for the two groups: “Consume alcohol”, and “Do not
consume alcohol”, and the following graphs and summary statistics were obtained.

Exhibit 1:

Percentage frequency for income of


households that consume alcohol
50.0%
40.0%
30.0%
20.0%
10.0%
0.0%

Weekly income ($)

Exhibit 2:

Percentage frequency for income of


households that DO NOT consume
alcohol
50.0%
40.0%
30.0%
20.0%
10.0%
0.0%

Weekly income ($)

8
Business Statistics

Exhibit 3: Summary Statistics for Weekly Income ($)

DO NOT consume alcohol Consume alcohol


Mean 456.9 708.4
Median 353 638.5
Modal class $0-$250 $500-$750
Standard deviation 403.0 461.3
Coefficient of variation 88.2% 65.1%
Minimum 12 12
Maximum 3846 3696
Range 3834 3684
Lower quartile 173.75 356.75
Upper quartile 632.25 936
Interquartile range 458.5 579.25
Count 234 766

Using the above results, compare the distribution of the variable “Income” for the two groups,
discussing typical values (i.e. “central tendency”), how spread out the values are (“variability”),
and the shape of the distributions.

Comment on what this tells us about the association between income and the consumption
of alcohol.

SOLUTION:

Measures of central location (L) - refer to Exhibit 3

 Mean $708.40 Consume vs $456.9 Do Not Consume


 Median $638.5 Consume vs $353 Do Not Consume
 Modal class $500-$750 Consume VS $0-$250 Do Not Consume

9
Business Statistics

With a lower mean and median, we can conclude that the average weekly household
income for the group that does not consume alcohol is much lower than the group that
consumes alcohol.

The group that does not consume alcohol has a clear modal class of $0-$250 while that of the
group that does consume alcohol is not really clear, $500-$750 if pushed.

Measures of variability (V) – refer to Exhibit 3

• Range $3684 Consume vs $3834 Do Not Consume

The income range of the group that does not consume is larger than the group that
consume alcohol by $150.

(Both minimums are the same but Do Not Consume has a higher Maximum, hence a larger
range)

 Interquartile range Consume $579.25 vs Do Not Consume $458.5

The spread of the middle 50% of the income for the group that consumes alcohol is larger
(more variable) than the group that does not consume alcohol.

 Standard deviation Consume $461.3 vs $403 Do Not Consume

The distribution of incomes for households that do not consume alcohol has a lower standard
deviation and interquartile range than the distribution for those that do consume alcohol.

 Coefficient of variation Consume 65.1% vs Do Not Consume 88.2%.

Co-eff of variation measures the relative variability (standard deviation as a percentage of


mean) and is higher for the group that does not consume alcohol. This means that there is
greater variability.

10
Business Statistics

Overall, although the measures of absolute variability (interquartile range and standard
deviation) are higher for the households that consume alcohol, the relative variability as
measured by the coefficient of variation is considerably higher for the income distribution of
households that do not consume alcohol. This is because the standard deviation is about 88%
of the mean, whereas for the income distribution of alcohol-consuming households the
standard deviation is only about 65% of the mean.

Shape (S) – refer to Exhibits 1, 2 and 3.

The distribution of weekly income is skewed to the right and unimodal for both groups. This
means that in both cases, the mean is greater than the median. This suggests that there are
a “few” very large incomes.

Among those who do not consume alcohol, the distribution is more strongly skewed given
that the difference between mean and median is larger.

Overall then, the above evidence shows that the group that do not consume alcohol has a
larger range and greater relative variability in weekly income.

11
Business Statistics

Q3.5
The side by side boxplots below show the distribution of age at marriage of 45 married men
and 38 married women.

a. Compare the two distributions in terms of:


(i) measures of central location,

(ii) measures of variability, and

(iii) shape

b. Comment on how the age at marriage of men compares to women for the data.

Solution:
a.
i. Mean: Female 22 years vs Male 25 years
Median: Female 21 years vs Male 23 years
Both the mean and median age of marriage for men is greater than women. Hence,
the overall average age of marriage is greater for men than women.
ii. The range of age at marriage is greater for men (R= 26 years) than women (R= 22
years).

12
Business Statistics

The IQR is also greater for men (IQR= 11 years) than women (IQR= 8 years). This means
that the middle 50% of men get married between the ages of 20 – 31 years (IQR = 11
years), compared to the middle 50% of women who get married between the ages of
19 – 27 years (IQR = 8 years), which is both younger and less varied.

iii. The distributions of age at marriage are positively skewed for both men and women
as shown by the mean > median. This means that for both men and women, there
must be a small number who married at a much older age which is responsible for
dragging the mean upwards in a way that results in it no longer representing the
‘average’ age.

b. The men, on average, married at an older age and the age at which they married is
more variable, i.e. spread across a wider range of ages.

13

You might also like