You are on page 1of 17

EPGDIB HYBRID -2019-20

ASSINGNMENT

TOPIC
Business Statistics

By Prasoon Sinha
Roll No -36
INDIAN INSTITUTE OF FOREIGN TRADE
EPGDIB Hybrid (19-20)
SUBJECT: BUSINESS STATISTICS
ASSIGNMENT-1(Date of Submission-30/11/2019) (Individual
Submission

Descriptive Statistics
i. A manufacturer of dog food was planning to survey households in
India to determine purchasing habits of dog owners. Among the
variable to be collected are:
a. The primary place of purchase for dog food
b. The number of dogs living in the household
c. Whether the dog is pedigreed
For each of the three variables listed, indicate whether the variable is
categorical or numerical. If it is numerical, is it discrete or continuous?
Give Reasons

Ans.
a.The primary place of purchase for dog food - Categorical variable
b.The number of dogs living in the household - Numerical variable
c.Whether the dog is pedigreed - Categorical variable

ii. In which level of measurement can each of these variables be


expressed? Give reasons.
a. The bank account number - Nominal level of Measurement
b. The position of a ship at sea, in longitude - Interval - It is distance and
they have a meaning equal to zero
c. The colour of a karate belt - Ordinal - ranking is there on colours
d. Sales revenue of a company - Ratio - it is measurable

iii. The following is the information about the settlement of an industrial


dispute in a factory. Comment on the losses & gains from the point of view
of workers & that of management,

Before After
No. of workers 3000 2900
Mean wages (Rs) 2200 2300
Median wages (Rs) 2500 2400
Standard deviation 300 260
Ans .
The comments on gains and losses from both workers and managements
points of view are as follows;

Total wages bill


Before - 3000*2200 = 6600000
After - 2900*2300 = 6670000

Total wage bill has been increased after the settlement of dispute workers
retained after the settlement are 50 workers less than the previous numbers.
After the settlement of dispute , the workers as a group are better off in terms
of monetary gain. If the workers efficiency remains the same, then it is against
the interest of the management. But if the workers feel motivated, resulting in
increased efficiency, then management can achieve , higher productivity. This
would be an indirect gain to the management also.
Since the workers retained after the settlement of the dispute are less than the
number of employed before, it is against the interest of the workers.

Median Wages.
The median wages after the settlement of dispute has come down from
Rs2500 to Rs2400 . This indicates that before the settlement, 50 percent of the
workers are getting, wages above Rs2500 but after the settlement, they will be
getting only Rs2400. It has certainly gone against the interest of the workers.

Uniformity in the wage structure.

The extent in relative uniformity in the wage structure, before and after the
settlement can be determined by comparing the coefficient of variation as
follows;
Before After
Coefficient of variation - 300/2200 * 100 =13.65 260/2300 * 100 = 11.30

Since CV has decreased after the settlement from 13.63 to 11.3 , the distribution
of wages is more uniform after the settlement, that is , there is now
comparatively less disparity in wages received by the workers. Such a position
is good for both the workers and the management in maintaining the cordial
work environment.
Pattern in Work Structure
The nature and pattern of the work structure before and after the settlement can
be determined by comparing the coefficient of skewness
Coeffficient of skewness , SKp
Before 3(2200- 2500) /300 = -3(3(2300-2400))/260 = -1.15

Since coefficient of skewness is negative and has increase after the


settlement ,therefore it suggests that number of workers getting low wages has
increased and that of workers getting high wages has decreased after the
settlement.

iv. Samples of light bulbs were bought from two suppliers and were
subjected to destruction test in the lab. Following data are collected
on the life.

Life in hours 700-800 800-900 900-1000 1000-1100 Total


Supplier A 14 74 29 13 130
Supplier B 12 58 32 18 120

a. Which supplier provides greater average life?


b. Which supplier provides uniform quality?
c. Which supplier would you prefer?

Ans.

Life in hours Supplier A Supplier B


700-800 14 12
800-900 74 58
900-1000 29 32
1000-1100 13 18
Total 130 120

For A

Life in hours Class mark(f) Supplier A(x) Supplier A(x)*f


700-800 750 14 10500
800-900 850 74 62900
900-1000 950 29 27550
1000-1100 1050 13 13650
Total 130 114600

Mean = Sum(x*f)/f=114600/130=881.538

Median = L+ ((1/n-f)/fm)*c
=800+((65-14)/74)*100
= 868.9189
1/130=65
L=800
fm=74
n=130
c=100
f=14

For B
Life in hours Class mark(f) Supplier B Supplier B(x)*f

700-800 750 12 9000


800-900 850 58 49300
900-1000 950 32 30400
1000-1100 1050 18 18900
Total 120 107600

Mean = Sum(x*f)/f=107600/120=896.66

Median = L+ ((1/n-f)/fm)*c
= 800+(60-12)/50*100
= 882.7586

L=800
N=120
F=12
fm=58

Since the mean of B is greater than supplier A i.e 896.67>881.53 ,average life
of B is greater than A ,
b)

For A
xbar = 881.538

Life in hours Class mark(x) Supplier A(f) Supplier A(x)*f (x-xbar) (x-xbar)^2 f*(x-xbar)^2
700-800 750 14 10500 -131.5384615 17302.36686 242233.1361
800-900 850 74 62900 -31.53846154 994.6745562 73605.91716
900-1000 950 29 27550 68.46153846 4686.982249 135922.4852
1000-1100 1050 13 13650 168.4615385 28379.28994 368930.7692
Total 130 114600 0 0

881.5384615 820692.3077
6313.017751
standard
deviation 79.45450114

For B

Life in hours Class mark(x) Supplier B Supplier B(x)*f (x-xbar) (x-xbar)^2 f*(x-xbar)
700-800 750 12 9000 -147 21511 258133
800-900 850 58 49300 -47 2178 126311
900-1000 950 32 30400 53 2844 91022
1000-1100 1050 18 18900 153 23511 423200
Total 120 107600 0 0
897 50044 898667
7489
standard deviation 87

Since the standard deviation of A is less hence supplier A provides uniform


quality
Random Variable

i. The daily world price of refined sugar in cents per pound in April
2004 can be inferred to have the following distribution:

X 7 8 9 10 11 12
P(x) 0.05 0.10 0.25 0.40 0.15 0.05

a. Show that P(x) is a valid probability distribution.


b. What is the probability that price on a given day during this period
will be at least 9 cents per pound?
c. What is the probability that price on a given day during this period
will be less than 11 cents per pound?
Ans.
a) To prove Px is a valid probability distribution. P(x)=1
Proof::
LHS = P(7)+P(8)+P(9)+P(10)+P(11)+P(12)
= 0.05+0.1+0.25+0.40+0.15+0.05
=1
Hence proved

b) P(x>=9) = P(x=9) +P(x=10)+P(x=11)+P(x=12)


=0.25+0.40+0.15+0.05
=0.85 or 85% chance are there

c) P(x<11) =P(x=7)+ P(x=8)+P(x=9)+P(x=10)


=.05+0.10+0.25+0.4
=0.8
= 80% chance are there

ii. Returns on a certain business venture , to the nearest $1000, are


known to follow the probability distribution

X -2000 -1000 0 1000 2000 3000


P(x) 0.1 0.1 0.2 0.2 0.3 0.1

(a) Is the venture likely to be successful? Explain.


(b) What is the long term average earning of business venture of this
kind? Explain.
(c) What is a good measure of risk involved in a venture of this kind?
Compute this measure.

Ans.
Probability Distribution

i. The annual return of a well-known mutual fund has historically had


a mean of about 10% and a standard deviation of 21%. Suppose the
return for the following year follows a normal distribution, with the
historical mean and standard deviation. What is the probability that
you will lose money in the next year by investing in this fund?

Ans.

1) To compute the probability, we first convert the value of zero to its


corresponding standard normal value (z).
z=(X−mean)/sdz=0−0.10/0.21z=−0.476or−0.48z=(X−mean)/sdz=0−0.10/0.21z
=−0.476or−0.48

2) Refer to the z-table to find the area below z=-0.48.


P(Z<−0.48)=0.3156

3) The probability that you will lose money in the given fund is 31.56%.

ii. A machine produces steel rods. The lengths of the rods are normally
distributed with a mean of 26 cm and a SD 1 cm. Rods that are
longer than 27 cm or shorter than 24 cm have to be discarded. The
machine produces 500 rods per shift. How many rods per shift have
to be discarded?
Ans.
P(24cm <x<27cm)

For P(24)
Z=(24-26)/1=-2
From Z table
Z(-2) =.0228

For P(27)
Z=27-26/1=1

From Z table
Z(1) = .8413

P(24cm<x<27cm) = 0.8413-0.0228
= .8185 or 81.85% probability

Since the machine produces 500 rods per shift and the probability of
rods being less than 27cm and greater than 24cm is 81.85% ,therefore the
probability of rods not coming in this and being rejected is
100-81.85=18.15%
Therefore the no of rods getting rejected = 500*(18.15/100)
94.25 rods or 94 rods

iii. Fluctuations in the prices of precious metals such as gold have been
empirically shown to be well approximated by a normal distribution
when observed over short interval of time. In May 1995, the daily
price of gold (1 troy ounce) was believed to have a mean of $383 & a
S.D of $12. A broker, working under these assumptions, wanted to
find the probability the price of gold the next day would be between
$394 & $399per troy ounce. In this eventuality, the broker had an
order from a client to sell the gold in the client’s portfolio. What is
the probability that the client’s gold will be sold the next day?

Ans.
Using the normal distribution, the z-scores for 394 and 399 are:

z1 = (394 - 383) / 12 = 0.9167

z2 = (399 - 383) / 12 = 1.3333

The probability that the price of gold will be between 394 and 399 is:

P(394 < x < 399) = P(0.9167 < z < 1.3333) = P(z < 1.3333) - P(z <
0.9167) = 0.9088 - 0.8203 = 0.0885.

The probability that the client's gold will be sold the next day is
also 0.0885.

(Technically, this is the probability that the client's gold will


be offered for sale. The assumption is that if the gold is offered at the
current price that it will be sold. This may or may not turn out to be the
case. Also, the question seems to ignore the possibility that the price
could go higher than 399? Why would the client not want to sell if the
price went even higher? So the realistic probabilty of the client's gold
being sold is probably higher than the value noted above.
Sampling Distribution

i. International mutual funds reported weak returns in 2008. The


population of international mutual funds earned a mean return of -
43.95% in 2008 (data extracted from The Wall Street Journal,
January 2, 2009, p. R6). Assume that the returns for international
mutual funds distributed as a normal random variable, with a mean
of -43.95 and a standard deviation of 20. If you selected a random
sample of 10 funds from this population, what is the probability that
the sample would have mean return
a. less than -10
b. between 0 and -20

Ans.

a. Less than -10

Population mean = 43.95


Population standard deviation = 20
Population not known
Sample size =10
Need to find P ( 𝑥̅ <-10)
P(Z<(-10-µ/(ॣ/sqrt10)
P(Z<5.37)
=.9986
Since 5.37 lies beyond the maximum Z limit
There is 99.86% chance that the sample may have return less than -10

b.between 0 and -20

to find P(0<𝑥̅ <-20)


P(o-µ/(20/sqr10) <Z < -20-µ/(ॣ/sqrt(n))
P((0+43.95)/(20/sqrt10))<Z<(-20+43.95)/(20/sqrt(10))
P(6.94<Z<3.78)
Since both the values 6.94 & 3.94 are not there in the z table we consider
Z to be maximum that is more than 99.86% tending towards 100%

ii. According to Business Week, profits in the energy sector have been
rising, with one company averaging $3.42 monthly per share. Assume
this is an average from a population with SD of $1.5. If a random
sample of 30 months is selected, what is the probability that its
average will exceed $4?

Ans.

Given population mean = $3.42


Population SD (sigma)=$1.5
Based on 30 month mean = $4
Need to find P(xbar>=4)
Since n=30 , Central limit theorem states that x mean approximately normal
Xbar ~ N (3.42, 1.5/sqr30)
Z=(xbar-mean)/(standard deviation /sqrt(n))
P(Z>= 4-3.42) (1.5/sqrt30)
P(Z>=0.58)/(1.5/5.48)
P=(Z>=(.58/.27)
P=(Z>=2.15)
1-.4842(as per z table)
=.5158or 51.58% or 52%
Thus the probability is 52% that its average will exceed $4
Confidence Interval Estimation

i. The U.s Department of Transportation requires tire manufacturers


to provide tire performance information on the sidewall of a tire to
better inform prospective customers as they make purchasing
decisions. One very important measure of tire performance is the
tread wear index, with a tire graded with a base of 100. A tire with a
grade of 200 should last twice as long, on average, as a tire graded
with a base of 100. A consumer organization wants to estimate the
actual tread wear index of a brand name of tires that claims “graded
200” on the sidewall of the tire. A random sample of n= 18 indicates a
sample mean tread wear index of 195.3 and a sample standard
deviation of 214
a. Assuming that the population of tread wear indexes is normally
distributed, construct a 95% confidence interval estimate for the
population mean tread wear index for tires produced by this
manufacturer under this brand name.
b. Do you think that the consumer organization should accuse the
manufacturer of producing tires that do not meet the performance
information provided on the sidewall of the tire? Explain.

Ans.
Ho: μ = 200
HA: μ ≠ 200

n = 18
x-bar = 195.3
s = 21.4
df = 17

a.
Standard Error (SE) = s/√n = 21.4/√18 = 5.044
tα/2, df=17 = 2.11 (from a t-table)
Margin of Error (ME) = ±(tα/2, df=17 × SE) = ±(2.11 × 5.044) = ±10.64
Lower Bound of 95% CI = x-bar – ME = 195.3 – 10.64 = 184.66
Upper Bound of 95% CI = x-bar + ME = 195.3 + 10.64 = 205.94

b.
No, there is no grounds to allegate wrong-doing because the 95% confidence interval contains
200, which means that the sample mean is not statistically significantly different from the
claimed tread wear index.
The confidence interval is:(184.66, 205.94)

The test statistic for this sample is:


t = (195.3 - 200) / (21.4 / √18) = -0.9317
The critical value for a left-tailed test with alpha = 0.05 and 17 degrees of freedom is t =-1.74
Since the test value is greater than the critical value, the difference between the sample mean
and the value provided on the sidewall of the tire is not statistically significant.

ii. An advertising agency that serves a major radio station wants to


estimate the mean amount of time that the station’s audience spends
listening to the radio daily. From past studies, the standard deviation
is estimated as 45 minutes.
a. What sample size is needed if the agency wants to be 90% confident
of being correct to within ±5 minutes?
b. If 99% confidence is desired, how many listeners need to be
selected?
Ans. a.
n = [z(x/2) * s / B]^2
n = [z(0.1/2)* 45/5]^2
n = [z0.05 * 45/5]^2
n = [(1.645* 45)/5]^2
n = (1.645*9)^2
n = 219.188025
n = 219
A sample size of n=219 is needed if the agency wants to be 90% confident of being
correct to within 5 minutes.

Ans.b.
Given that o = 45 mins, e = + 5mins.For a 90% confident level,
the corresponding Z critical value is 2.576.n= Z 2σ 2/e2= (2.5762x 452) / 52= 537.5
A sample size of n=538 is needed if the agency wants to be 99% confident of being correct to
within 5 minutes.

iii. According to a survey, the average cost of a fast food meal (quarter
pound cheese burger, large fries, and medium soft drink excluding
taxes) is $4.82. Suppose this figure was based on a sample of 27
different establishments & the SD was $0.37.
a. Construct a 95% confidence interval for all fast food meals. Assume
that, the costs of fast food meal are normally distributed.
b. Interpret the interval constructed in (a).
c. Using the interval as a guide, is it likely that population mean is really
$4.50? Why, or why not
Ans.
Confidence intervals are used to find a region in which we are 100 * ( 1 - α )%
confident the true value of the parameter is in the interval.
In order for the Confidence Interval to be valid you must have data from a normal
distribution, at least if you are using the method here. If you do not have normal data
then this type of confidence interval is not valid.
To clear up the notation I will use here. "t" is the test statistic and "t_(n-1)" is a
Student t random variable with n - 1 degrees of freedom, e.g. a Student t random
variable with 18 degrees of freedom is denoted as t_18.For small sample confidence
intervals about the mean you have:
xBar ± t * sx / sqrt(n)
where xBar is the sample mean
t is the t - score with n - 1 degrees of freedom such that α% of the data in the tails,
i.e., P( |t_(n-1)| > t) = α sx is the sample standard deviation
n is the sample size
The sample mean xbar = 4.82
The sample standard deviation sx = 0.37
The sample size n = 27
The t score for a 0.95 confidence interval is the t score such that 0.025 is in each tail.
t = 2.055529
The confidence interval is:
( xbar - t * sx / sqrt( n ) , xbar + t * sx / sqrt( n ) )
( 4.673633 , 4.966367 )

iv. A survey is planned to determine the mean annual family medical


expenses of a large company. The management of the company
wishes to be 95% confident that the sample mean is correct to be
within ±$50 of the population mean annual family medical expenses.
A previous study indicates that the standard deviation is
approximately $400.

a. How large a sample is necessary?


b. If management wants to be correct to within ±$25, how many
employees need to be selected?
Ans.
I'm assuming the value of $400 is supposed to be assumed to be the population
standard deviation, σ.
Recall the formula for determining the endpoints of a confidence interval for a mean,
given the population standard deviation:
μ +- z* (σ / √n)
If we want to be correct within +- $50, then, it stands to reason that we want z* (σ /
√n) to be equal to $50 (more specifically, less than or equal to $50).
z* = 1.96 for a 95% CI and σ = $400 as given.
Therefore,
1.96 ($400 / √n) <= $50
n >= 245.9
Note the inequality: sample size must be "greater than or equal" to 245.9, so we
would round up to 246.
Then, for +- $25 accuracy, we notice that the value of z* doesn't change (still 95%
CI), only the error does, so the inequality we need to solve is:
1.96 ($400 / √n) <= $25
n >= 983.4
I know it's tempting to use n = 983, but that would be incorrect. The inequality
specifically says that n should be "greater than or equal to" 983.4 and so we must
use n = 984. This concept makes sense conceptually as well; if n=983.4 creates a
interval with error +- $25, we would expect that lowering the sample size increases
the error.

v. A stationery store wants to estimate the mean retail value of


greetings cards that it has in its inventory. A random sample of 100
greeting cards indicates a mean value of $2.55 and a standard
deviation of $0.44. Assuming a normal distribution, construct a 95%
confidence interval estimate for the mean value of all greeting cards
in the store’s inventory

Ans.
n = 100
x-bar = 2.55
s = 0.44
% = 95
Standard Error, SE = σ/Ön = 0.44 /√100 = 0.044
z- score = 1.959963985
Width of the confidence interval = z * SE = 1.95996398454005 * 0.044 = 0.086238415
Lower Limit of the confidence interval = x-bar - width = 2.55 - 0.0862384153197624 =
2.463761585
Upper Limit of the confidence interval = x-bar + width = 2.55 + 0.0862384153197624 =
2.636238415
The 95% confidence interval is [$2.46, $2.64]

-------------------------------------------------------------------------------

You might also like