You are on page 1of 16

AKGIM/EXM/FM/02

Ajay Kumar Garg Institute of Management, Ghaziabad


Pre-University Test
Model Solution
Course: MBA Semester: I
Session: 2021-22 Section: 1& 2
Subject: Business Statistics and Analytics Sub. Code: KMBN104
Marks: 100 Time: 3 Hours

OBE Remarks:
Q.No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
CO No. CO4 CO4 CO5 CO5 CO4 CO4 CO5 CO5

Note: Answer all the sections.

Section-A
A. Attempt all the questions. (10x2 =20)

1. Differentiate between descriptive and inferential statistics.


Sol.

Descriptive Statistics Inferential Statistics


Inferential Statistics is a set of methods
Descriptive Statistics is a set of methods
used to make a generalization, estimate,
Meaning which is used to describe data that has
predict or take a decision when we want
been collected, i.e. summarization of data
to draw conclusions about a distribution.
Organize, analyze and present data in a
meaningful way. A distinction is made
What it does? Compares, test and predicts data.
between univariate, bivariate and
multivariate analysis.
Charts, Graphs and Tables. Frequency
Form of final distribution, measures of central tendency,
Probability
Result measures of dispersion and skewness are
used.
To summarize the population data by To generalize the results obtained from a
Usage describing what was observed in the random sample back to the population
sample numerically or graphically. from which the sample was drawn
It attempts to reach the conclusion to
It explains the data, which is already
Function learn about the population that extends
known, to summarize sample.
beyond the data available.

2. Differentiate between skewness and kurtosis.


Sol. Skewness - measures the degree and direction of symmetry or asymmetry of the distribution. A
normal or symmetrical distribution has a skewness of zero (0). For a symmetric distribution
mean=median=mode. Therefore, a distribution may be positively skewed (skew to the right; longer tail

1
to the right; represented by a positive value) or negatively skewed (skew to the left; longer tail to the left;
with a negative value).

Kurtosis - measures how peaked a distribution is and the lightness or heaviness of the tails of the
distribution. In other words, how much of the distribution is actually located in the tails? A normal
distribution has a kurtosis value of zero (0) and is said to be mesokurtic. A positive kurtosis value means
that the tails are heavier than a normal distribution and the distribution is said to be leptokurtic (with a
higher, more acute "peak"). A negative kurtosis value means that the tails are lighter than a normal
distribution and the distribution is said to be platykurtic (with a smaller, flatter "peak").

3. Differentiate between population and sample.


Sol.

4. Differentiate between Fixed base index and chain based index.


Sol.

5. Discuss partition values such as quartiles, decile and percentile.


Sol. Partition values divide the same set of observations in different ways. So, we can
fragment these observations into several equal parts.

2
Median – It is that value of the variable which divides the group into two equal parts, one
part comprising all values greater, and the other part having lesser value than median.
Deciles are those values that divide any set of a given observation into a total of ten equal
parts. Therefore, there are a total of nine deciles. These representation of these deciles are as
follows – D1, D2, D3, D4, ……… D9.
Apercentile basically divides any given observation into a total of 100 equal parts. The
representation of these percentiles are given as – P1, P2, P3, P4, ……… P99.A quartile is a
type of quartile. The first quartile (Q1) is defined as the middle number between the smallest
number and the median of the data set. The second quartile (Q2) is the median of the data.
The third quartile (Q3) is the middle value between the median and the highest value of the
data set.
Partition values Division Notation
Median 2 Med
Quartiles 4 Q1 to Q3
Deciles 10 D1 to D9
Percentiles 100 P1 to P99

6. Regression equation of two variables X and Y are as follows:


2Y - X = 50 and 3Y – 2X = 10
Find mean of X & Y and coefficient of correlation between X and Y.

Sol. Let the first equation is Y on X


Y = X/2 + 25 i.e. byx = ½

Let the second equation is X on Y


X = 3/2 X - 5 i.e. bxy = 3/2

r = √ byx * bxy = 0.866

For calculating mean of X and Y, solve these two equation simultaneously

We got, X = 130 and Y = 90

7. What is the difference between mutually exclusive and independent events?


Sol. If A & B are mutually Exclusive then A ∩ B = ф,
P (A ∩B)=0

If A and B are Independent event:


P(A∩B)= P(A).P(B)

8. Discuss addition and multiplication theorem of probability.


Sol. The addition theorem in the Probability concept is the process of determination of the
probability that either event ‘A’ or event ‘B’ occurs or both occur. The notation between two
events ‘A’ and ‘B’ the addition is denoted as '∪' and pronounced as Union.

3
The result of this addition theorem generally written using Set notation,
P (A ∪ B) = P(A) + P(B) – P(A ∩ B),
Multiplication theorem
Multiplication law in probability applies to combination of events. When the events have to occur
together then we make use of the multiplication law of probability. Now two cases arise: whether
the events are independent or dependent.
Multiplication or Conditional Probability
 The probability of an event B when it is known that the event A has occurred already:
P(B/A)= P(A∩B) / P(A) ; if P(A)>0
ie. P(A∩B)= P(A).P(B/A)
 If A and B are Independent event:
P(A∩B)= P(A).P(B)

9. Differentiate between Type I and Type II error?


Sol. Broadly there are two types of errors Type I and Type II error, defined as follows:

Accept H0 Reject H0

H0 (True) Correct Type I Error


Decision

H0 (false) Type II error Correct


Decision

Type I error is represented when null hypothesis is rejected when it is true. Type II error
is represented when null hypothesis is accepted when it is false. Type I error is
represented as α and Type II error is represented as ß.
(1-ß) is called power of the test.

10. Discuss the need of business analytics in today’s scenario.


Sol. Business Analytics is the use of data, information technology, statistical analysis,
quantitative methods, and mathematical or computer-based models to help managers gain
improved insight about their business operations and make better, fact-based decisions.
Types of Business Analytics
i. Prescriptive analytics
ii. Predictive analytics
iii. Diagnostic analytics
iv. Descriptive analytics

4
Section-B

B. Attempt any three questions. (3x10 = 30)

11. In a hotel total of 500 bulbs were installed simultaneously and their failure over time was
observed as given below. You are required to calculate mean life of bulbs:

End of Week 1 2 3 4 5 6 7
No failure 12 40 108 242 346 428 500
Sol.
End of No Mid
Week failure CI F Point FX
1 12 0-1 12 0.5 6
2 40 1-2 28 1.5 42
3 108 2-3 68 2.5 170
4 242 3-4 134 3.5 469
5 346 4-5 104 4.5 468
6 428 5-6 82 5.5 451
7 500 6-7 72 6.5 468
500 2074

Mean Life of Bulb = 2074 / 500 = 4.148 weeks.

12. Given the figures of production (in thousand tones) of a fertilizer factory:
Year 2002 2003 2004 2005 2006 2007
Production 10 12 15 16 18 19
Fit a straight line trend by least square method and estimate the trend for 2008. Also find monthly
rate of growth.
Sol.
X Y x = 2(X-2004.5) x2 xY
2002 10 -5 25 -50
2003 12 -3 9 -36
2004 15 -1 1 -15
2005 16 1 1 16
2006 18 3 9 54
2007 19 5 25 95
Total 90 0 70 64
Using method of least square
Trend Equation: Y= a + bx
∑Y=Na+b∑x
∑ xY = a ∑ x + b ∑ x2
Using these normal equations
a = 15, b = 0.91 then Y = 15 + 0.91 x
For the year 2008, Y=21.39
Monthly rate of growth

5
13. “Our managers can improve managerial decisions to a great extent, if they are adequately
familiar with the basic tools of statistics.” Explain and illustrate.

Sol. Definitions of statistics:


 The systematic and scientific treatment of quantitative measurement is precisely known as
statistics. – Horace Secrist
 Statistics may be called as science of counting / averages. – Bowley A.L.
 Statistics is concerned with the collection, classification (or organization), presentation and
analysis of data which are measurable in numerical terms. – Croxton & Cowden
Application of statistics could be in the following area:
1. Marketing: Statistical analysis are frequently used in providing information for making decision on
the basis of historical data or the data collected for the purpose to find out what can be sold and the to
evolve suitable strategy.
2. Production: In the field of production statistical data and method play a very important role.
Decision related to quality, predictions, inventory, variation need to be addressed through statistical
methods.
3. Finance: The financial organization discharging their finance function effectively depends very
heavily on statistical analysis.
4. Investment: Statistics greatly assists investors in making clear and valued judgment in his
investment decision in selecting securities which are safe and have the best prospects of yielding a
good income.
5. Human Resource: Statistics may be used to handle data generated through human resource for
planning, organizing, staffing.
Tools in Statistics:
Statistics used in management decision making
 Time Series – Used to analyze the trend in data and make prediction based on that trend.
 Probability – Used to find chance of success or failure of any project.
 Measure of Central Tendency – Used to know the single value of about data. Like mean,
mode, median.
 Measure of dispersion – Used to understand variation between data.
 Index Number – Used to understand commodity and inflation.
 Correlation – Used to understand the relation between variables.
 Regression – Used to make forecasting, prediction and estimation.
 Hypothesis – Used for research in management.
 Decision Theory – Used to help in decision making

14. The height (in cm) and weight (in Kg) of 10 basketball players of a team are:
Player 1 2 3 4 5 6 7 8 9 10
Height (X) 186 189 190 192 193 193 198 201 203 205
Weight (Y) 85 85 86 90 87 91 93 103 100 101
Calculate:
a) The coefficient of correlation between X and Y.
b) The regression line of Y on X
c) The estimated weight of a player whose height measure as 208 cm.

6
Sol.
X Y X2 Y2 XY
186 85 34596 7225 15810
189 85 35721 7225 16065
190 86 36100 7396 16340
192 90 36864 8100 17280
193 87 37249 7569 16791
193 91 37249 8281 17563
198 93 39204 8649 18414
201 103 40401 10609 20703
203 100 41209 10000 20300
205 101 42025 10201 20705
1950 921 380618 85255 179971
∑X ∑Y ∑X2 ∑Y2 ∑XY

Using the above formula (r) = 0.94

Coefficient of
Correlation (r) 0.94
x on y bxy Slope 0.87 intercept 114.63
y on x byx Slope 1.02 intercept -107.14

The estimated weight of a player whose height measure as 208 cm would be 105.3 kg
Regression line of Y on X
Y- 92.1 = 1.021 (X – 195)
Estimating y for x = 208, put x=208 in above equation.
Y=105.3 Kg

15. Discuss the concept of Business analytics with its meaning, types and applications in various
functions of management.

Sol. Business Analytics is the use of data, information technology, statistical analysis, quantitative
methods, and mathematical or computer-based models to help managers gain improved insight about
their business operations and make better, fact-based decisions.

Types of Business Analytics


Prescriptive analytics is really valuable, but largely not used. Where big data analytics in general
sheds light on a subject, prescriptive analytics gives you a laser questions. For example, in the health

7
care industry, you can better manage the patient population by using prescriptive analytics to
measure the number of patients who are clinically obese, then add filters for factors like diabetes and
LDL cholesterol prescriptive model can be applied to almost any industry target group or problem.

Predictive analytics use big data to identify past patterns to predict the future. For example, some
companies are using predictive analytics for sales lead scoring. Some companies have gone one step
further use predictive analytics for the entire sales process, analyzing lead source, number of
communications, types of communications, social media, documents, CRM data, predictive analytics
can be used to support sales, marketing, or for other types of complex forecasts.

Diagnostic analytics are used for discovery or to determine why something happened. For example,
for a social media marketing campaign posts, mentions, followers, fans, page views, reviews, pins,
etc. There can be thousands of online mentions that can be distilled into a single view to see what
worked in your past campaigns and didn’t.

Descriptive analytics or data mining are at the bottom of the big data value chain, but they can be
valuable for uncovering patterns that offer insight. A simple example of descriptive analytics would
be assessing credit risk; using past financial performance. Descriptive analytics can be useful in the
sales cycle, for example, to categorize customers by their likely product preferences and sales cycle.

Applications of Business Analytics

 Finance
It is of utmost importance to the finance sector. Data Scientists are in high demand in investment
banking, portfolio management, financial planning, budgeting, forecasting, etc.
 Marketing
Studying buying patterns of consumer behaviour, analysing trends, help in identifying the target
audience, employing advertising techniques that can appeal to the consumers, forecast supply
requirements, etc.
 HR Professional
HR professionals can make use of data to find information about educational background of high
performing candidates, employee attrition rate, number of years of service of employees, age,
gender, etc. This information can play a pivotal role in the selection procedure of a candidate.
 CRM
It helps one analyse the key performance indicators, which further helps in decision making and
make strategies to boost the relationship with the consumers. The demographics, and data about

8
other socio-economic factors, purchasing patterns, lifestyle, etc., are of prime importance to the
data available.
 Manufacturing
It can help us in supply chain management, inventory management, measure performance of
targets, risk mitigation plans, improve efficiency in the basis of product data, etc.
 Credit Card Companies Credit card transactions of a customer can determine many factors:
financial health, life style, preferences of purchases, behavioral trends, etc.

Section-C
C. Attempt all the questions. (5x10 = 50)

16.Attempt anyone.

a.The local authorities in a certain city install 10,000 electric lamps in the street of the city. If the average
life of 1000 burning hours with a SD of 200 hours, what number of lamps might be expected to fail
i) In first 800 burning hours
ii) Between 800 and 1200 burning hours
iii) After 1400 burning hours
Given the area under the standard normal curve between z = 0 to z:
Z 0.5 1.0 1.5 2.0
Area 0.1915 0.3413 0.4332 0.4772

Sol. Using properties of normal distribution


Given Mean = 1,000 and SD = 200

a) Number of bulbs Before 800 burning hours


That is we need to find the area of marked region
Less than 800
P (X < 800)
0.5 - P (800 < X < 1000) = 0.5 - P (-1 < Z < 0) = 0.5 – 0.3413
The required probability is 0.1587
Number of bulbs before 800 burning hours = 1587
b) Number of bulbs Between 800 and 1200 burning hours
That is we need to find the area of marked region
P (800<X<1200)
P (800 < X < 1200) = 2*0.3413
The required probability is 0.6826
Number of bulbs between 800 and 1200 burning hours =6826
c) Number of bulbs after 1400 burning hours
That is we need to find the area of marked region
More than 1400

9
P (X > 1400)
0.5 - P (0 < X < 1400) = 0.5 - P (0 < Z < 2) = 0.5 – 0.4772
The required probability is 0.0228
Number of bulbs after 1400 burning hours = 228

b. Define the probability distribution. Explain the salient features of Binomial, Poisson and Normal
distribution.
Sol.
Basis Binomial Distribution Poisson Distribution
x n-x
Formula p (x) = n C x p q p (x)= e-λ X=0,1,2…
λx / x!
where λ =np
Property Statistical independence Statistical independence
Dichotomy Dichotomy
Constant Probability Constant Probability
Identical Condition Identical Condition
Parameter n, p λ
Mean P λ
Variance Npq λ
Special Condition ---- When n is large and p is small

The properties of Normal distribution are following:


 Mean (µ) & SD (σ) are known as the parameter of the distribution.
 The curve is Asymptotic to X-axis.
 The problems related to Normal distribution can be solved by using the properties of Normal
Curve.
 The random variable X should be transform to Standard Normal Variable ‘Z’ using
Z = (X-µ)/σ
 After the transformation the probability (or area) can be found using Normal distribution table.
 The total area under the normal curve is 1, which is divided into two equal halves through
vertical axis.

17. Attempt anyone.


a) In a bolt factory, three machines M1, M2, and M3 manufacture 2000, 2500, and 4000 bolts every day.
Of their output 3%, 4%, and 2.5% are defective bolts. One of the bolts is drawn very randomly from a

10
day’s production and is found to be defective. What is the probability that it was produced by machine
M2?
Sol. Using Bayes Theorem

P(E1). 0.24 P(E/E1) 0.03 P(E1).P(E/E1) 0.007


P(E2). 0.29 P(E/E2) 0.04 P(E2).P(E/E2) 0.012
P(E3). 0.47 P(E/E3) 0.025 P(E3).P(E/E3) 0.012
0.031

Required 0.012/0.031
Probability = = 0.384

b) A random variable X has the following probability distribution:


x 0 1 2 3 4 5
p(x) 0.20 0.25 0.10 0.15 0.20 0.10
Find the variance.
Sol.
x F(x) E(x)=x.f(x) E(x2)=x2.f(x)
0 0.2 0 0
1 0.25 0.25 0.25
2 0.10 0.2 0.4
3 0.15 0.45 1.35
4 0.2 0.8 3.2
5 0.1 0.5 2.5
2.2 7.7

Variance = E (x2) – {E (x)}2


Variance = 7.7 – (2.2)2
Variance = 2.86

18. Attempt anyone.

a) A drug is said to be useful for treatment of cold. In an experiment carried out on 160 persons suffering
from cold, half of the persons were treated with drug and rest of half with sugar pills. The effect of
treatment is described in the following table:
Helped Harmful No Effect
Drug 52 10 18
Sugar Pills 44 10 26
Test the effectiveness of the drug.
[For 2 df the value of chi-square is 5.99 at 5% level of significance]
Sol.

11
H0: There is no significant difference between drug and its effect.
H1: There is a significant difference between drug and its effect.

Helped Harmful No Effect Total


Drug 52 10 18 80
Sugar Pills 44 10 26 80
Total 96 20 44 160

O E (O-E) (O-E)2 (O-E)2 / E


52 48 4 16 0.333
10 10 0 0 0.000
18 22 -4 16 0.727
44 48 -4 16 0.333
10 10 0 0 0.000
26 22 4 16 0.727
2.12

Chi Square value = ∑ (O-E)2 / E

Calculated value of chi-square = 2.12


Degree of freedom (df) = (r-1) *(c-1) = (2-1)*(3-1) = 2
Table value at 2 df the value of chi-square at 5% level of significance = 5.99

Since calculated value of chi-square if less than table value


Then H0 is not rejected and it is concluded that there is no significant difference between drug
and its effect.

b) Define null hypothesis, alternate hypothesis, critical region and two sided test, used in testing of
statistical hypothesis.
Sol.
Null Hypothesis: In statistical inference of observed data of a scientific experiment, the null
hypothesis refers to a general or default position: that there is no relationship between two measured
phenomena, or that a potential medical treatment has no effect. Rejecting or disproving the
null hypothesis – and thus concluding that there are grounds for believing that there is a relationship
between two phenomena or that a potential treatment has a measurable effect – is a central task in the
modern practice of science, and gives a precise sense in which a claim is capable of being proven false.
Example
Given the test scores of two random samples of men and women, does one group differ from the other?
A possible null hypothesis is that the mean male score is the same as the mean female score:
H0: μ1 = μ2
where:
H0 = the null hypothesis
μ1 = the mean of population 1, and
μ2 = the mean of population 2.
A stronger null hypothesis is that the two samples are drawn from the same population, such that the
variance and shape of the distributions are also equal.

Critical Region

12
The critical region CR, or rejection region RR, is a set of values of the test statistic for which the null
hypothesis is rejected in a hypothesis test. That is, the sample space for the test statistic is partitioned into
two regions; one region (the critical region) will lead us to reject the null hypothesis H0, the other will
not. So, if the observed value of the test statistic is a member of the critical region, we conclude "Reject
H0"; if it is not a member of the critical region then we conclude "Do not reject H0".

Two-tail Test
In statistical significance testing, a one-tailed test or two-tailed test are alternative ways of computing the
statistical significance of a data set in terms of a test statistic, depending on whether only one direction is
considered extreme (and unlikely) or both directions are considered extreme. Alternative names are one-
sided and two-sided tests; the terminology "tail" is because the extremes of distributions are often small,
as in the normal distribution or "bell curve", pictured above right.
If the test statistic is always positive (or zero), only the one-tailed test is generally applicable, while if the
test statistic can assume positive and negative values, both the one-tailed and two-tailed test are of use.

Figure: A two-tailed test corresponds to both extreme negative and extreme positive directions of the test
statistic, here the normal distribution.

19. Attempt anyone


a. The following figures relates to the number of units of an items produced per shift by two workers A
and B respectively
A 19 22 24 27 24 18 20 19 25
B 26 37 40 35 30 30 40 26 30 35 45
Can we inferred that both the have similar level of stability in terms of production using F-test at 5%
level of significance? (Critical value at 5% level of significance is F (10, 8) = 3.35
Sol.
A (x) (x-mean) (x-mean)2 B (y) (y-mean) (y-mean)2
19 -3 9 26 -8 64
22 0 0 37 3 9
24 2 4 40 6 36
27 5 25 35 1 1
24 2 4 30 -4 16
18 -4 16 30 -4 16
20 -2 4 40 6 36
19 -3 9 26 -8 64

13
25 3 9 30 -4 16
35 1 1
45 11 121
80 380

S12 = 10 S22 = 38

F = S22 / S12
Fcal = 3.8
Ftab = 3.35
Since, Fcal > Ftab, H0 Rejected

b. The sales of a large company conducted a sample survey to examine the daily sales performance of is
salesman posted in two states, Uttar Pradesh and Madhya Pradesh. The result of his survey is shown in
the following table:

States Sample Size Mean Sales SD (Rs)


(Rs)
Uttar Pradesh 400 5200 50
Madhya Pradesh 400 5600 40
Test, at 5% significance, that the mean daily sales of the salesman posted in two states were same
or they differ significantly. [Z value at 5% level is 1.96]

Sol.

H0: There is no significant difference between average sales between the two cities.
H1: There is a significant difference between average sales between the two cities.
Submitting the values in the formula
Z = 128.2 which more than table value (1.96)
Hence H0 is rejected, therefore it is concluded that there is a significant difference between
average sales between the two cities.

14
20. Attempt anyone.
a) Calculate price index numbers for the year 2015 with 2014 as the base year from the following data
using:

i) Fisher’s Ideal Index


ii) Marshall-Edgeworth Index
iii) Show that Fisher’s Ideal Index satisfies Factor Reversal Test
2014 2015
Commodity Price Value Price Value
A 10 100 12 144
B 15 75 20 120
C 8 80 10 110
D 20 60 25 50
E 50 500 60 540

Sol.
Commodity p0 q0 p1 q1 p0q0 p0q1 p1q0 p1q1
A 10 10 12 12 100 120 120 144
B 15 5 20 6 75 90 100 120
C 8 10 10 11 80 88 100 110
D 20 3 25 2 60 40 75 50
E 50 10 60 9 500 450 600 540
Total 815 788 995 964

a) Fisher’s Ideal Index =


Answer = 122.2

b) Marshall-Edgeworth Index =
Answer = 122.2

According to the factor reversal test if the factors i.e. Price and quantity in a price or quantity index
formula be interchanged so that a quantity or price index number formula is obtained, then the product of
these formulae should be the value index number.

Po1 X Qo1= summation pnqn / summation poqo.

15
These test are used to check that which method is best to calculate index number. These tests are used to
check the accuracy of the index number obtained from different resources. They help to compare the
values of index numbers.

b. What are time series? Explain the different components of time series. Discuss any two
methods of forecasting the trend.
Sol) The series of observations recorded over time is known as time series. Time series models uses past
history to predict the future. Components of time series are following:
 Secular trend: The Tendency of the time series data to increase, decrease or stagnate over a
long passage of time. For Ex : Population
 Seasonal component: is the variability in the behavioral pattern during different seasons in an year.
For Ex: Sale of AC, Fans.
 Cyclical component: is almost synonymous with the business cycle reflecting the upswing and
downswing of the data over extended periods of time. For Ex : Recession
 Random or Irregular component: irregular variations caused by random factors and sporadic
causes like strikes, natural disasters and so on.

16

You might also like