Professional Documents
Culture Documents
Introduction
• Index numbers are commonly used in economics, business, policy, etc
• An Index number is a summary measure that states a relative comparison between groups of
related items.
• In its simplest form, Index number is nothing more than a price relative – a percentage figure that
expresses the relationship between two numbers with one of the numbers used as the base
• Example:
• Consumer Price Index
• Index of Industrial Production
• Stock price index
• Cost of living index across cities
• Index numbers are generally constructed by Govt.
• There are also private agencies index numbers
Why Index numbers?
• To study the relative changes of a group of related items in a easy
comprehensible manner
• Example: Prices of Food items, production of food items etc
• To study the average of changes among a set of variables with respect
to a benchmark
Introduction
• For instance, IIP (2011-12=100)
• CPI for 2015-16 = 124.7 CPI 2020-21 2021-22
Year
(2012=100)
which means, compared to 2011-12, the CPI has April 54.0 126.7
increased 24.7% May 90.2 116.0
• CPI for 2019-20 = 146.3 2011-12 93.3 June 107.9 122.6
Compared with 2011-12, CPI increase is 46.3% 2012-13 102.5 July 117.9
σ 𝑃𝑡 𝑄0
𝐼𝑡 = × 100
σ 𝑃0 𝑄0
Paasche Index
• Instead of base-year weights, if the index is calculated over current
year weighys, then the index is called Paasche Index
σ 𝑃𝑡 𝑄𝑡
𝐼𝑡 = × 100
σ 𝑃0 𝑄𝑡
• Paasche Index has the advantage of using current usage, but it has
some problems as well
• Weights are to be obtained every new year, which is more expensive
• Every time new usaage data comes in, the w hole sief index numbers (for the
past years) are to be recalculated
Aggregate Index from Relative Prices
• Many times, individual relative prices are
available
• One can construct aggregate Index from Automotive Operating Expenses Index - Relative Price Method
relative prices Typical Usage
Relative
𝑃 1990 2011 Relative Pirces X
σ 𝑡𝑤 Prices
per year
Weights weights
𝑃0
𝐼𝑡 = × 100 Gallon of
σ𝑤
gasoline 1.30 3.52 2.71 1000 1300 3520
quart of oil 2.10 6.25 2.98 15 31.5 93.75
• Where w is the weight of each item in the Tire 130.00 145.00 1.12 2 260 290
Insuance policy 820.00 1040.00 1.27 1 820 1040
inedex 2411.5 4943.75
𝑤 = 𝑃0 𝑄0
Quantity Index Numbers
• Just as price index numbers, quantity index numbers also can be
constructed
• The only difference is that the role of P and Q change.
• Q becomes the variable of interest, whereas P would be the weight
• Laspeyres Quantity Index
σ 𝑄𝑡 𝑃0
𝐼𝑡 = × 100
σ 𝑄0 𝑃0
σ 𝑃𝑡 𝑄0 σ 𝑃𝑡 𝑄𝑡
𝐼𝑡 = × × 100
σ 𝑃0 𝑄0 σ 𝑃0 𝑄𝑡
Exercise
A security analyst wishes to 2019 2020 2021
construct an index to reflect Stock P0 Q0 P1 Q1 P2 Q2
the changes in the prices of A 20 100 30 200 25 200
five stocks in a certain
B 40 200 35 150 40 100
portfolio. Average prices and
quantity of his purchase C 50 150 70 250 60 100
during the year are given in D 35 300 40 250 45 200
the table E 20 200 15 400 25 300
• Construct an unweighted aggregates index using 2091 as a base price What are the
disadvantages of this type of index?
• Construct unweighted relative price aggregate Index using 2019 as base. Discuss what are
the disadvangates.
• Construct a Laspeyres index using 2091 as base year. What are the disadvantages of this
type of index?
• Construct a Paasche index using 2019 as base year. What are the disadvantages of this type
of index?
• Construct a Fishers Ideal Index using 2019 as base year
Exercise
• A small electric fan company produces household fans.
The Average unit selling price and number of units sold in Model 2019 2020
2019 and 2020 are given. Units Units
• Calculate the price Index of fan for 2020 on base year Price
Sold
price
sold
2019. Use the weighted arithmetic mean of relative price
High-end 3560 20 3750 30
method with base period weights
Economy 2500 45 2250 42
Analysis
Inference
Inference
Business decision making depends upon the perception of underlying population
What is the average daily productivity of a unit?
Is the supply of acceptable quality
Are the two populations significantly different?
Inference has wide applications in Business, Government, Academics, Medical
field and many other fields
Inference Methods
Inferences are drawn on population based on analysis of samples
Broad contents of process of Inference:
Theory of Sampling and sampling errors
Theory of probability for studying issues of uncertainty
Probability based Estimation of population parameters
Test of Hypothesis, Interval estimation
Conceptual Approaches to Probability
There are three different types of conceptual approaches to probability
1. Classical Probability
2. Relative Frequency of Occurrence
3. Subjective Probability
Classical Probability
Classical Probability
If there are a possible outcomes favourable to the occurrence of an event A and b
possible outcomes unfavourable to A, and if all outcomes are equally likely and
mutually exclusive, then the probability that A will occur P(A) is
𝑎
𝑃 𝐴 =
𝑎+𝑏
Example:
Financial Analysts assigning probability for a rate cut
Higher the likelihood of occurrence of event, assigned probability will be closer to 1
Basic Concepts
Probability
Numbers assigned to elements in the sample space such that
They are greater than or equal to 0
They must add up to 1
S = {A, B, C}
P(A) ≥ 0
P(B) ≥ 0
P(C) ≥ 0
P(A) + (𝐴)ҧ = 1
Counting Principles: Permutations
Consider a group of n objects. In how many ways x Example:
objects (x < n) can be selected and arranged? If there are 5 people competing for three positions,
viz., GM, GDM and AGM. How many ways the
selection is possible?
n𝑃𝑥 = 𝑛 × 𝑛 − 1 × 𝑛 − 2 × . . . . . . (𝑛 − 𝑛 − 𝑥 )
n=5
𝑛! x=3
n𝑃𝑥 = 𝑛−𝑥 !
𝑛! 5!
n𝑃𝑥 = 𝑛−𝑥 !
=
2!
= 60
Counting principles: Combinations
In permutations order of elements is important. If such order is not important, then we
have situation of ‘combinations’ Assuming A, B, C, D and E are the
candidates
𝑛 𝑥𝑃𝑛 𝑛!
= 𝑛𝐶𝑥 = = combinations candidates
𝑥 𝑥! 𝑥! × 𝑛 − 𝑥 !
1 ABC
2 ABD
Example: Suppose the three positions in the earlier example are 3 ABE
same (say AGM), how many ways the five competitors get selected? 4 ACD
5 ACE
6 ADE
𝑛 5! 7 BCD
= = 10
𝑥 3! × 2! 8 BCE
9 BDE
10 CDE
Exercises
A panel of consumers is given 4 different package designs to rank them in preference order.
How many different rankings could the panel have given?
Sales person has shown sarees of 12 different colours to her customer who wants to choose 4
sarees to purchase. If one of the selected colour is azure, how many combinations of choice are
there for the customer?
A committee consists of eight union members and six non-union members. If a sub committee
of six members is to be formed with 3 from union and 3 from non-union members, how many
combinations are possible?
Two dice are tossed in air. How many times does at least one of them show ‘3’
5 coins are tossed in the air. How many times exactly 5 heads can occur?
5 coins are tossed in the air. How many times exactly 3 heads occur?
Random Variable
Random variable (RV) may be anything that takes different (numerical) values due to
chances
For instance, number of heads in an experiment of tossing 10 coins
Number of accidents that can occur on a road during one year
Number of marks that a student may obtain
𝑃 𝑋 = 𝑥 =?
Probability Distribution/ Probability Function
Daily Wage of 100 workers
X=x 𝑓 𝑥
-1 1/10
0 2/10
1 3/10
2 4/10
4 16/25 7/25
What is 𝑓 𝑥 ?
5 25/25 9/25
Probability Distribution
X stands for the number of minutes it takes to drain a water tank. The
probability distribution
𝑥
𝑓 𝑥 = x = 1, 2, 3, 4, 5
15
1. Prove that 𝑓 𝑥 is a probability distribution function
2. What is the probability that it will take exactly three minutes to drain
the tank?
3. What is the probability that it will take at least two minutes but not
more than four minutes?
4. Calculate and furnish cumulative distribution in tabular form
5. What is the probability that the draining process will take at most
three minutes?
6. What is the probability that it will take more than two minutes?
Bernoulli Processes / Binomial Distribution
Bernoulli process consists of trials wherein the following
characteristics apply
Each trial results in either of two mutually exclusive outcomes
referred as “success” and “failure”
Probability of success, denoted by ‘p’, remains constant from trial
to trial. The probability of failure ‘q’ is equal to 1-p
The trials are independent
Probability of x successes in n trials of a Bernoulli process is
called Binomial Distribution
Binomial Distribution - Example
Question Solution
What is the probability of getting 2 The said event of 2 heads can come up in any
heads in an experiment of tossing 5 order in the 5 trials
coins? Supposing it comes in the order of (H, T, H,T, T)
the probability of the event:
Tossing a coin is a Bernoulli trial 𝑃 𝐻, 𝑇, 𝐻, 𝑇, 𝑇 = 𝑝 × 𝑞 × 𝑝 × 𝑞 = 𝑝2 𝑞 3
1 𝟐 1 5
× = 0.278
1 6 6
The experiment is a Bernoulli process. 2 𝟐 1 2 5 0
The random variable X – number of ‘6’ × = 0.028
2 6 6
Number of trials = 2
Event is getting one 6
Defining X as the number of 6s in the experiment, X Probability of at least one six = f(1) + f(2)
can take values 0, 1, 2.
The probability of getting one ‘6’ on a dice is p = 1/6
and q = 5/6
Exercise 1
Five fair dice are tossed. Would you agree that
the probability of obtaining no 1s is the same
as that of obtaining five 1s? If not what are
the exact probabilities of these two events?
Exercise 2
Political analysts estimate that 40% of a particular candidate’s supporters are
inclined to vote for other candidate in a municipal election. What is the
probability that from a random sample of two voters selected from among the
first candidate’s supporters,
a) two vote for other candidate;
Variance is given by
𝐸[𝑋 − 𝐸 𝑋 ]2 = 𝜎 2
Variance can also be written as
𝜎 2 = 𝐸 𝑋 2 − 𝐸(𝑋)2
Where 𝐸 𝑋 2 = σ 𝑥 2 𝑓(𝑥)
Poisson Distribution
Poisson Distribution is highly used in business applications
Poisson Distribution is applied when average rate of occurrence of event is known
It approximates to binomial when larger number of trials are involved and probability of
success ‘p’ is small
If X is a RV representing number of times an event occurs and X can take values x = 0,1,2,3,
………….. (ad infinitum)
𝜇 𝑥 × 𝑒 −𝜇
𝑓 𝑥 =
𝑥!
𝑐
𝜇 𝑥 × 𝑒 −𝜇
𝐹 𝑋≤𝑐 =
𝑥!
𝑥=0
Mean = µ = 𝐸 𝑋 = 1
Variance = 𝜎 2 = 𝐸 𝑋 2 − 𝐸(𝑋)2 = 1
Mean and Variance of Binomial D
Assume that a restaurant has determined that there is a probability of
20% that a customer will order Blitz beer. If at a particular time, there
are 5 customers in the restaurant, what is the expected value, variance
and standard deviation of the number of customers who will order
Blitz beer?
Random Variable: X = number of people ordering beer
Mean = 𝐸 𝑋 = 𝑛𝑝
Variance = 𝜎 2 = 𝑛𝑝𝑞
Mean and Variance of Proportions
Assume that a restaurant has determined that
there is a probability of 20% that a customer
will order Blitz beer. If at a particular time,
there are 5 customers in the restaurant, what
is the expected value, variance and standard
deviation of the proportion of customers who
will order Blitz beer?
Mean = 𝐸 𝑝ҧ = 𝑝
Variance = 𝜎 2 = 𝜇
Recap
Probability definition and types of measurement
Random Variable (Discrete RV and Continuous RV)
Bernouli Process and Binomial Distribution
Mean of RV following BD E(X) = np
Variance σ2 = npq
Poisson Distributing
Mean E(X) = µ
Variance σ2 = µ
Exercise
Manufactured items that do not pass inspection are often sold as
“seconds” in an outlet at reduced price. Often these seconds may have
only minor flaws and do not affect their performance. Past testing of a
manufacturer’s coffee maker has shown that 90% of all ‘seconds’
perform as well as ‘firsts’. A random sample of 25 ‘seconds’ from this
particular manufacturer is to be selected. We record X as the number of
items in the sample that perform as well as ‘firsts’
Find P(X ≥ 20)
Find P(18≤ X < 23)
Compute µ and σ of X
Continuous Probability Distributions
Unlike discrete variables that take distinct values, Continuous variables
are always measured in range of two numbers.
For instance, the weight of soap piece in a manufacturing unit will lie
between 19 to 21 grams.
The probability function of continuous variable, therefore, also refers to
probability of variable taking value in between two values
Probability of car travelling between 5 to 20 kilometers in a day
Continuous Probability Distributions
Probability of continuous RV taking values between x1 and x2 is given
by the area covered under the curve between the two values
Probability of continuous RV taking a specific value x is zero
Therefore,
P(a ≤ x ≤ b) is same thing as P(a < x < b)
Normal Probability Distribution
Normal Distribution most commonly applied distributions in cases of many
real life problems involving continuous RV
Normal Distribution is also widely used in test of hypothesis
1 −
1 𝑥−𝜇 2
𝑓(𝑥) = 𝑒 2 𝜎 µ and σ are the parameters of the distribution
𝜎 2𝜋 µ = mean of random variable X
σ = standard deviation of X
Properties of Normal Probability Distribution
0.014
1. Normal distribution depends upon
Two parameters: µ and σ
0.012 2. The probability curve is symmetric
mean µ (skewness = 0)
0.01 3. The tails extend to infinity on both
sides
0.008 4. Given same mean µ, variable with
probability
0
µ
Properties of Normal Probability Distribution
0.07
0.06
0.05
Probability
0
45 46 46 46 47 47 48 48 49 49 50 50 51 51 51 52 52 53 53 54 54
Random Variable
𝑥−𝜇
𝑍=
𝜎
Z is also a normal variable with parameters µ = 0 and σ = 1
If X ~ N(𝜇𝑥 , 𝜎�𝑥 ) and Y ~ N(𝜇𝑦 , 𝜎�𝑦 ) are two normal variables then
𝑥−𝜇𝑥 𝑦−𝜇𝑦
=𝑍= where Z ~ N(0,1)
𝜎𝑥 𝜎𝑦
Probability Areas under normal curve
0.025
0.01
P (X > 40,000) ?
Conversion of Z to X
Given the value of standard normal variable Z, one can find the value
of X using the following
X=Zσ+µ
estimate the population parameters 50958 yes 51933 yes 55861 yes
𝐸 𝑥ҧ = 𝜇
𝜎
𝜎𝑥ҧ =
𝑛
The effect of sample size n
Sampling distribution with varying sample size
0.07
For larger sample size,
0.06 n = 50 sampling distribution would
have lesser dispersion (𝜎ൗ 𝑛)
0.05
n = 10
0
1000
1002
997
997
997
997
998
998
998
998
998
998
999
999
999
999
999
999
999
1000
1000
1000
1000
1000
1000
1001
1001
1001
1001
1001
1001
1002
1002
1002
1002
1002
1002
1003
1003
1003
1003
Standard Error
A sample mean 𝑥ҧ is an estimate of 𝜇
For a sample, (𝑥ҧ − 𝜇) is considered as sampling error
𝜎
A measure of average sampling error is given by 𝜎𝑥ҧ = which is also
𝑛
called Standard Error
𝜎𝑥ҧ also gives a measure of precision with which 𝜇 is estimated
Important property of standard error:
Larger the sample size (n), smaller will be 𝜎𝑥ҧ
Sampling Distribution of Non-Normal Variables -
Central Limit Theorem
Sampling Distributions of samples from non-normal probability distributions
also tend to be normally distributed, due to Central Limit Theorem
𝑥ҧ
𝑥ҧ𝐿 = 𝑥ҧ − EM 𝑥ҧ𝐻 = 𝑥ҧ + EM
Sampling Distribution
Consider a case of random samples of size n over a
population random variable X whose mean is µ and s.d is
σ
The sampling distribution of 𝑥ҧ random sample of size n
𝑥ҧ𝐿 𝑥ҧ 𝑥ҧ𝐻
The unknown population mean is likely to
figure in the intervals 𝑥ҧ𝐿 , 𝑥ҧ𝐻 with (1-α)
confidence
Error Margin is 𝑧𝛼 𝜎𝑥ҧ
2
Example
Over the sample of 100 customers,
Lloyd’s Store found the mean of
expenditure per shopping trip as $
82. Assuming population σ as 20,
what are the upper and lower
limits within which population
mean is likely to be present 95% of
times?
What is the error margin?
When σ is known
A tire company wants to estimate 90% Sampling mean = 𝑥ҧ = 36,000 km
confidence intervals for running life of tires 5000
Sampling sd. 𝜎𝑥ҧ = 𝜎ൗ 𝑛 = 100 = 500
produced by it. A sample of 100 tires are
Confidence level (1-α) = 90%
drawn and tested for life. The sample mean The z0.05 (=norm.s.inv(0.95) = 1.64
was obtained as 36,000 kilometers. Assuming
the population s.d is 5000 kilometers, what is
the confidence interval for the population
running life? 𝑥ҧ − 𝑧0.5 𝜎𝑥ҧ 𝑥ҧ 𝑥ҧ + 𝑧0.05 𝜎𝑥ҧ
What is the error margin?
36000 − 1.64* 500 36000 + 1.64* 500
𝑠
𝑥ҧ𝐿 = 𝑥ҧ − 𝑡𝛼
2 𝑛
𝑠
and Error Margin: 𝑡𝛼
𝑠 2 𝑛
𝑥ҧ𝐻 = 𝑥ҧ + 𝑡𝛼
2 𝑛
When σ is unknown
A group of students working on a summer Sample Mean = 𝑥ҧ = 6810
780
project took a simple random sample of 30 Sampling sd = 𝑠ൗ 𝑛 = = 142.04
30
families from the population of “low income
area” of a large city. The sample estimated The t0.025 (39 df) = 2.045
annual income of a family as 𝑥ҧ = 6810 and s
= 780. What would be the 95% confidence
interval for the mean income of all the
families in this area? 6810 −2.045*124.02 6810 6810 +2.045*142.04
6518.7 7101.2
What is the error margin?
EM = 2.045*142.04 = 291.25
t α/2 (n d.f) = t.inv.2t(p,df)
Exercise
In a town, out of sampled 800 480
Sample Mean = 𝑝ҧ = = 60%
800
automobile owners, 480 said they
would like to see the size of Result: In large samples, the sampling distribution of
Proportions follow normal distribution with
automobiles reduced. What are the 95%
Sampling mean 𝐸 𝑝ҧ = 𝑝 and
confidence limits for the proportion of
𝑝ҧ 𝑞ത
all automobile owners in the town who 𝜎𝑝ҧ =
𝑛
would like to see car size reduced? Therefore, given the sample mean 𝑝ҧ = 0.6,
the confidence interval for 95% is
𝑝ҧ 𝑞ത
𝑝ҧ ∓ 2 ∗
𝑛
When test is done, The Type I and Type II errors that may be
committed are as below:
H0 is is true, but it was rejected H0 is not true but it was not rejected
Deciding that the fuel efficiency is more Deciding that the fuel efficiency is less
than 64 kmpl when actually it is not than 64 kmpl when actually it is above
Procedure of Test of Hypothesis
The Test of Hypothesis procedure is simply a decision
rule that specifies whether the null hypothesis H0
should be rejected or retained for every possible value
of a statistic observable in a simple random sample of
size n
Important Concepts
Level of Significance
Test statistic z (Calculated z)
‘p’ value
Critical value zc
Level of Significance
The level of significance refers to the tolerance
probability level that the researcher is willing
to accept of committing Type I error (that is
rejecting null hypothesis when it is actually
true)
Level of significance is always specified by the
researcher himself
Level of significance is generally denoted by α
and is commonly chosen as 0.05 or 0.01
Test Statistic
If µ is the population mean, σ is the population
standard deviation, and n is the sample size,
the test statistic for sample mean 𝑥ҧ is given as:
𝑥ҧ − 𝜇
𝑧= 𝜎
ൗ 𝑛
𝜇 =3,65,000 𝜇 + 2.33σ
0.45
0.4
0.35
probability
0.3
0.25
0.2
0.15
0.1 Reject H0
0.05
0
𝑧=0 𝑧 = 2.33
Exercise 1(cont..)
Given Sample Mean
𝑥ҧ = 3,75,000
ҧ
𝑥−𝜇 375000−365000
Test Statistic 𝑧 = 𝜎 = = 2
ൗ 𝑛 5000
0.4
probability
0.3
0.2
Reject H0 Reject H0
0.1
0.5
0.4
probability
0.3
0.2
Reject H0 Reject H0
0.1
Population mean at A = µ1
Population mean at B = µ2
Population sd
For region A σ1= s1 = 4.00
For region B σ2 = s2 = 4.50
Exercise
Samples
The problem is to test whether
Sample mean of A = 𝑥1 = 300.01
mean of Y is 0 (that is, µ = 0) Sample mean of B = 𝑥2 = 295.21
Hypotheses Therefore, the sample mean of Y
Null H0 : µ = 0 𝑦
ത = 𝑥1 − 𝑥1 = 4.8
Alt H1: µ ≠ 0 Sampling Distribution
Sampling mean 𝜇𝑦ത = 𝜇 = 0
𝑠12 𝑠22
Sd = 𝑠𝑦ത = +
𝑛1 𝑛2
42 4.52
= + = 0.51
100 200
Exercise
Test Statistic at sample mean 𝑦ത = 4.8
𝑦ത − 𝜇 4.8 − 0
𝑧= = = 9.41
𝑠𝑦ത 0.51
Z (critical)
Given that α = 0.02
As the test is two tail test, z value at α/2 = 2.33
As z calculated 9.41 is higher than z (critical) 2.33, H0 is to be rejected
Conclusion: The average wage rates of A and B are significantly
different
Sampling Distribution in Small Sample
cases
ҧ
𝑥−𝜇
When sample size is small (generally n < 30), the statistic 𝑧 = , where
𝑠𝑥ഥ
𝑠𝑥ҧ is estimated standard error, does not approximate to normal
distribution
The statistic z in such cases is found to follow ‘t’ distribution
The table values for are available in the ‘t’ tables
ҧ
𝑥−𝜇
The test statistic t = would be associated with degrees of freedom
𝑠𝑥ഥ
of (n-1)
Example Hypothesses
H0: µ = 100
The personnel department of a company H1: µ ≠ 100
developed an aptitude test for a certain type
of semiskilled worker. The individual test Sampling Distribution
scores were assumed to be normally
𝑠 5
distributed, with mean 100. It was agreed that 𝑠𝑥ҧ = = = 1.25
𝑛 4
this hypothesis would be subjected to a two-
tailed test at the 5% level of significance. The Test statistic
aptitude test was given to a simple random
sample of 16 semiskilled workers with the 𝑥ҧ − 𝜇 94 − 100
t= = = −4.8
following results 𝑠𝑥ҧ 1.25
The t table
5%
𝑥ҧ = 94; 𝑠 = 5; 𝑛 = 16 𝑓𝑜𝑟 𝑠𝑖𝑔𝑖𝑛𝑓𝑖𝑐𝑛𝑎𝑐𝑒 𝑎𝑡 15 𝑑𝑓 𝑖𝑠 2.131
2
H0 can not be maintained