Basic Statistic

Business Analytics
Basic Statistic
EduPristine
EduPristine www.edupristine.com
Agenda
Introduction
Data
Basic Statistics
EduPristine
3. Basic Statistics
I.
Probability
II.
Random variables
III. Probability distribution

IV. The Central Limit Theorem
V.
Sampling and statistical inference
VI. Confidence intervals

VII. Hypothesis testing
EduPristine
3.a. Probability
Probability is a numerical way of describing how likely something is to happen.
One of the fundamental methods of calculating probability is by using set theory.
A set is defined as a collection of objects and each individual object is called an element of that set.
Example from number of credit cards data, the distinct number of credit cards owned form a set:
# Cards = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Numbers present on a dice form a set:
Dice = {1, 2, 3, 4, 5, 6}
The sample space (S ) is the set of all possible outcomes that might be observed for an
event/experiment.
If each of the elements in the sample space are equally likely, then we can define the probability of
event A as:
P(A) = (# elements in A)/(# elements in sample space)
e.g. P(# Cards = 1) = (# of customers having 1 card)/(Total number of customers) = 100/1000 = 0.10 = 10%
e.g. Probability of rolling an even number on a dice
Sample space (S) = {1, 2, 3, 4, 5, 6}
Event (A) = {2, 3, 4}
P(A) = 3/6 = 0.5 = 50%
Why is it important from analytics perspective?

What we do: analyze historical data to find pattern under assumption that past is a reflection of future.
By means of probability theory, predict the future using historical patterns.
EduPristine
3.a. Probability- Other topics

Set operations
Union (A U B)
U
Intersection (A B)
Venn diagrams
Basic operations on Venn diagrams
Basic probability axioms

P (S) = 1
P (A) >= 0 for all A S
P (A U B) = P(A) + P(B) P (A B)
U
1.
2.
3.
Conditional probability
U
P(A|B) = P (A B)/ P(B)
Bayes theorem
EduPristine
3.b. Random variables

I.
Definition
II.
Types of Random Variables

1.
2.
Discrete
Continuous
III. Distribution and Probability Density functions of Random Variables

IV. Expected value (or Mean) of Random Variables
V.
Variance of Random Variables
VI. Coefficient of skewness of Random Variables
EduPristine
3.b. Random variables- Definition

A random variable is a function or a rule which maps each event in a sample space to real
numbers.
X (w) = x
Random variable
w1
w2
w3
.
.
.
Sample space S
x1
x2
x3
.
.
.
Set of real numbers
So, if w is an element of the sample space S (i.e. w is one of the possible outcomes of the
experiment concerned) and the number x is associated with this outcome, then X(w) = x .
Convention:
Denote random variable by capital letter X
Denote the outcome or possible values by small letter x i.e. X(w) = x
EduPristine
3.b. Random variables- Definition

Example:
Suppose there are 8 balls in a bag. The random variable X is the weight, in kg, of a ball
selected at random. Balls 1, 2 and 3 weigh 0.1kg, balls 4 and 5 weigh 0.15kg and balls 6, 7
and 8 weigh 0.2kg. Using the notation above, write down this information.
Solution:
X(b1) = 0.10 kg, X(b2) = 0.10 kg,
X(b4) = 0.15 kg, X(b5) = 0.15 kg
X(b6) = 0.2 kg, X(b7) = 0.2 kg
X(b3) = 0.1 kg,

X (bi) = x
Weight (Random variable)
b1
b2
b3
b4
b5
b6
B7
b8
Sample space S- Individual balls
EduPristine
0.10
0.15
0.20
Set of real numbers- Weights in kg

7
3.b. Types of Random variables

There are two types of Random Variables
1. Discrete Random Variables
2. Continuous Random Variables
EduPristine
3.b. Discrete Random variables

Definition:
The set of all possible values of the outcome (or x) takes discrete values
e.g.
Outcome of rolling a dice= {1, 2, 3, 4, 5, 6}
Or # credit cards owned by an individual = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Probabilities:
Probabilities are defined on events (subsets of the sample space S).
So what is meant by P(X = x) ?
Suppose sample space consists of eight events {s1, s2, s3, s4, s5, s6, s7, s8}
Let the outcome for
E1 = {s1, s2, s3} be associated with number x1
E2 = {s4, s5} be associated with number x2
E3 = {s6, s7, s8} be associated with number x3
P(X = x1) is meant P(E1)
EduPristine
3.b. Discrete Random variables

Probability functions
The function fX (x) = P(X = x) for each x in the range of X is the probability function (PF) of X
It specifies how the total probability of 1 is divided up amongst the possible values of X
Thus, gives the probability distribution of X.
Also known as probability distribution functions (pdf)
Following are the requirements for a function to qualify as the probability function of a discrete
random variable:
fX (x) >= 0 for all x within the range of X
fX (x) = 1
Cumulative distribution functions

Gives the probability that X assumes a value that does not exceed x.
Denoted as FX(x) = P(X <= x) where max (FX(x)) = 1
EduPristine
10
3.b. Discrete Random variables- Probability

Example:
and 8 weigh 0.2kg. Write down the different probability distribution functions.
Solution:
fX(0.10) = P(X=0.10) = probability the ball b1 or b2 or b3 is selected out of 8 balls = 3/8
fX(0.15) = P(X=0.15) = probability the ball b4 or b5 is selected out of 8 balls = 2/8
fX(0.20) = P(X=0.20) = probability the ball b6 or b7 or b8 is selected out of 8 balls = 3/8
FX(0.10) = P(X <= 0.10) = P(X=0.10) = 3/8

FX(0.15) = P(X<=0.15) = P(X=0.10)+ P(X=0.15) = 2/8 + 3/8 = 5/8
FX(0.20) = P(X<=0.20) = P(X=0.10) + P(X=0.15) + P(X=0.20) = 3/8 + 2/8 + 3/8 = 8/8 = 1
X (bi) = x
b1
b2
b3
b4
b5
b6
b7
b8
Weight (Random variable)

EduPristine
x1=0.10
x2=0.15
x3=0.20
11
3.b. Continuous Random variables

Definition:
The set of possible values taken by a continuous random variable falls in an interval (or a collection of
intervals) on the real line:
e.g. Salary of a set of individuals
Mathematically examples {x: x > 0} or {x: < x < } or {x: 0 < x < 1}
Probability Density Function
First define the range or the interval in which the probability has to be determined.
Say its (a, b).
The probability associated is represented as P(a < X < b) or P(a X b).
Also, it is the area under the curve of the probability density function (PDF) from a to b.
So probabilities can be evaluated by integrating the PDF fX (x) .
This relationship defines the PDF.
Mathematically
b
P(a < X < b) = a fX(x) dx
The conditions for a function to serve as PDF are

fX (x) 0 x
EduPristine
fX(x) dx = 1
12
3.b. Continuous Random variables

Cumulative distribution function:
The cumulative distribution function (CDF) is defined to be the function:
FX (x) = P(X x)
For a continuous random variable, FX (x) is a continuous, non-decreasing function, defined for all
real values of x.
x
- fX(t) dt
FX (x) =
EduPristine
13
3.b. Random variables- Expected values

Definition:
Expected values are numerical summaries of important characteristics of the distributions of
random variables.
Expected values of a Random Variable X is denoted as E[X]
Important Expected values are
Mean
Variance and Standard deviation
Mean:
E[X] is a measure of central location
For discrete case calculated as E[X] = (xi * Pi) OR E[X] = (x * fX(x))
For continuous case calculated as E[X] = - x * fX(x) dx

Usually denoted by
Variance:
Var[X] = E[{X E[X]}2]
Var[X] = E[X2] E2[X]
EduPristine
14
3.b. Random variables- Expected values

Example:
and 8 weigh 0.2kg. Find mean and variance of weight.
Solution:
fX(0.10) = P(X=0.10) = 3/8
fX(0.15) = P(X=0.15) = 2/8
fX(0.20) = P(X=0.20) = 3/8
E[X] = Pi * xi = 3/8 * 0.10 + 2/8 * 0.15 + 3/8 * 0.20 = 1.2/8 = 0.15 kg

Var[X] = E[X2] E2[X] = 0.024375 0.0225 = 0.001875 kg2
X (bi) = x
b1
b2
b3
b4
b5
b6
b7
b8
Weight (Random
variable)

EduPristine
x1=0.10
x2=0.15
x3=0.20
15
3.c. Discrete Probability distributions

I.
Define and describe Discrete Probability distributions

1)
2)
3)
4)
5)
EduPristine
Uniform
Bernoulli
Binomial
Poisson
Negative Binomial
16
3.c. Discrete PDF- Uniform distribution

Sample space S = {1, 2, 3,,k} .
Probability measure:
equal assignment (1/k) to all outcomes i.e. all outcomes are equally likely.
Random variable X defined by X(i) = i , (i = 1, 2, 3,,k) .

Distribution: P(X = x) = 1/k
where x = (1, 2, 3, 4,.,k)
Expected values:
Mean, = (k + 1)/2
Variance, 2 = (k2 1)/12
Example: Assigning equal probability of default to a portfolio of credit card holders.

Uniform distribution- Probability (k=10)
0.1
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
1
EduPristine
10
17
3.c. Discrete PDF- Bernoulli distribution

A Bernoulli trial is an experiment which has only two possible outcomes s (success) and f
(failure).
success and failure are mere labels and should not be taken literally. Instead we could have
yes and no OR true and false
Sample space S = {s,f} .
Probability measure:
P({s}) = p,
P({f}) = 1 p
0<p<1
Random variable X defined by
X(s) = 1,
Distribution: P(X = x) = px * (1-p)1-x
, x = 0, 1; 0 < p < 1
X(f) = 0.
Expected values:
Mean, = p
Variance, 2 = p (1 p)
Examples:
Tossing of a coin. Head corresponds to success and Tail corresponds to failure.
Defaulting a home loan. Default corresponds to success and Non-default corresponds to failure.
Auto insurance policy. No claim corresponds to success and Claim corresponds to failure.
EduPristine
18
3.c. Discrete PDF- Bernoulli distribution

Bernoulli distribution with probability of success (p) = 0.25.
Bernoulli distribution, p = 0.25
0.75
0.50
0.25
0.00
s (X = 1)
EduPristine
f (X = 0)
19
3.c. Discrete PDF- Binomial distribution

Assumptions
Each trial has only 2 possible outcomes, success
and failure.
The trials are identical and fixed (Usually denoted
by n)
The probability of success p is constant (0 <= p
<=1).
All trials are independent
Example
Number of borrowers that may default during a
time period
If we know total borrowers and constant PD for all
borrowers, assuming default independence
Number of claims on insurance policy from total

policy holders
Binomial Distribution One Parameter

Distr.
The probability of getting exactly k successes
in n trials is given by the probability mass
function:
for k = 0, 1, 2, ..., n, where

nC = n!/{(n-k)!*k!}
k
Mean (X) = n x p
Variance (X) = n x p x q
Discrete (Counting) Distribution Useful for Modeling Frequency of Losses

Variance < Mean, useful if variance of operational loss frequency is less than mean
EduPristine
20
3.c. Case study for binomial distribution

Each operational loss, independently, is supposed to be insured with probability 60% in a BL
(probability of 60% arrived from historical data as Number of insured Losses/Total number
of losses in the BL over last 36 months). This implies that the annual number of insured loss
is the sum of Bernoulli trial results and would follow a binomial distribution. If during a
particular year, 20 losses happen in the BL, what is the probability that the Bank would have
insurance in 10 or less cases?
Answer: Refer sheet: Ex-Binomial
What is the exact probability of getting insurance benefit in 18 out of 20 cases?
EduPristine
21
3.c. Discrete PDF- Poisson distribution

Expresses the probability of a number of events
occurring in a fixed period of time if these events
occur with a known average rate and independently of
the time since the last event
Assumptions
Constant mean (number of events in a prespecified time interval)
The interval length between two consecutive
events follows an exponential distribution ( )
Sum of independent Poisson variables is also Poisson

for 12M period maybe taken as 4 x for 3M
period
Example
For instance, event of occurrence of operational
risk losses, credit defaults during a time period; if
individual events are independent
Poisson Distribution One Parameter Distr.

Expected number of occurrences in interval =
, Probability there are exactly k occurrences
is equal to
k is the number of occurrences of an event

and is a non-negative integer, k = 0, 1, 2, ...
k! is the factorial of k
e is the base of the natural logarithm (e =
2.71)
(+ve real number), equal to the expected
number of occurrences during interval
Mean (X) = Variance (X) =
Discrete (Counting) Distribution Most Popular for Modeling Frequency of Losses

Variance = Mean, useful if variance of operational loss frequency is equal to mean
EduPristine
22
3.c. Case study for Poisson distribution

Annual mean Damage to Physical Assets frequency in Agency Services is 5.9 events p.a.
Find the probability of recording 0, 1, 2, 3, 4..20 losses over next 12M.
The Bank actually records 10 such events over next 12M. The management feels that it is 1
out of 100 years scenario. Verify this hypothesis.
Answer: Refer sheet Ex-Poisson
EduPristine
23
3.c. Discrete PDF- Negative- Binomial distribution

Discrete probability distribution of the number of failures (r)
in a sequence of Bernoulli trials before a specified (nonrandom) number k of success occurs
Special generalized case of the Poisson distribution
Intensity rate () is no longer taken to be constant
(Assumed to follow a Gamma Distribution)
Negative Binomial Distr. 2 Param

Distr.
The probability of getting exactly r
failures before k successes is given
by the probability mass function:
Two-parameter distribution
Provides additional flexibility in fitting data

Parameter uncertainty maybe high with less data points
(typical of scenario where annual frequency data points
maybe 3-6)
Advantages
for k = 0, 1, 2, ..., n,
where
nC = n!/{(n-k)!*k!}
k
Allows modelling of the frequency dependence due to the

assumption that occurrence of operational losses may be
affected by some external factor
Discrete (Counting) Distribution Popular for Modeling Frequency of Losses

Variance > Mean, useful if variance of operational loss frequency is greater than mean
EduPristine
24
3.c. Continuous Probability distributions

I.
Define and describe Continuous Probability distributions

1)
2)
3)
4)
5)
6)
7)
8)
EduPristine
Uniform
Normal
Lognormal
Exponential
Gamma
Chi-square
t- distribution
F- distribution
25
3.c. Continuous PDF- Uniform distribution

Assigns equal probability to all values between its minimum and maximum values.
Random variable X takes a value between two number a and b (say).
Probability density function:
Denoted as X ~ U(a, b)
fX(x) = 1/(b-a),
a<x<b
Expected values:
Mean, = (a + b)/2
Variance, 2 = (b - a)2/12
Example: Assigning equal probability of default to a portfolio of credit card holders.

Uniform distribution- Probability (0,10)
0.1
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
1
EduPristine
10
26
3.c. Continuous PDF- Gamma distribution

Gamma family if distributions is a positively-skewed distribution explained by two parameters and
(say).
It is bounded at zero and can take various shapes depending on values of parameters.
Random variable X takes a non-zero positive value.
fX(x) =( x-1e- x )/() ,
x>0
Denoted as X ~ Gamma(, )
Expected values:
Mean, = /
Variance, 2 = / 2
Special cases:
Exponential distribution when = 1: fX(x) = e- x ,
x>0
Chi-square distribution with = 2v (v any positive integer) and = 1/2
Example:
Used the predict claim amount in Auto insurance.
Used the predict loss amount in bank loan defaults
EduPristine
27
3.c. Continuous PDF- Gamma distribution

Plotting PDFs for different Gamma distributions using MS Excel.
EduPristine
Ga(2, 3)
7.96%
11.41%
12.26%
11.72%
10.49%
9.02%
7.54%
6.18%
4.98%
3.96%
3.12%
2.44%
1.90%
1.46%
1.12%
0.86%
0.65%
0.50%
0.37%
0.28%
Ga(1, 4)
19.47%
15.16%
11.81%
9.20%
7.16%
5.58%
4.34%
3.38%
2.63%
2.05%
1.60%
1.24%
0.97%
0.75%
0.59%
0.46%
0.36%
0.28%
0.22%
0.17%
Ga(20, 0.5)
0.00%
0.00%
0.00%
0.08%
0.75%
3.23%
8.17%
13.98%
17.73%
17.77%
14.71%
10.40%
6.44%
3.56%
1.79%
0.82%
0.35%
0.14%
0.05%
0.02%
Gamma distribution
20%
18%
Ga(2, 3)
16%
Probability
X
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
14%
Ga(1, 4)
12%
Ga(20, 0.5)
10%
8%
6%
4%
2%
0%
1
11
13
15
17
19
Random variable (X)
28
3.c. Continuous PDF- Normal distribution

A symmetrical distribution having bell shaped pdf curve.
Widely used to naturally occurring variables e.g. height, weight, exam scores etc.
Has two parameters mean () and variance (2).
Random variable X takes a non-zero positive value.
fX(x) =(1/ 2 ) exp[-1/2 {(x- )/ }2]
Denoted as X ~ N(, 2)
It provides good approximations to various other distributions (Central Limit Theorem)
Transformation z = (x- )/ has N(0, 1) distribution.
Afterwards, The probability is calculated by looking into the standard probability distribution
table for N(0,1) distribution.
EduPristine
29

Plotting PDFs for different Normal distributions using MS Excel.
N(0,1)
N(0,1.6)
45%
-5
0.0%
0.2%
40%
-4
0.0%
1.1%
35%
N(0,1)
-3
0.4%
4.3%
30%
N(0,1.6)
-2
5.4%
11.4%
-1
24.2%
20.5%
39.9%
24.9%
24.2%
20.5%
15%
5.4%
11.4%
10%
0.4%
4.3%
5%
0.0%
1.1%
0%
0.0%
0.2%
EduPristine
25%
20%
-5
-4
-3
-2
-1
30

Problem:
If X ~ N(25,36) , by making use of standard normal probability distribution table, find:
(i) P( X < 28)

(ii) P( X > 30)
(iii) P( X < 20)
Solution:
(i) P(X < 28) = P(Z < (28-25)/sqrt(36)) = P(Z < 3/6) =0.69146
(ii) P(X > 30) = P(Z > 0.833) =1 P(Z < 0.833) =1 0.79758 = 0.20242
(iii) P(X < 20) = P(Z < 0.833) =1 P(Z < 0.833) =1 0.79758 = 0.20242
EduPristine
31
3.c. Continuous PDF- Lognormal distribution

A positively skewed distribution.
If random variables X has lognormal distribution then Y = log(X) is normally distributed.
Random variable X is bounded at zero and used to model variables taking non-zero positive values.
Defined by two parameters and 2 and denoted as X ~ log N(, 2)
fX(x) =(1/ x 2 ) exp[-1/2 {(log x- )/ }2],
0<x
Expected values:
Mean, E[X] = exp( + (1/2) 2)
Variance, var(X) = exp(2 + 2) (exp(2) 1)
Example:
Used the predict claim amount in Auto insurance.
Used the predict loss amount in bank loan defaults
EduPristine
32
3.c. Continuous PDF- Lognormal distribution

Plotting PDFs for lognormal (0,1) distributions using MS Excel.
logN(0,1)
0.45
0.40
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0
EduPristine
10
12
14
16
18
20
33
3.d. The Central Limit Theorem

Introduction:
It is perhaps one of the most important result in statistics
It provides the basis for large-sample inference about a population mean when the population
distribution is unknown.
It also provides the basis for large-sample inference about a population proportion, for example, in
opinion polls and surveys.
Definition:
If X1, X2, .,Xn is a sequence of independent, identically distributed (iid) random variables with finite
mean and finite (non-zero) variance 2 then the distribution of (<X> )/( /n) approaches the
standard normal distribution, N(0,1) , as n
is the population mean from which X1, X2, .,Xn have been extracted.
<X> is the sample mean calculated as <X> = (1/n)
i=n
Xi
i=1
For large n, (<X> )/( /n) and ( Xi n )/((n 2)) has N(0, 1) distribution
OR
<X> ~ N(, 2/n)
Xi ~ N(n , n 2)
EduPristine
34
3.d. The Central Limit Theorem

Example:
It is assumed that the number of claims arriving at an insurance company per working day has
a mean of 40 and a standard deviation of 12. A survey was conducted over 50 working days.
Find the probability that the sample mean number of claims arriving per working day was less
than 35.
Solution:
We have, = 40, = 12 , n = 50 .
2
The central limit theorem states that <X) ~ N(40,12 /50) .

We want P( <X> < 35) :
2
P( <X> < 35) = P(Z < (35-40)/ (12 /50))

= P(Z < -2.946) = 1 P(Z < 2.946)
= 1 0.9984 = 0.0016
EduPristine
35
3.e. Sampling and Statistical Inference

I.
Introduction
II. Random samples

III. Sample Mean
IV. Sample variance
V. The t- result
VI. The F- result
EduPristine
36
3.e. Sampling and Statistical Inference

Introduction:
When a sample is taken from a population the sample information can be used to infer
certain things about the population.
For example, a population quantity could be its mean or variance.
If we were to keep taking samples from the same population and calculating the mean and
variance for each of the samples, we would find that the mean and variance results form
distributions as well.
The distributions of the sample mean and sample variance are called sampling
distributions.
Need for Sampling:
The physical impossibility of checking all items in the population.
The cost of studying all the items in a population.
The sample results are usually adequate.
Contacting the whole population would often be time-consuming.
The destructive nature of certain tests.
EduPristine
37
3.e. Sampling and Statistical Inference- Normal distribution

The sample mean:
Mean, <X> = (1/n)Xi
Distribution:
(<X> )/( /n) ~ N(0,1)
<X> ~ N(, 2/n)

is the population mean for which we are trying to draw the inference
The sample variance:

Mean, S2 = (Xi - <X>)2/(n-1)
Distribution:
(n-1) S2/2 ~ 2n-1
2 is the population variance for which we are trying to infer
Point to be noted: Distribution of mean is symmetrical (Normal) where as for variance it

positively skewed (chi-square) for small n by somewhat symmetrical for large n.
Expected value of S2:
E[(n-1) S2/2] = E[2n-1],
(the mean and variance of 2k are k and 2k, respectively)
E[S2] = 2
i.e. expected value of sample variance is an un-biased estimator of population
variance
EduPristine
38

The t result:
Distribution:
(<X> )/( /n) ~ N(0,1) is used to draw inference about when population variance 2 is known.
But for a population usually 2 is not known.
We combine (<X> )/( /n) ~ N(0,1) and (n-1) S2/2 ~ 2n-1 to solve this problem.
(<X> )/(S /n) ~ N(0,1)/(2n-1/n-1) = tn-1
As N(0,1)/(2k/k) = tk
Example:
State the distribution of (<X>-100)/(S/ 5) for a random sample of 5 values taken from a N(100, 2 )
population. What is the probability that this quantity will exceed 1.533?
Solution:
Distribution: (<X>-100)/(S/ 5) ~ t4
From the t-Distribution table, the probability that this quantity will exceed 1.533 is 10%.
EduPristine
39

The F result:
if independent random samples of size n1 and n2 respectively are taken from normal populations
with variances 12 and 22 , then
(S12/ 12 ) / (S22/ 22 ) ~ Fn1-1, Fn2-1
The F distribution gives us the distribution of the variance ratio for two normal populations.
EduPristine
40
3.e. The F-test

Example: William Waugh is examining the earnings for two different industries. He suspects that the
earnings for chemical industry are more divergent than those of petroleum industry. To confirm, he
took a sample of 35 chemical manufacturers & a sample of 45 petroleum companies. He measured
the sample standard deviation of earnings across the chemical industry to be $3.5 & that of
petroleum industry to be $3.00. Determine if the earnings of the chemical industry have greater
standard deviation than those of the petroleum industry.
EduPristine
41
Solution
1. State the hypothesis:
where variance of earnings for the chemical industry =
variance of earnings for the petroleum industry =
2. Select the appropriate test statistic: F=s12 / s22
3. Specify the level of significance: Take it 5% here
4. State the decision rule regarding the hypothesis: Reject H0 is F > 1.74
5. Collect the sample & calculate the sample statistics:
Using the information provided, the F-statistic can be computed as:
F = S12 = $3.502 = 1.1165 < 1.74 (Hence no sufficient evidence to reject H0)
S22 $3.002
EduPristine
42
3.f. Point Estimate & Confidence Intervals

Point estimates: These are the single (sample) values used to estimate population parameters
Confidence interval: It is a range of values in which the population parameter is expected to lie
Confidence interval takes on the following form where N 30
CI = m + Z*sx
True for a population distribution where

m is the mean of the population
sx is the standard deviation of the population
For a sample mean,
Point estimate + (reliability factor * standard error )
CI = < x > + Z*(Sx/n)

Where < x > is the mean of the sample
Sx is the standard deviation of the population
EduPristine
43
3g. Hypothesis Testing

A statistical hypothesis test is a method of making statistical decisions from and about
experimental data.
Null-hypothesis testing answers the question:
How well the findings fit the possibility that chance factors alone might be responsible."
Example: Does your score of 6/10 imply that I am a good teacher???
EduPristine
44
3g. Key steps in Hypothesis Testing

Null Hypothesis (H0): The hypothesis that the researcher wants to reject
Alternate Hypothesis(Ha): The hypothesis which is concluded if there is sufficient evidence to
reject null hypothesis

Test Statistic
Rejection/Critical Region
Conclusion
EduPristine
45
3g. Launching a niche course for MBA students?

Sam, a brand manager for a leading financial training center, wants to introduce a new niche finance course for MBA
students. He met some industry stalwarts and found that with the skills acquired by attending such a course, the
students would able to land up a in a good job.
He meets a random sample of 100 students and discovers the following characteristics of the market
Mean household income to $20,000
Interest level in students = high
Current knowledge of students for the niche concepts = low
Sam strongly believes the course would adequately profitable in students if they have the buying power for the
course. They would be able to afford the course only if the mean household income is greater than $19,000.
Would you advice Sam to introduce the course?
What should be the hypothesis?
o Hint: What is the point at which the decision changes (19,000 or 20,000)?
o What about the alternate hypothesis?
What other information do you need to ensure that the right decision is arrived at?
o Hint: confidence intervals/ significance levels?
o Hint: Is there any other factor apart from mean, which is important? How do I move from population
parameters to standard errors?
What is the risk still remaining, when you take this decision?
o Hint: Type-I/II errors?
o Hint: P-value
EduPristine
46
3g. Criterion for Decision Making

To reach a final decision, Sam has to make a general inference (about the population) from the
sample data.
Criterion: Mean income across all households in the market area under consideration.
If the mean population household income is greater than $19,000, then PD should introduce
the product line into the new market.
Sams decision making is equivalent to either accepting or rejecting the hypothesis:
The population mean household income in the new market area is greater than $19,000
The term one-tailed signifies that all z-values that would cause Sam to reject H0, are in just one
tail of the sampling distribution
m -> Population Mean
H0: m $19,000
Ha: m > $19,000
EduPristine
47
3g. Identifying the Critical Sample Mean Value Sampling Distribution

0.25
0.2
0.15
Critical Value
(Xc)
0.1
0.05
0
-10
-5
0
$19,000
10
Sample mean values greater than $19,000--that is x-values on the right-hand side of the sampling
distribution centered on = $19,000--suggest that H0 may be false.
More important the farther to the right x is , the stronger is the evidence against H 0
Reject H0 if the sample mean exceeds Xc
EduPristine
48
3g. Computing the Criterion Value
Standard deviation for the sample of 100 households is $4,000. The standard error of the mean
(sx) is given by:
sx
$ 400
Critical mean household income xc through the following two steps:

Determine the critical z-value, zc. For =0.05:
zc = 1.645.
Substitute the values of zc, s, and m (under the assumption that H0 is "just" true )
Critical Value xc
xc = m + zcs = $19,658.
In this case, since the observed sample statistic (20,000) is greater than the critical value (19,658), so
the null hypothesis is rejected =>
Decision Rule
If the sample mean household income is greater than $19,658, reject the null hypothesis and introduce the
new course
EduPristine
49
3g. Test Statistic
The value of the test statistic is simply the z-value corresponding to = $20,000.
x m
2 .5
0.25
sx
0.2
Here, sx is the standard error

0.15
= 0.05
0.1
There is a significant difference in

the hypothesized population
parameter and the observed
sample statistic =>
Mean income > 19,000 =>
Launch the course
0.05
0
-10
-5
0
=$19,000
Z=0
Do not Reject H0
Reject H0
X
Z
EduPristine
50
x=
5 $ 20,000
Z=2.5
$ 19 , 658
1 . 645
10
3g. Errors in Estimation

Please note: You are inferring for a population, based only on a sample
This is no proof that your decision is correct

Its just a hypothesis
Actual
There is still a chance that your inference is wrong
H0 is True
H0 is False
How do I quantify the prob. of error in inference? Inference
Type I and Type II Errors:

Type I error occurs if the null hypothesis is
H0 is True
Correct Decision
Confidence Level=1-
rejected when it is true

Type II error occurs if the null hypothesis is
H0 is False
not rejected when it is false
Type-I Error Significance

Level=
Significance Level:
-> Significance level :
The upper-bound probability of a Type I error
1 - ->confidence level :
The complement of significance level
The power of a test is the probability of correctly rejecting the null.
EduPristine
Type-II Error P(TypeII Error)=

Power=1-
3g. P - Value Actual Significance Level
The p-value is the smallest level of

significance at which the null hypothesis can
be rejected.
P-value
The probability of obtaining an observed

value of x (From the sample) as high as
$20,000 or more when actual
populations mean (m) is only $19,000 =
0.00621
Calculated probability of rejecting the
null hypothesis (H0) when that hypothesis
(H0) is true (Type I error)
The actual significance level of 0.00621 in

this case means that the odds are less than
62 out of 10,000 that the sample mean
income of $20,000 would have occurred
entirely due to chance (when the
population mean income is $19,000)
EduPristine
0.25
0.2
0.15
0.1
= 0.05
0.05
=$19,000
Z=0
Do not Reject H0
p-value= 0.00621
Reject H0
5352
3g. Some variations in the Z-Test
What if Sam surveyed the market and found that the student behavior is estimated to be:
They would found the training too expensive if their household income is < US$ 19,000 and
hence would not have the buying power for the course?
They would perceive the training to be of inferior quality, if their household income is >
US$19,000 and hence not buy the training?
How would the decision criteria change? What should be the testing strategy?
Hint: From the question wording infer: Two tailed testing
Appropriately modify the significance value and other parameters
Use the Z-test
Appropriate change in the decision making and testing process process:
Students will not attend the course if:
The household income >$19,000 and the students perceive the course to be inferior
The household income is <$19,000
This becomes a two tailed test wherein the student will join the course only when the
household lie between a particular boundary. i.e. the household income should be neither
very high neither very low
EduPristine
Two- Tailed Test
Now the test is modified to two-tailed test,

which signifies that all z-values that would
cause PD to reject H0, are in both the tails of
the sampling distribution
m -> Population Mean
H0: m = $19,000
Ha: m $19,000
=
Since we are checking for significance
difference on both the ends, so its a two
tailed test
The lower boundary =
0.25
0.2
0.15
0.025
0.1
= 0.025
0.05
0
m Z / 2 * s 19 , 000 1 . 95 * 400 $ 18 , 216
-10
=$19,000
Z=0
-5
m Z / 2 * s 19 , 000 1 . 95 * 400 $ 19 , 784
Conclusion: If the household income lies

between $18,216 and $19,784 then the
student will attend the course at 95%
confidence
EduPristine
Reject H0
54
Do not
Reject H0
10
Reject H0
Thank you!
EduPristine
702, Raaj Chambers, Old Nagardas Road, Andheri (E), Mumbai-400 069. INDIA
www.edupristine.com
Ph. +91 22 3215 6191
EduPristine
EduPristine www.edupristine.com

Basic Statistic

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Statistic

Uploaded by

Copyright:

Available Formats

Business Analytics

III. Probability distribution

Sampling and statistical inference

VI. Confidence intervals

Why is it important from analytics perspective?

3.a. Probability- Other topics

Basic probability axioms

P(A|B) = P (A B)/ P(B)

3.b. Random variables

Types of Random Variables

III. Distribution and Probability Density functions of Random Variables

Variance of Random Variables

VI. Coefficient of skewness of Random Variables

3.b. Random variables- Definition

3.b. Random variables- Definition

X(b3) = 0.1 kg,

Set of real numbers- Weights in kg

3.b. Types of Random variables

3.b. Discrete Random variables

Outcome of rolling a dice= {1, 2, 3, 4, 5, 6}

Or # credit cards owned by an individual = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

3.b. Discrete Random variables

Cumulative distribution functions

3.b. Discrete Random variables- Probability

FX(0.10) = P(X <= 0.10) = P(X=0.10) = 3/8

Weight (Random variable)

Sample space S- Individual balls

3.b. Continuous Random variables

Probability Density Function

P(a < X < b) = a fX(x) dx

The conditions for a function to serve as PDF are

3.b. Continuous Random variables

3.b. Random variables- Expected values

For continuous case calculated as E[X] = - x * fX(x) dx

3.b. Random variables- Expected values

E[X] = Pi * xi = 3/8 * 0.10 + 2/8 * 0.15 + 3/8 * 0.20 = 1.2/8 = 0.15 kg

Sample space S- Individual balls

3.c. Discrete Probability distributions

Define and describe Discrete Probability distributions

3.c. Discrete PDF- Uniform distribution

Random variable X defined by X(i) = i , (i = 1, 2, 3,,k) .

where x = (1, 2, 3, 4,.,k)

Example: Assigning equal probability of default to a portfolio of credit card holders.

3.c. Discrete PDF- Bernoulli distribution

Random variable X defined by

Distribution: P(X = x) = px * (1-p)1-x

3.c. Discrete PDF- Bernoulli distribution

3.c. Discrete PDF- Binomial distribution

Number of claims on insurance policy from total

Binomial Distribution One Parameter

for k = 0, 1, 2, ..., n, where

Discrete (Counting) Distribution Useful for Modeling Frequency of Losses

3.c. Case study for binomial distribution

3.c. Discrete PDF- Poisson distribution

Sum of independent Poisson variables is also Poisson

Poisson Distribution One Parameter Distr.

k is the number of occurrences of an event

Mean (X) = Variance (X) =

Discrete (Counting) Distribution Most Popular for Modeling Frequency of Losses

3.c. Case study for Poisson distribution

Answer: Refer sheet Ex-Poisson

3.c. Discrete PDF- Negative- Binomial distribution

Negative Binomial Distr. 2 Param

Provides additional flexibility in fitting data