Professional Documents
Culture Documents
Excel
By Dr. Shailaja Rego
Introduction to statistics
Definition
1. Statistical Data
By Statistics we mean aggregates of facts affected to a
marked extent by multiplicity of causes, numerically
expressed, enumerated or estimated according to a
reasonable standards of accuracy collected in a
systematic manner for a predetermined purpose and
placed in relation to each other
2. Statistical Methods
Statistics may be defined as science of collection
organisation presentation analysis and interpretation of
numerical data.
Applications of Statistics in
Various Areas
• Marketing
• Economics
• Finance
• Insurance
• Operations
• Human Resource Management or Development
• Information Systems
• Data Mining
Illustrative List of Statistical
Techniques and their Applications
Sr Statistical
Field Specific Application
No Technique
Binomial Quality
1 Sampling Inspection
Distribution Assurance
Cluster Target Marketing, Customer
2 Marketing
Analysis Profiling
Cluster Planning and
3 Identifying Similar Groups
Analysis Management
Production
4 Control Chart Quality Control
Engineering
Correlation and
5 Regression Financial Risk Hedging of Investments
Analysis
Illustrative List of Statistical Techniques and their Applications
Sr
Statistical Technique Field Specific Application
No
Correlation and
6 Marketing Cross-Market Analysis
Regression Analysis
Investments, Portfolio
7 Decision Theory Finance Selection, Mergers and
Acquisitions
Discriminant
8 Finance Credit Risk Analysis
Analysis
Discriminant
9 Marketing Customer Profiling
Analysis
10 Forecasting Banking Business Forecasting
Illustrative List of Statistical Techniques and their Applications
Sr Statistical
Field Specific Application
No Technique
Pricing of Financial
11 Forecasting Finance Products, Return on
Investment
12 Forecasting HRD Manpower Planning
Sr Statistical
Field Specific Application
No Technique
Logistic
16 Finance Credit Risk Analysis
Regression
Normal
17 Equity Research EPS
Distribution
Normal
18 Finance Risk Management
Distribution
Normal
19 Finance Yield Curve
Distribution
Normal
20 HRD Performance Appraisal
Distribution
Normal Production
21 Six Sigma
Illustrative List of Statistical Techniques and their Applications
Sr
Statistical
N Field Specific Application
Technique
o
Normal Production
22 Statistical Quality Control
Distribution Engineering
Normal Project
23 PERT / CPM
Distribution Management
24 Percentiles Education Relative Ranking
Formulating Compensation
25 Percentiles HR
Strategies
Rankings in Contests With
26 Rank Correlation Rankings
Multiple Judges
Rankings with Multiple
27 Rank Correlation Rankings
Criteria
28 Sampling Election Opinion/Exit Polls
Illustrative List of Statistical Techniques and their Applications
Sr Statistical
Field Specific Application
No Technique
29 Sampling Market Research Consumer Survey
Production
30 Sampling Inspection and Quality Control
Engineering
Testing of Agriculture/
31 Testing a Pesticide on Field
Hypothesis Chemical
Testing of Paramedical-
32 Testing a Drug on Clinical Trial
Hypothesis Pharmaceutical
Sensex, NIFTY, Wholesale Price
33 Weighted Average Finance
and Consumer Price Indices
WACC (Weighted Average Cost
34 Weighted Average Finance of Capital) and EVA (Economic
Value Added)
Illustrative List of Decision Situations and Corresponding
Statistical Techniques
Area Decision Situation Statistical Techniques
Applicable
• Descriptive Analysis
• Inferential Analysis
• Differences Analysis (Test of analysis)
• Associative Analysis
• Predictive Analysis
Type Description Example Statistical
Concepts
Descriptive Data Describes the typical Mean, mode,
Reduction respondents; median Standard
Describes how similar deviation range
respondents are to the frequency
typical respondent distribution
Inferential Determine Values estimate Standard error, null
population population hypothesis
parameters test
hypothesis
Differences Determine if Evaluate statistical Z-test and t-test of
differences significance of differences,
exists between difference in the analysis of
groups means of two groups variance
in a sample
Type Description Example Statistical
Concepts
Associati Determine Determine if two Correlation
ve associations variables are Cross
related in a tabulation
systematic way
Predictive Forecast Estimate the level Time series
based on a of Y, given the analysis,
statistical amount of X regression
model
When to use a particular descriptive measure?
i = Range
1+3.332 log N
Diagrammatic or graphical Representation
Types of Graphs
One Dimensional
Two Dimensional
Three Dimensional
One Dimensional or Bar Diagrams
Types of Bar Diagram
Simple
Multiple
Deviation
Sub-Divided
Percentage
Country Birth Rate
India 33
Germany 16 Birth Rate per thousand for different countries
U.K. 20
45 40
China 40 40
33
35
New Zaland 30 30
30
Birth Rates
Sweden 15 25
20 16
20
15
Birth Rate
15
10
5
0
Countries
Year Marine Inland Total
1991-92 5.34 2.18 7.52
Sub divided
1992-93 8.8 2.8 11.6 bar diagram
1993-94 10.86 6.7 17.56
1994-95 15.55 8.87 24.42
1995-96 16.98 11.03 28.01
1996-97 17.16 11.6 28.76
1997-98 12.47 8.42 20.89
35
30
25
11.03 11.6
20 8.87 Inland
8.42
15 6.7 Marine
10 2.8 17.16
2.18 15.55 16.98
5 8.8 10.86 12.47
5.34
0
1991- 1992- 1993- 1994- 1995- 1996- 1997-
92 93 94 95 96 97 98
Multiple Bars
Year West North East South Centre
1996 78.4 88.9 83.7 89.9 86.5
1997 75.6 62.5 103.6 75.5 77.4
1998 121.2 116.5 107.6 123.9 90.3
Zonewise Rainfall
100%
80%
0%
1 2 3
-20%
Rupee Comes From
2%
3%
Excise
3%
Customes
6%
Internal Borrowing
7% 22%
Non Tax Revinew
7% Deficit
Other Capital Reciepts
18%
14% Corporation Tax
18% Income Tax
External Assistance
Other Taxes
Graphs of frequency distribution
Histogram
Frequency polygon
Smoothed frequency curve
Ogives or cumulative frequency curves
Histogram
70 60
60 52
No of Students
50 40 40
40 35
30 No of Students
30 22
20 8 12
10 5
0
0-10 10- 20 - 30 - 40 - 50 - 60 - 70 - 80 - 90 -
20 30 40 50 60 70 80 90 100
Marks
No of Students
70
60
50
40
No of Students
30
20
10
0
0-10 10- 20 - 30 - 40 - 50 - 60 - 70 - 80 - 90 -
20 30 40 50 60 70 80 90 100
Scattered Diagrams
x 2 3 5 6 8 9
y 6 5 7 8 12 11
14
12
10
8
y
y
6
4
2
0
0 2 4 6 8 10
x
Requisites of a good average
Easy to understand
Simple to compute
Based on all items
Not be unduly affected by extreme observations
Rigidly defined
Capable of further algebraic treatment
Sampling stability
Types of averages
OR
X= X
N
Arithmetic Mean- Discrete Series
X= fX
N
20 8 160
30 12 360
40 20 800
50 10 500
60 6 360
70 4 280
fx =2460
2460
= 41
X= 60
Arithmetic Mean- Simple
X1+ X2 + X3 +…..+ XN
X=
N
OR
X= X
N
Arithmetic Mean- Discrete Series
X= fX
N
X= fm
N
Where m is midpoint of various classes
f is frequency of each class
N = total frequency
0-10 5 5 25
10-20 15 10 150
20-30 25 25 625
30-40 35 30 1050
40-50 45 20 900
50-60 55 10 550
N=100 fm=3300
X = 33
Combined Mean of two groups`
N1 + N2
Weighted Mean
For frequency Distribution
WX Xw = W(fX)
Xw =
W W
Median
th
N+1
If N is Odd value of observation
2
th th
N N
If N is Even average value of And +1
2 2
observations
Median – Discrete Series
Size of N+1/2 th item
Ex- From following data find value of median
N/2 –c.f. x i
Median = L + f
L = Lower limit of median class
c.f. = cumulative frequency of class preceeding to
median class
f = simple frequency of median class
i=class interval of median class
Calculate Median for following frequency distribution
Marks No of std
45-50 10
40-45 15
35-40 26
30-35 30
25-30 42
20-25 31
15-20 24
10-15 15
5-10 7
Calculation Of Mode Frequency Distribution
Mode = L + 1 Xi
1+ 2
L = Lower limit of the modal class
1 = Difference between the frequency of
modal class & freq of pre modal class
x
Distributions with
different mean but
with same
dispersion x1 x2
Distributions with
different mean and
different dispersion
x1 x2
Significance of Measuring variation
1. The Range
2. Mean deviation
3. The Standard Deviation
Range
Range = L - S
Coefficient of Range
Coefficient of Range = L–S
L+S
Series A 46 6 46 46 46 46 46 46
Series B 6 10 6 6 46 46 46 46
Series C 6 6 15 25 30 32 40 46
Despite Serious limitations Range is useful in
following cases
1. Quality Control
2. Fluctuations in share prices
3. Weather forecast
4. Everyday Life
Mean Deviation
Mean Deviation – Individual Observations
M.D. = 1N X-A
1 D Or D
N
N
Where D = X - A
Coefficient of M.D. = M.D
Median
Ex- Calculate the M.D. and its coefficient for the two income
groups of 5 and 7
Group 1 Group 2
Deviation from
Median 4400
4000 400 3000 1400
4200 200 4000 400
4400 0 4200 200
4600 200 4400 0
4800 400 4600 200
4800 400
N=5 |D| 1200 5800 1400
N=7 |D| 4000
Mean Deviation – Discrete series
fD
M.D =
N
Where | D | denotes deviation from median
ignoring signs
X 10 11 12 13 14
F 3 12 18 12 3
Mean Deviation – Continuous series
M.D. = fD
N
Ex- Find the median and mean deviation of the following data
Size Frequency
0-10 7
10-20 12
20-30 18
30-40 25
40-50 16
50-60 14
60-70 8
The Standard Deviation
Introduced by Karl Pearson
2
d
( d
)
2
= -
N N
240,260,290,245,255,288,272,263,277,251
X =2641 d2 = 2689
2
d 2
d
=
N (
-
N
) =16.398
Calculation of SD – Discrete series
fX2
=
N
X2 = (X – X)2
2
fd
(
fd
)
2
= -
N N
Where d= X-A
Calculation of SD – Contineous series
2
fd
(
fd
)
2
= - Xi
N N
Where d = m-A
i
i=class interval
Combined SD
2 2
N1 1 + N2 2 + N1D12 +N2D22
12 =
N 1 + N2
Coefficient of variation
CV= x 100
X
Normal distribution: inflection points and standard deviations
Skewness
“Skewness” refers to deviations from symmetry
with respect to a location measure. The quantity,
often referred to as b1, that is commonly used as a
measure of asymmetry
Karl Pearson Coefficient of Skewness
Mean - Mode
Skp =
In moderately skewed distribution the averages have
following relationship
Mode = 3 Median – 2 Mean
3 (X – Median)
Skp =
Bowley’s Coefficient of Skewness
(Q3-Median) – (Median – Q1)
Skb =
(Q3-Median) + (Median – Q1)