You are on page 1of 5

Pabna University of Science and Technology

Department of Electrical and Electronic Engineering

EEE 3104
Numerical Methods and Statistics Laboratory
Experiment No 5
Statistic Measurements Using Excel

Introduction:
Statistics is concerned with the scientific methods for collecting, organizing, summarizing,
presenting, and analyzing data as well as with drawing valid conclusions and making reasonable
decisions on the basic of such analysis.
Measurement of Central Tendency:
An average is a value that is typical, or representative, of a set of data. Since such typical values
tend to lie centrally within a set of data arranged according to magnitude, averages are also called
measures of central tendency. The most common being the arithmetic mean, the median, the mode,
the geometric mean, and the harmonic mean. Each has advantage and disadvantages, depending
on the data and the intended purpose.
The Arithmetic mean: the arithmetic mean, or briefly the mean, of a set of N numbers X 1, X2,
X3, ….., XN is denoted by 𝑋̅ and defined as

𝑋1 + 𝑋2 + 𝑋3 + ⋯ + 𝑋𝑁 ∑𝑁
𝑗=1 𝑋𝑗 ∑𝑋
𝑋̅ = = =
𝑁 𝑁 𝑁
For group data, where f is the frequencies of classes.

𝑓1 𝑋1 + 𝑓2 𝑋2 + 𝑓3 𝑋3 + ⋯ + 𝑓𝐾 𝑋𝐾 ∑𝐾
𝑗=1 𝑓𝑗 𝑋𝑗 ∑ 𝑓𝑋
𝑋̅ = = 𝐾 =
𝑓1 + 𝑓2 + 𝑓3 + ⋯ + 𝑓𝐾 ∑𝑗=1 𝑓𝑗 ∑𝑓

The Median: the median of a set of numbers arranged in order of magnitude is either the middle
value or the arithmetic mean of the two middle values. Thus, the median is a positional average.
For grouped data median is given by
𝑁
− (∑ 𝑓)1
𝑀𝑒𝑑𝑖𝑎𝑛 = 𝐿1 + ( 2 )𝑐
𝑓𝑚𝑒𝑑𝑖𝑎𝑛

Where 𝐿1 = lower class boundary of the median class


N = number of items in the data (total frequency)
(∑ 𝑓)1 = sum of frequencies of all classes lower than the median class

𝑓𝑚𝑒𝑑𝑖𝑎𝑛 = frequency of the median class


c = size of the median class interval
The mode: the mode of a set of numbers is that value which occurs with the greatest frequency;
that is, it is the most common value. For grouped data, mode is calculated as below mention
formulas
∆1
𝑀𝑜𝑑𝑒 = 𝐿1 + ( )𝑐
∆1 + ∆2
Where, 𝐿1 = lower class boundary of the modal class
∆1 = excess of modal frequency over frequency of next-lower class
∆2 = excess of modal frequency over frequency of next-higher class
c = size of the modal class interval

Example 1: The monthly power consumption of 20 houses in an area were 35, 105, 49, 225, 50,
30, 125, 65, 40, 145, 55, 125, 52, 76, 155, 48, 325, 47, 125, and 60 MW. Find the mean, mode,
and median
Method 1: use ‘SUM’ function and then apply mean formulas, sort the data form DATA tab>>
Sort option and then calculate mode and median.
Method 2: In the Excel use function ‘AVERAGE’, ‘MODE.SNGL’, and ‘MEDIAN’
Ans: mean = 96.85, mode = 125, median= 62.5

Example 2: Find the mean, mode, and median from the Group data

Weight (lb) Frequency


118-126 3
127-135 5
136-144 9
145-153 12
154-162 5
163-171 4
172-180 2

Mean=146.975, mode=147.4, median=147


Measurement of Dispersion:
The degree to which numerical data tend to spread about an average value is called the dispersion,
or the variation of the data.
The Mean Deviation: the mean deviation, or the average deviation, of a set of N numbers X1, X2,
X3, ….., XN is abbreviated MD and defined as
∑𝑁 ̅
𝑗=1|𝑋𝑗 − 𝑋| ∑|𝑋 − 𝑋̅| ̅̅̅̅̅̅̅̅̅̅
𝑀𝑒𝑎𝑛 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (𝑀𝐷) = = = |𝑋 − 𝑋̅|
𝑁 𝑁
Mean deviation for group data
∑𝑁 ̅
𝑗=1 𝑓𝑗 |𝑋𝑗 − 𝑋 | ∑ 𝑓 |𝑋 − 𝑋̅ |
𝑀𝐷 = =
𝑁 𝑁
The Standard Deviation: the standard deviation of a set of N numbers X1, X2, X3, ….., XN is
denoted s and defined as

2
∑𝑁 ̅
𝑗=1(𝑋𝑗 − 𝑋 ) ∑(𝑋 − 𝑋̅ )2
𝑠= √ = √
𝑁 𝑁

Standard deviation for group data

2
∑𝑁 ̅
𝑗=1 𝑓𝑗 (𝑋𝑗 − 𝑋) ∑ 𝑓(𝑋 − 𝑋̅ )2
𝑠=√ =√
𝑁 𝑁

The Variance: the variance of a set of data is defined as the square of the standard deviation and is
thus given by

∑𝑁 ̅ 2
𝑗=1(𝑋𝑗 − 𝑋 ) ∑(𝑋 − 𝑋̅ )2
2
𝑠 = =
𝑁 𝑁

The Moments
If X1, X2, X3, ….., XN are the N values assumed by the variable X, the quantity is defined as

𝑋1𝑟 + 𝑋2𝑟 + ⋯ + 𝑋𝑁𝑟 ∑𝑁 𝑟


𝑗=1 𝑋𝑗 ∑ 𝑋𝑟
̅̅
𝑋̅̅𝑟 = = =
𝑁 𝑁 𝑁
called the rth moment. The first moment with r=1 is the arithmetic mean 𝑋̅.
The rth moment about the mean 𝑋̅ is defined as
∑𝑁 ̅ 𝑟 ∑(𝑋 − 𝑋̅ )𝑟
𝑗=1(𝑋𝑗 − 𝑋)
𝑚𝑟 = =
𝑁 𝑁
Skewness
Skewness is the degree of asymmetry, or departure from symmetry, of a distribution. If the
frequency curve (smooth frequency polygon) of a distribution has a longer tail to the right of the
central maximum than to the left, the distribution is said to be skewed to the right, or to have
positive skewness. If the reverse is true, it is said to be skewed to the left, or to have negative
skewness.
Moment coefficient of skewness is denoted as 𝑎3 and defined as
𝑚3 𝑚3 𝑚3
𝑎3 = = =
𝑠 3 ( √ 𝑚2 ) 3
√𝑚23

Kurtosis
Kurtosis is the degree of peakedness of a distribution, usually taken relative to a normal
distribution. A distribution having relatively high peak is called leptokurtic, while one which is
flat-topped is called platykurtic. A normal distribution is not very peaked or very flat-topped, is
called mesokurtic.
Moment coefficient of kurtosis is denoted as 𝑎4 and defined as
𝑚4 𝑚4
𝑎4 = = 2
𝑠4 𝑚2

Example 3: find the mean deviation, standard deviation, variance, skewness, and kurtosis for the
data given in example 1
Example 4: find the mean deviation, standard deviation, variance, skewness, and kurtosis for the
data given in example 2

Linear Correlation Coefficient:


Correlation means the degree of relationship between variables, which seeks to determine how
well a linear or other equation describes or explains the relationship between variables.
Linear correlation coefficient is calculated by product-moment formula, if X and Y are two
variables then correlation coefficient denote as r and describes as
∑ 𝑥𝑦
𝑟=
√(∑ 𝑥 2 )(∑ 𝑦 2 )

Where 𝑥 = 𝑋 − 𝑋̅ and 𝑦 = 𝑌 − 𝑌̅
Example 5: Draw the Scatter plot and find the correlation coefficient for the below data
TV
hours/week GPA
20 2.35
5 3.8
8 3.5
10 2.75
13 3.25
7 3.4
13 2.9
5 3.5
25 2.25
14 2.75

Hint: use above mention formula and also use built in ‘CORREL’function
Ans: -0.90972

Prepared by
Tonmoy Ghosh
Lecturer, EEE, PUST

You might also like