Professional Documents
Culture Documents
3.1.1
Mean ( x )
- the average representation of data
- calculate by dividing the sum of all data values in a distribution by the number of
data
- a population mean is an example of parameter
- a population mean is denoted of
- the sample mean is an example of a statistic
- the mean of a sample is denoted by
Population mean:
=
Ungrouped data
Sample mean:
X=
1 + 2 + 3 + +
=
+ 2 + 3 + +
Grouped data
Each
class
interval
is
represented by the mid-point
of the interval, xi .
X=
fx
i
Example 2: The following data shows prices of vehicles sold last month at KAR Company.
Selling Price (RM
thousand)
No. of
vehicles, f i
8
23
17
18
8
4
2
Midpoint, xi
f i xi
Calculate the mean price of the vehicles sold last month and explain its meaning.
Example 3: Calculate the mean number of vehicles owned by residents and explain the
meaning of mean value obtained.
Vehicles owned
0
1
2
3
4
5
No. of
households
5
25
10
4
2
1
fx
3.1.2
x)
Median ( ~
- the middle value of an ordered array of data
- 50% of the data are greater or equal to this value and another 50% of the data are
smaller or equal to this value
- Advantages:
Unaffected by the magnitude of extreme values
The best measure of location
Ungrouped data
1) Arrange the data in ascending order
2) If the number of observations is odd, the median is the number in the middle
of the ordered list. If the number of observations is even, the median is the
average of the 2 middle values.
3) After arrange data, find the position of median.
Position of median =
n 1
2
Grouped data
1) To find median, create a column for the CF
2) Determine the position of median
Position of median = 2
3) The median is calculate as follows
f m1
x = Lm + 2
Median, ~
C
fm
Lm
fm
Example 2: The following data shows prices of vehicles sold last month at KAR company.
Selling Price (RM
thousand)
12 and less than 15
15 and less than 18
18 and less than 21
21 and less than 24
24 and less than 27
27 and less than 30
30 and less than 33
No. of
vehicles, f
8
23
17
18
8
4
2
Class boundary
Cumulative
frequency
(i) Calculate the median and explain the meaning of the value obtained.
(ii) Draw an ogive and find the median from the ogive.
3.1.3
Mode ( x )
- Disadvantage: not unique (a set of data may have more than one mode)
Ungrouped data
-
f 0 f1
C
f 0 f1 f 0 f 2
Mode, x = L +
Mean
Involve all data values
Influenced by extreme
values
Only one mean for a
data set
Applicable to
quantitative data only
Frequency
2
7
9
14
6
2
Class boundary
Median
Does not involve all data
values
Not influenced by
extreme values
Only one median for a
data set
Applicable to quantitative
data only
Mode
Does not involve all data
values
Might influenced by
extreme values
Can be more than one
mode for a data set
Applicable to
quantitative and
qualitative data only
3.1.4
EXERCISE
The data below shows the amount of electricity bill charged to houses at Taman Rimbun.
Amount (RM)
50 69
70 89
90 109
110 139
140 149
150 179
Frequency
4
16
20
25
10
5
Calculate the mean, median and mode for the amount of bill. Explain the meaning of each of the
value obtained.
3.2
MEASURES OF DISPERSION
A
Variance
Coefficient of
variation
Standard
deviation
The more spread out or dispersed the data, the larger is the measures of dispersion.
If the data is more concentrated or homogeneous, the measures of dispersion are
smaller.
If the observations are the same, the measures of dispersion will be zero.
Thus, all these measures of dispersions are always give positive values.
3.2.1
Sample Variance
( X ) 2
1
2
X
N
N
( x ) 2
1
2
s
x
n 1
n
s s2
EXAMPLE
1) Following are the 2005 earnings (in thousands of dollars) before taxes for all six
employees of a small company.
48.50 38.40 65.50 22.60 79.80 54.60
X2
48.50
38.40
65.50
22.60
79.80
54.60
Since the data earnings of all employees of this company, then use the population formula
to compute the variance and standard deviation.
8
2) The following table, based on Forbes magazines list of the wealthiest people in the
world, gives the total wealth (in billions of dollars) of five persons (USA TODAY, March
11, 2005).
Total Wealth
(billions of dollars)
46.5
18.0
16.0
7.8
7.2
Billinaire
Bill Gates
Helen Walton
Michael Dell
Keith Rupert Murdoch
George Soros
x2
46.5
18.0
16.0
7.8
7.2
Sample Variance
fX
1
fX 2
N
N
fx
1
2
fx
s
n 1
n
s s2
EXAMPLE
1) The following data is the frequency distribution of the daily commuting times (in minutes) from
home to work for all 25 employees of a company.
Daily Commuting Time
(minutes)
0 to less than 10
10 to less than 20
20 to less than 30
30 to less than 40
40 to less than 50
Number of Employees
4
9
6
4
2
Number of
Employees, f
4
9
6
4
2
fX
fX 2
(midpoint)
fX
10
fX
2) The following data is the frequency distribution of the number of orders received each day
during the past 50 days at the office of a mail-order company.
Number of Orders
10 12
13 15
16 18
19 21
f
4
12
20
14
Number of Orders
10 12
13 15
16 18
19 21
fx
(midpoint)
4
12
20
14
fx
11
fx 2
fx
3.2.2
CV =
s
100
x
B: x = 70
s=6
5
60
x 100
= 8.33%
B: CV =
6
70
x 100
= 8.57%
12
3.2.3
Skewness
Measure of skewness
It is used to determine the difference between the mean and mode of the distribution.
If (mean mode) = +ve (positive)
The distribution is skewed to the right or positively-skewed.
If (mean mode) = 0
The distribution is symmetrical.
Pearson Coefficient of Skewness
To measure the degree of skewness (to determine the shape of the distribution)
PCS =
OR
PCS =
3()
If PCS = 0
If PCS = +ve value
If PCS = -ve value
13
EXERCISE
1.
The following table shows the score of Efficiency Test for 100 new factory operators.
Test score
No. of
operators
6
11
18
25
17
13
7
3
fx
fx 2
a) Calculate the mean and standard deviation for the score obtained by the operators. Then,
explain the meaning of the mean value obtained.
d) Determine the shape of the distribution by calculating the Pearsons Coefficient of Skewness.
14
e) Given the mean and the variance of the scores of the senior operators were 55 and 144
respectively, which group of operators obtained more consistent score?
2a)
The data below shows the number of accidents in Johor Bahru for the year 2012.
26
24
29
32
28
35
24
23
30
37
27
31
i)
Calculate the mean, mode, and median for the above data. Hence, determine the
shape of the distribution.
2b)
Carlos Company tested 20 types of engine to see how many hours they would last. The data
is shown below:
i)
Hours
Frequency
5<10
6
10<15
4
15<20
3
20<25
2
25<30
5
Find the mean and standard deviation for above distribution.
ii)
Draw an ogive for the data and estimate the median value.
iii)
Determine the skewness of the above data using the Pearson Coefficient of
Skewness.
iv)
It is found that Kenji Company also tested the same engine and has the mean and
variance of 19 and 36 respectively. Which company has more consistent hours in
testing the engine?
15
3a)
The number of low birth weight infants for the past 12 years from NCB specialist is as
follows:
35
b)
37
30
41
32
24
46
52
27
49
51
32
i)
ii)
iii)
The following table shows the distribution of time to repair an electronic gadget for 50
gadgets chosen at random from Samsung Repair Shop.
i)
ii)
It is reported that Sony Repair Shop has a mean of 25.05 minutes and its standard
deviation is 4.17 minutes on time to repair electronic gadget. Determine which
repair shop has a more consistency in repair time.
16
4)
The following data are the number of cupcakes sold weekdays in February 2013 by Balqis
Shop.
48
48
58
50
35
47
75
46
39
35
56
66
33
43
65
37
i)
17
60
52
67
68