You are on page 1of 219

CHAPTER 1 DESCRIPTIVE STATISTICS

DESCRIPTIVE STATISTICS

CHAPTER 1 DESCRIPTIVE STATISTICS


Introduction

Raw data - Data recorded in the sequence in which there are collected and before they are processed or ranked. Array data - Raw data that is arranged in ascending or descending order.

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 1 Quantitative raw data

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 1 Qualitative raw data

These data also called ungrouped data

CHAPTER 1 DESCRIPTIVE STATISTICS

Organizing and Graphing Qualitative Data

CHAPTER 1 DESCRIPTIVE STATISTICS


Organizing and Graphing Qualitative Data Frequency Distributions/ Table Relative Frequency and Percentage Distribution Graphical Presentation of Qualitative Data Frequency Distributions / Table A frequency distribution for qualitative data lists all categories and the number of elements that belong to each of the categories. It exhibits the frequencies are distributed over various categories Also called a frequency distribution table or simply a frequency table.
The number of students who belong to a certain category is called the frequency of that category.

CHAPTER 1 DESCRIPTIVE STATISTICS

CHAPTER 1 DESCRIPTIVE STATISTICS


Relative Frequency and Percentage Distribution A relative frequency distribution is a listing of all categories along with their relative frequencies (given as proportions or percentages). It is commonplace to give the frequency and relative frequency distribution together. Calculating relative frequency and percentage of a category

CHAPTER 1 DESCRIPTIVE STATISTICS


Relative Frequency of a category = Frequency of that category Sum of all frequencies Percentage = (Relative Frequency)* 100

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 3 A sample of UUM staff-owned vehicles produced by Proton was identified and the make of each noted. The resulting sample follows (W = Wira, Is = Iswara, Wj = Waja, St = Satria, P = Perdana, Sv = Savvy):
W Is Wj Wj St W W Is Sv W P W Wj W W Is Wj Sv Is W Is Is W P W P W W Sv St Is W W Wj St W Is Wj Wj P St W St W Wj Wj Wj W W Sv

Construct a frequency distribution table for these data with their relative frequency and percentage.

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Frequency 19 Relative Frequency Percentage (%) Category Wira Iswara Perdana Waja Satria Savvy Total

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Frequency 19 8 Relative Frequency Percentage (%) Category Wira Iswara Perdana Waja Satria Savvy Total

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Frequency 19 8 4 10 5 4 Total 50 Relative Frequency Percentage (%) Category Wira Iswara Perdana Waja Satria Savvy

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Frequency 19 8 4 10 5 4 Total 50 Relative Frequency 19/50 = 0.38 Percentage (%) Category Wira Iswara Perdana Waja Satria Savvy

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Frequency 19 8 4 10 5 4 Total 50 0.16 Relative Frequency 19/50 = 0.38 Percentage (%) Category Wira Iswara Perdana Waja Satria Savvy

0.38*100 = 38

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Frequency 19 8 4 10 5 4 Total 50 0.16 0.08 0.16*100 = 16 Relative Frequency 19/50 = 0.38 Percentage (%) Category Wira Iswara Perdana Waja Satria Savvy

0.38*100 = 38

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Frequency 19 8 4 10 5 4 Total 50 0.16 0.08 0.20 0.10 0.08 1.00 0.16*100 = 16 0.08*100 = 8 0.20*100 = 20 0.10*100 = 10 0.08*100 = 8 100 Relative Frequency 19/50 = 0.38 Percentage (%) Category Wira Iswara Perdana Waja Satria Savvy

0.38*100 = 38

CHAPTER 1 DESCRIPTIVE STATISTICS


Graphical Presentation of Qualitative Data Bar Graphs A graph made of bars whose heights represent the frequencies of respective categories. Such a graph is most helpful when you have many categories to represent. Notice that a gap is inserted between each of the bars. It has => simple/ vertical bar chart => horizontal bar chart => component bar chart => multiple bar chart

CHAPTER 1 DESCRIPTIVE STATISTICS


Simple/ Vertical Bar Chart To construct a vertical bar chart, mark the various categories on the horizontal axis and mark the frequencies on the vertical axis Refer to Figure 2.1 and Figure 2.2,

CHAPTER 1 DESCRIPTIVE STATISTICS


Figur e 2.1

Figur e 2.2

CHAPTER 1 DESCRIPTIVE STATISTICS


Horizontal Bar Chart To construct a horizontal bar chart, mark the various categories on the vertical axis and mark the frequencies on the horizontal axis. Example 4: Refer Example 3,

CHAPTER 1 DESCRIPTIVE STATISTICS


Figure 2.3
U M Staff-ow U ned Vehicles Produced B y Proton
Types of Vehicle Sav y v Satria W aja Perdan a Isw ara W ira 0 5 10 Frequency 15 20

CHAPTER 1 DESCRIPTIVE STATISTICS


Another example of horizontal bar chart: Figure 2.4

Figure 2.4: Number of students at Diversity College who are immigrants, by last country of permanent residence

CHAPTER 1 DESCRIPTIVE STATISTICS


Component Bar Chart To construct a component bar chart, all categories is in one bar and every bar is divided into components. The height of components should be tally with representative frequencies. Example 5 Suppose we want to illustrate the information below, representing the number of people participating in the activities offered by an outdoor pursuits centre during Jun of three consecutive years.

CHAPTER 1 DESCRIPTIVE STATISTICS


2004 Climbing Caving Walking Sailing Total 21 10 75 36 142 2005 34 12 85 36 167 2006 36 21 100 40 191

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Figure 2.5
200 180 160 140 120 100 80 60 40 20 0 2004 2005 Year 2006 Number of participants

Activities Breakdown (Jun)

Sailing Walking Caving Climbing

CHAPTER 1 DESCRIPTIVE STATISTICS


Multiple Bar Chart To construct a multiple bar chart, each bars that representative any categories are gathered in groups. The height of the bar represented the frequencies of categories. Useful for making comparisons (two or more values). Example 6: Refer example 5,

CHAPTER 1 DESCRIPTIVE STATISTICS


Figure 2.6
120 Number of participants 100 80 60 40 20 0 2004 2005 Year 2006 Climbing Caving Walking Sailing

Activities Breakdown (Jun)

CHAPTER 1 DESCRIPTIVE STATISTICS


Another example of horizontal bar chart: Figure 2.7

Figure 2.7: Preferred snack choices of students at UUM

CHAPTER 1 DESCRIPTIVE STATISTICS


Pie Chart
A circle divided into portions that represent the relative frequencies or percentages of a population or a sample belonging to different categories. An alternative to the bar chart and useful for summarizing a single categorical variable if there are not too many categories. The chart makes it easy to compare relative sizes of each class/category.

CHAPTER 1 DESCRIPTIVE STATISTICS


The whole pie represents the total sample or population. The pie is divided into different portions that represent the different categories. To construct a pie chart, we multiply 360o by the relative frequency for each category to obtain the degree measure or size of the angle for the corresponding categories. Example 7 (Table 2.6 and Figure 2.8):

CHAPTER 1 DESCRIPTIVE STATISTICS

Figure 2.8

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 8 (Table 2.7 and Figure 2.9):
Movie Genres Comedy Action Romance Drama Horror Foreign Science Fiction Frequency 54 36 28 28 22 16 16
200

Relative Frequency
0.27 0.18 0.14 0.14 0.11 0.08 0.08 1.00

Angle Size

360o

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 8 (Table 2.7 and Figure 2.9):
Movie Genres Comedy Action Romance Drama Horror Foreign Science Fiction Frequency 54 36 28 28 22 16 16
200

Relative Frequency
0.27 0.18 0.14 0.14 0.11 0.08 0.08 1.00

Angle Size
360*0.27=97.2O

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 8 (Table 2.7 and Figure 2.9):
Movie Genres Comedy Action Romance Drama Horror Foreign Science Fiction Frequency 54 36 28 28 22 16 16
200

Relative Frequency
0.27 0.18 0.14 0.14 0.11 0.08 0.08 1.00

Angle Size
360*0.27=97.2O 360*0.18=64.8O

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 8 (Table 2.7 and Figure 2.9):
Movie Genres Comedy Action Romance Drama Horror Foreign Science Fiction Frequency 54 36 28 28 22 16 16
200

Relative Frequency
0.27 0.18 0.14 0.14 0.11 0.08 0.08 1.00

Angle Size
360*0.27=97.2O 360*0.18=64.8O 360*0.14=50.4O 360*0.14=50.4O 360*0.11=39.6O 360*0.08=28.8O 360*0.08=28.8O 360o

CHAPTER 1 DESCRIPTIVE STATISTICS

Figure 2.9

CHAPTER 1 DESCRIPTIVE STATISTICS


Line Graph/Time Series Graph A graph represents data that occur over a specific period time of time. Line graphs are more popular than all other graphs combined because their visual characteristics reveal data trends clearly and these graphs are easy to create. When analyzing the graph, look for a trend or pattern that occurs over the time period.

CHAPTER 1 DESCRIPTIVE STATISTICS


Example is the line ascending (indicating an increase over time) or descending (indicating a decrease over time). Another thing to look for is the slope, or steepness, of the line. A line that is steep over a specific time period indicates a rapid increase or decrease over that period. Two data sets can be compared on the same graph (called a compound time series graph) if two lines are used. Data collected on the same element for the same variable at different points in time or for different periods of time are called time series data.

CHAPTER 1 DESCRIPTIVE STATISTICS


A line graph is a visual comparison of how two variables shown on the x- and y-axesare related or vary with each other. It shows related information by drawing a continuous line between all the points on a grid. Line graphs compare two variables: one is plotted along the x-axis (horizontal) and the other along the y-axis (vertical). The y-axis in a line graph usually indicates quantity (e.g., RM, numbers of sales litres) or percentage, while the horizontal x-axis often measures units of time. As a result, the line graph is often viewed as a time series graph

Example 9

CHAPTER 1 DESCRIPTIVE STATISTICS

A transit manager wishes to use the following data for a presentation showing how Port Authority Transit ridership has changed over the years. Draw a time series graph for the data and summarize the findings. Year 1990 1991 1992 1993 1994 Ridership (in millions) 88.0 85.0 75.7 76.6 75.4

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
89 Ridership (in millions) 87 85 83 81 79 77 75 1990 1991 1992 Year 1993 1994

The graph shows a decline in ridership through 1992 and then leveling off for the years 1993 and 1994.

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 1 1.The following data show the method of payment by 16 customers in a supermarket checkout line. Here, C = cash, CK = check, CC = credit card, D = debit and O = other.
C CK CK CC CK D C CC CC C D CK O CK C CC

a.Construct a frequency distribution table. b.Calculate the relative frequencies and percentages for all categories. c.Draw a pie chart for the percentage distribution.

CHAPTER 1 DESCRIPTIVE STATISTICS


1.a). Frequency distribution table, relative frequencies, percentages and angle sizes of all categories.
Method of payment Cash Check Credit Card Debit Other Total Frequency, f 4 5 4 2 1 16 Relative frequency
0.2500 0.3125 0.2500 0.1250 0.0625 1

Percentage Angle Size (%) (o)


25.00 31.25 25.00 12.50 6.25 100 90 112.5 90 45 22.5 360

CHAPTER 1 DESCRIPTIVE STATISTICS


b). Pie Chart

6% 13% 25% Cash Check Credit Card Debit 25% Other 31%

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 2: The frequency distribution table represents the sale of certain product in ZeeZee Company. Each of the products was given the frequency of the sales in certain period. Find the relative frequency and the percentage of each product. Then, construct a pie chart using the obtained information

CHAPTER 1 DESCRIPTIVE STATISTICS


1.a). Frequency distribution table, relative frequencies, percentages and angle sizes of all categories.
Type of Frequency product A B C D E Total 13 12 5 9 11 50 Relative Frequency
0.26 0.24 0.10 0.18 0.22 1.00

Percentage Angle Size (o) (%)


26 24 10 18 22 100 93.6 86.4 36.0 64.8 79.2 360

CHAPTER 1 DESCRIPTIVE STATISTICS

ORGANIZING AND GRAPHING QUANTITATIVE DATA

CHAPTER 1 DESCRIPTIVE STATISTICS


2.3 ORGANIZING AND GRAPHING QUANTITATIVE DATA 2.3.1 2.3.2 2.3.3 2.3.4 2.3.5 2.3.6 Stem and Leaf Display Frequency Distribution Relative Frequency and Percentage Distributions. Graphing Grouped Data Shapes of Histogram Cumulative Frequency Distributions.

CHAPTER 1 DESCRIPTIVE STATISTICS


Stem-and-Leaf Display
In stem and leaf display of quantitative data, each value is divided into two portions a stem and a leaf. Then the leaves for each stem are shown separately in a display. Gives the information of data pattern. Can detect which value frequently repeated.

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 10 25 36 14 12 13 41 9 11 38 10 12 44 5 31 13 12 28 22 23 37 18 7 6 19

Solution:
0 1 2 3 4 9 5 2 0 7 6 2 3 1 2 4 3 8 9 2 8

5 3 8 6 1 7 1 4

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 3: Queen Bakery is the famous shop that sell cake in Town J. The operation manager of Queen Bakery is interested to study about the time that customers queue before serve at the cashier counter. Below is data (in minute) for 20 customers. Construct a stem and leaf table. 5 1 8 1 3
3 10 7 15 16 10 2 9 3 12 6 11 16 14 8

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
0 1 5 1 5 2 8 1 3 3 2 9 6 7 3 8 6 0 6 4 0 1

0 1

1 1 0 0

2 3 3 3 5 6 7 8 8 9 1 2 4 5 6 6

CHAPTER 1 DESCRIPTIVE STATISTICS


Frequency Distributions A frequency distribution for quantitative data list all the classes and the number of values that belong to each class. Data presented in form of frequency distribution are called grouped data.

CHAPTER 1 DESCRIPTIVE STATISTICS

CHAPTER 1 DESCRIPTIVE STATISTICS


The class boundary is given by the midpoint of the upper limit of one class and the lower limit of the next class. Also called real class limit. To find the midpoint of the upper limit of the first class and the lower limit of the second class, we divide the sum of these two limits by 2.

CHAPTER 1 DESCRIPTIVE STATISTICS


e.g.:
400 + 401 = 400.5 2
class boundary

Class Width (class size) Class width = Upper boundary Lower boundary e.g. : Width of the first class = 600.5 400.5 = 200

CHAPTER 1 DESCRIPTIVE STATISTICS


Class Midpoint or Mark
class midpoint or mark = Lower limit + Upper limit 2

e.g:
Midpoint of the 1st class = 401 + 600 = 500.5 2

CHAPTER 1 DESCRIPTIVE STATISTICS

Constructing A Frequency Table

CHAPTER 1 DESCRIPTIVE STATISTICS


Constructing Frequency Distribution Tables 1. To decide the number of classes, we used Sturges formula, which is c = 1 + 3.3 log n where c is the no. of classes n is the no. of observations in the data set. 2. Class width, This class width is rounded to a convenient number.
Largest value - Smallest value Number of classes Range i> c i>

CHAPTER 1 DESCRIPTIVE STATISTICS


3. Lower Limit of the First Class or the Starting Point Use the smallest value in the data set.

Example 11 The following data give the total home runs hit by all players of each of the 30 Major League Baseball teams during 2004 season

CHAPTER 1 DESCRIPTIVE STATISTICS

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: i. number of classes, c = 1 + 3.3log 30 = 1 + 3.3(1.48) = 5.89 6 classes ii. Class width, i
i> 242 135 6 = 17.8 18

iii. Starting Point = 135

CHAPTER 1 DESCRIPTIVE STATISTICS


Table 2.10 Frequency Distribution for Data of Table 2.9 Total Home Runs 135 152 153 170 171 188 189 206 207 224 225 242 IIII IIII II IIII IIII I III IIII Tally f 10 2 5 6 3 4

f = 30

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 4: The followings data shows the information of serving time (in minutes) for 40 customers in a post office: 2.0 4.5 2.5 2.9 4.2 2.9 3.5 2.8 3.2 2.1 4.6 2.7 2.9 3.1 2.8 3.9 4.0 3.6 5.1 2.9 3.0 4.3 2.7 2.9 3.8 4.7 2.6 2.5 2.5 2.6 4.4 3.7 2.3 4.1 3.5 3.3 3.5 3.1 3.0 2.4

Construct a frequency distribution table

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: i. number of classes, c = 1 + 3.3log 40 = 1 + 5.29 = 6.29 6 classes ii. Class width, i

i>

5. 1 2. 0 6 = 0.52 0.6

iii. Starting Point = 2.0

CHAPTER 1 DESCRIPTIVE STATISTICS


Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 IIII IIII IIII IIII IIII I Tally II IIII IIII II I f 7 15 7 6 4 1

= 40

CHAPTER 1 DESCRIPTIVE STATISTICS


Relative Frequency and Percentage Distribution

Frequency of that class Re lative frequency of a class = Sum of all frequency f = f Percentage = (Re lative Frequency) *100

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 12 (Refer example 11) Table 2.11: Relative Frequency and Percentage Distributions
Total Home Runs 135 152 153 170 171 188 189 206 207 224 225 242 Class Boundaries 134.5 less than 152.5 152.5 less than 170.5 170.5 less than 188.5 188.5 less than 206.5 206.5 less than 224.5 224.5 less than 242.5 Sum freq 10 2 5 6 3 4 30 Relative Frequency 0.3333 0.0667 0.1667 0.2 0.1 0.1333 1.0 % 33.33 6.67 16.67 20 10 13.33 100%

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 5: Refer to exercise 4, construct the relative frequency for the data. f Time class Relative frequency 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 7 15 7 6 4 1 0.175 0.375 0.175 0.150 0.100 0.025

total

= 40

1.000

CHAPTER 1 DESCRIPTIVE STATISTICS


Graphing Grouped Data
Histograms
A histogram is a graph in which the class boundaries are marked on the horizontal axis and either the frequencies, relative frequencies, or percentages are marked on the vertical axis. The frequencies, relative frequencies or percentages are represented by the heights of the bars. In histogram, the bars are drawn adjacent to each other and there is a space between y axis and the first bar.

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 13 (Refer example 11)
12 10 Frequency 8 6 4 2 0
134.5 152.5 170.5 188.5 206.5 224.5 242.5

Total home runs

Figure 2.10: Frequency histogram for Table 2.10

Polygon

CHAPTER 1 DESCRIPTIVE STATISTICS

A graph formed by joining the midpoints of the tops of successive bars in a histogram with straight lines is called a polygon. Example 13
12 10

Frequency

8 6 4 2 0

134.5 152.5 170.5 188.5 206.5 224.5 242.5 1


Total hom runs e

Figure 2.11: Frequency polygon for Table 2.10

CHAPTER 1 DESCRIPTIVE STATISTICS


For a very large data set, as the number of classes is increased (and the width of classes is decreased), the frequency polygon eventually becomes a smooth curve called a frequency distribution curve or simply a frequency curve.

Figure 2.12: Frequency distribution curve

CHAPTER 1 DESCRIPTIVE STATISTICS


Shape of Histogram Same as polygon. For a very large data set, as the number of classes is increased (and the width of classes is decreased), the frequency polygon eventually becomes a smooth curve called a frequency distribution curve or simply a frequency curve.

CHAPTER 1 DESCRIPTIVE STATISTICS


The most common of shapes are: (i) Symmetric

Figure 2.13 & 2.14: Symmetric histograms

CHAPTER 1 DESCRIPTIVE STATISTICS


(ii) Right skewed and (iii) Left skewed

Figure 2.15 & 2.16: Right skewed and Left skewed

CHAPTER 1 DESCRIPTIVE STATISTICS

Cumulative Frequency Distributions

CHAPTER 1 DESCRIPTIVE STATISTICS


Cumulative Frequency Distributions A cumulative frequency distribution gives the total number of values that fall below the upper boundary of each class. Example 14: Using the frequency distribution of table 2.11,
Total Home Runs 135 152 153 170 171 188 189 206 207 224 225 242 Class Boundaries 134.5 less than 152.5 152.5 less than 170.5 170.5 less than 188.5 188.5 less than 206.5 206.5 less than 224.5 224.5 less than 242.5 freq 10 2 5 6 3 4 Cumulative Frequency 10 10+2=12 10+2+5=17 10+2+5+6=23 10+2+5+6+3=26 10+2+5+6+3+4=30

CHAPTER 1 DESCRIPTIVE STATISTICS


Ogive An ogive is a curve drawn for the cumulative frequency distribution by joining with straight lines the dots marked above the upper boundaries of classes at heights equal to the cumulative frequencies of respective classes. Two type of ogive: (i) ogive less than (ii) ogive greater than

CHAPTER 1 DESCRIPTIVE STATISTICS


First, build a table of cumulative frequency. Example 15 (Ogive Less Than)
Earnings (RM) 30 39 40 49 50 59 60 - 69 70 79 80 - 89 Total Number of students (f) 5 6 6 3 3 7 30 Earnings (RM) Less than 29.5 Less than 39.5 Less than 49.5 Less than 59.5 Less than 69.5 Less than 79.5 Less than 89.5 Cumulative Frequency (F) 0 5 11 17 20 23 30

CHAPTER 1 DESCRIPTIVE STATISTICS


O giv e Le ss T han
35 30 25 20 15 10 5 0 Les s thanLes s than s thanLes s thanLes s than s thanLes s than Les Les 29.5 39.5 49.5 59.5 69.5 79.5 89.5 Ea rning (RM )

Cummulative Freq

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 16 (Ogive Greater Than)
Earnings (RM) 30 39 40 49 50 59 60 - 69 70 79 80 - 89 Total Number of students (f) 5 6 6 3 3 7 30 Earnings (RM) More than 29.5 More than 39.5 More than 49.5 More than 59.5 More than 69.5 More than 79.5 More than 89.5 Cumulative Frequency (F) 30 25 19 13 10 7 0

CHAPTER 1 DESCRIPTIVE STATISTICS


O give G reate r T han
35 30 25 20 15 10 5 0 M ore M ore M ore M ore M ore M ore M ore than 29.5 than 39.5 than 49.5 than 59.5 than 69.5 than 79.5 than 89.5

Cummulative Freq

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 6: Using the frequency table that you construct in exercise 4 and 5, build an ogive less than and ogive greater than for the table.
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1

total

= 40

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:(ogive less than)
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries Less than 1.95 Cummulative frequency

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries Less than 1.95 Less than 2.55 Cummulative frequency

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries Less than 1.95 Less than 2.55 Less than 3.15 Cummulative frequency

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries Less than 1.95 Less than 2.55 Less than 3.15 Less than 3.75 Less than 4.35 Less than 4.95 Less than 5.55 Cummulative frequency

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries Less than 1.95 Less than 2.55 Less than 3.15 Less than 3.75 Less than 4.35 Less than 4.95 Less than 5.55 Cummulative frequency 0

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries Less than 1.95 Less than 2.55 Less than 3.15 Less than 3.75 Less than 4.35 Less than 4.95 Less than 5.55 Cummulative frequency 0 7

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries Less than 1.95 Less than 2.55 Less than 3.15 Less than 3.75 Less than 4.35 Less than 4.95 Less than 5.55 Cummulative frequency 0 7 22

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries Less than 1.95 Less than 2.55 Less than 3.15 Less than 3.75 Less than 4.35 Less than 4.95 Less than 5.55 Cummulative frequency 0 7 22 29 35 39 40

CHAPTER 1 DESCRIPTIVE STATISTICS

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:(ogive greater than)
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries More than 1.95 Cummulative frequency 40

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries More than 1.95 More than 2.45 Cummulative frequency 40 33

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries More than 1.95 More than 2.45 More than 3.15 Cummulative frequency 40 33 18

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Time class 2.0 2.5 2.6 3.1 3.2 3.7 3.8 4.3 4.4 4.9 5.0 5.5 f 7 15 7 6 4 1 Class boundries More than 1.95 More than 2.45 More than 3.15 More than 3.75 More than 4.35 More than 4.95 More than 5.55 Cummulative frequency 40 33 18 11 5 1 0

CHAPTER 1 DESCRIPTIVE STATISTICS

Exercise 7: Given the following frequency table:

CHAPTER 1 DESCRIPTIVE STATISTICS

Amount ($) Number of Responses Cumulative frequency 0 99 100 199 200 299 300 399 400 499 500 999 2 2 6 9 4 2 25 23 2

i. Construct the suitable ogive for the data

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Amount ($) 0 99 100 199 200 299 300 399 400 499 500 999 f 2 2 6 9 4 2 Class Boundaries More than -0,5 More than 99.5 More than 199.5 More than 299.5 More than 399.5 More than 499.5 More than 999.5 Cumulative frequency 25 23 21 15 6 2 0

CHAPTER 1 DESCRIPTIVE STATISTICS


O g iv e M o r e T h a n
30 25 20 15 10 5 0 M o re M o re M o re th a n -0 ,5 a n 9 9 .5 th a n th 1 9 9 .5 M o re th a n 2 9 9 .5 M o re th a n 3 9 9 .5 M o re th a n 4 9 9 .5 M o re th a n 9 9 9 .5

Cumulative freq

C la ss B o u n d a rie s

Box-Plot Describe the analyze data graphically using 5 measurement: smallest value, first quartile (K1), second quartile (median or K2), third quartile (K3) and largest value.
For symmetry data

CHAPTER 1 DESCRIPTIVE STATISTICS

Smallest value

K1

Median

K3

Largest value

For left skewed data


Smallest value

K1

Median

K3

Largest value

For right skewed data


K1 Smallest value Median K3 Largest value

CHAPTER 1 DESCRIPTIVE STATISTICS


Measures of Central Tendency Ungrouped Data (1) Mean (2) Median (3) Mode Group Data (1) Mean (2) Median (3) Mode

CHAPTER 1 DESCRIPTIVE STATISTICS


Ungrouped Data Mean Mean for population data: Mean for sample data: where:

x =
N

x x=
n

x=

the sum of all values N = the population size n = the sample size, = the population mean x = the sample mean

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 17 The following data give the prices (rounded to thousand RM) of five homes sold recently in Sekayang. 158 189 265 127 191 Find the mean sale price for these homes. Solution: x x= n 158 + 189 + 265 + 127 +191 = Thus, these five homes were sold 5 for an average price of RM186 930 thousand @ RM186 000. = 5 = 186

CHAPTER 1 DESCRIPTIVE STATISTICS


Median Median is the value of the middle term in a data set that has been ranked in increasing order. Procedure for finding the Median Step 1: Rank the data set in increasing order. Step 2: Determine the depth (position or location) of the median.
D p o M d n= e th f e ia n+ 1 2

Step 3: Determine the value of the Median.

Example 19 Find the median for the following data: 10 5 19 8 3 Solution: (1)
(2)

CHAPTER 1 DESCRIPTIVE STATISTICS

Determine the depth of the Median


Depth of M edian = = = n +1 2 5+ 1 2 3

Rank the data in increasing order 3 5 8 10

19

(3)

Determine the value of the median Therefore the median is located in third position of the data set. 3 5 8 10 19 Hence, the Median for above data = 8

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 20 Find the median for the following data: 10 5 19 8 Solution: (1) Rank the data in increasing order 3 5 8 10 15 (2) Determine the depth of the Median
D epth of M edian = = = n +1 2 6+ 1 2 3.5

3 19

15

CHAPTER 1 DESCRIPTIVE STATISTICS


Determine the value of the Median Therefore the median is located in the middle of 3rd position and 4th position of the data set.

8 +10 Median = = 9 2
Hence, the Median for the above data = 9 -The median gives the center of a histogram, with half of the data values to the left of (or, less than) the median and half to the right of (or, more than) the median. -The advantage of using the median is that it is not influenced by outliers.

CHAPTER 1 DESCRIPTIVE STATISTICS


Mode Mode is the value that occurs with the highest frequency in a data set. Example 21 1. What is the mode for given data? 77 69 74 81 71 68

74

73

Solution: Mode = 74 (this number occurs twice): Unimodal

CHAPTER 1 DESCRIPTIVE STATISTICS


2. What is the mode for given data? 77 69 68 74 81 71 68 74 73 Mode = 68 and 74: Bimodal A major shortcoming of the mode is that a data set may have none or may have more than one mode. One advantage of the mode is that it can be calculated for both kinds of data, quantitative and qualitative.

CHAPTER 1 DESCRIPTIVE STATISTICS

Grouped Data

CHAPTER 1 DESCRIPTIVE STATISTICS


1.Mean Mean for population data:

fx
N
fx n
Where

Mean for sample data:

x=

x is the midpoint and f is


the frequency of a class.

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 22 The following table gives the frequency distribution of the number of orders received each day during the past 50 days at the office of a mail-order company. Calculate the mean. Number of order 10 12 13 15 16 18 19 21 f 4 12 20 14 n = 50

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Because the data set includes only 50 days, it represents a sample. The value of fx is calculated in the following table: Number of order 10 12 13 15 16 18 19 21 f 4 12 20 14 n = 50 x 11 14 17 20 fx 44 168 340 280

fx

= 832

CHAPTER 1 DESCRIPTIVE STATISTICS


The value of mean sample is:
x= fx = n 832 =16.64 50

Thus, this mail-order company received an average of 16.64 orders per day during these 50 days

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 8: A survey research company asks 100 people how many times they have been to the dentist in the last five years. Their grouped responses appear below.
Number of Visits Number of Responses 04 59 10 14 15 19 What is the mean of the data? 16 25 48 11

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits 04 59 10 14 15 19 Number of Responses, f 16 25 48 11 x fx

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits 04 59 10 14 15 19 Number of Responses, f 16 25 48 11 x 2 7 fx

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits 04 59 10 14 15 19 Number of Responses, f 16 25 48 11 x 2 7 12 fx

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits 04 59 10 14 15 19 Number of Responses, f 16 25 48 11 x 2 7 12 17 fx

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits 04 59 10 14 15 19 Number of Responses, f 16 25 48 11 x 2 7 12 17 fx 36

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits 04 59 10 14 15 19 Number of Responses, f 16 25 48 11 x 2 7 12 17 fx 36 175

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits 04 59 10 14 15 19 Number of Responses, f 16 25 48 11 x 2 7 12 17 fx 36 175 576 187

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits 04 59 10 14 15 19 Total x 2 7 12 17 Number of Responses, f 16 25 48 11 100 fx 32 175 576 187 970

CHAPTER 1 DESCRIPTIVE STATISTICS


The value of mean sample is:

fx = fx x= n f
974 = 100 = 9.74

Thus, an average of times the people have been to the dentist in the last five years is 9.74

CHAPTER 1 DESCRIPTIVE STATISTICS


Median Step 1: Construct the cumulative frequency distribution. Step 2: Decide the class that contain the median. Class Median is the first class with the value of cumulative frequency is at least n/2. Step 3: Find the median by using the following formula:

CHAPTER 1 DESCRIPTIVE STATISTICS


Md n e ia =L + m n 2 -F i fm

Where: n = the total frequency F = the total frequency before class median i = the class width

fm
Lm

= the frequency of the class median = the lower boundary of the class median

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 23 Based on the grouped data below, find the median: Time to travel to work 1 10 11 20 21 30 31 40 41 50 Frequency 8 14 12 9 7

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: 1st Step: Construct the cumulative frequency distribution Time to travel to work 1 10 11 20 21 30 31 40 41 50 Frequency 8 14 12 9 7 Cumulative Frequency 8 22 34 43 50

CHAPTER 1 DESCRIPTIVE STATISTICS


n 50 = = 25 2 2 So, F = 22, Therefore,

Class median is the 3rd class

fm

= 12,

Lm= 20.5 and i = 10


Thus, 25 persons take less than 23 minutes to travel to work and another 25 persons take more than 23 minutes to travel to work.

n F i Median = Lm + 2 fm 50 22 10 = 20.5 + 2 12 = 23

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 9: A survey research company asks 100 people how many times they have been to the dentist in the last five years. Their grouped responses appear below.
Number of Visits Number of Responses 04 59 10 14 15 19 What is the median of the data? 16 25 48 11

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits Number of Responses 04 59 10 14 15 19 16 25 48 11 Cumulative frequency 16

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits Number of Responses 04 59 10 14 15 19 16 25 48 11 Cumulative frequency 16 41

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits Number of Responses 04 59 10 14 15 19 16 25 48 11 Cumulative frequency 16 41 89

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:
Number of Visits Number of Responses 04 59 10 14 15 19 16 25 48 11 Cumulative frequency 16 41 89 100

CHAPTER 1 DESCRIPTIVE STATISTICS


n 100 = = 50 2 2 Class median is the 3rd class

So, F = 41, fm = 12, Lm= 9.5 and i = 5 Therefore,


n F i Median = Lm + 2 fm 100 41 5 = 9.5 + 2 48 = 10.4375

Thus, 50 people take less than 10.4375 times to see the dentist and another 50 people take more than 10.4375 times to see the dentist in the last five years

CHAPTER 1 DESCRIPTIVE STATISTICS


Mode Mode is the value that has the highest frequency in a data set. For grouped data, class mode (or, modal class) is the class with the highest frequency. To find mode for grouped data, use the following formula

CHAPTER 1 DESCRIPTIVE STATISTICS


1 M ode =L mo + i 1+2
Where:

1 is the difference between the frequency of class

mode and the frequency of the class before the class mode

2 is the difference between the frequency of class


mode and the frequency of the class after the class mode i is the class width Lmo is the lower boundary of class mode

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 24 Based on the grouped data below, find the mode Time to travel to work 1 10 11 20 21 30 31 40 41 50 Frequency 8 14 12 9 7

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Based on the table,

Lmo

= 10.5,

1= (14 8) = 6, 2= (14 12) = 2


i

and i = 10

1 Mode = Lmo + 1 + 2

6 M ode = 105 + = 175 . 10 . 2 6 +

CHAPTER 1 DESCRIPTIVE STATISTICS


We can also obtain the mode by using the histogram;

Figure 2.19

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 10: The following table gives the distribution of the shares price for ABC Company which was listed in BSKL in 2005. Price (RM) Frequency 12 14 15 17 18 20 21 23 24 26 27 - 29 5 14 25 7 6 3 Find the mode for this data using formula and histogram.

CHAPTER 1 DESCRIPTIVE STATISTICS


1 Mode = Lmo + + i 2 1 11 = 17.5 + 3 11 + 18 = 18.64

frequency

CHAPTER 1 DESCRIPTIVE STATISTICS


2 5 2 4 2 3 2 2 2 1 2 0 1 9 1 8 1 7 1 6 1 5 1 4 1 3 1 2 1 1 1 0 9 8 7 6 5 4 3 2 1 1 .5 1 1 .5 4 1 .5 7 2 .5 0 2 .5 3 2 .5 6 2 .5 9 c s bud s la s o n rie m d =1 .5 oe 8

Median using ogive: Example 1: example 15 Example 2: example 16

CHAPTER 1 DESCRIPTIVE STATISTICS


Relationship among mean, median & mode
As discussed in previous topic, histogram or a frequency distribution curve can assume either skewed shape or symmetrical shape. Knowing the value of mean, median and mode can give us some idea about the shape of frequency curve.

CHAPTER 1 DESCRIPTIVE STATISTICS


Figure 2.20: Mean, median, and mode for a symmetric histogram and frequency distribution curve

Figure 2.21: Mean, median, and mode for a histogram and frequency distribution curve skewed to the right

CHAPTER 1 DESCRIPTIVE STATISTICS


Figure 2.22: Mean, median, and mode for a histogram and frequency distribution curve skewed to the left

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 11: For the following situations, state whether it is symmetry, skewed to the right or skewed to the left.
Mean = 10, median = 15, mode = 20 Mean = 15, median = 10, mode = 7 Mean = 10, median = 10, mode = 11 Mean = 11, median = 12, mode = 12
Left skewed Right skewed Approx Symmetry Approx Symmetry

CHAPTER 1 DESCRIPTIVE STATISTICS


Dispersion Measurement The measures of central tendency such as mean, median and mode do not reveal the whole picture of the distribution of a data set. Two data sets with the same mean may have a completely different spreads. The variation among the values of observations for one data set may be much larger or smaller than for the other data set.

CHAPTER 1 DESCRIPTIVE STATISTICS


Ungrouped Data 1.Range RANGE = Largest value Smallest value Example 25: Find the range of production for this data set,

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Range = Largest value Smallest value = 267 277 49 651 = 217 626 Disadvantages:
being influenced by outliers. Based on two values only. All other values in a data set are ignored.

CHAPTER 1 DESCRIPTIVE STATISTICS


Variance and Standard Deviation
Standard deviation is the most used measure of dispersion. A Standard Deviation value tells how closely the values of a data set clustered around the mean. Lower value of standard deviation indicates that the data set value are spread over relatively smaller range around the mean. Larger value of data set indicates that the data set value are spread over relatively larger around the mean (far from mean).

CHAPTER 1 DESCRIPTIVE STATISTICS


Standard deviation is obtained the positive root of the variance: Variance Standard Deviation
2

Population
=
2

( x)
N
2

=
s= s
2

Sample
s =
2

( x) x
2

n1

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 26 Let x denote the total production (in unit) of company Company A B C D E Production 62 93 126 75 34

Find the variance and standard deviation,

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Company A B C D E Production (x) 62 93 126 75 34 1156 x2 3844 8649 15 876 5625 1156

=35150

CHAPTER 1 DESCRIPTIVE STATISTICS

s2 =

( x) n n -1 5

35150=

( 390 )

Since s2 = 1182.50; Therefore,

5 1 = 1182 . 50

s = 1182.50 = 34.3875

CHAPTER 1 DESCRIPTIVE STATISTICS


The properties of variance and standard deviation: The standard deviation is a measure of variation of all values from the mean. The value of the variance and the standard deviation are never negative. Also, larger values of variance or standard deviation indicate greater amounts of variation. The value of s can increase dramatically with the inclusion of one or more outliers.

CHAPTER 1 DESCRIPTIVE STATISTICS


Grouped Data Range = Upper bound of last class Lower bound of first class Class 41 50 51 60 61 70 71 80 81 90 91 - 100 Total Frequency 1 3 7 13 10 6 40
Upper bound of last class = 100.5 Lower bound of first class = 40.5 Range = 100.5 40.5 = 60

CHAPTER 1 DESCRIPTIVE STATISTICS


Variance and Standard Deviation Variance Standard Deviation
2

Population
2 =

fx

( fx )
N

=
2

Sample
s2 =

fx

( fx )
n

s= s

n 1

CHAPTER 1 DESCRIPTIVE STATISTICS


Example 27 Find the variance and standard deviation for the following data: No. of order 10 12 13 15 16 18 19 21 Total f 4 12 20 14 n = 50

CHAPTER 1 DESCRIPTIVE STATISTICS


Solutio n: No. of order 10 12 13 15 16 18 19 21 Total f 4 12 20 14 n = 50 x 11 14 17 20 fx 44 168 340 280 857 fx2 484 2352 5780 5600 14216

CHAPTER 1 DESCRIPTIVE STATISTICS


Variance, Standard Deviation,

s2 =

fx

( fx )
n
2

s = s 2 = 7.5820 = 2.75

50 50 1 = 7.5820 =

( 832 ) 14216

n 1

Thus, the standard deviation of the number of orders received at the office of this mail-order company during the past 50 days is 2.75.

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise 13: Refer to exercise 10, find the variance and standard deviation for the data.
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Frequency 5 14 25 7 6 3

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 fx x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 fx x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 fx x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 22 fx x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 22 25 fx x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 22 25 28 fx x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 22 25 28 fx 65 x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 22 25 28 fx 65 224 x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 22 25 28 fx 65 224 475 154 150 84 x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 22 25 28 fx 65 224 475 154 150 84 1152 x2 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 22 25 28 fx 65 224 475 154 150 84 1152 x2 169 256 361 484 625 784 fx2

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution
Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Total f 5 14 25 7 6 3 60 x 13 16 19 22 25 28 fx 65 224 475 154 150 84 1152 x2 169 256 361 484 625 784 fx2 845 3584 9025 3388 3750 2352 22944

CHAPTER 1 DESCRIPTIVE STATISTICS


Variance

s =
2

fx 2

( fx )
n

Standard deviation: Since s2 = 13.9932; Therefore,

n 1 (1152 ) 2 22944 60 = 59 825.6 = 59 = 13.9932

s = s2 = 13.9932 = 3.7407

CHAPTER 1 DESCRIPTIVE STATISTICS


Relative Dispersion Measurement To compare two or more distribution that has different unit based on their dispersion OR To compare two or more distribution that has same unit but big different in their value of mean.

Also called modified coefficient or coefficient of variation, CV.


FORMULA

CHAPTER 1 DESCRIPTIVE STATISTICS

s CV = 100 % ( sample ) x CV = 100 % ( population ) x

CHAPTER 1 DESCRIPTIVE STATISTICS


Exampl e 26

Given mean and standard deviation of monthly salary for two groups of worker who are working in ABC company- Group 1: 700 & 20 and Group 2 :1070 & 20. Find the CV for every group and determine which group is more dispersed.

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution:

20 CV 1 = 100 . 286 % = % 700 20 CV 2 = 100 . 187 % = % 1070


The monthly salary for group 1 worker is more dispersed compared to group 2.

CHAPTER 1 DESCRIPTIVE STATISTICS

MEASURE OF POSITION
QUARTILE INTERQUARTILE RANGE

CHAPTER 1 DESCRIPTIVE STATISTICS


Determines the position of a single value in relation to other values in a sample or a population data set. Quartiles Quartiles are three summary measures that divide ranked data set into four equal parts.

CHAPTER 1 DESCRIPTIVE STATISTICS


oThe 1st quartiles denoted as Q1
FORMULA


Depth of Q1 =

n +1 4

oThe 2nd quartiles median of a data set or Q2 oThe 3rd quartiles denoted as Q3
FORMULA

3( n + 1) Depth of Q 3 = 4

CHAPTER 1 DESCRIPTIVE STATISTICS


Exampl e 27 Table below lists the total revenue for the 11 top tourism company in Malaysia 109.7 79.4 79.9 89.3 121.2 76.4 80.2 98.0 103.5 86.8 82.1

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Step 1: Arrange the data in increasing order 76.4 79.4 79.9 80.2 82.1 86.8 89.3 98.0 103.5 109.7 121.2 Step 2: Determine the depth for Q1 and Q3
Depth of Q1 = n + 1 11 + 1 = =3 4 4

3 ( 11 +1 ) 3( n + 1) Depth of Q 3 = = = 9 4 4

CHAPTER 1 DESCRIPTIVE STATISTICS


Step 3: Determine the Q1 and Q3 76.4 79.4 79.9 80.2 82.1 89.3 98.0 103.5 109.7 121.2 Q1 = 79.9 ; Q3 = 103.5 86.8

CHAPTER 1 DESCRIPTIVE STATISTICS


Interquartile Range The difference between the third quartile and the first quartile for a data set.
FORMULA

IQR = Q3

Q1

CHAPTER 1 DESCRIPTIVE STATISTICS


Exampl e 28 Table below list the total revenue for the 12 top tourism company in Malaysia 109.7 82.1 79.9 79.4 74.1 121.2 89.3 98.0 76.4 103.5 80.2 86.8

Determine the IQR of the data

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: Step 1: Arrange the data in increasing order 74.1 76.4 79.4 79.9 80.2 82.1 86.8 89.3 98.0 103.5 109.7 121.2 Step 2: Determine the depth for Q1 and Q3
Depth of Q = 1
Depth of Q = 3

n +1 12 1 + = = 3. 25 4 4 3 (12 + ) 1 3(n +) 1 = = 4 4

975 .

CHAPTER 1 DESCRIPTIVE STATISTICS


Step 3: Determine the Q1 and Q3 74.1 76.4 79.4 79.9 80.2 98.0 103.5 109.7 121.2 82.1 86.8 89.3

Q1 = 79.4 + 0.25 (79.9 79.4) = 79.525 Q3 = 98.0 + 0.75 (103.5 98.0) = 102.125 Therefore, IQR = Q3 Q1 = 102.125 79.525 = 22.6

CHAPTER 1 DESCRIPTIVE STATISTICS


Quartile For Group Data
From Median, we can get Q1 and Q3 equation as follows:
FORMULA

n 4 - F Q1 = L Q1 + i f Q1

3n 4 -F Q3 = LQ 3+ i f Q3

CHAPTER 1 DESCRIPTIVE STATISTICS


Example: Find Q1 and Q3 for the following data Time to travel to work 1 10 11 20 21 30 31 40 41 50 Frequency 8 14 12 9 7

CHAPTER 1 DESCRIPTIVE STATISTICS


Time to travel to work 1 10 11 20 21 30 31 40 41 50 Frequency Cumulative Frequency 8 14 12 9 7

CHAPTER 1 DESCRIPTIVE STATISTICS


Time to travel to work 1 10 11 20 21 30 31 40 41 50 Frequency Cumulative Frequency 8 14 12 9 7 8

CHAPTER 1 DESCRIPTIVE STATISTICS


Time to travel to work 1 10 11 20 21 30 31 40 41 50 Frequency Cumulative Frequency 8 14 12 9 7 8 22

CHAPTER 1 DESCRIPTIVE STATISTICS


Time to travel to work 1 10 11 20 21 30 31 40 41 50 Frequency Cumulative Frequency 8 14 12 9 7 8 22 34

CHAPTER 1 DESCRIPTIVE STATISTICS


Time to travel to work 1 10 11 20 21 30 31 40 41 50 Frequency Cumulative Frequency 8 14 12 9 7 8 22 34 43 50

CHAPTER 1 DESCRIPTIVE STATISTICS


Class Q
1

Therefore,

n 50 = = = 5 12 4 4

Class Q1 is the 2nd class

n -F 4 Q1 = LQ1 + i fQ1 12.5 - 8 = 10.5 + 10 14 = 13.7143

CHAPTER 1 DESCRIPTIVE STATISTICS


3n 3 ( 50 ) Class Q 3 = = = 37 5. 4 4 Class Q3 is the 4th class Therefore,

n -F 4 Q3 = LQ3 + i fQ3 37.5 - 34 = 30.5 + 10 9 = 34.3889

CHAPTER 1 DESCRIPTIVE STATISTICS


Exercise: Find Q1 and Q3 for the following data Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Frequency 5 14 25 7 6 3

CHAPTER 1 DESCRIPTIVE STATISTICS


Answer: Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Frequency 5 14 25 7 6 3 Cumulative freq

CHAPTER 1 DESCRIPTIVE STATISTICS


Answer: Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Frequency 5 14 25 7 6 3 Cumulative freq 5

CHAPTER 1 DESCRIPTIVE STATISTICS


Answer: Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Frequency 5 14 25 7 6 3 Cumulative freq 5 19

CHAPTER 1 DESCRIPTIVE STATISTICS


Answer: Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Frequency 5 14 25 7 6 3 Cumulative freq 5 19 44

CHAPTER 1 DESCRIPTIVE STATISTICS


Answer: Price (RM) 12 14 15 17 18 20 21 23 24 26 27 - 29 Frequency 5 14 25 7 6 3 Cumulative freq 5 19 44 51 57 60

CHAPTER 1 DESCRIPTIVE STATISTICS


n 60 Class Q1 = = = 15 , Class Q1 is the 2nd class 4 4

Therefore,

n F i Q1 = Lq1 + 4 f q1 15 5 = 14.5 + 3 14 = 16.6429

CHAPTER 1 DESCRIPTIVE STATISTICS


3n 3(60) Class Q3 = = = 45 , Class Q3 is the 4th class 4 4

Therefore,

3n F i Q3 = Lq 3 + 4 f q3 45 44 = 20.5 + 3 7 = 20.9286

CHAPTER 1 DESCRIPTIVE STATISTICS

MEASURE OF SKEWNESS

CHAPTER 1 DESCRIPTIVE STATISTICS


To determine the skewness of data (symmetry, left skewed, right skewed) Also called Skewness Coefficient or Pearson Coefficient of Skewness
FORMULA

CHAPTER 1 DESCRIPTIVE STATISTICS


If Sk +ve right skewed If Sk -ve left skewed If Sk = 0 symmetry

If Sk takes a value in between (-0.9999, -0.0001) or (0.0001, 0.9999) approximately symmetry.

CHAPTER 1 DESCRIPTIVE STATISTICS


The duration of cancer patient warded in Hospital Seberang Jaya recorded in a frequency distribution. From the record, the mean is 28 days, median is 25 days and mode is 23 days. Given the standard deviation is 4.2 days.
What is the type of distribution? Find the skewness coefficient
Example 32

CHAPTER 1 DESCRIPTIVE STATISTICS


Solution: This distribution is right skewed because the mean is the largest value
Sk = Mean - Mode 28 23 = = 11905 . s 4.2 OR 3 ( Mean - Median ) s = 3 ( 28 25 ) 4.2 = 21429 .

Sk =

So, from the Sk value this distribution is right skewed.

EXERCISE
1. A student want to study a level of satisfaction toward a price of a product at Queen supermarket. She take a simple random of 100 customers and asked them whether they very satisfied, satisfied, not sure, not satisfied, or very not satisfied. State:

Population: All customers at Queen Supermarket Sample 100 customers at Queen Supermarket Variable satisfaction Type of variable Qualitative variable Data value
Very satisfied / satisfied / not sure, not satisfied / very not satisfied

Type of data collection in the survey


Face tot face