You are on page 1of 42

1

SUMMARIZING
AND
ORGANIZING
DATA

Brig ® Naila Azam


Learning Objectives
2

Students will be able to


 Understand data cleaning
 Describe various methods of data organization
 Make a frequency distribution table
3
Treatment of Data

 Data Cleaning
 Summarization and presentation
 Data Analysis
Data Handling
 Data Collection
 Data Storage
 Data Protection
 Data Retention
 Data Analysis
 Data Sharing
 Data Reporting

4
5
WHAT SENSE CAN BE MADE OUT OF
THIS DATA?

143 142 140 144 142 141


142 143 144 144 145 145
141 142 143 143 141 141
142 144 144 142 143 143
143 146 143 144 144 145

(Sodium concentration mEq/l)


Summarization & Organization of data
6

 Done to bring about important points clearly and strikingly

 To display information that is easy to read and comprehend


7 Organizing Data

 Two ways to organize and present data

 Tables

 Graphs, charts and diagrams


Methods for Organizing Data
8
 Ordering data

 Tallies
 Stem and leaf
 Grouping
displays data

 Frequency
distributions
 Summarizing data

 Measures of central tendency

 Measures of dispersion
 Box-and-whiskers plots
Methods
9 for Organizing Data
 Displaying data

 Tables
 Histograms, diagrams
 bar
Pie charts
 Scatterplots
 Graphs
10 Ordering Data
 Example: ages of graduate students (n=10)
 Suppose the unordered data were:
 35, 40, 52, 27, 31, 42, 43, 28, 50, 35
 Data could be ordered by hand:
 27, 28, 31, 35, 35, 40, 42, 43, 50, 52
 Ordering data by hand can be tedious, especially
when there is a large number of observations

 Alternatives to this method are:


 Tallies
 Stem and leaf displays
Tallies of Data
11

 Advantage
 Provide information regarding the frequency of
observations in groups or categories
 Disadvantage
 The actual values of observations within groups are
not
retained
12 Stem-and-Leaf Displays

 If you have a set of observations, there are a


number of ways to order those observations
 Example: Ages of Graduate Certificate Students
35, 40, 52, 27, 31, 42, 43, 28, 50, 35
 You could order the observations by hand

 Alternatively, you could use a stem and leaf


display to record and order your observations
13
Ordering Data with Stem-and-Leaf Displays
SIMPLE TABLE

 Systematic arrangement of data in columns and rows


 First step before data is used for analysis & interpretation
 Should be numbered & titled (brief, self-explanatory)
 Headings of rows & columns should be clear & concise
 Foot note and explanatory note may be given
 Heading above and data source below the table

14  Typically used to display numerical data


SIMPLE TABLE

Table: 1 Population of different cities of Pakistan*

City Population

Rawalpindi 100,000

Islamabad 80,000

15Lahore 120,000

*Source: Federal Bureau of statistics


Lets Recap…
Method of Organizing/ Ordering data
 Tallies

 Stem and Leaf Displays

 Simple table
FREQUENCY DISTRIBUTION

 It is the process by which long series of observation are


systematically arranged & recorded to enable analysis &
interpretation
 Most convenient way of summarizing data
 Range of all values is divided into ordered classes & number of
observations that fall in each class is determined
 When total number of observations is small (i.e. range of data
17
is not wide), numerical values themselves may be used to
define the classes
FREQUENCY DISTRIBUTION

Value Frequency
140 1
141 4
142 6
143 8
144 7
18
145 3
146 1
Total 30
ANOTHER EXAMPLE

68 63 42 27 30 36 43
28 32 79 27 22 23 49
24 25 44 65 43 25 12
74 51 36 42 28 31
28 25 45 12 57 51
12 32 49 38 42 27
19 31 50 38 21 16 24
69 47 23 22 43 27
49 28 23 19 46 30
STEPS
Data Arrangement
o Arrange data into array of observation from smallest to largest
o Range of values divided into ordered classes (CI) & number of
observation into frequencies (f)
Class Intervals
o It is division of range into number of arbitrary but usually equal
& non overlapping segments

20
STEPS

 First class interval – smallest observation, the last one


– largest observation
 The class intervals should be equal i.e. similar
number of observations
 Class intervals should be mutually exclusive i.e. non-
overlapping
 Determining the number and width of class intervals
21 ???
 Number of classes may range from 5-15
 Calculate width of class intervals by dividing the
range with the number of classes required
STEPS
Class Interval Frequency

10 – 19 5

20 – 29 19

30 – 39 10

40 – 49 13

50 – 59 4

60 – 69
22 4

70 – 79 2

Total 57
STEPS
Interval Width
o Number of units between upper & lower limits (Real??)
o Two types of limits:
 Class limits/apparent limits (10 – 19, 20 – 29)
 Class Boundaries/Real limits : 9.5 – 19.4, 19.5 – 29.4
o Class Boundaries/Real limits
 Determine point at which lowest class interval should
begin
 Can include value not included in data
23 
Do not leave intervals with zero observations
 Record total observation to frequencies column
ANOTHER TERMINOLOGY

Class Mark

o Mid point

o Denoted by X

 UL+LL/2

24
MERITS OF FREQUENCY DISTRIBUTION
 It shows at glance how many individual observation are in a
group & where is main concentration
 Shows range & shape of distribution
 It showed proportion of population or sample with certain
characteristics
25
DEMERITS

 Precision in the resulting statistical calculation is lost as for


these calculations it is assumed that the frequencies are
evenly distributed through the range of the interval however
that might not be the case
 Different groupings can result from the same frequency
distribution
26 Likewise, different frequency distributions can have similar
groupings
Time to Recap…..
Methods of Ordering Data
27

 Tallies Stem and leaf display

 Simple Table

 Frequency Distribution table


OTHER FEATURES OF FREQUENCY
DISTRIBUTION

Relative Frequency
 It is fraction of items (out of total number) which is belonging
to that class
 Divide absolute frequency of that class by total number of
observations
 Can be used for comparing two different frequency
distributions for two or more groups of individuals
 The sum of all the relative frequencies will always be ….. ?
RELATIVE FREQUENCY

Class Interval Frequency Relative Frequency

10-19 5 0.087 (5/57)

20-29 19 0.333 (19/57)


30-39 10 0.175 (10/57)

40-49 13 0.228 (13/57)

50-59 4 0.07 (4/57)

29
60-69 4 0.07 (4/57)

70-79 2 0.035 (2/57)

57 0.998
OTHER FEATURES OF FREQUENCY
DISTRIBUTION

Percentage Frequency
 It is proportion of number of items (out of total number) which is
belonging to that class (% age relative to total cases)
 Obtained by multiplying relative frequency with 100
 The sum of all the percentage frequencies is always 100
 Also shows 1/3rd of cases or 1/4th of cases lie in which group
PERCENTAGE DISTRIBUTION

Class Int Freq Rel Freq %age Freq


10 – 19 5 0.087 8.7
20 – 29 19 0.333 33.3
30 – 39 10 0.175 17.5
40 – 49 13 0.228 22.8
50 – 59 4 0.07 7
60 – 69 4 0.07 7
70 – 79 2 0.035 3.5
Total 57 0.998 99.8
OTHER FEATURES OF FREQUENCY
DISTRIBUTION

Cumulative Frequency
 It is obtained by adding up frequencies of succeeding classes
 First frequency will be same and next will be obtained by
adding up next class
 It can be tabulated for absolute frequency, relative frequency &
percentage frequency
 The sum of all cumulative frequencies will be equal to total
number i.e. absolute frequency total
CUMULATIVE FREQUENCY
Cum %
Cum Cum Rel Percentage age Freq
Class Int Freq Rel Freq
Freq Freq Freq

10 - 19 5 5 0.0877 0.0877 8.7 8.7


20 – 29 19 24 0.3333 0.421 33.3 42.1
30 – 39 10 34 0.1754 0.5964 17.5 59.64
40 – 49 13 47 0.2281 0.8245 22.8 82.45
50 – 59 4 51 0.702 0.8947 7 89.47
60 – 69 4 55 0.702 0.9649 7 96.49
70 – 79 2 57 0.0351 1.0000 3.5 100
Total 57 1.0000
Frequency Distribution
34
Example
In a survey of 20 patients who smoked, the
following data were obtained. Each value
represents the number of cigarettes the
patient smoked per day. Construct a
frequency distribution using six classes.
(The data is given on the next slide)
Frequency Distribution…
35
Example
Frequency Distribution…
36
Example
 Step 1: Find the highest and lowest
values: H = 22 and L = 5.
 Step 2: Find the range:
R = H – L = 22 – 5 = 17.
 Step 3: Select the number of classes
desired. In this case it is equal to 6.
Frequency Distribution…
37
Example

 Step 4: Find the class width by dividing the


range by the number of classes. Width = 17/6
= 2.83. This value is rounded up to 3.
 Step 5: Select a starting point for the lowest
class limit. For convenience, this value is
chosen to be 5, the smallest data value.
Frequency Distribution…
38
Example
ASSIGNMENT 1

 Thirty AA batteries were tested to determine how long


they would last. The results, to the nearest minute, were
recorded as follows:
 423, 369, 387, 411, 393, 394, 371, 377, 389, 409, 392,
408, 431, 401, 363, 391, 405, 382, 400, 381, 399, 415,
428, 422, 396, 372, 410, 419, 386, 390
 Use the steps to construct a frequency distribution table
with frequency, cumulative frequency, relative frequency,
39 percentage frequency, cumulative relative frequency
ASSIGNMENT 2

 Ages in years of 25 students in a nursing class were


noted as under in 2017
19,20,23,18,21,19,19,18,20,18,20,23,24,20,22,20,25,18
,20,19,18,17,20,19,20
 Use the steps to construct a frequency distribution
table with frequency, cumulative frequency, relative
frequency, percentage frequency, cumulative relative
frequency
ASSIGNMENT 3

 There are 2000 households in a village. 400 use


well water, 300 water from a river, 800 from a
pond and 500 from hand pumps.
 Construct a pie chart to present above data.
42

THANK YOU

You might also like