You are on page 1of 55

TYPES OF DATA

AND
GRAPHICAL / TABULAR
REPRESENTATION

NIHAR RANJAN PANDA


INTRODUCTION
• Statistics may be defined as the science, which deals
with collection, presentation, analysis and
interpretation of numerical data

DESCRIPTIVE STATISTICS INFERENTIAL STATISTICS

DATA INFORMATION PRESENTATION


A set of values recorded on one or more
observational units is called data

Data should be processed

Data depiction, data summarization and


data transformation

INFORMATION
Collected data should be
• Accurate (i.e. Measures true value of what is
under study)
• Valid( i.e. Measures only what is supposed to
measure)
• Precise(i.e. Gives adequate details of the
measurement)
• Reliable(i.e. Should be repeatable)
Types of DATA
• Qualitative/ Quantitative
• Discrete/ Continuous/ Interval/ Ratio
• Primary/ Secondary
• Nominal/ Ordinal
Quantitative data: Qualitative data:

•Also called as measurement data •Represents a particular quality or


•Can be expressed as number with attribute
or without unit of measurement •Expressed as numbers without unit
•Eg: Height in cm, Hb in gm%, BP in of measurements
mm of Hg, Weight in kg •Eg: religion, Sex, Blood group etc
Discrete data:
• Here we always get a whole number.
• Ex: Number of beds in hospital
Malaria cases

Continuous data :
• It can take any value possible to measure or
possibility of getting fractions
• Ex: Hb level, Ht, Wt.

WHAT IS IMPORTANT???
Interval:
• Has values of equal intervals that mean
something. For example, a thermometer might
have intervals of ten degrees
• Ex: Celsius Temperature, IQ (intelligence scale)

Ratio:
• Exactly the same as the interval scale except that
the zero on the scale means: does not exist
• Ex: Age, Weight, Height
Primary data:
• Data collected by the investigator himself/ herself for a
specific purpose
• Ex: Data collected by a student for his/her thesis or
research project
• Advantages:
– The investigator collects data specific to the problem
under study.
– There is no doubt about the quality of the data collected
(for the investigator).
– If required, it may be possible to obtain additional data
during the study period.
Secondary data:
• Data collected by someone else for some other purpose
(but being utilized by the investigator for another
purpose)
• Ex: Census data being used to analyze the impact of
education on career choice and earning
• Advantages of using Secondary data:
– The data’s already there- no hassles of data collection
– It is less expensive
– The investigator is not personally responsible for the
quality of data (“I didn’t do it”)
Nominal data:
• The information or data fits into one of the
categories, but the categories cannot be ordered
• Categories without order
• Ex: Colour of eyes, Race, Gender

Ordinal data:
• A rank or order
• Here the categories can be ordered, but the
space or class interval between two categories
may not be the same
• Ex: Ranking in the class or exam, SES
QUESTION

A person's highest educational level is which type of variable?


• Continuous
• Discrete
• Ordinal
• Nominal

The number of motor-vehicle accidents on a particular stretch


of the national highway in a week is which type of variable?
• Continuous
• Discrete
• Nominal
• Ordinal
DATA

Quantitative Qualitatitive

Discrete Continuous Interval Ratio Nominal Ordinal


REPRESENTATION OF DATA
• Tabular
• Graphic
• Numeric
When to use Tables
• When you wish to show how a single category of
information varies when measured at different points
• When the dataset contains relatively few numbers
• When the precise value is crucial to your argument and
a graph would not convey the same level of precision
• For example: when it is important that the reader
knows that the result was 2.48 and not 2.45
• When you don’t wish the presence of one or two very
high or low numbers to detract from the message
contained in the rest of the dataset
Tabular Presentation
1. Table must be numbered
2. Brief and self explanatory title must be given to each table
3. The heading of columns and rows must be clear, sufficient, concise and
fully defined
4. The data must be presented according to size of importance,
chronologically, alphabetically or geographically
5. Table should not be too large
6. The classes should be fully defined, should not lead to any ambiguity
7. The classes should be exhaustive i.e. should include all the given values
8. The classes should be mutually exclusive and non overlapping.
9. The classes should be of equal width or class interval should be same
10. The number of classes should be neither too large nor too small
Normal Range → 18.5 ≤ x < 25
Frequency distribution table with
quantitative data:
Table 1: Fasting blood glucose level in diabetics
at the time of diagnosis (n=78)
Fasting Glucose n
120-129 12
130-139 8
140-149 10
150-159 10
160-169 15
170-179 18
180-189 5
Cross- Tabulation
• Table 2: Fasting blood glucose level in
diabetics at the time of diagnosis (n=78)
Frequency distribution table with
qualitative data:
Table 1: Cases of malaria in adults and children in the months of
June and July 2010 in Nair Hospital (n=389)
EXAMPLE
This is a poor example because:
• The table lacks a title
•The source of the information is
not provided
• Row titles overlap two lines
•The alphabetical listing of regions
results in a non-numerical ordering
of data down the columns
EXAMPLE

This is a better example


because:
• The table has title
•The source of the information
is provided
• Row titles not in two lines
•The alphabetical listing of
regions results in a numerical
ordering of data down the
columns
• Numbers are aligned
Graphical Presentation
• A Graphical representation is a visual display of
data and statistical results. It is more often and
effective than presenting data in tabular form
• Graphical representation helps to quantify, sort
and present data in a method that is
understandable to a large variety of audience
• Graphs also enable us in studying both time
series and frequency distribution as they give
clear account and precise picture of problem
• Graphs are also easy to understand and eye
catching
General Principles of Graphic
Presentation
• In a graph there are two lines called coordinate axes
• One is vertical known as Y axis and the other is
horizontal called X axis
• These two lines are perpendicular to each other.
Where these two lines intersect each other is called ‘0’
or the Origin
• On the X axis the distances right to the origin have
positive value and distances left to the origin have
negative value
• On the Y axis distances above the origin have a positive
value and below the origin have a negative value
• It should have a title, legend and labelling
VARIOUS CHARTS AND DIAGRAMS

• Bar Diagram
• Histogram
• Frequency polygon
• Cumulative frequency curve/ Ogive
• Scatter diagram
• Line diagram
• Pie diagram
• Pictogram
• Stem and Leaf Plot
BAR DIAGRAM
• Bar charts are used for qualitative type of variable in which the variable
studied is plotted in the form of bar along the X-axis (horizontal) and the
height of the bar is equal to the percentage or frequencies which are
plotted along the Y-axis (vertical).
• The width of the bars is kept constant for all the categories
• The space between the bars also remains constant throughout.
• The number of subjects along with percentages in bracket written on the
top of each bar
• Types:
– Simple
– Compound
– Component
SIMPLE BAR CHART
• When we draw bar charts with only one
variable or a single group it is called as simple
bar chart
COMPOUND BAR CHART
• When two variables or two groups are considered it is called as multiple/
compound bar chart
• In multiple bar chart the two bars representing two variables are drawn
adjacent to each other and equal width of the bars is maintained
COMPONENT BAR CHART
• Bar chart wherein we have two qualitative variables which are further
segregated into different categories or components is called component
bar chart
• In this the total height of the bar corresponding to one variable is further
sub-divided into different components or categories of the other variable
HISTOGRAM
• A histogram is used for quantitative continuous
type of data where, on the X-axis, class intervals
and on the Y-axis we plot the frequencies
• It is very similar to the bar chart with the
difference that the rectangles or bars are
adherent (without gaps)
• It is used for presenting class frequency table
(continuous data)
• Diagram consisting of rectangles whose area is
proportional to the frequency of a variable and
whose width is equal to the class interval
EXERCISE

Distribution of the subjects by Cholesterol level


Serum Cholesterol (mg/dl) No. of Subjects Percentage (%)
175-200 3 30
200-225 3 30
225-250 2 20
250-275 1 10
275-300 1 10
Total 10 100
EXERCISE
FREQUENCY POLYGON AND CURVE
•Plot the variable along the X-axis and the
frequencies along the Y-axis
•Derived from a histogram by connecting the
mid points of the tops of the rectangles in the
histogram
•The line connecting the centres of histogram
rectangles is called frequency polygon
•If we construct a smooth freehand curve
passing through these points. Such a curve is
known as frequency curve
(n=37)
CUMULATIVE FREQUENCY DIAGRAM
One can tell the number of patients that lie above or below a certain level
Exercise

You might also like