You are on page 1of 17

UNIT II

TABULATION AND PRESENTATION OF DATA

DATA:

 Data is a set of values of qualitative or quantitative variables.


 Information in the raw, unorganized form is called data.
 Facts or things known from which conclusions may be drawn

COLLECTION OF DATA

The process of counting or enumerating and recording the same systematically is called the
“Collecting of Data”

 Investigator/Researcher – he tries to collect the data


 Respondents/Informants – who provides you with data or who responds to the
questions.

STATISTICAL DATA

1. Primary Data
2. Secondary Data

Based on measurement it can be classified into:

a. Qualitative
b. Quantitative

PRIMARY DATA

First-time data is collected by the investigator/researcher.

Methods of Primary Data Collection:

1. Observation method: investigators go to the field personally, observe, enquire and


collect the information from the respondents.
Can be further divided into participative or non – participative.
2. Direct oral interview: face to face, telephone, etc., enumerator interviews the
respondents and records the answers.
3. Information through agencies: local agencies or correspondents in different parts of
the area collect the information and send their reports periodically to the investigators.
4. Mail questionnaire: A set of questions
5. Schedules: it is sent through the enumerator.
Enumerator: is a person who collects the data on behalf of the researcher and he shall
be trained for the purpose.
Ex: Census
SECONDARY DATA

Readily available data which is already collected by some other investigator.

1. Published Sources
 International publications
 Official publications (Central Government, State Government)
 Semi-official publications ( IIM B)
 Publications of commercial and financial institutions
 Publications of research institutions.
 Committee Reports
 Newspapers and Journals
2. Unpublished Sources
 Not published by agencies. It is not useful for the public instead used for
internal purposes.
 Ex: private institutions (HR policy etc).

SAMPLING DESIGN

Types of Population

a) Finite Population – This is countable.


b) Infinite Population – this is uncountable.

1. Census Method: Involves each unit from the population.

Merits of Census:

i. Data is collected from each unit of the population.


ii. Intensive study is possible
iii. It can be used for other purposes.

Demerits of Census:

i. It involves huge costs.


ii. It is conducted by the government and not individuals (expensive)
iii. More money and labour is involved
iv. Training charges

2. Sample Method: Randomly selecting, collect a sample that represents the whole
population.

Methods of Sampling:

1. Probability Sampling
2. Non – Probability Sampling
1. Probability Sampling: Probability sampling is defined as a sampling technique in
which the researcher chooses samples from a larger population using a method based
on the theory of probability.
i. Simple Random Sampling: Merely selecting at random. There is no
criteria/technique
 Lottery System: writing every unit and try to pick randomly
 Table of random numbers: randomly write few numbers on the table
and select.
ii. Stratified Random Sampling: the population is classified into small groups and
pick a sample from the group. The homogenous group is also called strata.
iii. Systematic Random Sampling: We try to adopt some kind of system.
Randomly select some number and follow the system Eg: 4 th house from all 10
streets.

iv. Cluster Sampling: In cluster sampling, researchers divide a population into


smaller groups known as clusters.  They then randomly select among these
clusters to form a sample.

Cluster sampling is a method of probability sampling that is often used to


study large populations, particularly those that are widely geographically
dispersed. Researchers usually use pre-existing units such as schools or cities
as their clusters.

2. Non – probability sampling


There is no equal opportunity given for every unit. The population is infinite.
i. Judgement/Purposive sampling: Judgmental sampling, also called purposive
sampling or authoritative sampling, is a non-probability sampling technique in
which the sample members are chosen only based on the researcher’s
knowledge and judgment.
ii. Quota Sampling: This is one of the most common forms of non-probability
sampling. Sampling is done until a specific number of units (quotas) for
various sub-populations have been selected. Since there are no rules as to how
these quotas are to be filled, quota sampling is a means for satisfying sample
size objectives for certain sub-populations.
Quota sampling is somewhat similar to stratified sampling in that similar units
are grouped together. However, it differs in how the units are selected.
In probability sampling, the units are selected randomly while in quota
sampling it is usually left up to the interviewer to decide who is sampled.
iii. Convenience Sampling: Convenience sampling is defined as a method adopted
by researchers where they collect market research data from a conveniently
available pool of respondents. It is the most commonly used sampling
technique as it’s incredibly prompt, uncomplicated, and economical. In many
cases, members are readily approachable to be a part of the sample.
iv. Snowball Sampling: It helps researchers find a sample when they are difficult
to locate. Researchers use this technique when the sample size is small and not
easily available. This sampling system works like the referral program. Once
the researchers find suitable subjects, he asks them for assistance to seek
similar subjects to form a considerably good size sample.

Non-probability sampling examples

Here are three simple examples of non-probability sampling.

1. An example of convenience sampling would be using student volunteers known to the


researcher. Researchers can send the survey to students belonging to a particular school,
college, or university, and act as a sample.
2. In an organization, for studying the career goals of 500 employees, technically, the sample
selected should have proportionate numbers of males and females. Which means there
should be 250 males and 250 females. Since this is unlikely, the researcher selects the
groups or strata using quota sampling.
3. Researchers also use this type of sampling to conduct research involving a particular
illness in patients or a rare disease. Researchers can seek help from subjects to refer to
other subjects suffering from the same ailment to form a subjective sample to carry out the
study.

Merits of Non- probability Sampling:

1. Non-probability sampling techniques are a more conducive and practical method for
researchers deploying surveys in the real world.

2. Getting responses using non-probability sampling is faster and more cost-effective


than probability sampling because the sample is known to the researcher. The
respondents respond quickly as compared to people randomly selected as they have a
high motivation level to participate.

De-merits of Non – probability Sampling:

1. An unknown proportion of the entire population is not included in the sample group
i.e. lack of representation of the entire population
2. The lower level of generalization of research findings compared to probability
sampling
3. Difficulties in estimating sampling variability and identifying possible bias

CLASSIFICATION OF DATA

It is the process of arranging the available facts into homogenous groups or classes according
to resemblance or similarities.

Definition: according to Secrist “classification is the process of arranging data into sequences
and groups according to their common characteristics or separating them into different related
parts”
Objectives of Classification:

1. To present the complex, scattered data in a concise, logical, and understandable form.
2. To make a comparative study possible.
3. To remove irrelevant details from the data and make a possible tabulation of data and
its further analysis.
4. To make possible generalizations of the data.
5. To pinpoint the most significant features of the data at a glance.
6. To know similarities and dissimilarities
7. To find out relationships
8. It facilitates statistical treatment

Characteristic/Essentials of classification

1. Exhaustive: the classification must be exhaustive so that every unit of the distribution
may find a place in one group or another.
2. Suitability: classification must confirm the objects of investigation.
3. Homogeneity: all the items constituting a group must be homogeneous.
4. Flexibility: classification should be flexible so that new facts and figures may be
easily adjusted.
5. Mutually exclusive: the data must not overlap. Each item of the data must be found in
one class.

Methods of Classification:

1. Chronological classification: refers to the classification of data based on time for


instance the sale of a firm in different years will be classified as under

Year Sales (lakh tons)


1996 90
1997 80
1998 150

2. Geographical Classification: it is based on the location differences of data for


instance data relating to the number of firms producing sewing machines in India
would be classified under

Place Number of Firms


Punjab 150
Haryana 100
Himachal Pradesh 50
3. Conditional classification: here classification is done based on some conditions
such as gender, literacy, marital status, etc., for eg: the number of students under
different faculties in the university

Faculties No. Of Students


Humanities 100
Languages 150
Commerce 120
Management 80
Science 110
4. Classification according to attributes
a. Simple Classification
When the data is classified by the presence or absence of an attribute, it is
known as simple classification. For eg: the attribute under study is population,
one wants to find how many females are there in the group, therefore there
would be two classes are formed: one possessing the attribute and the other
that lacks the attribute.
b. Manifold Classification
It means the classification of data based on more than one attribute for
example we may divide the population into male and female based on the
attribute of gender, further this class may be subdivided into literate and
illiterate based on attribute literacy and so on.

Population

Male Female

Literate Literate

Illiterate Illiterate
5. Quantitative Classification
When the data are classified based on a characteristic that can be measured such
as age, income, height, production, etc, it is called quantitative qualification.
Methods: there are two stages of quantitative classification
a. Raw
When the investigator has collected the data and not systematically arranged
the same, it is called raw data or unorganized data.
Ex: an investigator has collected the data regarding the weight of 20 workers
in a factory and his findings are shown in the table below:

40 90 64 53
70 62 88 59
47 90 35 36
73 34 10 84
40 59 12 20
In the raw form, the data is scattered, and even after carefully studying the
details given in them are not understood. Presentation of data in its raw form
does not give any useful information.
b. Statistical Data:
Statistical series refers to data that is presented in some order and sequence. It
is an arrangement of data in different classes according to a given order.

Statistical Series:

Types of Series

Frequency
Individual Series Distribution
series

Discrete Series

Continuous
Series

a. Individual Series: under this method the values of all the units are shown separately.
Ex: the data of workers heights can be arranged in two forms:
 According to the code number of workers.
 The magnitude of weights of workers (ascending or descending).

When the individual units are arranged in ascending or descending order it is


called arrays or arraying of the data or arraying of figures.

The presentation though better than the raw data does not reduce the volume of
the data.

b. Frequency Distribution:
It is a summary presentation of the values of the variable. According to their
magnitude individually or in groups.
Tally bars are small vertical bars scored parallel to each other and put opposite to a
particular value or group of values to facilitate the counting of the frequencies.

(a) Discrete Series is a statistical series in which all the observations are listed
out along with their corresponding frequency in the form of a table. All the
observations may not have the same frequency.
(b) Continuous Series: Continuous Series is a statistical series in which all the
class intervals along with their corresponding frequency are listed out in
the form of a table. All the class intervals may not have the same
frequency.

Frequency Cumulative More than CF


Frequency
4 4 50
11 15 46
17 32 35
13 45 18
5 50 5

CONTINUOUS SERIES (GROUPED SERIES)

Important points:

 Class Interval – the size of each class or group in which the values of variables are
classified to condense the data. It begins with a lower limit and ends with an upper
limit.
 Class limits – are two end values of a class interval. The smaller limit is called the
lower limit and the larger limit is called the upper limit.
 Inclusive class interval – is a class in which both class limits are considered in the
process of frequency distribution while counting.
 Exclusive Class interval – is a class in which the lower limit is considered and the
upper limit is excluded in the process of frequency distribution while counting.
 The magnitude of class interval – is the difference between the lower limit and upper
limit of class interval.
 Mid Value (MidPoint/Class Mark) – is the center point of the class interval which is
exactly at the middle of the two extreme limits or boundaries of class interval.
 Class frequency – is the number of observations corresponding to a specific class, it is
the rate of occurrence of particular events or values relating to a particular class.
 Cumulative frequency – is the running total of all the frequencies up to and including
the respective class interval when the class intervals are in ascending or descending
order of values.
 Less than cumulative frequency – are running totals of the frequencies downward
starting from the first frequency.
 More than cumulative frequency – are running totals of the frequencies upward
starting from the last frequency.
 Empty class interval – the class that does not have any frequency.

FREQUENCY DISTRIBUTION TABLE

A frequency distribution table is a chart that summarizes values and their frequency. It's a
useful way to organize data if you have a list of numbers that represent the frequency of a
certain outcome in a sample. A frequency distribution table has two columns. The first
column lists all the various outcomes that occur in the data, and the second column lists the
frequency of each outcome. Putting this kind of data into a table helps make it simpler to
understand and analyze.

Two types of frequency distribution table

1. Univariate Frequency Distribution: Univariate analysis involves analyzing one variable at


a time. Frequency distributions show the numbers and percentages of people or items that
fall into different categories.

2. Bivariate Frequency Distribution: A bivariate frequency distribution


effectively represents the correlation between two variables through a table or graph.
TABULATION

Meaning: Tabulation refers to systematic arrangement of the information in rows and


columns. Column represents vertical arrangement and rows represent horizontal arrangement
of data.

Definition:

Tabulation involves the orderly and systematic presentation of numerical data in a form
devised to elucidate the problem under consideration

Objectives of Tabulation

1. Simplification
2. Comparison
3. Provides Bird’s Eye view of the data
4. Quick location of required data
5. Easy to analyse the data

Difference between tabulation of Data and Classification

1. Classification refers to the process of grouping the data where as tabulation refers to
process of placing the classified data in columns and rows.
2. Classification deals with grouping the data into classes where as tabulation carried on to
prepare for further statistical analysis.

Format of table

Title:
Table No: Head note:

STUB HEADING CAPTION HEADING


Column Entries Column Entries
Stub Entries Body Of
Stub Entries The Table
Source: Foot Note:

Parts of a Table

1. Table Number: A table should be numbered for identification and for future reference
especially when there are a large number of tables in a study.
2. Title: every table must be given a suitable title which describes the contents of the table.
The title should be clear and brief; it should be carefully worded and capable of clear
interpretation.
3. Date: the date of preparing a table should be written so that the reader can identify the
chronology (order of tables according to time) of the tables prepared.
4. Stub: stub are the designation of the rows i.e row heading. They are at the extreme left of
the table explaining what the horizontal items represent.
5. Captions: they refer to column headings. They explain what the columns items represent
under caption there maybe sub-captions.
6. Body of the table: the actual data are arranged in the body part of the table. It is the most
important part of the table.
7. Head note: it is a brief explanatory statement applying to all or a major part of the data in
the table. For eg: the unit of measurement like Rs in crores etc are written as head notes.
It is presented on right top corner of a table.
8. Source: a note at the bottom of the table indicating the sources from which the data
contained in table are collected.
9. Foot note: in case of any irregularities occurring in a table or when anything thereof has
not been adequately explained or any abbreviations are used, it is preferably added or an
explanatory note at the bottom of the table is given.

Types of tables

1. Simple table: is also called one way table showing only one characteristic of the data.

2. Complex or Manifold Table: when two or more characteristics are shown simultaneously
in a table it is called complex table or manifold table.
3. General purpose table: are tables which provide information for general use or reference.
They usually contain detailed information for general purpose.
4. Special purpose table: provide information for a particular purpose. They serve the
purpose of that particular group for which they have been prepared. These are brief in
nature and are targeted towards a particular objective.

Problems on frequency distribution

1. Marks scored by 30 students are given below:

41 55 48 47 53 48 33 32 42 55
44 38 60 65 71 80 41 53 47 48
55 20 31 34 42 51 35 35 26 25

a) Arrange the marks in ascending order


Ans: 20, 25, 26, 31, 32, 33, 34, 35, 35, 38, 41, 41, 42, 42, 44, 47, 47, 48, 48, 48, 51,
53, 53, 55, 55, 55, 60, 65, 71, 80.
b) Arrange the marks in descending order
Ans: 80, 71, 65, 60, 55, 55, 55, 53, 53, 51, 48, 48, 48, 47,47, 44, 42, 42, 41, 41, 38,
35, 35, 34, 33, 32, 31, 26, 25, 20.
c) Convert the marks into a continuous series of Class intervals of 10.
2. Marks of 60 students is given below. Prepare a frequency distribution table with the help
of inclusive method of class interval and the width of class to be 5. Also calculate less
than and more than cumulative frequency.

48 27 38 13 10 5 49 35 26 1
25 33 47 9 19 46 22 17 35 20
3 8 31 45 25 19 40 19 45 18
20 41 39 15 9 40 15 37 29 30
47 16 48 30 40 10 25 20 37 47
12 5 44 32 16 20 2 45 17 34

Ans:

Class Interval Frequency Less than More than


Cumulative Cumulative
Frequency Frequency
1–5 5 5 60
6 – 10 5 10 55
11 – 15 4 14 50
16 – 20 12 26 46
21 – 25 4 30 34
26 – 30 5 35 30
31 – 35 6 41 25
36 – 40 7 48 19
41 – 45 5 53 12
46 – 50 7 60 7
60

3. The marks obtained by 50 students in an examination are given below

30 45 48 55 39 32 31 22 21 18
54 59 61 33 34 44 10 38 19 62
74 43 73 41 46 43 51 37 85 85
71 29 22 62 29 58 55 63 64 44
43 27 32 43 52 31 47 64 18 51
Prepare a frequency distribution table and calculate the cumulative frequency.

4. Form a continuous frequency table from the following data having class interval of 40-50,
50-60 etc

90 78 86 51 96 104 51 78 50 72
68 106 79 76 49 77 92 84 76 42
74 70 69 65 80 54 79 73 58 91
65 60 77 78 67 50 84 76 110 53
74 40 60 42 82 41 61 75 115 81
5. Marks of students in two subjects is given, prepare univariate and bi variate frequency
distribution table.
Marks in accountancy: 24, 22, 21, 25, 23, 26, 21, 22, 23, 24, 25, 22.
Marks in statistics: 12, 18, 17, 14, 11, 15, 13, 16, 12, 13, 16, 18.

Ans:

Uni - variate Distribution table: Marks in accountancy

Marks Tally Bars Frequency


21 11 2
22 111 3
23 11 2
24 11 2
25 11 2
26 1 1
12

Uni – variate Distribution table: Marks in Statistics

Marks Tally Bars Frequency


11 1 1
12 11 2
13 11 2
14 1 1
15 1 1
16 11 2
17 1 1
18 11 2
12

Bi – variate distribution table

Marks of Statistics
Marks of 11 12 13 14 15 16 17 18 Total
Accountancy
21 1 1 2
22 1 11 3
23 1 1 2
24 1 1 2
25 1 1 2
26 1 1
Total 1 2 2 1 1 2 1 2 12

Problems on tabulation:

6. Draw a blank table to show the candidates gender, appearing for first year, second year
and third year exams of a university in the faculties of Arts, Science and Commerce in a
certain year.

Ans:
a) Gender: Male, Female
b) Year: I, II and III
c) Faculty: Arts, Science and Commerce

Table showing Gender wise distribution of candidates appearing in University


Examination

Faculty Gender
Male Female Total
I II year III Total I Year II III Total
year year year year
Arts
Science
Commerce
Total

7. In the house of lok sabha there were 600 members present during discussion on a
resolution put to vote, 400 voted in favour of the resolution. The government members in
the house were 380, 65 members belonging to the opposition voted in favour of the
resolution. Members were belonging to either of the 2 groups and there were no
absentees. Tabulate the information.

Ans:
a. Members: Ruling Party and Opposition Party
b. Vote: in favour and against

Voting pattern on a resolution in Lok Sabha

Vote Members
Ruling Party Opposition party Total
In favour 335 65 400
Against 45 155 200
Total 380 220 600

8. Present the following information in a suitable tabular form. In 2018, out of 2000 workers
in a factory 1550 were members of a trade union. The number of women workers
employed was 250, out of which 200 did not belong to any trade union. In 2019, the
number of union workers was 1725 of which 1600 were men. The number of non – union
workers was 380, among whom 155 were women.

Ans:
1. Trade Union: members and non – members
2. Year : 2018 and 2019
3. Gender: Male and Female

Table showing composition of Trade Union in a factory in 2018 and 2019

Year
Trade 2018 2019
Union Male Female Total Male Female Total
Members 1500 50 1550 1600 125 1725
Non – 250 200 450 225 155 380
Members
Total 1750 250 2000 1825 280 2105

9. In a sample survey about coffee habit in two towns the following information is received
Town A: Females were 40%, total coffee drinkers were 45% and Males non-coffee
drinkers were 20%.
Town B: Males were 55%, Males non-coffee drinkers were 30% and female coffee
drinkers were 15%.

Ans:
a) Towns: Town A and Town B
b) Gender: Male and Female
c) Coffee Habits: Coffee Drinkers and Non – Coffee Drinkers

Table showing coffee habits in Town A and Town B

No’s In Percentage

Coffee Town
Habits Town A Town B
Male Female Total Male Female Total
Coffee 5 40 45 25 15 40
Drinkers
Non – 20 35 55 30 30 60
Coffee
Drinkers
Total 25 75 100 55 45 100
10. Present the following information in a suitable form supplying the figures not directly
given. In 2018 out of a total of 4000 workers in a factory, 3300 were member of trade
union, the number of women workers employed was 500 out of which 400 did not belong
to any union.
In 2019 the number of workers in the union was 3450 of which 3200 were men, the
number of non union workers was 760 of which 330 were women.

You might also like