Chapter 3

Data Collection
3 - 1
–Primary vs. secondary data
– Collection of primary data
– Observation method
– Interview
– Information through agencies
– Mailed questionnaires
– Schedules sent through enumerators
– Collection of secondary data

3 - 2
3 - 3
USE OF WORDS „DATA‟ & „INFORMATION‟
 DATUM (singular) or DATA (plural) refers to raw
numbers or other measures, usually discrete and gives
objective facts about events.

 INFORMATION refers to what emerges when data
are processed, analyzed, interpreted and presented.

Information is data transformed (contextualized,
categorized, corrected, calculated, condensed) into a
message
Any statistical data can be classified under two categories
depending upon the sources utilized.

These categories are,
1. Primary data
2. Secondary data

– Categories of data
3 - 4
Primary data is the one, which is collected by the
investigator himself for the purpose of a specific inquiry or
study.

Such data is original in character and is generated by
survey conducted by individuals or research institution
or any organisation.
– Categories of data: Primary
3 - 5
Secondary data are those data which have been already
collected and analysed by some earlier agency for its own
use; and later the same data are used by a different agency.
– Categories of data: Secondary
3 - 6
– Observation method
– Interview
– Information through agencies
– Mailed questionnaires
– Schedules sent through enumerators
– Primary Data: Methods of Collection
3 - 7

1. Observation
When data are collected by observation, the
investigator asks no questions.
Instead, s/he observes the objects or actions in
which he is interested.

Sometimes individuals make the observations; on
other occasions, mechanical devices observe and
record the desired information.
– Primary Data: Methods of Collection
3 - 8

2. Interview
Personal interviews are those in which an
interviewer obtains information form respondents in
face-to-face meetings.

The information obtained by this method is likely to
be more accurate because the interviewer can clear
up doubts, can cross-examine the informants and
thereby obtain correct information.
– Primary Data: Methods of Collection
3 - 9
3. Information from correspondents:
The investigator appoints local agents or
correspondents in different places and compiles
the information sent by them.

Information to Newspapers and some departments of
Government come by this method. The advantage of this
method is that it is cheap and appropriate for extensive
investigations. But it may not ensure accurate results
because the correspondents are likely to be negligent,
prejudiced and biased. This method is adopted in those
cases where information is to be collected periodically
from a wide area for a long time.
– Primary Data: Methods of Collection
3 - 10

4. Mailed questionnaire method:
Under this method a list of questions is prepared and is sent
to all the informants by post. The list of questions is
technically called questionnaire.

A covering letter accompanying the questionnaire explains the
purpose of the investigation and the importance of correct
information and request the informants to fill in the blank spaces
provided and to return the form within a specified time.

This method is appropriate in those cases where the informants are
literates and are spread over a wide area.

– Primary Data: Methods of Collection
3 - 11
5. Schedules sent through Enumerators:
In case the informants are largely uneducated and non-
responsive data cannot be collected by the mailed
questionnaire method. In such cases, schedule method is
used to collect data.
Here the questionnaires are sent through the enumerators to
collect information. Enumerators are persons appointed
by the investigator for the purpose. They directly meet
the informants with the questionnaire. They explain the
scope and objective of the enquiry to the informants and
solicit their cooperation. The enumerators ask the questions
to the informants and record their answers in the
questionnaire and compile them.
– Primary Data: Methods of Collection
3 - 12
Secondary data can be collected from a number of
sources which can broadly be classified into two
categories.

i) Published sources

ii) Unpublished sources them.
– Secondary Data
3 - 13
Mostly secondary data are collected from published sources. Some
important sources of published data are the following.

1. Published reports of Governments and local bodies.
2. Statistical abstracts, census reports and other reports published by
different ministries of the Government.
3. Official publications of the foreign Governments.
4. Reports and Publications of trade associations, chambers of
commerce, financial institutions etc.
5. Journals, Magazines and periodicals.
6. Periodic Publications of Government organizations.
7. Reports submitted by Economists, Research Scholars, Bureaus etc.
8. Published works of research institutions and Universities etc.
– Secondary Data
3 - 14
Statistical data can also be collected from various unpublished
sources. Some of the important unpublished sources from which
secondary data can be collected are:

1. The research works carried out by scholars, teachers and
professionals.

2. The records maintained by private firms and business enterprises.
They may not like to publish the information considering them as
business secret.

3. Records and statistics maintained by various departments and
offices of the Governments, Corporations, Undertakings etc.
– Secondary Data
3 - 15
3 - 16
3 - 17
3 - 18
3 - 19
3 - 20
3 - 21
D1 Planning and collecting
data
KS3 Mathematics
D
1
D
1
D
1
D
1
D1.1 Planning a statistical enquiry
Contents
D1 Planning and collecting data
D1.2 Collecting
data
D1.3 Organizing
data
D1.4 Writing a statistical
report
Specifying the problem
The first step in planning a statistical enquiry is to decide
what problem you want to explore.
This can be done by asking questions that you want your
data to answer and by stating a hypothesis.
For example, suppose we wish to investigate the lengths
of words used in newspapers.
We could ask:
“Do different types of newspaper
use different length words?”
A hypothesis is a statement of something that you
believe to be true but do not have any evidence to
support.
Specifying the problem
Related questions could include:
“Is there a link between the lengths of the words used
and the lengths of the sentences for a particular
newspaper?”
“Is there a difference between the
use of two- and three-letter words?”
A possible hypothesis could be:
“Tabloid newspapers use shorter words
to appeal to a wider audience.”
Deciding on the data
The next step is to decide what data is needed and where
it can be collected from.
Data can be collected from a primary
source or a secondary source.
Data from a primary source is data that you have
collected yourself, for example:
Data from a secondary source is data that you have
collected from somewhere else including the Internet,
reference books or newspapers.
From a survey or questionnaire of a group of people.
From an experiment involving observation,
counting or measuring.
Sources of data
Choosing the sample
When collecting data it is usually impractical to include
every member of the group that is being investigated.
How big should a sample
be?
The sample should be as large as possible.
This will depend on the time and resources available.
If the sample size is too small, then the results will be
unrepresentative.
A sample is therefore choose to represent the group.
Choosing the sample
It is important that the sample is representative of the
group that is being investigated.
Suppose, for example, that you wish to investigate the
favourite sports of 11 to 15 year-olds.
Would it be reasonable to question a sample of
people outside a football ground following a game?
Can you suggest a better sample?
You would have to make sure that you ask equal numbers
of girls and boys and that the sample is spread out across
all age groups in the range.
Choosing units
If your statistical investigation involves measurement
then you must decide what units to use and to what
degree of accuracy.
Suppose, for example, that you wish to investigate the
relationship between age and height.
How will you measure age?
How will you measure
height?
In weeks? In months?
In years and months? In years?
In metres?
In inches? In centimetres?
Planning a statistical enquiry
Once you have decided on
you can start the next stage which is to design a data
collection sheet or questionnaire.
the purpose of the enquiry,
the type of data that will be collected and where it will
come from,
the sample size and type,
D
1
D
1
D
1
D
1
D1.1 Planning a statistical
enquiry
D1.2 Collecting data
Contents
D1.3 Organizing
data
D1 Planning and collecting data
D1.4 Writing a statistical
report
Collecting data
Data can be collected using a questionnaire or a data
collection sheet.
A questionnaire is used when you wish to ask
a sample of people a series of structured
questions relevant to your line of enquiry.
A data collection sheet or observation sheet is
used when recording results involving
counting, measuring or observing. It can also
be used to collect the answers to a few simple
questions.
Data can also be collected from secondary sources such
as the Internet, newspapers or reference books.
Designing a questionnaire
When designing you own questionnaire you should try to
follow these rules:
1) Provide an introduction, so that the person filling
in
the questionnaire knows the purpose of your
enquiry. 2) Write questions in a sensible order, putting easier
questions first.
It is important to design a questionnaire so that:
People will co-operate and answer the questions honestly.
The answers to the questions can be analysed and
presented.
Designing a questionnaire
3) Make sure that questions are not embarrassing or
personal.
For example, you need to think carefully about
questions asking about age or income.
Do not ask :
How old are you?
A better question is :
Tick one box for your age group.
15-20 21-25 26-30 31 +
Designing a questionnaire
For example :
People could answer :
Yes
No
Not much
Only the best bits
Once a day
Sometimes
4) If possible, write questions so that they have a
specific answer.
Did you see the
Olympics on TV ?
Designing a questionnaire
A better question would be:
How much of the Olympics coverage did you
watch?
Tick one box only. None
Less than 1 hour a day
Between 1 to 2 hours a day
More than 2 hours a day
Every eventuality has been accounted for and the
person answering the question cannot give another
choice.
How would you rate the leisure facilities available in
your local area? Tick one box only.
Designing a questionnaire
A scale can be used when asking for an opinion.
For example,
Excellent Unsatisfactory Poor Satisfactory Good
Designing a questionnaire
5) Do not ask leading questions.
For example, this question conveys a particular opinion,
A better question is :
Which one of the following sports do you like the best?
football rugby tennis golf cricket boxing
Don’t you agree that football is the best sport?
Suggest a better question
How much do you weigh?
This is too personal, also some people don’t know
their weight.
Underweight Average weight Overweight
Would you consider yourself to be:
A better question would be:
Suggest a better question
Most people use a deodorant, do you ?
Which make of deodorant do you use ?
Male:
Female: Sure Impulse Dove Other None
Lynx Other Adidas Slazenger None
Please circle any that apply.
This is a leading question and may offend
people.
A more useful question would be:
Suggest a better question
The intervals given overlap. Also, if a person has read
more than 6 books there is nowhere to tick.
A better question would be:
How many books did you read last month?
Tick one box.
0-2 3-5 6-8 8+
How many books did you read last month?
0-2 2-4 4-6
Trialling a questionnaire
Once you have written a questionnaire it is a good idea to
try it out on a small sample of people.
This is called a pilot survey.
Note down their responses and use these to refine any
questions that are causing difficulty.
Do I use a tick or a cross
to show the box I want?
What does this
question mean?
I don’t want to answer
this question because it’s
too personal.
There isn’t a box to
cover my answer.
Designing a data collection sheet
A data collection sheet can be used to record data that
comes from counting, observing or measuring.
It can also be used to record responses to specific
questions.
For example, to investigate a claim that the amount of TV
watched has an impact on weight we can use the
following:
age gender height (cm) weight (kg) hours of TV watched per week
Using a tally chart
When collecting data that involves counting something we
often use a tally chart.
For example, this tally chart can be used to record people’s
favourite snacks.
favourite snack tally frequency
crisps
fruit
nuts
sweets
The tally marks are recorded, as responses are collected,
and the frequencies are then filled in.
13
6
3
8
Using a tally chart
D
1
D
1
D
1
D
1
D1.3 Organizing data
Contents
D1.2 Collecting
data
D1 Planning and collecting data
D1.1 Planning a statistical
enquiry
D1.4 Writing a statistical
report
Categorical data
Categorical data is data that is non-numerical.
For example,
Sometimes categorical data can contain numbers.
For example,
favourite football team,
eye colour,
birth place.
favourite number,
last digit in your telephone number,
most used bus route.
Discrete and continuous data
Discrete data can only take certain values.
Continuous data comes from measuring and can
take any value within a given range.
Numerical data can be discrete or continuous.
For example,
For example,
shoe sizes,
the number of children in a class,
the number of sweets in a packet.
the weight of a banana,
the time it takes for pupils to get to school,
the height of 13 year-olds.
Discrete or continuous data
Using a frequency table
Once data has been collected it is often organized into a
frequency table.
For example, this frequency table shows the favourite
take-away meals of a group of pupils:
Favourite take-away
Pizza
Fish and chips
Burgers
Indian
Frequency
11
7
8
5
Chinese
8
Grouping discrete data
A group of 20 people were asked how much change they
were carrying in their wallets. These were their responses:
34p
£1.72
83p
£6.36
£4.07
£2.97
£3.53
6p
£9.54
34p
£1.68
50p
82p
£7.54
£1.09
£2.81
£2.43
46p
£1.70
£1.29
Each amount of money is different and the values cover a
large range.
This type of data is usually grouped into equal class
intervals.
Choosing appropriate class intervals
When choosing class intervals it is important that they
include every value without overlapping and are of equal
size.
For the following data:
34p
£1.72
83p
£6.36
£4.07
£2.97
£3.53
6p
£9.54
34p
£1.68
50p
82p
£7.54
£1.09
£2.81
£2.43
46p
£1.70
£1.29
We can use class sizes of £1:
£0.01 - £1.00, £1.01 - £2.00, £2.01 - £3.00, £3.01 - £4.00,
£4.01 - £5.00, Over £5.
This is an open class interval.
Over 5.00
4.01 - 5.00
3.01 - 4.00
2.01 - 3.00
1.01 - 2.00
0.01 - 1.00
Frequency Amount of money (£)
3
1
1
3
5
7
Choosing appropriate class intervals
34p
£1.72
83p
£6.36
£4.07
£2.97
£3.53
6p
£9.54
34p
£1.68
50p
82p
£7.54
£1.09
£2.81
£2.43
46p
£1.70
£1.29
Complete the following frequency table for this data:
Choosing appropriate class intervals
The size of the class intervals depends on the range of
the data and the number of intervals required.
Explain why class sizes of £5 would be inappropriate.
Could we use a class size of 20p?
For the following data:
34p
£1.72
83p
£6.36
£4.07
£2.97
£3.53
6p
£9.54
34p
£1.68
50p
82p
£7.54
£1.09
£2.81
£2.43
46p
£1.70
£1.29
Grouping continuous data
Continuous data is usually grouped into equal class intervals.
What is wrong with the class intervals in this grouped
frequency table showing lengths?
30 ≤ length
20 ≤ length ≤ 30
10 ≤ length ≤ 20
0 ≤ length ≤ 10
Frequency Length (cm)
This is an open class interval.
30 ≤ length
20 ≤ length < 30
10 ≤ length < 20
0 ≤ length < 10
Frequency Length (cm)
The class intervals are written using the symbols ≤ and <.
Grouping continuous data
Continuous data is usually grouped into equal class intervals.
What is wrong with the class intervals in this grouped
frequency table showing weights?
Weight (g) Frequency
0 < weight < 10
10 < weight < 20
20 < weight < 30
30 < weight
Weight (g) Frequency
0 ≤ weight < 10
10 ≤ weight < 20
20 ≤ weight < 30
30 ≤ weight
Using two-way tables
A two-way table can be used to organize two sets of data.
For example, pupils from Years 7, 8 and 9 were asked
what they usually did during their lunch break. This two-
way table shows the results:
Year 7
Year 8
Year 9
Eat school dinners
35
29
38
Eat a packed lunch
42
34
32
Eat at home
19
22
18
D
1
D
1
D
1
D
1
D1.4 Writing a statistical report
Contents
D1.3 Organizing
data
D1.2 Collecting
data
D1 Planning and collecting data
D1.1 Planning a statistical
enquiry
The data collection cycle
The following diagram shows the stages needed to
conduct a statistical enquiry.
Specify the
problem and
plan
Process and
display the data
Collect the data
from a variety of
sources
Interpret and
discuss the
results
The data collection cycle
Writing a statistical report
Once you have planned, collected and processed data
relevant to a statistical enquiry you will often have to
communicate your findings in the form of a report.
A report should contain the following:
A description of what sources were used including a
justification of the type and size of any samples used.
An introduction stating the purpose of the survey and
any initial conjectures which you plan to investigate.
Calculations, such as the mean, median and mode, to
give an overall picture of the data.
Writing a statistical report
Sometimes your data will give results that you did not
expect. These will lead to new lines of enquiry which
you should investigate if possible.
Problems or ambiguities that arose during the course of
the investigation and how you dealt with them.
A summary of the conclusions shown by the data, not
forgetting to refer back to your initial hypothesis.
Tables or graphs of the results, using ICT as
appropriate. (Remember to justify you choice of what is
presented).
Writing a statistical report
Collect the relevant data and write a
statistical report investigating one of the
following:
The types of sports young people take part in outside of
school hours.
How pupils travel to school.
The difference in word lengths used in men’s and
woman’s magazines.
Use of mobile phones among teenagers.
The relationship between hand span and foot length.
Writing a statistical report
Question
1. What is data? Write the categories of data.
2. Explain the methods of primary data collection.
3. Write down the different sources of secondary data.
4. Write short notes on:
a) Observation method
b) Interview
c) Mailed questionnaires
d) Primary data vs. Secondary data

Sign up to vote on this title
UsefulNot useful