You are on page 1of 55

UNIT 3

DATA COLLECTION
• Types of data – Primary Vs Secondary data – Methods of

primary data collection – Survey Vs Observation – Experiments

– Construction of questionnaire and instrument – Validation of

questionnaire.

• Sampling plan – Sample size – determinants optimal sample

size – sampling techniques – Probability Vs Non–probability

sampling methods.
DATA COLLECTION
WHAT IS DATA?

“Data is a collection of facts.”

•Such as numbers, words, measurements, observations


or even just descriptions of things.

•Data are the facts and figures collected, summarized,


analyzed, and interpreted.

•The data collected in a particular study are referred


to as the data set.
• Data collection is a term used to describe a process of
preparing and collecting data.

• Systematic gathering of data for a particular purpose


from various sources, that has been systematically
observed, recorded, organized.

• Data are the basic inputs to any decision making


process in business
PURPOSE OF DATA
• The purpose ofCOLLECTION
data collection is to obtain information
• to keep on record
• to make decisions about important issues,
• to pass information on to others

WHAT DOES A STATISTICIAN DO?


• Collects numbers or data
• Systematically organizes or arranges the data
• Analyzes the data…extracts relevant information to
provide a complete numerical description
• Infers general conclusions about the problem using
this numerical description
CLASSIFICATION OF DATA

1. Primary Data
– The data collected first hand by researcher for his
research

2. Secondary Data
– The data which is already collected by someone

– It is readymade data
PRIMARY DATA
• The data which are collected from the field under the
control and supervision of an investigator

• Primary data means original data that has been collected


specially for the purpose in mind

• This type of data are generally afresh and collected for the
first time

• It is useful for current studies as well as for future studies

Example: questionnaire.
Quantitative and Qualitative
•Quantitative – based on numbers – 56% of 18 year olds
drink alcohol at least four times a week - doesn’t tell you
why, when, how.
Numerical, Statistically reliable, Projectable to a broader
population

•Qualitative – more detail – tells you why, when and how!

In-depth, insight generating, Non-numerical,‘Directional’


Oil Painting
Qualitative data:
•blue/green color, gold frame
•smells old and musty
•texture shows brush strokes of oil paint
•peaceful scene of the country
•masterful brush strokes
Quantitative data:
•picture is 10" by 14"
•with frame 14" by 18"
•weighs 8.5 pounds
•surface area of painting is 140 sq. in.
•cost $300
Coffee
Qualitative data:
•robust aroma
•frothy appearance
•strong taste
•burgundy cup
Quantitative data:
•12 ounces of latte
•serving temperature 150º F.
•serving cup 7 inches in height
SECONDARY DATA
• Data gathered and recorded by someone else prior to
and for a purpose other than the current project
• Secondary data is data that has been collected for
another purpose.
• It involves less cost, time and effort

• Secondary data is data that is being reused.

• Usually in a different context.

• Example: data from a book.


SOURCES
• INTERNAL SOURCES EXTERNAL SOURCES
• Internal sources of secondary External sources of secondary data
data are usually for marketing are usually
application for Financial application-

• Sales Records  Journals

• Marketing Activity  Books


 Magazines
• Cost Information
 Newspaper
• Distributor reports and
 Libraries
feedback
 The Internet
• Customer feedback
• Advantages & Disadvantages of Primary Data
•  Advantages
•  Targeted Issues are addressed
•  Data interpretation is better
•  Efficient Spending for Information
•  Decency of Data
•  Proprietary Issues
•  Addresses Specific Research Issues
•  Greater Control
• Disadvantages
• High Cost
• Time Consuming
• Inaccurate Feed-backs
• More number of resources is required
• Advantages & Disadvantages of Secondary Data
• Advantages
•  Ease of Access
•  Low Cost to Acquire
•  Clarification of Research Question
•  May Answer Research Question
• Disadvantages
• Quality of Research

• Not Specific to Researcher’s Needs

• Incomplete Information

• Not Timely
OBSERVATION
• Study relating to Behavioral Science
• Information is sought by way of investigator’s own direct
observations
• Respondent is not asked/communicated.
Eg. Brand of wrist watch
• Willingness of respondent to respond is not necessary
• Less demanding of active cooperation
Limitations:
1. Expensive
2. Limited Data
3. Can’t observe what is going on in mind
4. Some people are rarely accessible
Structured Vs. Unstructured observations

•Careful definition of units to be observed

•Style of recording observed information

•Standardized conditions of observation

Structured >>> Descriptive

Unstructured >>> Exploratory


Participant and non-participant

Researched is member of group that he observes >


Participant
Ex: Going in slum and living their life

•Researched is detached from group and observes >


Non - participant
Ex: Study of slum people without being its part/member
Controlled and Uncontrolled

•Observation in natural setting > Uncontrolled

Ex. Study of consumers in a mall

•Observation in predefined environment > Controlled

Ex. Marshmallow Test –delayed gratification


INTERVIEW
• Presentation of oral-verbal stimuli and reply in terms of
oral-verbal responses.
1. Personal Interview:
Two persons (Interviewer and interviewee)
Face-to-face contact
Direct or indirect interview
 Structured (Descriptive Study) or
unstructured (Exploratory Study) interview
Pre-determined questions
Standardized technique of recording
Example: Investigation, documentary, exit interview etc.
Advantages of personal interview
1. More and in depth information
2. Interviewer can overcome the resistance by his skills
3. Greater flexibility
4. Observation method can also be applied
5. Personal information can be easily obtained
6. Greater response
7. Catch spontaneous reaction
8. Language can be adapted
9. Can also collect supplementary information
Disadvantages of personal interview:
1. Expensive

2. Possibility of bias

3. Certain respondents may not be approachable

4. More time consuming

5. Training and selecting the field staff

6. Requirement of proper rapport with respondents


2. Telephonic Interview:
• Faster method
• Suitable for long distances
• Cheaper than personal interview
• Simple and economical
• Higher rate of response
• Replies can be easily recorded
• No field staff is required
Demerits:
• Interview duration can not be too long
• Restricted to only those having telephonic facility
• Questions have to be short and to the point
• Non verbal responses not able to judge
3. Group Interview:

• Number of individuals with common interest are


interviewed

• Free discussion is encouraged

• Information may be obtained through questionnaire

• Eg. People’s reaction on public amenities, health


projects, welfare schemes, movie review etc.
PREREQUISITES OF
INTERVIEW
1. Interviewers should be carefully selected, trained and briefed.

2. Honest, sincere, hardworking, impartial, unbiased

3. Must possess technical competence

4. Practical experience

5. Create friendly atmosphere of trust and confidence

6. Recording responses accurately and completely

7. Interviewer should not show surprise or disapproval

8. Should not argue 9. Keep things on track


QUESTIONNAIRE
WHAT IS QUESTIONNAIRE?

•“A document containing set of questions logically


related to the problem under study.”

• If the questions are filled by respondents, then its


called as ‘Questionnaire’

• If filled by interviewer, it’s called as ‘Schedule’


QUESTIONNAIRE
ADVANTAGES
• Free from bias of interviewer
• Respondents have adequate time
• Large samples can be used to get more dependable and
reliable data
DISADVANTAGES
• Used only when respondent is educated and cooperating
• Control over questionnaire is lost once it is sent
• Inflexibility
• Time and cost
QUESTION CONSTRUCTION
• 1. Question Relevance

• Should be relevant to research objectives

• Question should be able to answer research problem

• Single question may not be able to answer problem or attain


objective

• Respondents should know the answer

• Question should not test respondents recall ability

• Should be easily understandable and be specific


2. Question Wording
• Vocabulary : Use of common vocabulary
• Exactness : Do you usually go to gym? Vs. How many days in a
week do you go to gym?

• Simplicity : Simple words and sentence, avoid jargons


• Neutrality : Should not cause undue influence.
E.g. You prefer Bru over Nescafe, right?
• Presumption : Shouldn’t presume about respondent
E.g. How many times a day do you drink coffee?

• Hypothetical Questions : Avoid. E.g. What would you do if …

• Embarrassing Questions : Personal Questions


3. Types of Questions
i. Open Ended
– Free Scope for respondents to answer
– Used to explore more and in depth information
– Difficult to analyze: E.g. What are your career plans after MBA?
ii. Closed Ended
Dichotomous
• Can be answered with 2 responses
• E.g. Do you like Hindi movies? Yes or NO
Multiple Choice Questions:
More than 2 alternatives for one question
• E.g. Which brand of jeans do you prefer? Lee, Denim, Apollo, Basics,Celio.
• MCQ must contain all the possible choices
• Should not contain overlapping choices
• Alternatives should be reasonable
4. Question Order or Sequence
• One question should follow another in logical
sequence…..

• Sequence should have relation….

E.g. What is volume of your trading?

How many trades do you make in a week?


Effective Questionnaire
Prepare questions
(Formulate & choose types of questions, order them, write
instructions, make copies)
Select your respondents
Random/Selected

Administer the questionnaire


(date, venue, time )

Tabulate data collected

Analyze and interpret data collected


Service Quality Using SERVQUAL
Questionnaire
Tangibles
XYZ bank has modern looking equipment.
XYZ Bank’s physical facilities are visually appealing.
XYZ Bank’s reception desk employees are neat appearing.
Materials associated with the service (such as pamphlets or statements) are visually
appealing at XYZ bank.
Reliability
When XYZ bank promises to do something by a certain time, it does so.
When you have a problem, XYZ bank shows a sincere interest in solving it.
XYZ bank performs the service right the first time.
XYZ bank provides its service at the time it promises to do so.
XYZ bank insists on error free records
Responsiveness
Employees in XYZ bank tell you exactly when services will be performed.
Employees in XYZ bank give you prompt service.
Employees in XYZ bank are always willing to help you.
Employees in XYZ bank are never too busy to respond to your request.
Assurance
The behavior of employees in XYZ bank instills confidence in you.
You feel safe in your transactions with XYZ bank.
Employees in XYZ bank area consistently courteous with you.
Employees in XYZ bank have the knowledge to answer your questions.
Empathy
XYZ bank gives you individual attention.
XYZ bank has operating hours convenient to all its customers.
XYZ bank has employees who give you personal attention.
XYZ bank has your best interest at heart.
The employees of XYZ bank understand your specific needs.
Validation of questionnaire
FACE VALIDITY
• Evaluate in terms of………
CONTENT VALIDITY
How do experts evaluate
validity
Method 1: Average Congruency Percentage (ACP)
[Popham, 1978]

 Experts compute the percentage of questions deemed to be


relevant for them

 Take the average of all experts

 If the value is > 90 . . . Valid’

 Eg: 2 experts . . (Expert 1-100%, Expert 2-80%)


 Then ACP = 90%
Method 2: Content validity index [Martuza 1977]
 Content validity Index for individual items (I-CVI)
I-CVI
Panel of content experts asked to review the relevance of
each question on a 4-point Likert scale (minimum 3 -
maximum 10 experts)
1= not relevant
2= somewhat relevant

3= relevant

4= very relevant

 Then for each question, number of experts giving 3 or 4


score is counted (3,4 – relevant; 1,2 – non relevant)
Proportion is calculated

Eg: If 4/5 experts give score 3 or 4: I-CVI = 0.80


Content Validity Index for the scale (S-CVI)

•The proportion of items on an instrument that achieved a


rating of 3 or 4 by all the content experts

Two approaches:
S-CVI/UA – Universal agreement

S-CVI/Ave - Average
CONSTRUCT VALIDITY
Method: Factor analysis

To examine empirically the interrelationship among items


and to identify clusters of items that share sufficient
variation to justify their existence as a factor or construct to
be measured by the instrument

Various items are gathered into common factors

Common factors are synthesized into fewer factors and


then relation between each item and factor is measured

Unrelated items are eliminated


RELIABILITY
 It is the ability of an instrument to create reproducible
results

 Each time it is used, similar scores should be obtained

 A questionnaire is said to be reliable if we get


same/similar answers repeatedly

 Though it cannot be calculated exactly, it can be


measured by estimating correlation coefficients
• A population is the set of data of all possible
measurements (or observations) of individuals or items.
Example:

• the heights of all the students in a college

• the life period of all the light bulbs produced by XYZ.


• Sampling Population

–The population to be studied/ to which the investigator wants to


generalize his results

• Sampling Unit –smallest unit from which sample can be selected

• Sampling frame

–List of all the sampling units from which sample is drawn

• Sampling scheme

–Method of selecting sampling units from sampling frame

• Sampling fraction

–Ratio between sample size and population size


Sampling…………
•The process of selecting a number of individuals for a study
in such a way that the individuals represent the larger group
from which they were selected.
Simple random sampling
• All subsets of the frame are given an equal probability.

• Random number generators

• Selecting subjects so that all members of a population have


an equal and independent chance of being selected
Table of random numbers
684257954125632140
582032154785962024
362333254789120325
985263017424503686
Advantages
•1. Easy to conduct
•2. High probability of achieving a representative sample
•3. Meets assumptions of many statistical procedures
Disadvantages
•1. Identification of all members of the population can be difficult
•2. Contacting all members of the sample can be difficult
Stratified random sampling
• The population is divided into two or more groups called strata,
according to some criterion, such as geographic location, grade
level, age, or income, and subsamples are randomly selected
from each strata.

You might also like