You are on page 1of 35

PROCESSING AND ANALYSIS OF DATA

• The survey data collected from the field needs to be processed


• The data, after collection, has to be prepared for analysis.
• Collected data is raw and it must be converted to the form that is suitable for the required analysis.
Remove impurity / Discrepancy from the data collected.
• It has to be arranged as per research design.
• The result of the analysis are affected a lot by the form of the data.
• Make it suitable for data analysis as per the statistical tools to be applied.
• So, proper data preparation is must to get reliable result. After the collection of data from primary or
secondary sources, arrangement is done so that the same may be analyzed & interpreted with the help
of statistical tools. Software such as MS Excel, SPSS (Software Packages for Social Sciences) Google Docs
etc. Will be used.
• The term analysis refers to the computation of certain measures along with searching for patterns of
relationship that exist among data group.
• But there are views , who do not like to make differences between processing and analysis. They opine
that analysis of data in a general way, which are performed with a purpose of summarising the collected
data.
• However, We shall prefer to observe the différences between the two terms.
IMPORTANT STEPS
PROCESSING THE DATA

Editing
Field Editing Central Editing
Coding

C lassific ation

Tabulation

Graphing
DATA EDITING
• Data editing is a process by which collected data is examined to detect any errors or omissions
and further these are corrected as much as possible, so as to ensure Legibility, Compactness,
Consistency and Accuracy before proceeding further.
• The recorded data must be legible so that it could be coded later. An illegible response may be
corrected by getting in touch with the people who recorded it or alternatively it may be inferred
from other parts of the question.
• Completeness involves that all the items in the questionnaire must be fully completed
• Editing is of two types :
 Field Editing: This is a type of editing that relates to abbreviated or illegible written form of
gathered data. It also refers to simultaneously along with the collection of data. Such editing is
more effective when done on same day or the very next day after the interview. The investigator
must not jump to conclusion while doing field editing.

 Central Editing : Such type of editing relates to the time when all data collection process has
been completed. Here a single or common editor corrects the errors like entry in the wrong
place, entry in wrong unit etc. As a rule all the wrong answers should be dropped from the final
results.
BENEFITS OF DATA EDITING

• The data obtained is complete in all respects.


• It is accurate in terms of information recorded and responses
sought.
• The response format is in the form that was instructed.
• The data is structured in a manner that entering the information will
not be a problem.
DATA CODING
• The process of identifying and denoting a numerals or alphabetical or both to
the responses given by the respondent is called coding.
• Coding is necessary for efficient analysis and through it the several replies may be
reduced to a small number or classes which contain the critical information
required for the analysis.
• It can be done at the beginning stage of the questionnaire itself so that the likely
responses to questionnaire are pre coded.
• It simplifies computer tabulation of the data for further analysis.
• Eg : What is your gender : Male – 0 ; Female – 1
• What is your monthly income : Less than 5000 – 1
5001 - 9000 - 2
900 - 10,000 – 3
above 10,001 - 4
Data coding example
 Sample record: Excel sheet for two-wheeler owners

Vehicle
Column 3 Km/day Family size
Unit occupatio Column 4 Marital status Column 6
Column 1 n Column 2 Column 5
1 4 1 20 1 3
2 3 2 25 2 1
3 5 1 25 1 4
4 2 1 15 2 2
5 4 2 20 2 4
6 5 2 35 2 6
7 1 1 40 1 3
8 5 2 20 2 4
CLASSIFICATION OF DATA
• Classification of the data implies that the collected raw data is categorized into common
group having common feature.
• Data having common characteristics are placed in a common or homogenous group.
• The entire data collected is categorized into various groups or classes, which convey a
meaning to the researcher.
 Classification is done in two ways:
• Classification according to attributes.: only their presence and absence in an individual items
can be noticed. Here the data is classified on the basis of common characteristics that can be
descriptive like literacy, sex, honesty, marital status etc.
• Descriptive features are qualitative in nature and cannot be measured quantitatively but are
kindly considered while making an analysis
• Classification according to the class intervals: size of each class into which a range of a
variable is divided.
• The numerical feature of data can be measured quantitatively and analyzed with the help of
some statistical unit like the data relating to income, production, age, weight etc. come under
this category. This type of data is known as statistics of variables and the data is classified by
way of intervals
CLASSIFICATION
• Classification of data which happens to be the process of arranging data in group or classes on the basis of
common characteristics.

Attributes Class-intervals
only their presence and size of each class into
which a range of a variable
absence in an individual is divided.
items can be noticed.
TABULATION
• Tabulation is the process of summarizing raw data and displaying the same in compact
form( i.e., in the form of statistical tables ) of tables and graphs for further analysis.
• Tabulation is an orderly arrangement of data in columns and rows.
• Tabulation summarizes the raw data and displays data in form of some statistical
tables.
• Simple tabulation results in one way table which can be used to answer question related to
one characteristics of the data.
• Complex tabulation results in two way table when gives information about two related
characteristics of the data.
GRAPHING OF DATA
• Visual representation of data
• Data are presented as absolute numbers or percentages
• The most informative are simple and self-explanatory
GRAPHICAL REPRESENTATION
• Graphs help to understand the data easily.
• Most common graphs are bar charts and pie charts.
 In a bar chart, a bar shows each category, the length of which represents the amount,
frequency or percentage of values falling into a category
 The pie chart is a circle broken up into slices that represent categories. The size of each slice
of the pie varies according to the percentage in each category
GRAPHICAL REPRESENTATION
• Histogram :A graph of the data in a frequency distribution is called a
histogram
• Polygon : A percentage polygon is formed by having the midpoint of
each class represent the data in that class and then connecting the
sequence of midpoints at their respective class percentages.
DATA CLEANING
• Checking the data for consistency and treatment for missing value.
DATA ADJUSTING
• Data adjusting is not always necessary but it may improve the quality of analysis sometimes.
Problems in Processing of data in Research
Methodology

• The following two problems of processing the data for analytical purposes
• The problem concerning “Don’t know” (or DK) responses
• Use or percentages
Cont
.
DK Responses
• When the DK response group is small, it is of little significance.
• But when it is relatively big, it becomes a matter of major concern in which case the question
arises: Is the question which provoked DK response useless?
• The answer depends on two points viz.,
• the respondent actually may not know the answer or
• the researcher may fail in obtaining the appropriate information.
• In the first case the concerned question is said to be alright and DK response is taken as
genuine DK response.
• But in the second case, DK response is more likely to be a failure of the questioning process
Use or
percentages
• Percentages are often used in data presentation for they simplify numbers, reducing all of them
to a 0 to 100 range.
• Through the use of percentages, the data are reduced in the standard form with base equal to
100 which fact facilitates relative comparisons.
• While using percentages, the following rules should be kept in view by researchers:

1. Two or more percentages must not be averaged unless each is weighted by the group size
from which it has been derived.
2. Use of too large percentages should be avoided, since a large percentage is difficult to
understand and tends to confuse, defeating the very purpose for which percentages are
used.
Cont
.Use or percentages
3. Percentages hide the base from which they have been computed. If this is not kept in view,
the real differences may not be correctly read.
4. Percentage decreases can never exceed 100 per cent and as such for calculating the
percentage of decrease, the higher figure should invariably be taken as the base.
5. Percentages should generally be worked out in the direction of the causal-factor in case of
two-dimension tables and for this purpose we must select the more significant factor out
of the two given factors as the causal factor.
ANALYSIS OF DATA
• Analysis means computation of certain indices or measures
along with searching for patterns of relationships that exists
among the data groups
DESCRIPTIVE ANALYSIS
• The study of distribution of variables is termed as a descriptive
analysis. If we are studying one variable then it will be termed as a
uni-variate analysis, in the case of two variables bi-variate analysis
& multi-variate analysis in the case of three & more then three
variables
UNI-VARIATE ANALYSIS
• Univariate analysis refers to the analysis of one variable at a time. The commonest
approaches are as follows:
• Frequency tables : Measures of central tendency
 Arithmetic mean
 Median
 Mode
Measures of dispersion
 Range
 Mean deviation
 Standard deviation
• Diagrams:
o Bar charts
o Pie charts
o Histogram
BIVARIATE ANALYSIS
• Bivariate analysis is concerned with the analysis of two
variables at a time in order to uncover whether the two
variables are related
Main types:
 Simple Correlation
 Simple Regression
 Two-Way ANOVA( Analysis of Variance)
MULTI-VARIATE ANALYSIS
• Mutivariate analysis entails the simultaneous analysis of three
or more variables
 Main Types
 Multiple Correlation
 Multiple Regression
 Multi- ANOVA (Analysis of variance)
CAUSAL ANALYSIS
• Causal analysis is concerned with the study of how one or
more variables affect changes in another variables
INFERENTIAL ANALYSIS
• Inferential analysis is concerned with the testing the
hypothesis and estimating the population values based on the
sample values.
PARAMETRIC TESTS
• These tests depends upon assumptions typically that the
population(s) from which data are randomly sampled have a
normal distribution. Types of parametric tests are:
• t- test
• z- test
• F- test
• Y2- test
NON PARAMETRIC TEST
• Do Not Involve Population Parameters
• Example: Probability Distributions, Independence

• Data Measured on Any Scale


• (Ratio or Interval, Ordinal or Nominal)
Types of analysis in Research
Methodology
Multiple regression analysis:
• This analysis is adopted when the researcher has one dependent variable which is presumed to
be a function of two or more independent variables.
Multiple discriminant analysis:
• This analysis is appropriate when the researcher has a single dependent variable that cannot be
measured, but can be classified into two or more groups on the basis of some attribute.
Multivariate analysis of variance (or multi-ANOVA):
• This analysis is an extension of twoway ANOVA, wherein the ratio of among group variance to
within group variance is worked out on a set of variables
Cont
.
Canonical analysis:
• This analysis can be used in case of both measurable and non-measurable variables for the
purpose of simultaneously predicting a set of dependent variables from their joint covariance
with a set of independent variables.
Inferential analysis:
• is concerned with the various tests of significance for testing hypotheses in order to determine
with what validity data can be said to indicate some conclusion or conclusions.
• It is also concerned with the estimation of population values.
• It is mainly on the basis of inferential analysis that the task of interpretation (i.e., the task of
drawing inferences and conclusions) is performed.
STATISTICS IN RESEARCH - Research
Methodology

• The role of statistics in research is to function as a tool in designing research, analyzing its data
and drawing conclusions therefrom.
• Most research studies result in a large volume of raw data which must be suitably reduced so that
the same can be read easily and can be used for further analysis.
• The important statistical measures that are used to summarize the survey/research data are:
• measures of central tendency or statistical averages;
• measures of dispersion;
• measures of asymmetry (skewness);
• measures of relationship; and
• other measures.
SIMPLE REGRESSION ANALYSIS - Research
Methodology
• Regression Analysis is the determination of a statistical relationship between two or more
variables.
• In simple regression, we have only two variables, one variable (defined as independent) is the
cause of the behaviour of another one (defined as dependent variable).
• Regression can only interpret what exists physically i.e., there must be a physical way in which
independent variable X can affect dependent variable Y.
• The basic relationship between X and Y is given by
Y = a + bX
where the symbol Y denotes the estimated value
of Y for a given value of X.
• This equation is known as the regression equation of Y on X (also represents the regression line
of Y on X when drawn on a graph) which means that each unit change in X produces a change of b
in Y,
• which is positive for direct and negative for inverse relationships.
Thanks
Good luck and All the Best
FINAL EXAM – 40 MARKS

 TOPICS :
• Unit II –Writing Research Proposal
• Unit IV - oral and poster presentation,
• Unit V - Processing and Analysis of Data

You might also like