Professional Documents
Culture Documents
Editing
Field Editing Central Editing
Coding
C lassific ation
Tabulation
Graphing
DATA EDITING
• Data editing is a process by which collected data is examined to detect any errors or omissions
and further these are corrected as much as possible, so as to ensure Legibility, Compactness,
Consistency and Accuracy before proceeding further.
• The recorded data must be legible so that it could be coded later. An illegible response may be
corrected by getting in touch with the people who recorded it or alternatively it may be inferred
from other parts of the question.
• Completeness involves that all the items in the questionnaire must be fully completed
• Editing is of two types :
Field Editing: This is a type of editing that relates to abbreviated or illegible written form of
gathered data. It also refers to simultaneously along with the collection of data. Such editing is
more effective when done on same day or the very next day after the interview. The investigator
must not jump to conclusion while doing field editing.
Central Editing : Such type of editing relates to the time when all data collection process has
been completed. Here a single or common editor corrects the errors like entry in the wrong
place, entry in wrong unit etc. As a rule all the wrong answers should be dropped from the final
results.
BENEFITS OF DATA EDITING
Vehicle
Column 3 Km/day Family size
Unit occupatio Column 4 Marital status Column 6
Column 1 n Column 2 Column 5
1 4 1 20 1 3
2 3 2 25 2 1
3 5 1 25 1 4
4 2 1 15 2 2
5 4 2 20 2 4
6 5 2 35 2 6
7 1 1 40 1 3
8 5 2 20 2 4
CLASSIFICATION OF DATA
• Classification of the data implies that the collected raw data is categorized into common
group having common feature.
• Data having common characteristics are placed in a common or homogenous group.
• The entire data collected is categorized into various groups or classes, which convey a
meaning to the researcher.
Classification is done in two ways:
• Classification according to attributes.: only their presence and absence in an individual items
can be noticed. Here the data is classified on the basis of common characteristics that can be
descriptive like literacy, sex, honesty, marital status etc.
• Descriptive features are qualitative in nature and cannot be measured quantitatively but are
kindly considered while making an analysis
• Classification according to the class intervals: size of each class into which a range of a
variable is divided.
• The numerical feature of data can be measured quantitatively and analyzed with the help of
some statistical unit like the data relating to income, production, age, weight etc. come under
this category. This type of data is known as statistics of variables and the data is classified by
way of intervals
CLASSIFICATION
• Classification of data which happens to be the process of arranging data in group or classes on the basis of
common characteristics.
Attributes Class-intervals
only their presence and size of each class into
which a range of a variable
absence in an individual is divided.
items can be noticed.
TABULATION
• Tabulation is the process of summarizing raw data and displaying the same in compact
form( i.e., in the form of statistical tables ) of tables and graphs for further analysis.
• Tabulation is an orderly arrangement of data in columns and rows.
• Tabulation summarizes the raw data and displays data in form of some statistical
tables.
• Simple tabulation results in one way table which can be used to answer question related to
one characteristics of the data.
• Complex tabulation results in two way table when gives information about two related
characteristics of the data.
GRAPHING OF DATA
• Visual representation of data
• Data are presented as absolute numbers or percentages
• The most informative are simple and self-explanatory
GRAPHICAL REPRESENTATION
• Graphs help to understand the data easily.
• Most common graphs are bar charts and pie charts.
In a bar chart, a bar shows each category, the length of which represents the amount,
frequency or percentage of values falling into a category
The pie chart is a circle broken up into slices that represent categories. The size of each slice
of the pie varies according to the percentage in each category
GRAPHICAL REPRESENTATION
• Histogram :A graph of the data in a frequency distribution is called a
histogram
• Polygon : A percentage polygon is formed by having the midpoint of
each class represent the data in that class and then connecting the
sequence of midpoints at their respective class percentages.
DATA CLEANING
• Checking the data for consistency and treatment for missing value.
DATA ADJUSTING
• Data adjusting is not always necessary but it may improve the quality of analysis sometimes.
Problems in Processing of data in Research
Methodology
• The following two problems of processing the data for analytical purposes
• The problem concerning “Don’t know” (or DK) responses
• Use or percentages
Cont
.
DK Responses
• When the DK response group is small, it is of little significance.
• But when it is relatively big, it becomes a matter of major concern in which case the question
arises: Is the question which provoked DK response useless?
• The answer depends on two points viz.,
• the respondent actually may not know the answer or
• the researcher may fail in obtaining the appropriate information.
• In the first case the concerned question is said to be alright and DK response is taken as
genuine DK response.
• But in the second case, DK response is more likely to be a failure of the questioning process
Use or
percentages
• Percentages are often used in data presentation for they simplify numbers, reducing all of them
to a 0 to 100 range.
• Through the use of percentages, the data are reduced in the standard form with base equal to
100 which fact facilitates relative comparisons.
• While using percentages, the following rules should be kept in view by researchers:
1. Two or more percentages must not be averaged unless each is weighted by the group size
from which it has been derived.
2. Use of too large percentages should be avoided, since a large percentage is difficult to
understand and tends to confuse, defeating the very purpose for which percentages are
used.
Cont
.Use or percentages
3. Percentages hide the base from which they have been computed. If this is not kept in view,
the real differences may not be correctly read.
4. Percentage decreases can never exceed 100 per cent and as such for calculating the
percentage of decrease, the higher figure should invariably be taken as the base.
5. Percentages should generally be worked out in the direction of the causal-factor in case of
two-dimension tables and for this purpose we must select the more significant factor out
of the two given factors as the causal factor.
ANALYSIS OF DATA
• Analysis means computation of certain indices or measures
along with searching for patterns of relationships that exists
among the data groups
DESCRIPTIVE ANALYSIS
• The study of distribution of variables is termed as a descriptive
analysis. If we are studying one variable then it will be termed as a
uni-variate analysis, in the case of two variables bi-variate analysis
& multi-variate analysis in the case of three & more then three
variables
UNI-VARIATE ANALYSIS
• Univariate analysis refers to the analysis of one variable at a time. The commonest
approaches are as follows:
• Frequency tables : Measures of central tendency
Arithmetic mean
Median
Mode
Measures of dispersion
Range
Mean deviation
Standard deviation
• Diagrams:
o Bar charts
o Pie charts
o Histogram
BIVARIATE ANALYSIS
• Bivariate analysis is concerned with the analysis of two
variables at a time in order to uncover whether the two
variables are related
Main types:
Simple Correlation
Simple Regression
Two-Way ANOVA( Analysis of Variance)
MULTI-VARIATE ANALYSIS
• Mutivariate analysis entails the simultaneous analysis of three
or more variables
Main Types
Multiple Correlation
Multiple Regression
Multi- ANOVA (Analysis of variance)
CAUSAL ANALYSIS
• Causal analysis is concerned with the study of how one or
more variables affect changes in another variables
INFERENTIAL ANALYSIS
• Inferential analysis is concerned with the testing the
hypothesis and estimating the population values based on the
sample values.
PARAMETRIC TESTS
• These tests depends upon assumptions typically that the
population(s) from which data are randomly sampled have a
normal distribution. Types of parametric tests are:
• t- test
• z- test
• F- test
• Y2- test
NON PARAMETRIC TEST
• Do Not Involve Population Parameters
• Example: Probability Distributions, Independence
• The role of statistics in research is to function as a tool in designing research, analyzing its data
and drawing conclusions therefrom.
• Most research studies result in a large volume of raw data which must be suitably reduced so that
the same can be read easily and can be used for further analysis.
• The important statistical measures that are used to summarize the survey/research data are:
• measures of central tendency or statistical averages;
• measures of dispersion;
• measures of asymmetry (skewness);
• measures of relationship; and
• other measures.
SIMPLE REGRESSION ANALYSIS - Research
Methodology
• Regression Analysis is the determination of a statistical relationship between two or more
variables.
• In simple regression, we have only two variables, one variable (defined as independent) is the
cause of the behaviour of another one (defined as dependent variable).
• Regression can only interpret what exists physically i.e., there must be a physical way in which
independent variable X can affect dependent variable Y.
• The basic relationship between X and Y is given by
Y = a + bX
where the symbol Y denotes the estimated value
of Y for a given value of X.
• This equation is known as the regression equation of Y on X (also represents the regression line
of Y on X when drawn on a graph) which means that each unit change in X produces a change of b
in Y,
• which is positive for direct and negative for inverse relationships.
Thanks
Good luck and All the Best
FINAL EXAM – 40 MARKS
TOPICS :
• Unit II –Writing Research Proposal
• Unit IV - oral and poster presentation,
• Unit V - Processing and Analysis of Data