You are on page 1of 26


Analyzing quantitative data

I. Descriptive statistics
The first step in data analysis is to describe or summarize the data using descriptive statistics. The
purpose of descriptive statistics is to enable the researcher to meaningfully describe a distribution of
scores or measurements using a few indices or statistics. Each statistics or indices used depend on the
type of variables in the study and the scale of measurements used (ratio, interval, and ordinal, nominal).
(Mugenda and Mugenda, 2003)
Techniques of quantitative data analysis
1.1 Tabulation
When a mass of data has been assembled, it becomes necessary for the researcher to arrange the same
in some kind of concise and logical order. This procedure is referred to as tabulation. Thus tabulation is
the process of summarizing raw data and displaying the same in compact orderly form for further
analysis. Tabulation is essential because of the following reasons;
a. It conserves space and reduces explanatory and descriptive statement to a minimum.
b. It facilitates the process of comparison
c. It facilitates the summation of items and the detection of errors and omissions.
d. It provides a basis for various statistical comparisons
Tabulation can be done by hand or by mechanical or electronic devices. Hand tabulation is usually
preferred in case of small inquiries where the number of questionnaires is small and they are relatively
of short length. It may be done using the direct tally, the list and tally or the card sort and count method.
1.2 Graphical representation
A graph should be well labeled on both the vertical and horizontal axes. A graph should also have a title.
There are three types of graphs commonly used to present data in research reports. These are
histograms, frequency polygons and bar charts.

1.2.1 Histograms
It comprises of a series of adjacent bars whose heights (y-axis) represent the number of subjects
obtaining a particular score or number of respondents belonging to a particular category. The scores are
usually on the horizontal axis (x-axis).

1.2.2 Frequency Polygons
A polygon is a many sided figure, hence the name frequency distribution. In plotting a
frequency polygon, one must establish the midpoint of the class interval. The midpoint is
established by summing up the lower and the upper class limits of each class interval and
dividing by two. e.g.
Lower class limit 20.5
Upper class limit 25.5
Mid point (20.5+25.5) + 2 = 23

The midpoints are then plotted against the frequencies and the points joined using straight lines.
A frequency polygon is closed figure and so the ends meet the horizontal line one unit after the
highest score and one unit from the lowest score.

1.2.3 Bar charts
Bar charts are preferred when data is discrete or categorical or when the scale is nominal or none
ordered. This is mainly because the categories in a nominal scale do not apply any order. The bar
chart is very much like the histogram except that spaces are left between the bars to signify a
lack of continuity or flow between categories.
Commuting Distance of Employees
0 2 4 6 8 10 12 14 16 18
Number of Employees

1.2.4 Pie chart
A pie chart can also be used to represent the data. It represents the data in proportions of the degrees
in a circle, at a glance one is able to tell which group has dominated the representation.

1.2.5 Scatter diagrams
They are used to represent a relationship between two independent variables in a graph
single married divorced separated
Marital Status bar Chart
Marital Status

1.3.1 Measures of central tendency
Measures of central tendency are numbers that define the location of a distribution. For example if we
regard all measurements as being attempts to give us the true value of a particular phenomenon, we
can regard the center of the distribution of a set of measurements an estimate of that true value.
They include the mean, mode and the median. The mean
To obtain it we add up the scores and divide by the number of scores. The more common term is the
average, the more technical term, used here, is the mean. Two features of the mean should be
mentioned .The first is technical: it is the point in a distribution about which the sum of the squared
deviations is at a minimum. This makes it important for estimating variance, and for least squares
analysis. The second feature is that the mean is a very effective statistic where scores within a
distribution do not vary too much, but it is not so effective when there is great variance. Therefore it is
important to know how much spread or variability there is in a set of scores in order to interpret the
mean correctly. (Keith ,2003). It is the average of asset of scores or measurements. (Mugenda and
Mugenda ,2003). According to (Kothari, 2011), mean also known as arithmetic average, is the most
common measure of central tendency and may be defined as the value which we get by dividing the
0 0.5 1 1.5 2 2.5 3

total of the values of various given items in a series by the total number of items. The researcher uses
the mean as general value that represents the phenomena. Median
it is the value of the middle item of series when it is arranged in ascending or descending order of
magnitude. It divides the series into two halves; in one half all items are less than median, whereas in
the other half all values have values higher than the median. (Kothari, 2011)
In other words the median is the point below and above which 50% of the scores fall.
The median can also used as the value that represents the data in the research. MODE
it is the most commonly or frequently occurring value in a series. The mode in a distribution is the item
around which there is maximum concentration. In general, mode is the size of the item which has the
maximum frequency. Like median, mode is a positional average and is not affected by the values of the
extreme items. It is therefore useful in all situations where we want to eliminate the effects of the
extreme items. Mode is particularly useful in the study of popular sizes. E.g a manufacturer of shoes is
usually interested in finding out the size most in demand so that he may manufacture a larger quantity
of that size.
1.3.4 Measures of variability/dispersion
variability is the distribution of scores around a particular central score or value.
Purpose of variability
Measures of variability help the researcher to see how spread out the scores or measures of each
variable are. (Mugenda and Mugenda, 2011). The various ways we can describe this spread is called the
dispersion. These measures these measures quantify the variability of the distribution. They consist of: Range
Range is the simplest possible measure of dispersion and is defined as the difference between the
values of the extreme items of a series.

Thus range =highest value of an item in a series-lowest value of an item in a series
Scores: 78,79,80,81,82,85
The range is 85 78 = 7
A big weakness of the range as a measure of variability is that it only involves two scores that is the
highest and the lowest scores, it is therefore not sensitive to the total distribution.
When the range is small then the items of the distribution in the phenomena are said to be
homogenous. When the range is large then the items of the distribution are not uniform. Mean deviation
It is the average of difference of the values of items from some average of the series. Such a difference
is technically described as deviation. Standard Deviation and the variance
The standard deviation is defined as the extent to which scores in a distribution deviate from their mean
or average. The standard deviation, therefore involves subtracting the mean from each score to obtain
the deviation. If we square each deviation, sum the squared deviations and then divide this total by the
degrees of freedom, we have a measure of variability called variance. If the value is small, it implies that
the variance is small. This means that the scores are close together. If the value is large, it implies large
variance and therefore the scores are more spread out.
By taking the square root of the variance, one gets the standard deviation. The bigger the value derived
by calculating the standard deviation, the larger the value derived by calculating the standard deviation,
the larger the deviations from the mean denoting greater variability. A small standard deviation denotes
less variability of scores in the distribution. Frequency distributions
(Keith ,2003) Simple frequency distributions are a useful way to summarize and understand data. The
individual scores in the distribution are tabulated according to how many respondents achieved each
score or gave each response, or fell into each category. They also help the researcher to stay close to the

data, at least in the initial stages of the analysis. There is great benefit to getting a hands-on feel of the
data, especially when the availability of computerized programs makes it so easy for the researcher to
be removed from the data. Mugenda and Mugenda (2004),in social science research, frequency may
also refer to the number of subjects in a given category. For example, a frequency distribution of the
variable marital status
Scores Frequency (f)
Single 30
Married 60
Divorced 20
Separated 10
n =20
Frequency distribution of marital status

Grouped Frequency
When intervals are given for each class interval, these are referred to as grouped frequency.
The number of class intervals for each distribution depends on the sample size and the range of scores
obtained. However, class intervals should be between 10 and 15 in number.
Scores limit and exact limits
Some limits refer to situations where whole numbers are used to define the limits of class intervals.
Example: Newspapers
These are the numbers of newspapers sold at a local shop over the last 10 days:
22, 20, 18, 23, 20, 25, 22, 20, 18, 20
Let us count how many of each number there is:

Papers Sold Frequency
18 2
19 0
20 4
21 0
22 2
23 1
24 0
25 1
It is also possible to group the values. Here they are grouped in 5s:
Papers Sold Frequency
15-19 2
20-24 7
25-29 1
Principles governing conversion of raw scores into grouped frequencies
i. All intervals should be the same width. Unequal class intervals cause problems when advanced
statistical work is needed.
ii. Intervals should be continuous throughout the distribution. That is even if there are no scores in
a particular class interval, that class interval should be retained and a frequency of zero
indicated against it.
iii. Too few class interval lead loss of accuracy and too many class intervals result in
inconveniences. Normally a class interval should range between 10 and 15 in number.


2. Inferential statistics
The ultimate purpose of research is to be able to generalize the results from samples to populations. We
use hypothesis testing technique to generalize from the sample to the population. These techniques are
often referred to as inferential statistics. Inferential statistics are concerned with determining how likely
it is for the results obtained from a sample to be similar to results expected from the entire population.
a. Choice of a test of hypothesis
There are various types of statistical procedures that are used in testing the hypothesis and the choice
of the procedures to use depends on the following factors;
b. Size of the sample
In testing hypothesis, some data analysis procedures cannot be used if the sample size is too small e.g
c. Type of variable and measurement scale
The type of data analysis procedures used sometimes depends whether the variables are continuous or
discrete. Similarly the measurement scale (ratio, interval, ordinal, nominal) will determine the procedure
one should use to analyze data
d. Types of research design
Statistical data analysis procedures also differ depending on the research design. E.g data from an
experimental study that compares differences between two or more groups is best analyzed using
analysis of variance (ANOVA). Relationships and predictions among variables are best determined using
correlation and regression techniques.
Statistical procedures used in inferential statistics
It tells us the direction and strength of relationships between variables both how the variables are
related, and how much they are related.

Simple correlation
The most widely-used type of correlation coefficient is Pearson r, also called linear or product-
moment correlation. Pearson correlation assumes that the two variables are measured on at least
interval scales and it determines the extent to which values of the two variables are "proportional" to
each other. The value of correlation (i.e., correlation coefficient) does not depend on the specific
measurement units used; for example, the correlation between height and weight will be identical
regardless of whether inches and pounds, or centimeters and kilograms are used as measurement units.
Proportional means linearly related; that is, the correlation is high if it can be "summarized" by a straight
line (sloped upwards or downwards).
Correlation coefficient
The computation of correlation coefficient yields a statistic that ranges from -1 to 1. This statistic is
called a correlation coefficient . The correlation coefficient tells the researcher:
1. The magnitude of the relationship between two variables. The bigger the coefficient (absolute
value) the stronger the association between the two variables.
2. The direction of the relationship between the two variables. If the correlation coefficient is
positive (+), it means that there is a positive relationship between the two variables. A positive
relationship means that as variable x1 increases, variable x2 increases as well or as variable x1
decreases, variable x2 decreases. A negative relationship (-) means that as variable x1 decreases,
variable x2 increases and vice versa.
The importance of correlation
1. Correlation analysis takes one a step further by examining how various variables are related.
2. Determining the strength and directions of the association between two variables is very
important because this piece of information forms the basis for selecting variables for further
statistical analysis e.g regression analysis.


Regression analysis is a type of analysis used when a researcher is interested in finding out whether an
independent variable predicts a given dependent variable. Regression analysis could be categorized into:
I. Simple regression
II. Multiple regression
Simple regression is used when the researcher is dealing with only one independent variable and one
dependent variable.
A researcher might be interested in finding out whether education predicts the financial status of
households. In this example, education is the independent variable and financial status is the dependent
Multiple regressions
Multiple regression attempts to determine whether a group of variables together predict a given
dependent variable.
A researcher might be interested in finding out whether age, education, household size, and marital
status influence the financial status of households. The four independent variables are considered
together as predictors of financial status.
Chi test
The chi-square (x
) is a statistical technique which attempts to establish relationship between
two variables both of which are categorical in nature. For example, we may want to test the hypothesis
that there is a relationship between gender and the number of road accidents caused by drivers. The
variable gender is categorized as male and female. The variable number of accidents is categorized as
none, few or many.

The chi square technique is therefore a form of count occurring in two or more mutually exclusive
categories. The technique compares the proportion observed in each category with what would be
expected under the assumption of independence between the two variables.
The Analysis of Variance ANOVA
Analysis of variance is a data analysis procedure that is used to determine whether there are significant
differences between two or more groups or samples at selected probability level. The questions to be
answered by analysis of variance are: what is the probability that the variation among a group of sample
means has occurred as a result of randomly selecting the samples from a common population? Are the
differences among the groups due to the treatments given or to chance?
One way analysis of variance
This refers to analysis of variance where groups are being compared on only one variable but at
different levels. In other words there is only one independent variable that is measured either ordinal or
nominal levels. The dependent variable is measured at either the ratio or interval scale.
A researcher might be interested in finding out whether teaching methods influence performance in
class. For this perpose, a class might be randomly divided into three groups and a different method of
teaching used for each group.
Group 1: Lecture method
Group 2: Discussion method
Group 3: Individual study
In this example, the independent variable (teaching method) is measured at the nominal scale and the
performance, dependent variable is measured at interval scale. The hypothesis being tested here is
whether type of teaching method makes a difference in performance among students.

Two way analysis of variance
Often, researchers are in interested in comparing two or more groups in more than one variable. For
example, a researcher might want to compare mastery of the history content of standard eight pupils.
Comparisons of mean scores for males and females and also performances of different schools could be
included in the same study.
school gender
Mean score Mean score
Hospital hill 60 70
St George 80 80
Muthaiga Primary 50 70
Nairobi Primary 60 90

Two ways analysis of variance enables the researcher to make three types of comparisons. In the above
example, the mean score of females and males are compared keeping the schools they come from
constant. The same analysis of variance yields a comparison of the mean scores of subjects from
different schools keeping the gender constant. We can also obtain more information by comparing
subjects on the two variables, namely gender and schools, and determining whether the interaction
between the two variables is statistically significant.
t-test problems
It is a special case of ANOVA which is used to test whether there are significant differences between
two means derived from two samples or groups at a specified probability level. For example , a
researcher might want to compare IQ performance from rural and urban children.

Analysis of variance yields the F-statistics. The researcher must decide on the probability level desired to
determine whether the calculated F-statistic is significant or not. It should also be noted that there are
two types of degrees of freedom associated with the F-statistic. These are the numerator and the
denominator degrees of freedom.
The ANOVA statistic is reported like this:
The results shows that sucking lollipops significantly increases IQ of college men, F(3,17) = 2.87, p = .007.
F - the test statistic
3 - the model degrees of freedom (numerator)
17 - residual degrees of freedom (denominator)
2.87 - value of F, the test statistic
.007 - the probability of H
being true
Tests of significance
Once a researcher has obtained the results from various data analysis procedures, he or she must decide
whether the results are statistically significant. The F-test and the t-test are commonly used to evaluate
the significance of various statistical indices. The regression coefficient b
, the chi-test x
, the correlation
coefficient r and the coefficient of determination R
all must be tested for statistical significance. The
researcher must select and apply the appropriate test of significance. The test of significance helps us to
decide whether we can reject the null hypothesis. In general terms, a test of significance helps the
researcher to determine whether obtained results truly hold at a given confidence level.

Steps of testing the hypothesis
Step A: Null and alternative hypotheses
The first step of statistical testing is to convert the research question into null and alternative
forms. We use the notation H0 to represent the null hypothesis and H1 (or Ha) to denote the
alternative hypothesis. H0 is a statement of no difference. This is the hypothesis that the
researcher hopes to reject. H1 opposes H0. We retain the premise of the null hypothesis until
proved otherwise. This has a basis in [quasi-]deduction and is analogous to the presumption of
innocence in a criminal trial.
Step B: Error threshold (a) If we wish to reach a yes-or-no decision, fixed level testing must be
pursued. To pursue fixed-level testing, we set an error threshold for the decision. The error
threshold, called alpha (a), is the probability the researcher is willing to take of incorrectly
rejecting a true H
. For example, the researcher may be willing to take a 1% chance of
incorrectly rejecting a true H
Step C: Test Statistic
A test statistic is calculated. There are different test statistics depending on the data being
tested and question being asked. In this chapter, we introduce tests of single means. For single
means tests, the null hypothesis is H0: : = some value and the test statistic is either a zstat or
tstat. These statistics are introduced below.
Step D: Conclusion
We convert the test statistic to a p value by placing the test statistic on its appropriate
probability distribution and determine the area under the curve beyond the test statistic. With
fixed-level testing, the p value is compared to the " level and this simple decision rule is applied:

With flexible significance testing, the p value answers the question: Thus, the smaller p value,
the better the evidence against H0. As an initial rule-of-thumb we might say that we ought to
take note of any p value approaching .05 (or less). In the parlance of statistics, such findings
denote statistical significance.

Data analysis refers to examining what has been collected in a survey or experiment and making
deductions or inferences. It involves uncovering underlying structures; extracting important variables,
detecting any anomalies and testing any underlying assumptions.
It involves scrutinizing the acquired information and making inferences.
Qualitative research concentrates on the study of social life in natural settings. Its richness and
complexity mean that there are different ways of looking at and analyzing social life, and therefore
multiple perspectives and practices in the analysis of qualitative data.
There is variety in techniques because there are different questions to be addressed and different
versions of social reality that can be elaborated (Coffey and Atkinson, 1996: 14)
The variety and diversity in approaches underlies the point that there is no single right way to do
qualitative data analysis- no single methodological framework. For example, Miles and Huberman
(1994:9) suggest a fairly classic set of six moves common across different types of analysis. Likewise,
Tesch (1990) while concluding that no characteristics are common to all types of analysis, nevertheless
identifies ten principles and practices which hold true for most types of qualitative analysis.
Much depends on the purposes of the research, and it is important that the method of analysis is
integrated from the start with other parts of the research, rather than being an afterthought. The
researcher should decide how to analyze data before going to the field as this will determine the
recording technique that will be used during the data collecting exercise.
Methods for the analysis of data need to be systematic, disciplined and able to be seen and described. A
key question in assessing a piece of research is: how did the researcher get to these conclusions from
these data? An answer should be given in order to have confidence in the findings put forward.

The analysis will vary with the purposes of the research, the complexity of the research design and the
extent to which conclusions can be reached easily (Orodho and Kombo, 2002: 116)
Analytical induction was developed by Znaniecki (1934), and was originally identified with the search for
universals in social life.
Currently it is often used to refer to the systematic examination of similarities between cases to develop
concepts or ideas.
Hammersley and Atkinson (1995) describe it using the following steps;
An initial definition of the phenomenon to be explained is formulated
Some cases of this phenomenon are investigated, documenting potential explanatory features
A hypothetical explanation is framed on the basis of the analysis of data, designed to identify
common factors across the cases.
Further cases are investigated to test the hypothesis.
If the hypothesis does not fit the facts from these new cases, either the hypothesis is
reformulated or the phenomenon to be explained is redefined so that negative cases are
This procedure of examining cases, reformulating the hypothesis, and /or redefining the
phenomenon is continued until new cases continually confirm the validity of the hypothesis, at
which point it may be concluded that the hypothesis is correct, though not with absolute

In qualitative research, data can be analyzed by a quick impressionist summary. This involves:
Summarizing key findings. E.g. in FGDs, the researcher notes down the different responses of
the participants on various issues.
Interpretation and conclusion.
This rapid data analysis technique is mainly used in situations that require urgent information to make
decisions for a program for example in places where there is an outbreak in cholera and vital
information is needed for intervention.
This technique can also be used when the results already generated are obvious, making further analysis
of data unwarranted. For example if a researcher finds out that 80% of respondents give similar answers
to what caused a fire outbreak doing further analysis is unwarranted.

This form of analysis does not require data transcription as the researcher records key issues of the
discussions with respondents. A narrative report is written enriched with quotations from key
informants and other respondents.

Grounded theory is a research strategy whose purpose is to generate theory inductively from data. This
analysis aims directly at generating abstract theory to explain what is central in the data.
At the heart of grounded theory analysis is coding: open coding, axial coding and selective coding
Open coding
Open coding constitutes a first level of conceptual analysis with the data. The analyst begins by
breaking open the data that is opening up the theoretical possibilities in the data. The purpose is to use
the data to generate abstract conceptual categories- more abstract than the data they describe- for
later use in theory building. These are substantive codes- the initial conceptual categories in the data.
Open coding involves a close examination of some of the data, identifying conceptual categories to
account for the data being studied.
Open coding is the part of analysis that pertains specifically to the naming and categorizing of
phenomenon through close examination of data. At this point, data are broken down into discrete parts,
closely examined, compared for similarities and differences, and questions are asked about the
phenomena as reflected in the data.
Axial (theoretical) coding
Axial coding is the name given to the second stage, where the main categories which have emerged
from open coding of the data are interconnected with each other.
Open coding breaks the data apart, or runs the data open (Glaser, 1978), in order to expose their
theoretical possibilities and categories then axial coding puts categories back together again, but in
conceptually different ways. This is about inter-relating the substantive categories which open coding
has developed.
Strauss and Corbin (1990) write about the inter-actionist coding paradigm. This identifies causal
conditions, phenomenon, context, intervening conditions action/ interaction strategies, and
consequences as a way of interrelating categories in the data. Thus if the inter-actionist paradigm is

used, the outcome of axial coding is an understanding of the central phenomenon in the data in terms
of the conditions which give rise to it, the context to which it is embedded, the action/ interaction
strategies by which it is handled, managed or carried out, and the consequences of those strategies.

Selective coding
Selective coding builds on the propositions produced by axial coding. The objective here is to integrate
and pull together the developing analysis.
The theory to be developed must have a central focus around which it is integrated. This will be the core
category of the theory and must be a central theme in the data, and should also be seen as central by
the participants whose behavior is being studied.
In order to integrate the other categories in the data, the core category will have to be at a higher level
of abstraction. Selective coding will then aim at developing the abstract, condensed, integrated and
grounded picture of the data. This helps move from a description point of view to a more conceptual
category abstract enough to encompass what has been described.
Limitation of grounded theory
It faces a dilemma on how to be subjective, interpretive and scientific at the same time.

Themes refer to topics or major studies that come up in discussions. This analysis categorizes related
topics where major concepts or themes are identified.
In this form of analysis, the researcher does the following:
Peruses the collected data and identifies information that is relevant to the research questions
and objectives.
Develops a coding system based on samples of collected data.
Classifies major issues or topics covered.
Rereads the text and highlights key quotations/ insights and interpretations.
Indicates the major themes in the margins.
Places the coded materials under the major themes or topics identified. All materials relevant to
a topic are placed together.
Develops a summary report identifying major themes and the association between them.
Uses graphics and direct quotations to present the findings.

Reports the intensity, which refers to the number of times certain words or phrases or
descriptions are used in the discussion.
The frequency with which an idea, word or description appears is used to interpret the
importance, attention or emphasis.

Weakness: The thematic method tends to rely heavily on the judgment of a single analyst. This may lead
to high levels of subjectivity and bias.
It may be necessary to have two or more analysts to code the transcripts independently and compare

In narrative analysis, form and content can be studied together, and a concern with narrative can
illuminate how informants use language to convey particular meanings and experiences.
Coffey and Atkinson (1996) show how analysis can explore participants use of imagery, and how such
devices as metaphors reveal shared meanings and understandings
People use metaphors as a way of making sense of experiences, and of expressing and conveying its
meaning. Qualitative analysts will often do the same thing in making sense of data.
Metaphors are one important way of figurative language. They are a major type of literary device
(trope), comparing two things using their similarities but ignoring their differences. Others can be irony
(the view from the opposite or paradoxical), synecdoche (linking instances to a larger concept) and
metonymy (representing a whole in terms of one of its parts).

Content analysis examines the intensity with which certain words have been used. It systematically
describes the form or content of written and/ or spoken material.
In content analysis a classification system is developed to record the information. In interpreting results,
the frequency with which a symbol or idea appears may be interpreted as a measure of importance,
attention or emphasis. The relative balance of favorable attributes regarding a symbol or an idea may be
interpreted as a measure of direction or bias.
In content analysis, the first step is to select the data source to be studied, then develop a classification
system to record the information.

There are various forms of content analysis as follows:
Pragmatic content analysis: classifies signs according to their probable causes and effects. The
emphasis is on why something is said. This could be used to understand peoples perceptions and
Systematic content analysis: classifies signs according to meaning.
Designation analysis: determines the frequency with which certain objects or persons, institutions or
concepts are mentioned. This is a simple counting exercise.
Attribution analysis: examines the frequency with which certain characterization or descriptions are
used. The emphasis is on the adjectives, verbs and descriptive phrases and qualifiers
Assertion analysis: provides the frequency with which certain objectives are characterized in a
particular way. Such an analysis often takes the form of a matrix with objects as columns and descriptors
as rows.

The fundamental assumption of ethno-methodology is that people within a culture have procedures for
making sense of their daily life. The primary focus is on how central features of a culture, its shared
meanings and social norms, are developed, maintained and changed, rather than on the content of
those meanings and norms.
Conversation analysis becomes a central concern, as ethno-methodologists seek to understand peoples
methods for producing orderly social interaction.
The general purpose of this study is to understand the social organization of ordinary, naturally
occurring human conduct, in which talk is a primary vehicle for the production and intelligibility of
human action. When talk is analyzed, verbatim transcripts of actual conversations are used.
Silverman (1993: 125) gives account of three fundamental assumptions of conversation analysis. They
concern the structural organization of talk, the sequential organization of talk, and the need for the
empirical grounding of the analysis. Following these assumptions, conversation analysis studies the
situated production and organization of talk, developing a bottom-up understanding of how context
influences participants production of the social reality.
Conversation analysis generates significant implications from the analysis of previously unnoticed
interactional forms. Heath and Luff (1996: 324) conclude that the naturalistic analysis of conversation
and interaction has developed a substantial body of findings which delineate the interlocking social
organization of a wide range of ordinary social actions and activities.


Discourse refers to the general framework or perspective within which ideas are formulated. It focuses
attention on the way language is used, what it is used for, and the social context in which it is used.
Analysts see speech as a performance; it performs an action rather than describes a specific state of
affairs or specific state of mind. Much of this analysis is intuitive and reflective, but may also involve
some form of counting, such as counting instances of turn-taking and their influence on the
conversation and the way in which people speak to others.
Features of discourse analysis
It is concerned with talk and texts as social practices; and as such it pays close attention to features
which traditionally would be classed as linguistic content- meanings and topics- as well as attending to
features of linguistic form such as grammar and cohesion.
It has a triple concern with action, construction and variability (Potter and Wetherell, 1987). People
perform actions of different kinds through their talk and writing, and they accomplish the nature of
these actions partly through constructing their discourse out of a range of styles, linguistic resources and
rhetorical devices.
It is concerned with the rhetorical or argumentative organization of talk and texts.

Discourse analysis is sensitive to how spoken and written language is used, and how accounts and
descriptions are constructed. At the microscopic level, it shares much in common with conversation
analysis. In a more macroscopic perspective, it emphasizes the interrelationships between accounts and
hierarchies. At this level, it is similar to deconstruction, in dismantling constructed accounts to show
connections with power and ideology.

Semiotics, or the science of signs, lays out assumptions, concepts and methods for the analysis of sign
systems. Eco (1976) points out that semiotics is concerned with everything that can be taken as a sign. It
is based squarely on language, in line with the view that human linguistic communication can be seen as
a display of signs, or a text to be read.
Semiotics can be used in the analysis of texts and also narrative structures. With its focus on linguistic
structures and categories, it can be used to develop a theory of texts and their constituent elements. It
gives deeper meaning in the system of rules that structures the text as a whole. It is this underlying
structure and the rules it embodies that can tell the researcher what its cultural and social message is.


There is richness in documentary data for social research. The analysis of such data shares
characteristics with the approaches described but also has distinctive themes.
One theme focuses on the social production of the document, starting with how the document came
into being. All documentary sources are the result of human activity, produced on the basis of certain
ideas, theories and principles.
Documents and texts studied in isolation from their social context are deprived of their real meaning.
Thus an understanding on the social production and context of the document affects its interpretation.
A second related theme is the social organization of the document. It asks questions such as, How are
documents written? For what purposes? What is recorded/ what is omitted? What does the writer
seem to take for granted about the readers? What do readers need to know in order to make sense of
These questions are used to study the social organization of documents, irrespective of their truth or
A third theme concerns the more direct analysis meaning of text for meaning, this time including
questions of truth and error. It can focus on the surface or literal meaning, or on the deeper meaning.
Methods used range from interpretive understanding to more structural approaches.
A fourth theme would be the application of different theoretical perspectives to the analysis of texts and
documents. This can also incorporate deconstruction a s used in discourse analysis.

Using this method, data from different people is compared and contrasted and the process continues
until the researcher is satisfied that no new issues are arising. The researcher moves backwards and
forwards between transcripts, memos, notes and the research literature.
Comparative analysis examines similarities and differences in events during different time periods.

This utilizes historical parallels, past trends, and sequences of events to suggest the past, present and
future of the topic being researched.

Findings would be used to develop a theory or philosophy of leisure. For example, an analysis of public
recreation agency goals and objectives of previous eras can be used to describe the future in the context
of social, political, economic, technological, and cultural changes in society.

The researcher should ensure the following:
Understand the assumptions of their statistical procedures.
Be sure to use the best measurement tools available. If measures have errors, then that fact
should be considered.
Beware of multiple comparisons. If one has to do many tests, replacement or cross- validation
should be done to verify the results.
Keep in mind what one is trying to discover. One should look at the magnitude rather than the
Use numerical notation in a rational way. One should not confuse precision with accuracy.
Be sure to understand the conditions for causal inferences. If one needs to make inference, then
he/she should try to use random assignment.
Be sure the graphs are accurate and reflect the data variation clearly.

In data analysis, a researcher should maintain integrity. This is particularly in the application of statistical
skills to problems where private interests may inappropriately affect the development or application of
statistical knowledge. For these reasons, researchers should:
Present their findings and interpretations honestly and objectively.
Avoid untrue, deceptive, or doctored results.
Disclose any financial or other interests that may affect, or appear to affect their analysis.
Delineate the boundaries of the inquiry as well as the boundaries of the statistical inferences
which can be derived from it.
Make the data available for analysis by other responsible parties with appropriate safeguards
for privacy concerns.
Recognize that selection of a statistical procedure may to some extent be a matter of personal
judgment and that other statisticians may select alternative procedures.
Direct any criticism of a statistical inquiry to the inquiry itself and not to the individuals
conducting it.
Apply statistical procedures without concern for a favorable outcome.

In data analysis, the researcher has, according to Cohen (1993) to be sure of the following:
Be sure the analysis sample is representative of the population in which the researcher is
interested with
Be sure to understand the assumptions of statistical procedures, and be sure they are clearly
Be sure to use the best measurement tools available. If measures have errors, then that fact
should be taken into account.
Be clear of what he is trying to discover.
Be sure the graphs are accurate and reflect the data variation clearly.

Dawson, C. (2002) Practical Research Methods:A User- friendly Guide to Mastering Research. Oxford:
How to Books Ltd.
Kombo, D.A. and Tromp, D. L. A. (2006). Proposal and Thesis Writing: An Introduction. Nairobi: Paulines
Publications Africa.

Orodho, A. J. and Kombo, D. K. (2002). Research Methods. Nairobi: Kenyatta University, Institute of
Open Learning.

Punch, K. F. (1998). Introduction to Social Research: Quantitative and Qualitative Approaches. London:
Sage Publications Ltd.

Olive M Mugenda & Abel G Mugenda(2003). Research Methods: Quantitative and Qualitative
Approaches. Nairobi: Laba Graphic Services Ltd.
C.R Kothari(2011). Research Methods: Methods and Techniques. India: new age international (p) Ltd.