You are on page 1of 72

Chapter Five:

Analyses and Interpretation of


Data

07/31/2021 Admas University 1


Data Processing and Analysis
Data Processing
 It implies editing, coding, classification, and
tabulation of collected data.
A. Editing:
 Is a process of examining the collected data to detect
errors and omissions and to correct these when
possible.
 Making data ready for coding and transfer to data
storage.
 Done to assure that the data are accurate, consistent
with other facts gathered and uniformly entered,
Data Processing
Field editing: Consist of reviewing of the
reporting forms by the investigator for completing
what has been written in abbreviation and/ or in
illegible form at a time of recording the
respondents’ response
This sort of editing should be done as soon as
possible after the interview or observation.
Central editing: It will take place at the research
office. Its objective is to correct errors such as
entry in the wrong place.
Data Processing…
B. Coding
 Refers to the process of assigning numbers to each
answers so that responses can be put into a limited
number of categories.
 Coding is used when the researcher uses computer to
analyze the data otherwise it can be avoided.
Eg.
1. Closed end question 1 [ ] Yes 2 [ ] No
2. Likert scale
Data Processing…
C. Classification
 Data classification implies the processes of
arranging data in groups or classes on the basis of
common characteristics.
 Data having common characteristics placed in one
class and in this way the entire data get divided into
a number of groups or classes.
 Classification can be one of the following two
types, depending upon the nature of the
phenomenon involved:
(a) Classification according to attributes
(b) Classification according to class-intervals
Data Processing…
(a) Classification according to attributes: data are classified
on the basis of common characteristics which can either be
descriptive (such as literacy, sex, honesty, etc.) or
numerical (such as weight, height, income, etc.).
 Descriptive characteristics refer to qualitative
phenomenon which cannot be measured quantitatively.
 Data obtained this way on the basis of certain attributes are
known as statistics of attributes and their classification is
said to be classification according to attributes.
Data Processing…
b. Classification according to class-intervals: the
numerical characteristics refer to quantitative
phenomenon which can be measured through some
statistical units.
 Data relating to income, production, age, weight, etc.
come under this category.
 Such data are known as statistics of variables and are
classified on the basis of class intervals.
◦ Fore example, individuals whose incomes, say, are
within 1001-1500 Birr can form one group, those
whose incomes within 500-1000 Birr form another
group and so on.
Data Processing…
 In this way the entire data may be divided into a
number of groups or classes or what are usually called,
class interval.
 Each class-interval, thus, has an upper as well as lower
limit, which is known as class limit. The difference
between the two-class limits is known as class
magnitude.
 The number of items that fall in a given class is known
as the frequency of the given class.
◦ All the classes with their respective frequency are
taken together and put in the form of table are
describing as group frequency distribution or simply
frequency distribution.
Data Processing…
D. Tabulation:
 Tabulation is the process of summarizing raw
data and displaying in table form for further
analysis.
Generally accepted principles of tabulation
 Every table should have a clear, concise and
adequate title so as to make the table intelligible
without reference to the text and this title should
always be placed just above the body of the
table.
 Every table should be given a distinct number
to facilitate easy reference.
Data Processing…
 The column headings (captions) and the row headings
(stubs) of the table should be clear and brief.
 The units of measurement under each heading or sub-
heading must always be indicated.
 Source or sources from where the data in the table have
been obtained must be indicated just below the table.
 Those columns whose data are to be compared should be
kept side by side. Similarly, percentages and/or averages
must also be kept close to the data.
Data analysis

 Involve estimating the values of unknown parameters of


the population and testing hypothesis for drawing
inferences.
 Refers to computation of certain measures for searching
relationship that exist with data.
Analysis can be categorized as:
i). Descriptive Analysis, and
(ii). Inferential Analysis
Data analysis…
i). Descriptive Analysis
Refers to transformation of raw data into a
form that will make them easy to understand
and interpret.
Descriptive statistics are a set of techniques that
organize, summarize and provide a general
overview of data.
Data analysis…

The most common forms of describing the


processed data are:
Tabulation

Percentage

Measure of central tendency

Measure of dispersion
Data analysis…

Measurement of central tendency


 Measure of central tendency is a sort of average or
typical value of the items in the series.
 Its function is to summarize the series in terms of
this average value.
The most common measures of central tendency are–
(i) Arithmetic mean or mean

(ii) Median and

(iii) Mode
Data analysis…
Mean – the sum of all measurements divided by the
number of observations in the data set
Median: is the number present in the middle when
the numbers in a set of data are arranged in
ascending or descending order.
Mode: is the value that occurs most frequently in a
set of data.
◦ This is the only central tendency measure that can
be used with nominal data, which have purely
qualitative category assignments.
Example: Measures of Central Tendency
(Arithmetic Mean)
 The arithmetic mean is the average of all the values
under consideration

Branch Revenue

1 50,000,000
2 150,000,000
3 40,000,000
4 60,000,000
Total = 300,000,000

Arithmetic
Arithmetic Mean
Mean == 300,000,000
300,000,000 // 44 == 75,000,000
75,000,000
Example: Measures of Central Tendency
(Median)
 The Median is the midpoint of the distribution of values
under consideration

Salesperson Number of Sales


Calls
1 4
2 3
Median
Median== 33
3 2
4 5
5 3 11 22 33 33 33 44 55 55
6 3
7 1
8 5
Example: Measures of Central Tendency
(Mode)
 The Mode is the value that occurs most frequently in
the distribution of values under consideration

Salesperson Number of Sales


Calls
1 4
2 3
Mode
Mode == 33
3 2
4 5
5 3
6 3
7 1
8 5
Data analysis…
Measures of asymmetry (skewness)-it measures
the shape of distribution
The shape of the distribution is said to be
symmetric if the observations are balanced or
evenly distributed, about the center.
Symmetric Distribution

10
9
8
7
Frequency

6
5
4
3
2
1
0
1 2 3 4 5 6 7 8 9
Data analysis…
 The shape of the distribution is said to be skewed if the
observations are not symmetrically distributed around
the center.
Positively Skewed Distribution
A positively skewed 12

distribution (skewed to the 10

Frequency
right) has a tail that extends 6

to the right in the direction 2

of positive values.
0
1 2 3 4 5 6 7 8 9

A negatively skewed
Negatively Skewed Distribution

12

distribution (skewed to the 10

left) has a tail that extends to


Frequency

the left in the direction of


4

negative values. 0
1 2 3 4 5 6 7 8 9
Data analysis…
Measurement of Dispersion
 Dispersion measure how the value of an item is
scattered around the true value of the average.
 It is a measurement of how far is the value of the
variable from the average value.
Important measures of dispersion are:
 Range: difference between the max & min value of
an observed variable.
 Mean deviation: It is the average dispersion of an
observation around the mean value.
 Standard deviation: is defined as the square-root
of the average of squares of deviations.
When the distribution of item in a series
happens to be perfectly symmetrical
Data Presentation
• Data in raw form are usually not easy to use for decision
making
• Some type of organization is needed
• Table
• Graph
• Data presentation: The process of transforming a mass of raw
data into tables and charts-as a part of making sense of the data.
• Refers to the preparation of data in a manner that could be used
by general audience
• Tables:
◦ They can be used with just about all types of numerical data.
• Graphical
• The type of graph to use depends on the variable being
summarized
Data presentation: The Frequency
Distribution Table
Summarize data by category

Example: Hospital Patients by Unit


Hospital Unit Number of Patients
Cardiac Care 1,052
Emergency 2,245
Intensive Care 340
Maternity 552
Surgery 4,630

(Variables are
categorical)
Data presentation-Cont’d

Person Mode of Person Mode of Person Mode of


travel travel travel
1 car 11 car 1 car
2 car 12 bus 2 car
3 bus 13 walk 3 bus
4 car 14 car 4 car
5 walk 15 train 5 walk
6 cycle 16 bus 6 cycle
7 car 17 car 7 car
8 cycle 18 cycle 8 cycle
9 bus 19 car 9 bus
10 train 20 car 10 train

•How would you classify this data?


Data presentation-Cont’d

• This data is categorical (nominal ) since mode of travel does not


have a numerical value. This information would be better
displayed as a frequency table:
• Table: frequency Mode of travel
Mode of travel Frequency Relative frequency (%)
Car 9 45
Bus 4 20
Cycle 3 15
Walk 2 10
Train 2 10
Total 20 100

• Frequency: the number of times each category appeared


• Ordering by descending size of frequency makes comparison clearer
Data presentation-Cont’d

 Diagrammatic representation of data (bar charts, pie


charts, histogram, line graphs, frequency polygon)
 Bar charts:
A bar chart is a graph that shows the frequency
distribution of a variable.
 They can be used with nominal and with discrete data
 Bars should be of equal width, with the height of the bars
representing the frequency (height of the bar is
proportional to frequency) or the amount for each separate
category.
 For each category a vertical bar is drawn
 There is a gap between each bar.
Data presentation-Cont’d
Types of bar charts: simple bar chart, multiple bar
chart, component bar chart
 A simple bar chart: shows the total of each
category
 A multiple bar chart is used when you are interested
in changes in the components but the totals are of
no interest
 A component bar chart: this helps to compare totals
and seeing how the totals are made up a component
bar chart.
N um ber of
patients per year

1000
2000
3000
4000
5000

0
Cardiac
Care

Em ergency

Intensive
Care

Matern ity
Simple Bar Chart Example
Hospital Patients by Unit

Su rg ery
A multiple bar Chart Example
 Sales by quarter for three sales territories:

60

50

40
East
30 West
North
20

10

0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
A Component bar Chart Example
• Sales by quarter for three sales territories:
Pie chart:

 Presents data as segments of the whole pie.


 Each category is represented by a segment of a
circle.
 The segments are presented in terms of percentages
 The size of each segment reflects the frequency of
that category and can be represented as an angle.
Hospital Number % of
Unit of Patients Total Hospital Patients by Unit

Cardiac Care 1,052 11.93 Cardiac Care


Emergency 2,245 25.46 12%
Intensive Care 340 3.86
Maternity 552 6.26
Surgery 4,630 52.50
Emergency
Surgery 25%
53%

Intensive Care
(Percentages are 4%
Maternity
rounded to the
nearest percent) 6%
Inferential Analysis

Inferential analysis involves using quantitative


data to draw conclusion or inferences about a

complete population.
 Seek to determine the relationship between
variables and test statistical significance.
Inferential Analysis…
Two questions should be answered to determine the
relationship between variables:
(i) Does there association between the two (or more)
variables? If yes, of what degree?
 This question is answered by the use of correlation
technique.
In case of bivariate population, correlation can be found using
◦ Karl Pearson’s coefficient of correlation: It is simple
correlation and commonly used
◦ Charles Spearman’s coefficient of correlation
Measures of relationship:
Need to determine whether there is a
relationship between variables
Correlation

• Magnitude
• Direction
Tests of significance
 There are two general classes of significance tests:
Parametric hypothesis testing Non-parametric hypothesis testing

• When the data are interval-or • when data are either ordinal or
ratio-scaled (gross national nominal
product, industry sales volume)
and sample size is large •Examples: Chi-square,
•It assumes that the data in the Kolmogorov-Smirnov test
study are drawn from population
with normal (bell-shaped)
distributions and /or normal
sampling distribution

•Examples z-test, t-test


Inferential Analysis…
(ii) Is there any cause and effect relationship
between the two variables? If yes, of what degree
and in which direction?
 This will be answered by technique of
regression.
There are different techniques of regression.

◦ In case of bivariate population cause and effect


relationship can be studied through simple
regression.
◦ In case of multivariate population, causal
relationship can be studied through multiple
regression analysis.
Testing of Hypothesis
Hypothesis testing: A procedure, based on sample
evidence and probability theory, used to determine
whether the hypothesis is a reasonable statement and
should not be rejected, or is unreasonable and
should be rejected.
Procedure for deciding if a null hypothesis should
be accepted or rejected in favor of an alternate
hypothesis.
A statistic is computed from a survey or test result
and is analyzed to determine if it falls within a
preset acceptance region.
Testing of Hypothesis
• A hypothesis is a logical supposition, a reasonable guess:
– It provides a tentative explanation for a phenomenon under
investigation.
– Hypothesis links the variables of interest indicating the
expected relationship between them.
– Often takes the form of a statement of how changes in the
value of one variable will affect the value of the other variable.
◦ It is important that any hypothesis you develop be testable with empirical
research.
◦ Hypothesis provides a direction to proceed in order to acquire information
◦ It directs you to possible sources of information
◦ You search information to determine which hypothesis to accept
◦ In research, hypotheses are rarely proved or disproved; instead they are
either supported or not supported by the data.
Steps In Statistical Hypothesis Testing
Step1: State the null hypothesis, H0, and the alternative
hypothesis, Ha.
The alternative hypothesis represents what the researcher
is trying to prove.
The null hypothesis represents the negation of what the
researcher is trying to prove.
◦ (In a criminal trial in the Ethiopian justice system, the
null hypothesis is that the defendant is innocent; the
alternative is that the defendant is guilty; either the jury
rejects the null hypothesis if they find that the
prosecution has presented convincing evidence, or the
jury fails to reject the null hypothesis if they find that
the prosecution has not presented convincing evidence).
Testing of Hypothesis
Step 2: State the size(s) of the sample(s).
This represents the amount of evidence that is being
used to make a decision.
State the significance level, , for the test.
The significance level is the probability of making a
Type I error.
◦ A Type I error is a decision in favor of the
alternative hypothesis when, in fact, the null
hypothesis is true.
◦ A Type II error is a decision to fail to reject the null
hypothesis when, in fact, the null hypothesis is false.
Testing of Hypothesis
Step 3: State the test statistic that will be used to
conduct the hypothesis test
◦ Statistical Inference for Values of Population
Parameter.
The following statement should appear in this
step:
◦ The test statistic is _________ , which under H0
has a _____________ probability distribution
(with _____ degrees of freedom
Testing of Hypothesis
Step 4: Find the critical value for the test.
This value represents the cutoff point for the test
statistic.

Step 5: Calculate the value of the test statistic,


using the sample data.
◦ (If you are using SPSS, Excel or SAS, or some
similar computer package, you will calculate
the value of the test statistic, along with a p-
value.)
Testing of Hypothesis
Step 6: Decide, based on a comparison of the
calculated value of the test statistic and the critical
value of the test, whether to reject the null
hypothesis in favor of the alternative.
◦ (If you have a calculated p-value, then decide
based on a comparison of the p-value with . If
the p-value is less than , reject H0. Otherwise,
fail to reject H0.)
Testing of Hypothesis
If the decision is to reject H0, the statement of the
conclusion should read as follows:
◦ We reject H0 at the (value of ) level of
significance. There is sufficient evidence to
conclude that (statement of the alternative
hypothesis).
If the decision is to fail to reject H0, the statement
of the conclusion should read as follows:
◦ We fail to reject H0 at the (value of ) level of
significance.
◦ There is not sufficient evidence to conclude that
(statement of the alternative hypothesis).
Why do we need statistics?

To enable us to test experimental hypotheses


◦ H0 = null hypothesis
◦ H1 = experimental hypothesis

Example:
◦ Null = no difference in brain activation between
these 2 conditions
◦ Exp = there is a difference in brain activation
between these 2 conditions
T-test
A t-test helps to compare whether two groups have
different average values
◦ for example, whether men and women have different
average heights.
This analysis is appropriate whenever you want to
compare the means of two groups.

 Figure 1. Idealized distributions for treated and comparison group posttest values.
 One sample t-test
Impact on one independent variable on
dependent/response variable
◦ Eg. Number of patient on weekly sales of the
store
 Paired samples
Paired samples t-tests typically consist of a sample
of matched pairs of similar units, or one group of
units that has been tested twice (a "repeated
measures" t-test)
◦ Paired sample t-test is used in ‘before-after’ studies.
 A typical example of the repeated measures t-test would
be where subjects are tested prior to a treatment, say for
high blood pressure, and the same subjects are tested
again after treatment with a blood-pressure lowering
medication.
 By comparing the same patient's numbers before and
after treatment
 Independent/unpaired samples/The two-sample
t-test
The independent samples t-test is used when two
separate sets of independent and identically
distributed samples are obtained, one from each of
the two populations being compared.
It tests for significant differences in the means of
two distinct populations.
◦ For example, we can use this test to see if there are
significant differences in how men and women score the
new concept
◦ Eg. Name of two teachers who teach same course
different section
◦ Test its effect on students’ grade/score
Analysis of variance/ANOVA
ANOVA, is a technique from statistical
interference that allows us to deal with several
populations.
A hypothesis test to compare the means of more
than two population
ANOVA test assumes three things:
◦ The population sample must be normal
◦ The observations must be independent in each
sample
◦ Homogeneity: Homogeneity means that the
variance between the groups should be
approximately equal.
Analysis of variance/ANOVA
These assumptions can be tested using statistical
software. 
◦ The assumption of homogeneity of variance can be tested
using tests such as Levene’s test or the Brown-Forsythe Test. 

◦ Normality of the distribution of the population can be tested


using plots, the values of skeweness and kurtosis, or using
tests such as Shpiro-Wilk or Kolmogorov-Smirnov. 

◦ The assumption of independence can be determined from the


design of the study.
Analysis of variance/ANOVA
One-way ANOVA
 One-factor ANOVA, also called one-way ANOVA
is used when we compare more than two groups,
based on one factor (independent variable).
◦ For example1, we might look at average test
scores for students exposed to one of three
different teaching techniques (three levels of a
single independent variable
◦ Example2: a manufacturing company wants to
compare the productivity of three or more
employees based on working hours.
Analysis of variance/ANOVA
Two-way ANOVA
When a company wants to compare the employee
productivity based on two factors (2 independent
variables), then it said to be two way (Factorial)
ANOVA. 
◦ For example, based on the working hours and
working conditions, if a company wants to
compare employee productivity, it can do that
through two way ANOVA. 
Analysis of variance/ANOVA
Multivariate analysis of variance
(MANOVA)
• An extension of ANOVA to more than one
dependent
variables
• Used to test the significance of differences between
the means of two or more groups on two or more
dependent variables, considered simultaneously.
– E.g. the effect of two methods of exercise
treatment on both diastolic and systolic blood
pressure
Analysis of variance/ANOVA
 Example: 500 patients
◦ 4 d/nt diet/A1, A2, A3, A4-4levels for diet
◦ 3 d/nt exercise plans/B1, B2, B3-3levels for exercise
plans
Interested to measure patients blood sugar
◦ Therefore:
 Two factors: diet and exercise plans
 Response variable: blood sugar
 Experimental group: pateint
 Levels of diet: A1,A2,A3, A4
 Levels of exercise plans: B1,B2,B3
 Treatment/combination of the two factors: A1B1,
A2B2, ..
Analysis of variance/ANOVA
Example2
500 patients
Diet: only one factor

◦ Has different levels/A1, A2,A3,A4


Response variable; blood pressure
Experimental unit: patients
Factors: diet, i.e. which has effect on response
variable
Analysis of variance/ANOVA
Example3: exercise it
Let’s say we are doing a study to compare the
effectiveness of two teaching methods, “inductive
and deductive”. 100 students will be randomly
assigned to one of the two teaching methods and
the SAT scores of all of them will be recorded.
Questions;

1. What is the response variable?


2. What are the experimental units?
3. What are the factor(s)?
4. What are the treatments?
CHAPTER SIX
SCIENTIFIC RESEARCH REPORT
WRITING AND PRESENTATION

• The ultimate purpose of any research project is


to communicate solutions of a research problem.
• Communication system has two inseparable
parties: the sender and the receiver.
 Types Of Scientific Research Reports
 Can be classified into four: Journal Articles; Conference
Papers; Senior Essay, Thesis and Dissertations; and Books.
◦ Journal Articles are in turn classified based on the reviewing
process or the source of data the research uses.
◦ Based on the review process, Journal Articles are categorized into
Peer Reviewed Articles and Double Blind Reviewed Articles.
◦ A Peer Reviewed Article is a research paper that is examined by
peers whether the research work maintains standards accordingly
increasing research paper quality and credibility.
◦ The process of reviewing a research paper is known as referring.
◦ An article is Double Blind Reviewed Article if identity of both the
researcher and the referee are not known.
◦ That is, when referees review the research article without knowing
personal information about the writer and the writer does not know
who is reviewing her paper.
Conference Papers are papers submitted and/or presented
at conferences, workshops, seminars and other forums.
Conference Papers may be classified as Paper with the
Respondent, Panel Presentation, Roundtable, and Poster.
In the Paper with Respondent type of Conference Paper a
speaker submits a thirty-minute paper and a respondent
responds to the paper for about fifteen minutes and the
speaker gives response to the respondent for about fifteen
minutes.
Panel Presentation paper is prepared such that it would be
presented in a setting involving panel sessions led by 3-4
panel speakers each of who talks about fifteen minutes.
The panel may involve individual or group
discussants that give responses in reaction to the
panel speakers.
Roundtable Papers are submitted so as to be
presented in a setting that involves five or more
speakers who speak about ten minutes each.
Poster Papers are visual presentations of
research findings such as posting a hypothesis
and an outline of the findings.
Senior Essay, Thesis, and Dissertation are research works
that are done in partial fulfillment of Bachelor, Masters,
and PhD degrees respectively.
Senior Essay, Thesis, and Dissertation are structurally the
same but differ in the degree of complexity of the research
problem on the one hand and degree of reliability and
validity of the research methodology adopted on the other.
Books, as different from textbook (i.e. a manual of
instruction in any field of study), are compilation of
different research works about one specific field of study.
Alternatively, a Book can also be one research work
divided into several chapters.
Books are usually known as monographs.
Deciding To Publish Your Research Work

To publish means to make any content available to the


public and the act of publishing is called publication.
If you decide to publish your research paper the first
question you need to ask is what to publish? You can
publish either an abstract or full report from your
research paper.
The next thing you need to consider is choice of
publication outlet.
Depending on the quality of research you may look for
Journals starting from the least reputed up to the most
reputed ones.
Alternatively, you may publish your research paper in a
book form or else you may extract Conference Paper.
Writing A Scientific Research Report
Main purpose of a scientific research report is to
communicate research results to specific set of
audience.
Before writing your scientific research report you
need to know your audience.
In general, you may categorize your potential
readers or audience into three as: readers with
deep understanding about your research topic,
readers with reasonable understanding about
your research topic, and readers that do not
reasonable understanding about your research
topic.
 Sections Of A Scientific Research Report
A typical survey research report aimed at
publication in a Journal has the following parts:
◦ Abstract,
◦ Introduction,
◦ Literature Review,
◦ Results and Discussion,
◦ Conclusion,
◦ Acknowledgement,
◦ References, and
◦ Appendix (if any).
Types of presentations

Presentation can be classified into informative,


instructional, arousing, and persuasive types.
Informative presentation has the purpose of informing
the audience about an idea, object, phenomena, etc
using concise and to the point presentation.
◦ Informative presentation has to exclude complications
because it is aimed at informing the basic facts.
◦ An informative presenter has to be concerned as to
how to make the presentation brief and to the point.
◦ Informative presentations are common in events such
as seminars, workshops, and conferences where the
presenter and the audience do not have similar level
of understanding on the subject matter.
Instructional presentation has the purpose of
acquainting the audience with new knowledge and
skill.
◦ Instructional presentation puts a detailed account
of a topic.
◦ Instructional presentation is very common in a
class room teaching, public lectures; seminars,
workshops, and conferences on technical issues
of a discipline.
◦ In instructional presentation, the presenter has
more knowledge and skill on the topic than the
audience.
Arousing presentation has the purpose of making the
audience think about a problem, an event, an object,
an idea, etc and thereby attaching value onto the
event, the object, the idea, etc.
Arousing presentation is common when the audience
is composed of decision makers such as the members
of state council.
 Persuasive presentation has the purpose of convincing the
audience.
◦ The best example of persuasive presentation is defending
your research proposal.
◦ In a persuasive presentation you are required to present
sufficient logic in order to convince the audience to take
your view.
◦ For example, research proposal presentations by
graduating students of Rift Valley University are
persuasive presentations.
◦ In general, a good persuasive presentation may have three
parts, namely; gap reflection or problem statement, need
for intervention, and your proposed intervention.

You might also like