You are on page 1of 52

WOLKITE UNIVERSITY

COLLEGE OF COMPUTING AND INFORMATICS

DEPARTMENT OF INFORMATION SYSTEM

COMPILED BY: ISAYAS W. (M.Sc.)

1
CHAPTER FIVE

ANALYSIS AND PRESENTATION OF DATA

2
Quotes
• Data will talk to you if you are willing to listen. ‘Bergeson’

• Information is the oil of the 21st century, analytics is the


combustion engine ‘Peter’
Analysis and presentation of data
• The data, after collection, has to be processed and analyzed in accordance
with the outline laid down for the purpose at the time of developing the
research plan

This is essential for

a scientific study and

for ensuring that we have all relevant data

for making contemplated (expected) comparisons and analysis


Cont…
• The term analysis refers to the computation of certain measures along with
searching for patterns (designs) of relationship that exist among data-groups.

Computation means the procedure of calculating; determining something by


mathematical or logical methods

Thus, “in the process of analysis,

relationships or differences supporting or conflicting with original or

new hypotheses should be subjected to statistical tests of significance to

determine with what validity data can be said to indicate any conclusions”
Data analysis
• Data Analysis is the process of organizing, displaying, summarizing, and asking
questions about data.
Data analysis is the process of developing answers to questions through the examination and
interpretation of data. 

The basic steps in the analytic process consist of:

identifying issues,

determining the availability of suitable data,

deciding on which methods are appropriate for answering the questions of interest,

applying the methods and evaluating,

summarizing and communicating the results. 


Cont…
 
• The procedure followed for analyzing the collapsed (distorted) data will be
discussed first, after which the presentation of the data follows.

• The data must analyzed according to the research questions posed


(modelled) earlier in the study.

• Data analysis
• An attempt by the researcher to summarize collected data.
• Data Interpretation
• Attempt to find meaning
Cont…….
Data can be analyzed manually or by using some soft wares

Example

1.Data preparator-for preprocessing data

2.Weka

3.Orange Data Mining

4.Stata

5.Minitab

6. MATLAB
PROCESSING OPERATIONS

The concepts of processing and analysis includes:

1.Editing: Editing of data is a process of examining the collected raw data


(specially in surveys) to detect errors and omissions and to correct these when
possible

Editing is done to assure that the data are accurate, consistent (reliable) with
other facts gathered, uniformly entered, as completed as possible and have been
well arranged to facilitate coding and tabulation
2.Coding
2. Coding: Coding refers to the process of assigning numerals or other symbols to
answers so that responses can be put into a limited number of categories or classes

• Such classes should be appropriate to the research problem under consideration

3. Classification: Most research studies result in a large volume of raw data which must
be reduced into homogeneous groups if we are to get meaningful relationships.

The process of arranging data in groups or classes on the basis of common


characteristics
4. Tabulation
4. Tabulation: When a mass of data has been assembled, it becomes necessary
for the researcher to arrange the same in some kind of concise and logical
order. This procedure is referred to as tabulation

Tabulation is the process of summarizing raw data and displaying the same in
compact form (i.e., in the form of statistical tables) for further analysis

In a broader sense, tabulation is an orderly arrangement of data in columns and


rows.
Tabulation is essential because of the following reasons.

1. It conserves (saves) space and reduces explanatory and descriptive


statement to a minimum.

2. It facilitates the process of comparison.

3. It facilitates the summation of items and the detection of errors


and omissions.

4. It provides a basis for various statistical computations.


Data presentation and description
• Once data has been collected, it has to be classified and organized in such a
way that it becomes easily readable and interpretable, that is, converted to
information.

• Before the calculation of descriptive statistics, it is sometimes a good idea to


present data as tables, charts, diagrams or graphs.

• Most people find ‘pictures’ much more helpful than ‘numbers’ in the sense
that, in their opinion, they present data more meaningfully.
Various possible types of presentation of data and justification for their use in given situations.

1. TABULAR FORMS

This type of information occurs as individual observations, usually as a table or


array ( a matrix of rows and columns of numbers which have been

arranged in some order (preferably ascending) of disorderly values.

These observations are to be firstly arranged in some order (ascending or


descending if they are numerical) or

simply grouped together in the form of a frequency table before proper


presentation on diagrams is possible.
Cont…

We can easily verify the following:

1. Minimum = 2
2. Maximum = 68
3. Number of observations = 25
4. Mode = 19
5. Median = 24
2. Line Graphs

A line graph is usually meant for showing the frequencies for various
values of a variable

i. Single line graph

it displays information concerning one variable only, in terms of its


frequencies.
ii. Multiple line graph

• Multiple line graphs illustrate information on several variables so that


comparison is possible between them.
3. Pie Charts

• A pie chart or circular diagram is one which essentially displays the relative
figures (proportions or percentages) of classes or strata of a given sample or
population
Bar Charts

The bar chart is one of the most common methods of presenting data in a visual
form.

Its main purpose is to display quantities in the form of bars.

 A bar chart consists of a set of bars whose heights are proportional to the
frequencies that they represent.
Exploring, Displaying and examining data
• Data exploration is the first step in data analysis and typically involves
summarizing the main characteristics of a dataset (collection of data
(usually in digital form).

• It is commonly conducted using visual analytics tools, but can also be done in
more advanced statistical software, such as R.

• The open-source programming language R has for a long time been popular
(particularly in academia) for data processing and statistical analysis
Cont…
Cont…
Hypothesis Testing
• WHAT IS A HYPOTHESIS?

A mere (simple) assumption or some supposition to be proved or disproved.


But for a researcher hypothesis is a formal question that he intends to
resolve.

Thus, a hypothesis may be defined as

a proposition or a set of propositions set forth as an explanation for the


occurrence of some specified group of phenomena either asserted
merely as a provisional conjecture to guide some investigation or
accepted as highly probable in the light of established facts
Cont…
• Research hypothesis is a predictive statement, capable of being tested by
scientific methods, that relates an independent variable to some dependent
variable.

For example, consider statements like the following ones:

“Students who receive counselling will show a greater increase in creativity


than students not receiving counselling” Or

“the automobile A is performing as well as automobile B.”


Characteristics of hypothesis:

(i) Hypothesis should be clear and precise

(ii) Hypothesis should be capable of being tested

(iii) Hypothesis should state relationship between variables

(iv) Hypothesis should be limited in scope and must be specific.

(v) Hypothesis should be stated as far as possible in most simple terms

(vi) Hypothesis should be consistent with most known facts

(vii) Hypothesis should be amenable to testing within a reasonable time


BASIC CONCEPTS CONCERNING TESTING OF HYPOTHESES

(a) Null hypothesis and alternative hypothesis:

• In the context of statistical analysis, we often talk about null hypothesis and
alternative hypothesis

• If we are to compare method A with method B about its superiority and if we


proceed on the assumption that both methods are equally good, then this assumption
is termed as the null hypothesis (H0)

• As against this, we may think that the method A is superior or the


method B is inferior, we are then stating what is termed as alternative
hypothesis(Ha or H1)
Cont….

(b) The level of significance: The significance level is the maximum value
of the probability of rejecting H0 when it is true and is usually
determined in advance before testing the hypothesis.

It is always some percentage (usually 5%) which should be chosen with great care,
thought and reason

The 5 percent level of significance means that researcher is willing to take as much
as a 5 percent risk of rejecting the null hypothesis when it (H0) happens to be true.
Cont….

(c) Decision rule or test of hypothesis: Given a hypothesis H0 and an


alternative hypothesis Ha, we make a rule which is known as decision rule
according to which we accept H0 (i.e., reject Ha) or reject H0 (i.e., accept Ha).
Hypothesis testing

• In everyday life, we often have to make decisions based on incomplete


information.

• These may be decisions that are important to us such as, "Will I improve my
Programming grades if I spend more time studying C++?"

• Hypothesis testing is a kind of statistical inference that involves asking a


question, collecting data, and then examining what the data tells us about how
to proceed.
Developing Null and Alternative Hypothesis

There are always two hypotheses


1. Null hypotheses
2.Alternative hypotheses

• For example, if we were to test the hypothesis that college freshmen study 20
hours per week, we would express our null hypothesis as:

H0: µ = 20
Cont…………

We test the null hypothesis against an alternative hypothesis, which is given
the symbol Ha.

The alternative hypothesis is often the hypothesis that you believe yourself!

It includes the outcomes not covered by the null hypothesis.

In this example, our alternative hypothesis would express that freshmen do not
study 20 hours per week:

Ha: µ ≠ 20
Example A

• We have a medicine that is being manufactured and each pill (tablet) is supposed
to have 14 milligrams of the active ingredient. What are our null and alternative
hypotheses?
Solution
H0: µ = 14
Ha: µ ≠14

Our null hypothesis states that the population has a mean equal to 14 milligrams.

Our alternative hypothesis states that the population has a mean that is different
from 14 milligrams.
Example B

• The school principal wants to test if it is true what teachers say – that high
school juniors use the computer an average 3.2 hours a day. What are our null
and alternative hypotheses?

H0: µ = 3.2

Ha: µ ≠ 3.2

Our null hypothesis states that the population has a mean equal to 3.2 hours.

Our alternative hypothesis states that the population has a mean that differs
from 3.2 hours.
Deciding Whether to Reject the Null Hypothesis

• The alternative hypothesis can be supported only by rejecting the null


hypothesis
Hypotheses test on one sample mean , when population parameters
are known (Z-score test)
• Z-score is used to identify the critical (rejection) region
• Used when you know the standard deviation and true mean
Formula for z-score
Example
• The researcher thinks that the average users of Facebook is
increased. The average users of Facebook 20 years ago was 145
per second with standard deviation of 20, the researcher takes a
random sample of 200 users and that the average users of Facebook
is 147. Are users of Facebook today are more than they were before
? Use 0.05 significance level to evaluate the null and alternative
hypotheses
Solution
Measures of Association
Before we are going to discuss measures of association, we need to talk about
independent and dependent variables. 

A concept which can take on different quantitative values is called a variable. As


such the concepts like weight, height, income are all examples of variables

 If one variable depends upon or is a consequence of the other variable, it


is termed as a dependent variable, and the variable that is antecedent
(originate) to the dependent variable is termed as an independent variable
cont…
The dependent variable is whatever you are trying to explain. 

For example, let’s say we want to find out why some people think they will
eventually graduate from a four-year college while others don’t. 

 The independent variable is some variable that you think might help you
answer this question. 

 Perhaps we decide to use their grades in high school as our independent


variable. 
Examples
For instance, if we say that height depends upon age, then height is a
dependent variable and age is an independent variable

Saving and income: Income is independent variable and saving is dependent


variable

Price and quantity demand: Quantity demands depend on price,


Cont…………
• A measure of association is a numerical value that tells us how strongly
related two variables are.  There are several characteristics of a good measure
of association.

They range from a value of 0 (i.e., no relationship) to 1 (i.e., the strongest


possible relationship).

• For variables that have an underlying order from low to high they can be
positive or negative.  A positive value indicates that as one variable increases,
the other variable also increases.  A negative value indicates that as one

variable increases, the other variable decreases.


Cont…..
Some measures specify which variable is dependent and which is independent. 

The independent variable is some variable that you think might help explain

the variation in the dependent variable. 

For example, if your two variables were education and voting you
might choose education as the independent variable and voting as your
dependent variable because you think that education will help you
explain why some people vote Democrat and others vote Republican
Example

• Saving and income: Income is independent variable and saving is dependent


variable

• Price and quantity demand: Quantity demands depend on price,


Report Writing: Presenting insights (understandings) and findings

• Today, one of the most basic means of communication in our professional life is
written presentation, such as scientific paper, technical report, assignment
report, abstract, thesis, conference report, etc.

• Written presentations have one striking characteristics which is different from


that of verbal presentations, that is, written presentations are exposed to readers

• The communication between author(s) and readers are in indirect way.


The fundamental elements of good writing

Guidelines and tips that will improve your writing skills sufficiently to serve a purpose.
The main elements of good writing are:

1. Purpose: A specific type of written presentation has to meet a specific

need which depends on the purpose of the writing

Writing, like any other human activity, is driven by a purpose

The purpose of writing a scientific report is to communicate an idea or set


of ideas to people who want to understand the level of scientific progress in
a specific area of specialization, and many a times to even carry the idea(s)
further
2. The Target Audience

• Once an idea has been identified or formulated, then the effort will be
to present this idea in the best possible way to the target audience

• Which brings us to the question, “Who is the relevant target audience?”.

• Unfortunately, while writing their thesis, most graduate students


focus on their advisor or at most the graduate examining committee
as their target reader
Cont…
• Your thesis, your seminar report, etc. should be written for all interested
current and future researchers.

• Not properly identifying your reader usually leads to some mistakes in


writing (such as use of abbreviations).

• Having properly identified the relevant reader of the scientific paper, we need
to understand this audience
Cont..
• Anyone who picks up to read your writing is either interested in acquiring
new information or achieving a better understanding.

• Therefore, in order to serve the reader, your paper should have pertinent
information

• The information you would like to convey must be presented in an


arrangement such that the reader will not spend an inordinate (excessive)
amount of time in extracting the information
The Organization

• The organization of the paper refers to the structure, i.e. the sequence in
which you present each type of information

• The scientific report should have distinctive and clearly evident component
parts.

• It is always desirable for you to create an outline of the paper based on the
component parts and filling in the major points you want to cover in each
part.
Cont…..
-Title

- Acknowledgement

- Abstract

- Introduction

- Materials and methods

- Results

- Discussion/Conclusion

- References

- Appendices, where applicable


End of the

Chapter

Next

You might also like