Professional Documents
Culture Documents
I. Statistics
Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, and
presentation of masses of numerical data. Two main statistical methods are used in data
analysis:
1.1 Data
Data (singular datum) are individual pieces of factual information recorded and used for the
purpose of analysis. It is the raw information from which statistics are created. E.g. the height
of maize plants in a plot, the number of pests that attack and destroy a plant, number of
fruits, number of flowers, number of leaves etc.
1.2 Variable
A variable is something that varies, meaning that it can take on a number of values.
Individual characteristics, such as age, length, leaf area index, height, diameter are variables
because different plants have different values or scores on these characteristics.
1.3 Population
The collection of all individuals or items under consideration in a set of data, e.g. the total
number of students in Level 200
1.4 Sample
Consists one or more observations drawn from the population. More than one sample can be
derived from the same population. The number of observations in a sample is called the
sample size. E.g. the number of Tilapia in a fish pond of various kinds of fish.
1.5 Parameter
Data Analysis is the process of systematically applying statistical and/or logical techniques to
describe and illustrate, condense and recap, and evaluate data. Various analytic procedures
provide a way of drawing inductive inferences from data and distinguishing the phenomenon of
interest from the statistical fluctuations present in the data. Data can be analysed by quantitative
methods (descriptive and inferential statistics) and qualitative methods.
The two most commonly used quantitative data analysis methods are descriptive statistics and
inferential statistics.
Descriptive statistics involves analysis of data that helps describe, summarize and present
data in tables, graphics etc. Descriptive statistics provide absolute numbers without
explaining the reasoning behind those numbers, so conclusions cannot be made beyond the
analysed data. Descriptive statistics help researchers summarize the data (mostly of a single
variable) and find patterns. E.g. mean, median, mode, percentage, frequency, range variance,
standard deviation, bar charts, pie charts, line graphs etc. Since descriptive analysis is mostly
used for analysing single variable, it is often called univariate analysis.
Inferential statistics involves methods for making decisions on data based on probability-
based estimations, conclusions and accurate predictions. These complex analyses show the
relationship between multiple variables to generalise results and make predictions and
conclusions. E.g. correlation, regression, analysis of variance (ANOVA), T-Test etc.
Qualitative data refers to non-numeric (data that are not easily reduced to numbers)
information such as interview transcripts, notes, video and audio recordings, images and text
documents. E.g. word clouds and word frequency analysis.
Figure. Presentation of qualitative statistics; word cloud (left), word frequency analysis
(right)
Data collection is the process of gathering and measuring information on variables of interest,
in an established systematic fashion that enables one to answer stated research questions, test
hypotheses, and evaluate outcomes.
Any research is only as good as the data that drives it, so choosing the right technique of data
collection can make all the difference. Some data collection techniques;
1.8.1.1 Observation
Making direct observations of simplistic phenomena can be a very quick and effective way of
collecting data with minimal intrusion. Establishing the right mechanism for making the
observation is all you need.
1.8.1.2 Questionnaire
Questionnaires are stand-alone instruments of data collection that will be administered to the
sample subjects directly or through mail, phone or online. They have long been one of the most
popular data collection techniques.
1.8.1.3 Interview
Conducting interviews can help you overcome most of the shortfalls of the previous two data
collection techniques that we have discussed here by allowing you to build a deeper
understanding of the thinking behind the respondents’ answers.
Focus group discussions take the interactive benefits of an interview to the next level by bringing
a carefully chosen group together for a moderated discussion on the subject of the survey.
2. Types of Data
Discrete data usually involve counting a number of items, such as the number of books,
computers, people, etc. Typically, it involves integers. E.g. Number of children in a household,
number of languages a person speaks, number of people sleeping in statistics class etc.
Continuous data can take any value, including integers and decimals. E.g. height of children,
weight of cars, time to wake up in the morning, speed of the train
Qualitative data cannot actually be measured, hence cannot be expressed as a number, but
may be represented by a name, symbol, or a number code. E.g. softness of your skin, the
colour of the sky and the colour of your eyes.
Quantitative data are anything that can be expressed as a number, or quantified. E.g. The
age of your car, number of children, how much you earn, the number of coins in your pocket
etc. Only quantitative data can be analysed statistically, and thus more rigorous assessments
of the data are possible.