Research Methods
Topic f:Significance
i. Univariate Statistics
Recap
Population definition: all members that meet a set of specifications or a
specified criterion
Other Definitions:
• The entire set of members in a group. EXAMPLES: All Malawian
citizens; all PLU Students.
• All values of a variable in a definable group (e.g. Catholic, Protestant,
Muslim)
• The set of all values of interest
Sample: When only some elements are selected from a population in
other words, a subset of a population.
Other concepts to Know
Population Parameter: a measurable characteristic of a population(the
mean, median, mode are some of the simpler examples of population
parameters)
Sample Statistic or Parameter: sample estimates of the population
Levels of Measurement
refers to the coding scheme or the meaning of the numbers associated with each
variable.
There are four levels of measurement:
• Nominal: Each value represents a category. There is no inherent order to the
categories.
• Ordinal: each value is a category, but there is a meaningful order or rank to the
categories. However with ordinal data, there is not a measurable distance between
categories.
• Interval: The distance between each number is the same. For example, the distance
between 1 and 2 is the same as the distance between 15 and 16. With interval
measurement, we can determine not only that a person ranks higher but how much
higher they rank. You can do addition and subtraction with interval level measures,
but not multiplication and division.
Levels of Measurement….
RATIO: Have all the properties of interval variables with the addition
of a true zero point, representing the complete absence of the
property being measured.
These four levels of measurement are often combined into two main
types :
• Categorical: Nominal and ordinal measurement levels
• Continuous (or scale): Interval and ratio measurement levels
SUMMARY MEASURES OF CENTRAL TENDENCY
The most common way to summarize variables is to use measures of
central tendency and variability.
Central tendency: Is one number that is often used to summarize
the distribution of a variable.
• Typically thought of referring to the “average” value.
SUMMARY MEASURES OF CENTRAL TENDENCY
There are three main measures of central tendency:
• Mode: The category or value that contains the most cases. This
measure is typically used on nominal or ordinal data and can easily
be determined by examining a frequency table.
• Median: The midpoint of a distribution; it is the 50th percentile. If
all the cases for a variable are arranged in order according to their
value, the median is the value that splits the data into two equally
sized groups.
• Mean: The mathematical average of all the values in the distribution
(that is, the sum of the values of all cases divided by the total
number of cases).
Formula For Finding Mean
Hypothetical Example
Consider the following hypothetical values for weekly income
from a population of 10 people:
Xi
$100
$150
$200
$250
$250
$250
$250
$325
$325
$400
Σ 2,500
Find the mode, Median and Mean
Mode: The category or value that contains the most cases.In this
case $250
Median: The value that splits the data into two equal parts.Again,this
is $250
Mean
:
𝑁
𝑖 𝑋𝑖
=
$100+$150+$200+$250+$250+$250+$250+$325+$325+$400
𝑁 10
=250
Interms of a frequency Distribution
Xi fi Xi * fi
$100 1 $100
$150 1 $150
$200 1 $200
$250 4 $1,000
$325 2 $650
$400 1 $400
Σ 10 2,500
What is the mean?
Measures of Dispersion
Variability (Dispersion): The amount of spread or dispersion around
the measure of central tendency. There are a number of measures of
variability:
• Maximum: The highest value for a variable.
• Minimum: The lowest value in the distribution.
• Range: The difference between the maximum and minimum values.
Measures of Dispersion
• Variance: Provides information about the amount of spread
around the mean value. It’s an overall measure of how clustered
data values are around the mean.
Calculated by summing the square of the difference between
each value and the mean and dividing this quantity by the
number of cases minus one.
SUMMARY MEASURES OF CENTRAL
DISPERSION
In general terms, the larger the variance, the more spread
there is in the data; the smaller the variance, the more the
data values are clustered around the mean.
• Standard deviation: The square root of the variance. The variance
measure is expressed in the units of the variable squared. This can
cause difficulty in interpretation, so more often, the standard
deviation is used. The standard deviation restores the value of
variability to the units of measurement of the original variable
Variance
2
𝑋𝑖 (𝑋𝑖 −𝜇) 𝑋𝑖 2
$100 22,500 10,000
$150 10,000 22,500
$200 2,500 40,000
$250 0 62,500
$250 0 62,500
$250 0 62,500
$250 0 62,500
$325 5,625 105,625
$325 5,625 105,625
$400 22,500 160,000
Σ 2,500 68,750 693,750
2
𝑋𝑖2 𝑋𝑖
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑁

𝑁
2
𝑋
𝜎 2 = 𝑖 µ2
𝑁
𝜎 2 = 693,750/10 – (250) 2
𝜎 2 = 69,375  62,500
𝜎 2 = 6,875
Standard Deviation=√𝜎 2 =82.9156
Bivariate