Professional Documents
Culture Documents
distribution is said to be
‘Mesokurtic’
Platykurtic
Median • Ordinal
• Quantitative (skewed)
• Nominal
Mode • Ordinal
• Quantitative
CENTRAL TENDENCY - CBL
• Scenario
• The table below gives the number of accidents each year at a
particular road junction:
Work out the mean, median and mode for the values.
Example:
• The diastolic blood pressure of 10 individuals was as follows:
83, 75, 81 , 79, 71, 95, 75, 77, 84 and 90.
• Find the mean deviation.
Mean deviation is the average of the deviations from the arithmetic mean.
3. VARIANCE
• Variance in statistics is a measurement of the spread between
numbers in a data set or how far a set of numbers is spread out.
• Variance describes how much a random variable differs from its
expected value (e.g., mean) That is, it measures how far each number
is from the mean and therefore from every other number in the set.
A large variance indicates that numbers in the
set are far from the mean and from each
other, while a small variance indicates the
opposite.
Large variance
Small variance
DEFINITION OF VARIANCE
• The variance is defined as the average of the squares of the differences
between the individual value and the mean of the data set
.
• All variances that are not zero will be positive numbers always.
Meaning, variance can be zero but can never be negative
Formulae of variance
Population Variance vs. Sample Variance
• For a large population, it’s impossible to get all data. So, we
want to take out a sample and calculate its variance.
• The formula for Sample Variance is a bit twist to the
population variance
• In sample variance; let the dividing number subtract by 1,
so that the variance will be slightly bigger. (e.g., 10/5=2
while 10/4=2.5)
• It is not to get a larger variance but, the idea is to be
realistic.
• It’s reasonable. If we use the population variance formula
for sample data, it's always going to be underestimated.
• That's why for sample variance we reduce the sample
population by 1.
Advantages of Variance
• x̅ 1SD = 68%
∵ (x̅ + 1 SD = 34% and x̅ - 1 SD = 34%)
• x̅ 2SD = 95%
• x̅ 3SD = 99.7%
• C.V=(10/80)x100 C.V=12.5%
If we compare the results; we can conclude that
• Age of 25-years old data (sample1) has less variability. OR
• Age of 11 years old data (sample2) has more variability. OR
• Sample 1 is more precise(consistent) as compared to sample 2.
MORE ABOUT STATISTICAL
DISTRIBUTION
STANDARD ERROR OF MEAN
• The standard error is considered part of inferential statistics.
• In statistics, a sample mean deviates from the actual mean of a population;
this deviation is the standard error of the mean.
• In cases where multiple samples are collected, the mean of each sample
may vary slightly from the others, creating a spread. This spread is most
often measured as the standard error, accounting for the differences
between the means across the datasets.
• When the standard error is small, the data is said to be more
representative of the true population mean.
• Therefore, the standard error (SE) of a statistic is the approximate standard
deviation of a statistical sample population.
Standard Error of Mean (SEM)
• The standard error of the mean (SEM) measures how far the sample
mean of the data is likely to be from the true population mean.
• SEM is the SD of the theoretical distribution of the sample means (the
sampling distribution).
• SEM is calculated by taking the standard deviation and dividing it by
the square root of the sample size.
SEM=SD/√n
• The SEM is always smaller than the SD.
why to divide by √ of n
• Dividing by the square root of “n”, you are paying a “penalty” for
using a sample instead of the entire population
• Sampling allows us to make guesses or inferences about a population.
• The smaller the sample, the less confidence you might have in those
inferences; that’s the origin of the “penalty”.
Relationship between sample size and SEM
• As your sample size increases toward the size of the entire
population, the difference between the population mean and sample
mean becomes smaller and smaller.
• Therefore, larger the sample size---smaller will be the standard error
of mean
Calculate SEM
• To determine the prevalence of anemia in pregnancy, hemoglobin level of
100 females was recorded. Mean hemoglobin level was found to be 12 ±2
g/dl. Calculate standard error of mean. (Annual 2013)
Solution
• SEM = SD /√n
• SEM = 2/ √100
• SEM = 2/ 10
• SEM = 0.2
CONFIDENCE INTERVALS
• A confidence interval in statistics refers to the probability that the
population parameter (e.g., mean) will fall between two set values.
• The two set values are generally defined by the lower and upper
bounds or limits.
• The confidence interval is expressed as a percentage (the most
frequently quoted percentages are 90%, 95%, and 99%). The
percentage reflects the confidence level.
CONFIDENCE LIMITS
• Confidence limits are the numbers at the upper and lower end of a
confidence interval;
• For example,
if your mean is 102.86
with confidence limits of 99.29 and 106.43,
your confidence interval is 99.29 to 106.43.
Solution:
N = 100, Mean = 12 g/dl,
SD = 2 g/dl, 95%CI = ?
95%CI = 12 ± 2(0.2)
95%CI =12 ± 0.4 (Lower limit -> 12 – 0.4, upper limit -> 12 + 0.4)
95%CI = 11.6 --- 12.4
Various confidence levels
and their critical values