Professional Documents
Culture Documents
Finals. Fmch. Measure of Central Tendency Shape of The Distribution of Dispe
Finals. Fmch. Measure of Central Tendency Shape of The Distribution of Dispe
Semester
MEASURE OF CENTRAL TENDENCY SHAPE OF THE
DISTRIBUTION OF DISPERSION
Realendar Jr., MD | 11 112017
• Can greatly affect the measures of central
MEASURE OF CENTRAL TENDENCY tendency
• Describe the center position of data
Mean The data has no
• Number that describe what is the average of
outliers
typical of distribution
Median The data has
• The goal is to come up with the one single number
outliers
that best describes a distribution of scores
• Mean, Median, Mode
MODE
MEAN
• Find the value that occurs most frequently in a set
• Most common measure of central tendency
of observation
• Arithmetic mean is defined as the sum of all the
• If no value is repeated within the set of
observed values, divided by the number of
observations, then there is no mode
observations
• Mode is determined rather than calculated,
• Best for making predictions
determined by counting the number of times each
• Distribution is more or less normal (symmetrical) individual value occurs
• It is calculated rather than determined • Not affected by extreme values
• Affected by Extreme Values (Outliers) (picture) • Used for either numerical or categorical data
Finding the mean
• The mode is not very useful measure of central
• X=(∑X)/N tendency
MEDIAN
• It is insensitive to large changes in the data set.
• 50% of observations are above the median, 50% • That is, two data sets that are very different from
are below it each other can have the same
• it is the value that divides the distribution of • A distribution may have more than one mode
values into two equal part Multimodal distributions
• the difference in magnitude between the • Is a distribution has more than 2 modes
observations does not mater CONSIDERATION FOR CHOOSING A MEASURE OF CENTRAL
• therefore, it is not sensitive to outliers TENDENCY
• it is determined rather than calculated
• the observations are ranked in order from • For a nominal variable, the ode is the only
smallest to largest measure that can be used
• If the number of the observation is odd the median • For ordinal variables, the mode and median may
is the middle number (n+1) divided by 2 be used.
• If the number of observations is even, the median - The median provides more information
is the average of the two middle numbers (taking into account the ranking of categories)
{N/2+(N/2+1)}/2
• For interval-ratio variables, the mode, median, and
• The ways for computing the median depends on mean may all be calculated. The mean provides
the distribution of scores the most information about the distribution, but
• First, if you have an odd number of score pick the the median is preferred if the distribution is
middle score (N+1)/2 skewed.
• Second, if you have an even number of the scores,
take the average of the middle two MEAN - AVERAGE
{N/2+(N/2+1)}/2 MEDIAN - MIDDLE
• Not affected by Extreme Values MODE - MOST
• The median is computed when data are highly
skewed
Outlier
• A number that is extremely large or small in
comparison to the rest of the set of data
1 of 5
Family Medicine and Community Health I
MEASURE OF CENTRAL TENDENCY SHAPE OF THE
DISTRIBUTION OF DISPERSION
To the left: a negatively skewed distribution
To the right: a positive skewed distribution
RELATION BETWEEN THE MEASURES OF CENTRAL TENDENCY
• In symmetrical distributions, the median and
means are equal
- For normal distributions, MEAN =
MEDIAN = MODE
• In positively skewed distributions, the mean is
greater than the median
• In negatively skewed distributions, the mean is
smaller than the median
MEASURES IF DISPERSION
• Central tendency doesn’t tell us everything
• Dispersion/deviation/spread tells us a lot about
how a variable is distributed
• Dispersion shows us how much these
figures/variables differ from the average
The concept of dispersion: (examples)
*May contribute ka ba sa trans system? Kakahiya naman
ü Typically, a large city will
kung wala diba? Dami naghihirap ikaw hindi:P
have more diversity than a
SYMMETRY IN DATA SETS small town
The analysis of a data set often depends on whether the ü Metro manila are more
distribution is symmetric or non-symmetric. racially diverse than others
(cebu, davao)
SYMMETRIC DISTRIBUTION : the pattern of frequencies ü Some other students are
from a central point is the same (or nearly so) from the left more consistent than others
and right
Measures of dispersions are descriptive statistics that
*If the distribution is normal describe how similar a set of scores are to each other
- Mean is the best measure of central tendency • The more similar the scores are to each other, the
- Most scores “bunched up” in middle lower the measure of dispersion will be
• The less similar the scores are to each other, the
SYMMETRY IN DATA SETS higher the measure of dispersion will be
Non-symmetric distribution: the patterns from a central
• In general, the more spread out a distribution is,
point from the left and right are different
the larger the measured dispersion will be.
Skewed to the left: tail extends out to the left
• Measures of dispersion give information on the
Skewed to the right: tail extends out to the right
spread or variability of the data values
2 of 5
Family Medicine and Community Health I
MEASURE OF CENTRAL TENDENCY SHAPE OF THE
DISTRIBUTION OF DISPERSION
* the purpose of the coefficient of values between different
data set
VARIANCE
• Variance is defined as the average of the square
deviations:
RANGE
• Difference between the largest and the - Important measure of variation
smallest observations: - Shows variation about the man
Range = Xlargest – Xsmallest - Sensitive to extreme observation
• Ignores how data are distributed: Properties
• Sensitive to extreme observations § It has squared a unit … which leads to defining the
• Gives an idea of the variability very quickly standard deviation.
• Suffer from a serious drawback considers only § It is always non-negative, and equals zero if and
2 values and neglect all the other values of the only if all the observations are identical.
series § The larger the variance is, the more the scores
deviate, on average, away from the mean
WHEN TO USE THE RANGE: § The smaller the variance is, the else the scores
• The range is used when: deviation, on average, from the mean
- You are presenting your results to people STANDARD DEVIATION
with little or no knowledge of statistics • Most important measure of variation
• The range is rarely used in scientific work as it is • Shows variation about the mean
fairly insensitive • Has the Same units as the original data
• Usually use in daily temperature fluctuations or • It is simply the square root of the variance
price movement.
COEFFICIENT OF VARIATION
• Is a measure of relative variability used to:
- Measure changes that have occurred in a
population over time
- Compare variability of two populations
that are expressed in different units of
measurement
- Expressed as percentage rather that in
terms of the units of the particular data
- It indicate the spread of values around the WHAT DOES IT MEASURE?
mean by a percentage • It measures the dispersion (or spread) of figures
around the mean.
Coefficient of variation = standard deviation x 100 / • A large number for the standard deviation means
mean there is a wide spread of values around the mean,
*the higher the coefficient of variation the more widely whereas a small number for the values are
spread the values are around the man grouped close together around the mean
3 of 5
Family Medicine and Community Health I
MEASURE OF CENTRAL TENDENCY SHAPE OF THE
DISTRIBUTION OF DISPERSION
Standard deviation
• Standard deviation tells us a lot about a
distribution, particularly if that distribution is
normally distributed.
• It tells us that about 68% of all values will fall
within 1 SD of the mean, 95% fall within 2 SDs and
99.&% fall within 3 SDs
THE NORMAL CURVE
Measures of relative standing
A Percentiles
B Z scores
Measures of relative standing
• Measures of relative standing tell us something
about a given scores by reporting how it relates to
The properties of the normal distribution: other scores.
1. It is bell shaped and unimodal. PERCENTILES
2. It’s symmetrical around the mean. Data value
- the distribution is divided into 100 equal parts with the
concentrate around the mean.
median at the 50th percentile.
3. The mean, median and mode all have same value.
- the 50th percentile is that observation or number that has
50% of the observations below it and the other 50% above
it: this is simply the “middle” observation when the set of
observations are arranged in order of magnitude
- the most commonly used percentiles are the 25th, 50th
,75th percentiles
Example
In a set of 200 observations, if a number X is larger than
150 of the observations, then X is the 75th percentile
(150/200 of 25th )
THE Z-SCORE
• Each group has a distribution – but in their
original form, the groups are not comparable
• Each original score can be converted to a z-score,
which is a standard score that can be compared
Relationship between the normal curve and the standard across groups
deviation: Z=(xmean)/s
Z= z-score
X= score
Mean=mean of the distribution
S= standard deviation of the distribution
4 of 5
Family Medicine and Community Health I
MEASURE OF CENTRAL TENDENCY SHAPE OF THE
DISTRIBUTION OF DISPERSION
• A measure of an observation’s distance from the
mean
• The distance is measured in standard deviation
units
- If a z score is zero, it’s on the mean
- If a z score is positive, it’s above the mean
- If a z score is negative, its below the mean
- If a z score is 1, it’s 1 SD above the mean
- If a z score is -2, it’s 2 SDs below the mean
Characteristics, continued
- Given the standard distribution of scores within a
normal curve, the following statements are true:
o 84% of the scores fall below z-score of 1
o 16% of the scores fall above z-score of 1
- The more extreme the z-score, the father it is from
the mean
5 of 5