You are on page 1of 10

Measures of Central Tendency

- To describe a whole set of data with a single value


that represents the middle or center of its
distribution is the purpose of measure of central
tendency (measures of center or central location).
- way to describe the center of a data set.
- It lets us know what is normal or 'average' for a
set of data.
- condenses the data set down to one
representative value, which is useful when you
are working with large amounts of data.

Mean
- the sum of all the values in the observation or
a dataset divided by the total number of
observations. This is also known as the
arithmetic average.
- continuous and discrete numeric data as well
as for categorical data, as the values cannot
be summed.
- applicable to use for ratio and interval data.
Mode
- can be found for both numerical and categorical
(non-numerical) data.
- most commonly occurring value in a
distribution.
- There can be more than one mode for the same
distribution of data, (bi-modal, or multi-modal)
- the distribution may have no mode at all (i.e. if
all values are different).

Median
- is considered as the physical middle point in a
distribution because it is located at the center
position when the values are arranged in
ascending or descending order
- the middle value.
- If it is an even number, the median value is the
mean or average of the two middle values.
- useful for showing the spread within a dataset
and for comparing the spread between similar
datasets.
- Range (R) = HIGHEST
- OBSERVATION – LOWEST OBSERVATION

Mean Absolute Deviation (MAD)

- A set of data is the average distance between


each data value and the mean

Ungrouped data:

--------------------------------------------------------------------

Measures of Variation

- Range
- Variance
- Mean Absolute Deviation
- Standard Deviation
- Quartile and Interquartile Deviation

- Events of nature vary from time. People keep on


changing their location, motion, physical
appearance, skin reaction to different chemicals, 1. Find the mean
height, weight, hair color, eye color, ideas, and even 2. Find the distance between each data value and
value in life. Usually, the heights of a group of people the mean. That is, find the absolute value of the
with the same race tend to converge to a certain difference between each data value and the
common value mean
3. Find the average of those differences
Measures of Variation

- The measures of variation will enable you to


know how varied the observations are, whether
there are extremes values in the distribution, or
whether the values are very close to each other.
- If the measure of variation is zero, it means that
there is no variation at all and that the
observations are all alike, or homogeneous.
Otherwise, they are heterogenous.
- The common measures of variations are:
- range, mean absolute deviation, variance,
standard deviation, and quartile deviation &
interquartile deviation

Range

- is the simplest form of measuring the variation


of a distribution
- simple to compute and is useful when you wish
to evaluate the whole of a dataset.
Variance

- is another measure of variation which can be


used instead of the range
- The variance considers the deviation of each
observation from the mean
- To obtain the variance of a distribution, first
square the deviation from the mean of each row
score and add them together
- Then, divide the resulting sum by N or the total
number of cases.

Standard Deviation

- (σ) for a population and (s) for a sample, is


the square root of the value of the variance.
In symbols and formula for grouped and
ungrouped data:

Quartile Deviation (QD)

- is another way of determining the spread of a


distribution in terms of quartiles. The quartile
deviation formula is shown below:
Interquartile Deviation (IQD) *answer with a decimal answer is relative to the next.

*always round up

-------------------------------------------------------------------------

Quantiles

- are cut points in a score distribution where the


scores are divided into different equal parts.
There are three kinds of quantiles namely:
Quartiles, Deciles and Percentiles.

Quantile

- A measure of position that divides the ordered


observations or score distribution into 4 equal
parts.

Decile

- A measure of position that divides the ordered


observations or score distribution into 10 equal
parts.

Percentile

- A measure of position that divides the ordered


observations or score distribution into 100
equal parts.
Properties of a Normal Distribution

1.) The distribution is bell-shaped.


2.) The mean, median, and mode are equal and are
located at the center of the distribution.
3.) The normal distribution is unimodal.
4.) The normal distribution curve is symmetric
about the mean (the shape is same on both
sides).
5.) The normal distribution is continuous.
6.) The normal curve is asymptotic (it never
touches the x-axis).
7.) The total area under the normal distribution
*relativity curve is 1.00 or 100%.
8.) The area under the part of a normal curve that
lies within 1 standard deviation of the mean
68%; within 2 standard deviation, about 95%;
and with 3 standard deviation, about 99.7%.

The z-value

- A normal can be converted into a standard


------------------------------------------------------------------------- normal distribution by obtaining the z value. A z
value is the signed distance between a selected
Normal Distribution
value, designated x, and the mean, divided by
- A normal distribution is a continuous, the standard deviation. It is also called as z
symmetric, bell-shaped distribution of a scores, the z statistics, the standard normal
variable. The known characteristics of the deviates, or the standard normal values. In
normal curve make it possible to estimate the terms of formula:
probability of occurrence of any value of a
normally distributed variable.
- Most scientific and business data and natural
relationships,
- such as weight, height, etc., when displayed
using a histogram frequency curve are bell-
shaped, and symmetrical, known as Normal
Distribution (size of things produced by
machines, errors in measurements, heights of
people, blood pressure scores on a test)
------------------------------------------------------------------------

Correlation

- finding the relationship between two


quantitative variables without being able to
infer causal relationships. It is a statistical
technique used to determine the degree to
which two variables are related.

Scatter Diagram

- rectangular coordinate
- 2 quantitative variables
- One variable is called independent (X) and the
second is called dependent (Y)
- Point are not joined
- No frequency table
Pearson Product-Moment Correlation

- In Statistics, correlation is the interdependence


between two variable quantities.
- Pearson product-moment correlation is the
most widely used in statistics to measure the
degree of the relationship between the linear
related variables.
- The Pearson r correlation would require both
variables to be normally distributed. Correlation
refers to the departure of two variables from
independence.
- For example, in the stock market, if we want to
CORRELATION measure how two products are related to each
other, Pearson r correlation is used to measure
Positive Correlation the degree of relationship between the two
products.
- Situation: both variables increases

Negative Correlation

- Situation: one variable increase while the other


variables decreases
Spearman Rank Correlation
No Correlation
- Spearman rank correlation (or Spearman’s rho)
- Situation: one variable neither increase or
is a nonparametric test that is used to measure
decreases while the other variable increases
the degree of association between the two two
RELATIONSHIP OF VARIABLES variables.
- It is often denoted by 𝝆 (rho) or as 𝒓𝒔. Spearman
Direct Relationship rank correlation is the counterpart of Pearson
Product-Moment Correlation in parametric
- If line is slanting upward to the right statistics.
- It is calculated by converting each variable to
ranks and calculating the Pearson Product-
Moment Correlation between the two sets of
ranks.
- For small sample sizes, the observed correlation
coefficient is compared to what would result if
the ranks of the X-values and Y-values were
random permutations of the integers 1 to n
(sample size). The following formula is used to
calculate the Spearman rank correlation:

Inverse Relationship

- If line is slanting upward to the left

Pearson Product-Moment Correlation


Regression

- Uses a variable (x) to predict some outcome


variable (y)
- Tells you how values in y change as a function of
changes in values of x
- Linear means “straight line”
- Regression tells us how to draw the straight
line described by the correlation
- It calculates the “best-fit” line for a certain set of
data.
- The regression line makes the sum of the
squares of the residuals smaller than for any
other line.
- Regression minimizes residuals.

Regression Analysis

- Regression: technique concerned with


predicting some variables by knowing others
- The process of predicting variable Y using
variable X

You might also like