You are on page 1of 49

The Five Number Summary

The five numbers that help describe the center, spread


and shape of data are:
 Xsmallest
 First Quartile (Q1)
 Median (Q2)
 Third Quartile (Q3)
 Xlargest
Relationships among the five-number
summary and distribution shape
DCOVA
Left-Skewed Symmetric Right-Skewed
Median – Xsmallest Median – Xsmallest Median – Xsmallest
> ≈ <
Xlargest – Median Xlargest – Median Xlargest – Median
Q1 – Xsmallest Q1 – Xsmallest Q1 – Xsmallest

> ≈ <

Xlargest – Q3 Xlargest – Q3 Xlargest – Q3


Median – Q1 Median – Q1 Median – Q1

> ≈ <

Q3 – Median Q3 – Median Q3 – Median


Five Number Summary and
The Boxplot

 The Boxplot: A Graphical display of the data


based on the five-number summary:
Xsmallest -- Q1 -- Median -- Q3 -- Xlargest
Example:

25% of data 25% 25% 25% of data


of data of data

Xsmallest Q1 Median Q3 Xlargest


Five Number Summary:
Shape of Boxplots
 If data are symmetric around the median then the box
and central line are centered between the endpoints

Xsmallest Q1 Median Q3 Xlargest

 A Boxplot can be shown in either a vertical or horizontal


orientation
Distribution Shape and
Box and Whisker Plot

Left-Skewed Symmetric Right-Skewed

Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
Box-and-Whisker Plot Example

 Below is a Box-and-Whisker plot for the following


data:
Min Q1 Q2 Q3 Max
0 2 2 2 3 3 4 5 5 10 27

0 23 5 27
 This data is very right skewed, as the plot depicts
Methods for Detecting Outliers
 Outlier – an observation that is unusually large
or small relative to the data values being
described
 Causes
 Invalid measurement or misclassified
measurement
 A rare event
 Two detection methods
 Box Plots
 Z-scores
Box Plots
 A box plot allows you to:
 Graphically display the distribution of a data set.
 Compare two or more distributions.
 Identify outliers in a data set.

Outliers
Whiskers

Box

**
Box Plots
 Box Plots
• based on quartiles
• Lower Quartile QL – 25th percentile
• Middle Quartile - median
• Upper Quartile QU – 75th percentile
• Interquartile Range (IQR) = QU - QL
Box Plots (cont.)
Outer fence = Q3 + 3.0 × IQR

* Outlier
Inner fence = Q3 + 1.5 × IQR

Whisker
Q3 75th percentile
Interquartile
Median range
Q1 25th percentile IQR = (Q3 – Q1)
Whisker
Inner fence = Q1 – 1.5 × IQR

Outer fence = Q1 – 3.0 × IQR


Box Plots

 The box plot displays 5 summary values:


 S = smallest value
 L = largest value
 Q1 = first quartile = 25th percentile
 Q2 = median = second quartile = 50th percentile
 Q3 = third quartile = 75th percentile
Aggregate Price Indexes
 An aggregate index is used to measure the rate
of change from a base period for a group of items

Aggregate
Price Indexes

Unweighted Weighted
aggregate aggregate
price index price indexes

Paasche Index Laspeyres Index


Paasche Index

Where
p is the price index
pt is the current price
p0 is the price of the base period
qt is the quantity used in the current period
q0 is the quantity used in the base period

Advantages Because it uses quantities from the current period, it


reflects current buying habits.

Disadvantages It requires quantity data for the current year. Because


different quantities are used each year, it is impossible to attribute
changes in the index to changes in price alone. It tends to overweight
the goods whose prices have declined. It requires the prices to be
recomputed each year.
Laspeyres Index

Advantages Requires quantity data from only the base period. This
allows a more meaningful comparison over time. The changes in
the index can be attributed to changes in the price.

Disadvantages Does not reflect changes in buying patterns over time.


Also, it may overweight goods whose prices increase.
Changing Consumption Patterns
 The Laspeyres and Paasche methods provide similar
results if the time periods being compared are not too far
apart.

 Over time, consumers tend to adjust their consumption


patterns. As a result, the Paasche index will tend to
produce a lower estimate than the Laspeyres index if
prices are rising, and a higher estimate than the
Laspeyres index if they are falling.

 However, since the Paasche index requires weights to be


updated each year, in practice the Laspeyres index is
more widely used.
Nominal versus Real Values
CPI Uses - Formulas
CPI and Real Income
CPI is used to determine real disposable personal income,
to deflate sales or other variables, to find the purchasing
power of the dollar, and to establish cost-of-living
increases.
CPI and Real Income
The Consumer Price Index is also used to determine
the purchasing power of the dollar.

Suppose the Consumer Price Index this month is 200.0 (1982–84


100). What is the purchasing power of the dollar?
Inflation Rate
Example
 The CPI for 2006, 2007, and 2008 are reported as
201.59, 207.34, and 215.30, respectively, by the Bureau
of Labor Statistics.

 Let’s use these values to compute the inflation rates for


2007 and 2008:
Conditional Probability

 A conditional probability is the


probability of a particular event
occurring, given that another event
has occurred.

 The probability of the event A given


that the event B has occurred is
written P(A|B).
Conditional Probability Example

 Of the cars on a used car lot, 70% have air


conditioning (AC) and 40% have a CD player
(CD). 20% of the cars have both.

 What is the probability that a car has a CD


player, given that it has AC ?

i.e., we want to find P(CD | AC)


Conditional Probability Example
(continued)
 Of the cars on a used car lot, 70% have air conditioning
(AC) and 40% have a CD player (CD).
20% of the cars have both.
CD No CD Total
AC .2 .5 .7
No AC .2 .1 .3
Total .4 .6 1.0

P(CD and AC) .2


P(CD | AC) = = = .2857
P(AC) .7
Conditional Probability Example
(continued)
 Given AC, we only consider the top row (70% of the cars). Of these,
20% have a CD player. 20% of 70% is about 28.57%.

CD No CD Total
AC .2 .5 .7
No AC .2 .1 .3
Total .4 .6 1.0

P(CD and AC) .2


P(CD | AC) = = = .2857
P(AC) .7
Tree Diagrams

A tree diagram is useful for portraying


conditional and joint probabilities. It is
particularly useful for analyzing business
decisions involving several stages.
A tree diagram is a graph that is helpful in
organizing calculations that involve several
stages. Each segment in the tree is one stage of
the problem. The branches of a tree diagram are
weighted by probabilities.
Elementary Events
 A automobile consultant records fuel type and
vehicle type for a sample of vehicles
2 Fuel types: Gasoline, Diesel
3 Vehicle types: Truck, Car, SUV
e1
6 possible elementary events: e2
Car
e1 Gasoline, Truck e3
e2 Gasoline, Car e4
e3 Gasoline, SUV e5
e6
e4 Diesel, Truck Car
e5 Diesel, Car
e6 Diesel, SUV
Tree Diagram Example

P(E1 and E3) = 0.8 x 0.2 = 0.16

Car: P(E4|E1) = 0.5 P(E1 and E4) = 0.8 x 0.5 = 0.40


Gasoline
P(E1) = 0.8
P(E1 and E5) = 0.8 x 0.3 = 0.24

P(E2 and E3) = 0.2 x 0.6 = 0.12


Diesel
Car: P(E4|E2) = 0.1
P(E2) = 0.2 P(E2 and E4) = 0.2 x 0.1 = 0.02

P(E3 and E4) = 0.2 x 0.3 = 0.06


Bayes’ Theorem
 Bayes’ Theorem
 A procedure for updating probabilities based on
new information.
 Prior probability is the original (unconditional)
probability (e.g., P(B) ).
 Posterior probability is the updated
(conditional) probability (e.g., P(B | A) ).
Bayes’ Theorem
 Bayes’ Theorem
 Given a set of prior probabilities for an event and
some new information, the rule for updating the
probability of the event is called Bayes’ theorem.
P (A  B)
P (B | A) =
(
P ( A  B ) + P A  Bc )
or
P ( A | B ) P (B )
P (B | A) =
( ) ( )
P ( A | B ) P (B ) + P A | Bc P Bc
Bayes Theorem – Example 1
Bayes Theorem – Example 1
Bayes Theorem – Example 1
Bayes Theorem – Example 1
Bayes Theorem – Example 1
Binomial Distribution Formula

n! x n−x
P(x) = p q
x ! (n − x )!

P(x) = probability of x successes in n trials,


with probability of success p on each trial
Example: Flip a coin four
times, let x = # heads:
x = number of ‘successes’ in sample,
n=4
(x = 0, 1, 2, ..., n)
p = probability of “success” per trial p = 0.5
q = probability of “failure” = (1 – p) q = (1 - .5) = .5
n = number of trials (sample size) x = 0, 1, 2, 3, 4
Binomial Probability Distribution
 A binomial random variable is defined as the
number of successes achieved in the n trials of a
Bernoulli process.
 A Bernoulli process consists of a series of n

independent and identical trials of an experiment


such that on each trial:
 There are only two possible outcomes:
p = probability of a success
1−p = q = probability of a failure
 Each time the trial is repeated, the probabilities of
success and failure remain the same.
Cumulative Binomial Probability
Distributions - Example

A study by the Illinois Department of Transportation


concluded that 76.2 percent of front seat occupants
used seat belts. A sample of 12 vehicles is selected.
What is the probability the front seat occupants in at
least 7 of the 12 vehicles are wearing seat belts?
Expected Probability
 Expected return.
 Given a portfolio with two assets, Asset A and
Asset B, the expected return of the portfolio
E(Rp) is computed as:

= E ( Rp ) w AE ( RA ) + w B E ( RB )
 where
wA and wB are the portfolio weights/probability to invest
wA + wB = 1
E(RA) and E(RB) are the expected returns on assets
A and B, respectively.
Portfolio Returns
 Expected return, variance, and standard deviation
of portfolio returns.
 Using the covariance or the correlation coefficient of the
two returns, the portfolio variance of return is:
Var ( Rp ) = w A 2σ A 2 + w B 2σ B 2 + 2w Aw B ρ ABσ Aσ B

where σ2A and σ2B are the variances of the returns for
Asset A and Asset B, respectively,
σAB is the covariance between the returns for
Assets A and B
ρAB is the correlation coefficient between the returns
for Asset A and Asset B.
Solution
Solution
Solution
Solution
2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Indeks Harga Konsumen (IHK), 2002=100
Kota ABC 114.3 127.3 141.7 126.8
Kota XYZ 113.9 124.0 137.4 146.7

Indeks Harga Konsumen (IHK), 2007=100


Kota ABC 90.1 100.4 111.8 100.0 110.8 116.7 122.3 127.6 132.7 142.5
Kota XYZ 77.6 84.5 93.7 100.0 110.9 115.6 121.8 128.1 135.7 145.5

Upah Nominal
Kota ABC 1121 2261
Kota XYZ 557 1424

Upah Nominal
Kota ABC 1244 1586
Kota XYZ 718 979

Pertumbuhan
Kota ABC 21.6
Kota XYZ 26.7

Daya Beli 0.702


0.687

Upah Nominal agar setara 2261


3289

You might also like