You are on page 1of 53

MATH2901 - Higher Theory of Statistics

Libo, Li

May 28, 2020

Libo, Li
Week 1 - Lecture 1

Consultation Hours: TBA


Location: Blackboard Collaborate Rooms
Email: libo.li@unsw.edu.au

Requests from me
Please communicate using the university email.
Try your best to come to consultation hours.
Mute your microphone.

Libo, Li
Assessments
Online Quiz - Week 4, Weighting 5%
Midterm test - Week 7, Weighting 20%
Assignment - Week 9, Weighting 15%

Libo, Li
Statistics

Watch Youtube videos


How statistics can be misleading - Mark Liddell
https://www.youtube.com/watch?v=sxYrzzy3cq8
Chocolate, correlation and cat’s whiskers
https://www.youtube.com/watch?v=ZeCr3Jgh8r0

Interesting websites
Misleading Statistics - https://www.datapine.com/
blog/misleading-statistics-and-data/
Correlation and Causation - http:
//www.tylervigen.com/spurious-correlations

Libo, Li
Probability and Statistics

Probability: The formal study of probability in the west start in


the 17th century with Blaise Pascal, Pierre de Fermat and the
Dutchman Christian Huygens. Pascal’s triangle was known
long time before in Iranian and Chinese culture.

Statistics: The birth of statistics is often dated to 1662, when


John Graunt, along with William Petty, developed early human
statistical and census methods that provided a framework for
modern demography. He produced the first life table, giving
probabilities of survival to each age and gave the first
statistically based estimation of the population of London.

Libo, Li
History

17th Century Through the collaboration of Blaise Pascal,


Pierre de Fermat and the Dutchman Christian Huygens was
probability theory given a mathematically treatment.

Blaise Pascal Pierre de Fermat

Libo, Li
History

18th Century Jacob Bernoulli and Abraham de Moivre’s put


probability on a sound mathematical footing. Bernoulli proved a
version of the fundamental law of large numbers, which states
that in a large number of trials, the average of the outcomes is
likely to be very close to the expected value.

19th Century Probabilistic methods was used to correct


error-prone observations in Astronomy. Carl Frederic Gauss
determination of the orbit of Ceres from a few observations. A
normal distribution of errors was used to determine the most
likely true value.

Libo, Li
Gauss Laplace

Libo, Li
Laplace’s Demon

"We may regard the present state of the universe as the effect
of its past and the cause of its future. An intellect which at a
certain moment would know all forces that set nature in motion,
and all positions of all items of which nature is composed, if this
intellect were also vast enough to submit these data to
analysis, it would embrace in a single formula the movements
of the greatest bodies of the universe and those of the tiniest
atom; for such an intellect nothing would be uncertain and the
future just like the past would be present before its eyes."

Pierre Simon Laplace, A Philosophical Essay on Probabilities

Libo, Li
20th century

Statistics Hypothesis testing of Fisher and Neyman, which is


now widely applied in biological and psychological experiments
and in clinical trials of drugs, as well as in economics.

Probability The theory of stochastic processes broadened into


such areas as Markov processes and Brownian motion. Used
to model random movement of tiny particles suspended in a
fluid and fluctuations in stock markets.

Libo, Li
In conclusion.
Probability: Deductive? From model deduce the
probability
Statistics: Inductive? Induce from data the behaviour of
the black-box model.
Example: From Pascal’s triangle to the bell curve.

n=1000
plot(choose(n,c(0:n))/2^n)
sum(choose(n,c(0:n))/2^n)

Libo, Li
Examples

library(cluster)
head(iris)
fit<-kmeans(iris[1:4], 3)
clusplot(iris, fit$cluster, color=TRUE, shade=TRUE, labels=4, lines=0)
points(1.4,0.3)

library("nnet")
model <- multinom(Species ~ Sepal.Length + Petal.Width, data = iris)

expanded=expand.grid(Sepal.Length=c(1.3,3,7.5),
Petal.Width=c(0.3,1,1.6))

c = data.frame(Sepal.Length = c(1.4), Petal.Width=c(0.3))


expanded
predicted=predict(model,expanded,type="probs")
predicted=predict(model,c,type="probs")

predicted
points(expanded, col = ’red’,cex = 3)
points(c, col = ’red’,cex = 3)

Libo, Li
Experiments, Sample space and Events
Definition
An experiment is any process leading to recorded observations

Example
Some examples
Tossing a coin
Measuring the lifetime of a machine.
Counting the number of calls arriving at a telephone
exchange.

Libo, Li
Probability Space

Definition
An outcome is a possible result of an experiment and the set of
all possible outcomes is called the sample space which is
denoted by Ω.

Example
The following are some examples of sample spaces
Cast two dice consecutively. The sample space is
Ω = {(1, 1), (1, 2), . . . , (1, 6), (2, 1), . . . , (6, 6)}.
The number of arriving calls. The sample space is
Ω = {0, 1, . . . , } = N0

Libo, Li
Probability Space

Definition
An event is a set of outcomes, i.e. a subset of Ω.

Example
The event that the sum of two dices throws is ten or more is

A = {(5, 5), (5, 6), (6, 5), (6, 6)}

Definition
Events are mutually exclusive (disjoint) if they have no
outcomes in common.

Libo, Li
Revision in Set Operations

Lemma
(The associative law) If A, B, C are sets then

(A ∪ B) ∪ C = A ∪ (B ∪ C)
(A ∩ B) ∩ C = A ∩ (B ∩ C)

(Distributive Law) If A, B, C are sets then

A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)

Libo, Li
Remark
If you have trouble remembering the above rules, then one can
essentially replace ∩ by multiplication and ∪ by addition.

Libo, Li
Libo, Li
Libo, Li
σ-algebra

In order to define a probability function on the the sample


space, we require the concept of a σ-algebra, which is
beyond the scope of this course.
One can think of the σ-algebra as the family of all possible
events associate with Ω. In the case where the sample
space Ω is finite or countably infinite, the σ-algebra can be
taken to be the power set of Ω.

Libo, Li
Probability

Definition
A probability is a set function, which is usually denote by P, that
maps events from the σ-algebra to [0, 1] and satisfies certain
properties.

Example
Consider the coin toss experiment. The sample space is given
by Ω = {T , H} and the σ-algebra is A = {φ, {Ω}, {T }, {H}}.
We can define a probability P on the σ-algebra A by setting.

P(φ) = 0, P(Ω) = 1, P({H}) = p, P({T }) = 1 − p.

Libo, Li
Given the probability/sample space (Ω, A, P). The probability
function P must satisfy.
1 For every set A ∈ A, P(A) ≥ 0
2 P(Ω) = 1
3 (Countably additive) Suppose the family of sets (Ai )i∈N are
mutually exclusive, then

[ ∞
X
P( Ai ) = P(Ai )
i=1 i=1

Libo, Li
Libo, Li
MATH2901 - Higher Theory of Statistics

Libo, Li

May 28, 2020

Libo, Li
Week 1 - Lecture 2

We have introduced the probability/sample space (Ω, A, P),


Ω is the sample space.
A is the σ-algebra.
P is a probability function.

Libo, Li
The axioms are that the probability function P satisfies are
1 For every set A ∈ A, P(A) ≥ 0
2 P(Ω) = 1
3 Given a mutually exclusive family of sets (Ai )i∈N ,

[ ∞
X
P( Ai ) = P(Ai )
i=1 i=1

Libo, Li
Lemma
1 Given a family of disjoint sets (Ai )i=1,...,k

k
[ k
X
P( Ai ) = P(Ai )
i=1 i=1

2 P(φ) = 0
3 For any A ∈ A, P(A) ≤ 1 and P(Ac ) = 1 − P(A)
4 Suppose B, A ∈ A and A ⊆ B, then P(A) ≤ P(B).

Libo, Li
Libo, Li
Libo, Li
Example
(Tossing two fair dice consecutively) The sample space is

Ω = {(1, 1), (1, 2), . . . , (1, 6), (2, 1), . . . , (6, 6)}.

Let A be the power set of Ω. It is sufficient to define the


probability function P on the singletons, since every event in A
can be written as the disjoint union of the singletons elements.

Libo, Li
Theorem
(Continuity from below) Given an increasing sequence of
events A1 ⊂ A2 ⊂ . . . then

[
P( An ) = lim P(An )
n→∞
n=1

(Continuity from above) Given an decreasing sequence of


events A1 ⊃ A2 ⊃ . . . then

\
P( An ) = lim P(An )
n→∞
n=1

Libo, Li
Proof.
We
T∞ proof continuity
S∞ from above. By De Morgan’s law
A = ( A c )c
n=1 n n=1 n


\ ∞
[
P( An ) = P(( Acn )c )
n=1 n=1

[
= 1 − P( Acn )
n=1
= 1 − lim P(Acn )
n→∞
= lim (1 − P(Acn )) = lim P(An )
n→∞ n→∞

Libo, Li
Conditional Probability and Independence

Definition
the conditional probability that an event A occurs given that an
event B has occurred is
P(A ∩ B)
P(A|B) = , P(B) > 0
P(B)

Definition
Events A and B are independent if P(A ∩ B) = P(A) ∩ P(B).

Libo, Li
Conditional Probability and Independence

Lemma
Given two events A and B then P(A|B) = P(A) if and only if
P(B|A) = P(B).

Proof:

Libo, Li
Libo, Li
Conditional Probability and Independence

Definition
1 A countable sequence of events (Ai )i=N is pairwise
independent if P(Ai ∩ Aj ) = P(Ai )P(Aj ) for all i 6= j.
2 A countable sequence of events (Ai )i=N are independent if
for any sub-collection Ai1 , . . . Ain we have
n
Y
P(Ai1 ∩ Ai2 · · · ∩ Ain ) = P(Aij )
j=1

Remark
Independence implies pairwise independence, but pairwise
independence does not imply independence.
Libo, Li
Example
A ball is drawn at random from 4 balls labelled 1, 2, 3, 4. The
sample space is Ω = {1, 2, 3, 4} and we take P({i}) = 14 .
Consider the events

A = {1} ∪ {2}, B = {1} ∪ {3}, C = {1} ∪ {4}.

We see that P(A ∩ B) = P({1}) = 41 and P(A) = P(B) = 12 ,


which implies P(A)P(B) = P(A ∩ B). That is A and B are
independent. by similar arguments, we can show that the sets
A, B, C are pairwise independent, however

P(A ∩ B ∩ B) 6= P(A)P(B)P(C)
1 1
since P(A ∩ B ∩ B) = P({1}) = 4 and P(A)P(B)P(C) = 23
.
Libo, Li
Libo, Li
MATH2901 - Higher Theory of Statistics

Libo, Li

May 28, 2020

Libo, Li
Week 1 - Lecture 3

Lemma
1 The multiplicative law: given events A and B then

P(A ∩ B) = P(A|B)P(B),

and similarly, if you have events A, B, C then

P(A1 ∩ A2 ∩ A3 ) = P(A3 |A2 ∩ A1 )P(A2 |A1 )P(A1 )

2 The additive law: Let A and B be events then

P(A ∪ B) = P(A) + P(B) − P(A ∩ B)

Libo, Li
Remark
The RHS of the multiplicative law is exactly multiplication down
the tree diagram.

Proof.

Libo, Li
Libo, Li
Law of Total Probability

Lemma
Suppose (Ai )i=1,...,k are mutually exclusive and exhaustive of Ω,
that is ki=1 Ai = Ω, then for any event B, we have
S

k
X
P(B) = P(B|Ai )P(Ai )
i=1

Libo, Li
Proof.
It is easy to see that B = B ∩ Ω and by using the fact that
(Ai )i=1,...k is exhaustive of Ω, we can writ e

k
[ k
[
B =B∩Ω=B∩ Ai = (B ∩ Ai )
i=1 i=1

Then by noticing that (B ∩ Ai )i=1,...,k are again disjoint sets and


using the definition of conditional probability, we have
k
X k
X
P(B) = P(B ∩ Ai ) = P(B|Ai )P(Ai )
i=1 i=1

Libo, Li
Lemma
(Bayes Formula) Given sets B, A and a family of disjoint and
exhaustive sets (Ai )i=1,...,k then

P(B|A)P(A)
P(A|B) = Pk
i=1 P(B|Ai )P(Ai )

Libo, Li
Proof.
From definition of conditional probability

P(A ∩ B) P(B|A)P(A)
P(A|B) = =
P(B) P(B)

then by applying the law of total probability to P(B) in the


denominator, we have
P(B|A)P(A)
P(A|B) = Pk
i=1 P(B|Ai )P(Ai )

and this gives us the formula.

Libo, Li
Example
(Applications of Bayes Formula) A diagnostic test for a certain
disease claims to be 90% accurate in the following sense.
If the patient has the disease, the the test will be shown
positive with probability 0.9.
If the patient does not have disease, the the test will show
negative with probability 0.9.
Also we know that 1% of the population has the disease.

Libo, Li
Libo, Li
Libo, Li
Libo, Li
Descriptive Statistics + R

Steps to data analysis


What is the research question. How to provide insight to
the question using statistics
What are the properties of the variable of interest. Different
variable types require different analysis.

Categorical - Data can be sorted into a finite set of (unordered)


categories. e.g. Gender
Quantitative - Responses are measured on some sort of
scale. e.g. Weight.

Libo, Li
Numerical summaries of the quantitative data

Given observations x = (x1 , . . . , xn )


The sample mean (estimated mean) or average is given by
n
1X
x̄ = xi
n
i=1

sample variance (estimated variance)


n
1 X
s2 = (xi − x̄)2
n−1
i=1

Libo, Li
R-studio

r<-rexp(1000)
n<-rnorm(1000)
hist(-r,freq = FALSE)
hist(r,freq = FALSE)

par(mfrow=c(1,3))
plot(density(n), main = ’Symmetric Distribution’)
plot(density(-r+10), main = ’left skewed distribution’)
plot(density(r+10), main = ’right skewed distribution’)

Libo, Li

You might also like