You are on page 1of 33

Module 7

7 Lectures

Hydrological Statistics

Prof. Subhankar Karmakar


IIT Bombay
Objective of this module is to learn the fundamentals
of stochastic hydrological phenomena.

Module 7
Topics to be covered

 Statistical parameter estimation


 Probability distribution
 Goodness of fit
 Concepts of probability weighted moments & l-moments

Module 7
Module 7

Lecture 1: Introduction to probability


distribution
Introduction

 Deterministic model
– Variables involved are deterministic in nature

– No uncertainty for a given set of inputs


E.g. Newtonian model
=
s ut + 1 ft 2 (2nd law of motion)
2
 Probabilistic/Stochastic model
– Variables involved are random in nature.

– There is always uncertainty for a given set of inputs

E.g. Rainfall-runoff model

In a deterministic model, a given input yields the same output always, while a
probabilistic model yields different outputs for multiple trials with the same input.
Module 7
Random Variables

 Random Variable (R.V.) is a variable whose value can not be predicted


with certainty before it actually takes over the value. R.V may be discrete or
continuous.

E.g. rainfall, stream flow, soil hydraulic properties


(permeability, porosity, etc.), evaporation, diffusion, temperature, ground
water level, etc.
X≤x
Notations:
– Capital letters X, Y, Z indicates R.V.
Rainfall 30 mm (value)
– Small letter x, y, z indicates value of the R.V.

If X is a R.V. & Z(X) is function of X ==> Z is a R.V.


E.g. Rainfall is R.V, so Runoff is R.V.

Module 7
Random Variables Contd..…..

 Discrete random variables


 If the set of values for a R.V. can be assumed to be finite (or
countable infinite), then RV is said to be a discrete RV.
E.g. No. of raining days in a month (0,1,2,…,31)

No. of particles emitted by a radio-active material.

 Continuous random variables


 If the set of values for a R.V. can be assumed to be infinite,
then RV is said to be a continuous R.V.
E.g. Depth of rainfall in a period of given month

 By using ‘class interval’ we can convert continuous random


variables to discrete random variables.

Module 7
Probability Distribution

 Discrete Variable P [X=xi] Probability mass function (pmf)


Important conditions:
Splices
(i) 0 ≤ P [X=xi] ≤ 1

(ii) ∑ P [X=xi]
i
=1

[Here 1 is likelihood of occurrence] X1 X2 X3 Xn

Cumulative distribution function (CDF) Cumulative distribution function (cdf)


1.0
Notation
 CDF is a step function P
P [ X = x] = p(x)

p(x1)
P[ X ≤ =
xi ] ∑ P=
[X
xi ≤ x
xi ] p(x1)
+ p(x2)
+ p(x3)
p(x1) + p(x2)
for xi ≤ x2
P ( X = x ) = 0 {∀X = x1, x2 ,......, xn }
x1 x2 x3 x xn
Module 7
Probability Distribution Contd..…..

 Continuous Variable
f(x)
 Probability density function
(PDF), f(x) does not indicate
probability directly but it indicates
probability density
-α a +α
x
 CDF is given by
1.0
a
F (a ) = P [ X ≤ a ]= ∫α f ( x )dx

F(x)

-α a x +α

Module 7
Piecewise Continuous Distribution
d α

∫α f ( x )dx + P[X=d]+ ∫ f ( x )dx =



1
d
2 1
P[X=d]
f(x)
f1(x)
spike
Then, P[X<d] ≠ P[X ≤ d];
f2(x)
Simillarly, P[X>d] ≠ P[X ≥ d];

0 d x
CDF for this case
P[X = 0]

Jump ΔF X=d
P[X=d]
f(x)

0 X=0 d x 0 d x
Module 7
CDF within an interval

Here, interval is given by (x-∆x/2, x+∆x/2)


f(xi)
Δx

 ∆x  ∆x  ∆x 
x
 i − xi x +
 2  2  i
 2 

∆x ∆x xi
P[ xi − ≤ X ≤ xi + ]
2 2
∆x ∆x
= F ( xi + ) − F ( xi − ) = Area under the curve in the int erval
2 2
 ∆x ∆x 
 i
x − , x + 
 
i
2 2
= f ( x) * ∆xi
Module 7
Example Problem

Estimate the expected relative frequencies within each class of interval 0.25.
2
x1 x2 x3
3x
f=
(x) ; 0≤x≤5 0 0.25 0.75 1.25 1.75 5.0
125
xi f(xi)=3x2i/125 f(xi) x Δx = expected rel. freq.
0.25 0.0015 0.00075
0.75 0.0135 0.00675
1.25 0.0375 0.01875
1.75 0.0735 0.03675
2.25 0.1215 --
2.75 0.066 --
3.25 -- --
3.75 0.3375 0.16875
4.25 0.4335 0.21675
4.75 0.5415 0.27075
Sum 0.9975 ≈1.0

Module 7
Bivariate Random Variables

 (X,Y) – two dimensional random vector or random variable

X = rainfall
e.g. Temperature  Evaporation X

Temperature  Radiation Coeff. Y


Rainfall  Recharge
Rainfall  Runoff (X+Y)

y = runoff

 Variables may or may not be dependent

Case 1: (X,Y) is a 2-D discrete random vector


The possible values of may be represented as (Xi, Yj), i=1,2,….n & J=1,2,….n
Case 2: (X,Y) is a 2-D continuous random variable
Here, (X,Y) can assume all possible values in some non-countable set.
Module 7
1. Discrete 2-D Random Vector
X ≤x Y ≤ y
F ( x , y ) = P[ X ≤ x , Y ≤ y ] = ∑ ∑ p( X , Y ) F (∞, ∞ ) = 1
−∞ −∞

E.g. Calculate the following using the values from the given table.
1) F(3,2) i=1 2 3 4 5 6 P[Y]

2) F(7,4) X
0 1 2 3 4 5
Y
3) F(7,8)
j=1 0 0 0.01 0.03 0.05 0.07 0.09 0.25 P[Y=0]
6 4

∑ ∑ p( X , Y ) = 1
i =1 j =1
i j
2 1 0.01 0.02 0.04 0.05 0.06 0.08 0.26

F (3,2) = P[ X ≤ 3, Y ≤ 2] = 0.35 3 2 0.01 0.03 0.05 0.05 0.05 0.06 0.25

F (7,4) = F (∞, ∞ ) = 1 4 3 0.01 0.02 0.04 0.06 0.06 0.05 0.24 P[Y=3]

F (7,8) = 1 0.03 0.08 0.16 0.21 0.24 0.28

P[x=0] P[x=5]

Module 7
2. Continuous 2-D Random Vector
If f ( X ,Y ) → joint pdf of (X,Y)
f(X,Y) ≥ 0
α α

∫α ∫α f ( x, y )dxdy = 1
− −
B
F(x,y) → joint cdf of (X,Y)
f(x,y) plane
= P[X ≤ x,Y ≤ y ]
y x
= ∫α ∫α f ( x, y )dxdy
− −

Probablity of hatched region on the plane


F(α ,α )=1

F(-α ,y) = F(x, -α ) = 0


= ∫ f ( x, y )dB
B

Module 7
Example Problem 1

Consider the flows in two adjacent streams.


Denote it as a 2-D RV (X,Y), with a joint pdf,

=
f ( X ,Y ) C if 5,000 ≤ x ≤ 10,000 and
4,000 ≤ y ≤ 9,000

Solution: To get C,
α α α α


∫α ∫α f ( x, y )dxdy

1 ⇒ ∫ ∫ C dxdy =
=
−α −α
1

1
∴ C=
(5000)2
For a watershed region B,

P(B) = ∫ ∫ f ( x, y )dxdy
B

Module 7
Example Problem 1 Contd…

Determine P[X ≥ Y]
9000 y

= 1- P[X ≤ Y] = 1 - ∫ ∫
5000 5000
f ( x, y )dxdy

9000 y 9000
=1- ∫ ∫
5000 5000
Cdxdy =1- ∫
5000
[ y − 5000]dy

9000
y2 
= 1- C  − 5000 y 
 2  5000
9000
 (9000) 2
(5000)  2
= 1- C  − 45000000 − + 25000000 
 2 2  5000

= 17
25

Module 7
Example Problem 1 Contd…

Calculate F(10000,9000):
x y x y
1
F(x,y) = ∫
−α

−α
f ( x, y )dydx =∫
−α

− α (5000)
2
dydx
x
 y 4000 
= ∫
−α

 (5000)
2
-
(5000) 2  dx

yx 4000 y (5000) 5000 × 4000
= - x − +
(5000)2 (5000)2 (5000)2 (5000)2
xy 4x y 4
= - − + (Ans.)
(5000)2 5 × (5000) (5000) 5

F(10000,9000)

10000 × 9000 4(10000) 9000 4


= - − +
(5000) 2
5 × (5000) (5000) 5
= 3.6 - 1.6 -1.8 +0.8
= 1.0
Module 7
Example Problem 2

=
f(x,y) x 2 +xy 0 ≤ x ≤ 1& 0 ≤ y ≤ 2
= 0 elsewhere
α α
Find, P[x+y ≥ 1], verify ∫α ∫α f(x,y)dxdy
− −
=1

Solution :
1
2 1
xy )dxdy=
2
x x y
3 2

∫∫ + ∫0  3 6  dy
+
2
(x
3
0 0 0

2
2
1 y  1 y2  2 4 
= ∫  +  dy=  +  =  + 
0
3 6  3 12  0  3 12 

= 1 [verified]

Module 7
Example Problem 2 Contd…

P [ X + Y ≥ 1]
1
1
∫ − + x ) dx
2 3
=1 − P [ X + Y ≤ 1] =1-
60
(4 x 5 x
1− x
 2 xy 
1

∫ ∫ x + 3
1
= 1-  dydx 1  4x3 5x4 x2 
0 0   =1-  − + 
6 3 4 2 0
1− x
 2
1
y 2x 
=1- ∫  x y +  dx 1 4 5 1
=1 − − +
0  0 6 2
6
3 4 
 2
1
x (1 − 2 x + x 2 )  1 7 
= 1- ∫  x (1 − x ) +  dx = 1 −
0 
6  6 
 12 
65
 6x2 − 6x3 + x − 2x2 + x3 ) 
1
=
= 1- ∫   dx 72
0  
6

Module 7
Marginal Distribution Function

Discrete variables

α
Marginal Distribution=
of X is p[ xi ] ∑ p( x , y
j =1
i j ), ∀i
α
Marginal Distribution=
of y is q [ y j ] ∑ p( x , y
i =1
i j ), ∀j

Continuous variables
α
Marginal distribution function of x is given by g(x) = ∫α f ( x, y )dy

α
and of y is given by h(y) = ∫α f ( x, y )dx

[g(x) & h(y) are also pdf and should satisfy the conditions]

Module 7
Marginal Distribution Function Contd…

P [c ≤ X ≤ d ]= P[c ≤ X ≤ d , −α ≤ y ≤ α ]

α
d

= ∫  ∫ f ( x, y )dy dx
c  −α 
d
= ∫ g ( x )dx
c

[f ( x=
, y ) g ( x ) × h( y ) ......stochastically independent
g ( x ) is original distribution of the 1-D random variable X
x
Remember C.D.F = F(x) = ∫α g ( x )dx

g ( x ) is same as f(x), i.e. orginal density function of x.

Module 7
Example Problem

y
Function Px ,y ( x, y )= C (5 − ( ) − x ) for 0 ≤ x ≤ 2 and 0 ≤ y ≤ 2 can serve as a
2
bivariate contnuous probablity density function.

i) Find the value of C


2 2
S
∴ Probability (x ≤ 2 & y ≤ 2) = ∫0 ∫0 C (5 − (
2
) − t ) dsdt=1
2
2
 s2  2

∫0 C 5s − 4 − ts  dt = 1 ⇒ ∫0 C [10 − 1 − 2t ] dt = 1
0

∫ C [ 9 − 2t ] dt =
1 ⇒ C [9 t − t ]0 =
2 2
1
0

C [18 − 4] =1 ⇒ C =1
14
Module 7
ii) Find probability of x ≤ 1 & y ≤ 1
Px ,y ( x, y=
) prob( X ≤ x,Y ≤ y )
x y
S
= ∫0 ∫0 (5 − (
2
) − t ) /14 dsdt

1  xy 2 x 2 y 
x
1 y2 1  1 1
= ∫ [5 y − − ty ]dt = 5 xy − − = 5 − −  = 0.304
0
14 4 14  4 2 
 14  4 2

iii) Find "marginal desities" Px (x) & Py (Y)


2 2 2
Py (y) = ∫ PX ,Y (t , y )dt
Px (x) ∫P
0
X ,Y ( x, s )ds = ∫ (5 − s − x ) /14 ds
0
2 0
2
1 
2 1 y − t ) dt
=
s2
5s − − xs 
= ∫0 14 (5 −
2
14  4 0 2 2
1 1  yt t 
= [10 − 1 − 2 x ] = 5 t − − 
14 14  2 2 0
1 1 1
= (9 − 2 x ) = [10 − y − 2]= (8 − y )
14 14 14
Module 7
iv) Find cumulative marginal distributions Px ,y (x,α ) & Px x ,y (α ,Y)

9 − 2x
x x
∴ Px ,y (x,α ) =prob( X ≤ x )= ∫ PX , ( t )dt = ∫ ( ) dx
0 0
14
x
9x − x2   9x − x2 
=   =  
 14  0  14 
y
8−y
y
 8y − y 2 / 2 
(α , y ) prob(Y ≤ y )= ∫ Py ( s )ds = ∫ (
Px ,y= ) ds =  
0 0
14  14 
9x − x 2
Check putting x=2 , =1
14
 16 y − y 2 
& y=1,   =1
 28 

v) Find "conditional densities"


(5 − y − x)
Px / y (x/y) = Px x ,y (x,y)/Py (y) = 2
(8 − y )
(5 − y − x )
and Px / y (y/x) = Px x ,y (x,y)/Px (x) = 2
(9 − 2 x )
Module 7
Conditional Density Functions

Let (X,Y) is a 2-D random vector, with joint density function of f(x,y). Let g(x)
and h(y) be the marginal density functions of X and Y respectively.

The CDF of X, given Y=y is defined as, f ( x, y )


g(x | y ) = but h(y)>0
h( y )

f ( x, y )
and the CDF of Y, given X=x is defined as, h( y | x ) = but g(x)>0
g(x )

g ( x | y ) ≥ 0 as f(x,y) and h(y) are (+ve)


α α α
f ( x, y ) 1 1
∫−α g ( x | y )dx = 1= ∫−α h( y ) dx = ∫
h( y ) − α
f ( x , y )dx =
h( y )
h( y ) = 1

 α 
 ∫
∴ f ( x , y )dx = h ( y ) 
 −α 

Module 7
Conditional Density Functions Contd….

α
CDF = G(x|y) = P[ X ≤ x|Y = y ] = ∫α g ( x | y )dx

where y belongs to certain region, i.e. y ∈ R (R is a region)


∴ g(x | y ∈ R) =
f ( x, y )dy
R

∫ h( y )dy
R

Now cumulative conditional distribution function

= P [ X ≤ x | y ∈ R] = F( x | y ∈ R)
x
= F(X | y ∈=
R) ∫α g ( x | y ∈ R )dx

Module 7
Independence of random variables

g ( x | y ) = g ( x ), when X & Y are statistically independent variables


f ( x, y )
=
g(x|y) = g(x )
h( y )
If f(x,y) = g(x) x h(y), then x, y are independent

For e.g.,
f ( x, y ) = e − ( x + y ) ; x>0; y>0
α −( x + y ) α
e 
∴ g(x) = ∫ e − ( x + y )dy =  
0  − 1 0

= e − x , x>0 & h(y) = e − y , y>0

∴ f ( x, y ) =g ( x ) × h(y) ∴ x &y are independent

Module 7
Functions of random variables

 Simultaneous occurrence  Joint density function


 Distribution of one variable irrespective of the value of the other
variables  Marginal density function
 Distribution of one variable conditioned on the other variable
 Conditioned distribution
Example 1:
c
x : discrete variable, p(x) = f(x) =, x=2,3,4,5, y= x 2 − 7 x + 12
x
y takes two values 0 & 2, p(y=0) = p(x=3) + p(x=4)= c + c = 7c
3 4 12

p(y=2) = p(x=2) + p(x=5) = c + c =7c


2 5 10
x 2 3 4 5
y 2 0 0 2
p(x) c/2 c/3 c/4 c/5
Module 7
Functions of Random variables Contd…..

Example 2 :
2
for functions of RVs:
= 1- ∫ e − x dx
0
For continuous curve
−x e −2 e −0
f(x) = e ; x>0 = 1- [ − ]
−1 −1
y = 2x+1
y −1 = = e −2 (Ans.)
P[y ≥ 5] = P [ X ≥ ]
2
= P[X ≥ 2]

= 1- P[X ≤ 2] Alternatively
α α
2 e  −x

∫2
−x −2
= 1- ∫ e dx−x =e dx =  e (Ans.)
0
 −1  2
Module 7
i) g(x|y) = f(x,y) / h(y)
1 14 (10 − y − 2 x )
= (5 − y − x ) × = , 0 ≤ x ≤ 2, 0 ≤ y ≤ 2
14 2 (8 − y ) 2(8 − y )
Conditional CDF:

(10 x − xy − x 2 )
x
g ( x | y ) = P [ X ≤ x | Y = y ]= ∫ g ( x | y )dx =
−α
2(8 − y )

Now , P [ X ≤ 1,Y =
3 ]
2

(10(1) − (1)( 3 ) − 12 ) 10 − 3 − 1 15
= a[1| 3 ] = 2 = 2 =
2 2(8 − )3 2(13 ) 26
2 2
ii) Now, g(x|y ≤ 1] 0 ≤ x ≤ 2; 0 ≤ y ≤ 2
1 1
1 (5 − y − x )dy
∫ f ( x, y )dy
0
∫ 14
0
2
= 1
= 1

∫ h( y )dy
o
∫ (8 − y ) /14dy
o

Module 7
1 1
(5 − y  2

∫ 2
− x )dy
 5 y − y
4
− xy  5 − 1 − x 
= 0
= 0
=  4 
1
 2

1
8 − 1 
∫ (8 − y )dy 8 y − y  2
 2  0
o

(19 − 4 x )
=
30

x
 19 − 4 x   19 x x2 
iii ) F [ x | y ≤ 1]
= ∫0  30 dx =  30 − 15 
  
Now ,
2
18 x 2 x
P  X ≤ 1 | Y=
≤ 1 F  1 | Y < 1 = −
 2   2  15 15 for x= 1
2

18 x 2 17
= − =
2 × 15 15(4) 30

Module 7
Properties of RV

 Population: A complete assemblage of all possible RV

 Sample: A subset of population

 Realization: A time series of the RVs actually realized. They are continuous.
All samples are not realizations.

 Observation: A particular value of the RV in the realization


Note: ARMA is a synthetically generated realization

Module 7

You might also like