Hydrological Statistics: 7 Lectures

Module 7
7 Lectures
Hydrological Statistics
Prof. Subhankar Karmakar

IIT Bombay
Objective of this module is to learn the fundamentals
of stochastic hydrological phenomena.
Module 7
Topics to be covered
 Statistical parameter estimation

 Probability distribution
 Goodness of fit
 Concepts of probability weighted moments & l-moments
Module 7
Module 7
Lecture 1: Introduction to probability

distribution
Introduction
 Deterministic model
– Variables involved are deterministic in nature
– No uncertainty for a given set of inputs

E.g. Newtonian model
=
s ut + 1 ft 2 (2nd law of motion)
2
 Probabilistic/Stochastic model
– Variables involved are random in nature.
– There is always uncertainty for a given set of inputs
E.g. Rainfall-runoff model
In a deterministic model, a given input yields the same output always, while a
probabilistic model yields different outputs for multiple trials with the same input.
Module 7
Random Variables
 Random Variable (R.V.) is a variable whose value can not be predicted

with certainty before it actually takes over the value. R.V may be discrete or
continuous.
E.g. rainfall, stream flow, soil hydraulic properties

(permeability, porosity, etc.), evaporation, diffusion, temperature, ground
water level, etc.
X≤x
Notations:
– Capital letters X, Y, Z indicates R.V.
Rainfall 30 mm (value)
– Small letter x, y, z indicates value of the R.V.
If X is a R.V. & Z(X) is function of X ==> Z is a R.V.

E.g. Rainfall is R.V, so Runoff is R.V.
Module 7
Random Variables Contd..…..
 Discrete random variables

 If the set of values for a R.V. can be assumed to be finite (or
countable infinite), then RV is said to be a discrete RV.
E.g. No. of raining days in a month (0,1,2,…,31)
No. of particles emitted by a radio-active material.
 Continuous random variables

 If the set of values for a R.V. can be assumed to be infinite,
then RV is said to be a continuous R.V.
E.g. Depth of rainfall in a period of given month
 By using ‘class interval’ we can convert continuous random

variables to discrete random variables.
Module 7
Probability Distribution
 Discrete Variable P [X=xi] Probability mass function (pmf)

Important conditions:
Splices
(i) 0 ≤ P [X=xi] ≤ 1
(ii) ∑ P [X=xi]
i
=1
[Here 1 is likelihood of occurrence] X1 X2 X3 Xn
Cumulative distribution function (CDF) Cumulative distribution function (cdf)

1.0
Notation
 CDF is a step function P
P [ X = x] = p(x)
p(x1)
P[ X ≤ =
xi ] ∑ P=
[X
xi ≤ x
xi ] p(x1)
+ p(x2)
+ p(x3)
p(x1) + p(x2)
for xi ≤ x2
P ( X = x ) = 0 {∀X = x1, x2 ,......, xn }
x1 x2 x3 x xn
Module 7
Probability Distribution Contd..…..
 Continuous Variable
f(x)
 Probability density function
(PDF), f(x) does not indicate
probability directly but it indicates
probability density
-α a +α
x
 CDF is given by
1.0
a
F (a ) = P [ X ≤ a ]= ∫α f ( x )dx
−
F(x)
-α a x +α
Module 7
Piecewise Continuous Distribution
d α
∫α f ( x )dx + P[X=d]+ ∫ f ( x )dx =

−
1
d
2 1
P[X=d]
f(x)
f1(x)
spike
Then, P[X<d] ≠ P[X ≤ d];
f2(x)
Simillarly, P[X>d] ≠ P[X ≥ d];
0 d x
CDF for this case
P[X = 0]
Jump ΔF X=d
P[X=d]
f(x)
0 X=0 d x 0 d x
Module 7
CDF within an interval
Here, interval is given by (x-∆x/2, x+∆x/2)

f(xi)
Δx
 ∆x  ∆x  ∆x 
x
 i − xi x +
 2  2  i
 2 
∆x ∆x xi
P[ xi − ≤ X ≤ xi + ]
2 2
∆x ∆x
= F ( xi + ) − F ( xi − ) = Area under the curve in the int erval
2 2
 ∆x ∆x 
 i
x − , x + 
 
i
2 2
= f ( x) * ∆xi
Module 7
Example Problem
Estimate the expected relative frequencies within each class of interval 0.25.
2
x1 x2 x3
3x
f=
(x) ; 0≤x≤5 0 0.25 0.75 1.25 1.75 5.0
125
xi f(xi)=3x2i/125 f(xi) x Δx = expected rel. freq.
0.25 0.0015 0.00075
0.75 0.0135 0.00675
1.25 0.0375 0.01875
1.75 0.0735 0.03675
2.25 0.1215 --
2.75 0.066 --
3.25 -- --
3.75 0.3375 0.16875
4.25 0.4335 0.21675
4.75 0.5415 0.27075
Sum 0.9975 ≈1.0
Module 7
Bivariate Random Variables
 (X,Y) – two dimensional random vector or random variable
X = rainfall
e.g. Temperature  Evaporation X
Temperature  Radiation Coeff. Y

Rainfall  Recharge
Rainfall  Runoff (X+Y)
y = runoff
 Variables may or may not be dependent
Case 1: (X,Y) is a 2-D discrete random vector

The possible values of may be represented as (Xi, Yj), i=1,2,….n & J=1,2,….n
Case 2: (X,Y) is a 2-D continuous random variable
Here, (X,Y) can assume all possible values in some non-countable set.
Module 7
1. Discrete 2-D Random Vector
X ≤x Y ≤ y
F ( x , y ) = P[ X ≤ x , Y ≤ y ] = ∑ ∑ p( X , Y ) F (∞, ∞ ) = 1
−∞ −∞
E.g. Calculate the following using the values from the given table.
1) F(3,2) i=1 2 3 4 5 6 P[Y]
2) F(7,4) X
0 1 2 3 4 5
Y
3) F(7,8)
j=1 0 0 0.01 0.03 0.05 0.07 0.09 0.25 P[Y=0]
6 4
∑ ∑ p( X , Y ) = 1
i =1 j =1
i j
2 1 0.01 0.02 0.04 0.05 0.06 0.08 0.26
F (3,2) = P[ X ≤ 3, Y ≤ 2] = 0.35 3 2 0.01 0.03 0.05 0.05 0.05 0.06 0.25
F (7,4) = F (∞, ∞ ) = 1 4 3 0.01 0.02 0.04 0.06 0.06 0.05 0.24 P[Y=3]
F (7,8) = 1 0.03 0.08 0.16 0.21 0.24 0.28
P[x=0] P[x=5]
Module 7
2. Continuous 2-D Random Vector
If f ( X ,Y ) → joint pdf of (X,Y)
f(X,Y) ≥ 0
α α
∫α ∫α f ( x, y )dxdy = 1
− −
B
F(x,y) → joint cdf of (X,Y)
f(x,y) plane
= P[X ≤ x,Y ≤ y ]
y x
= ∫α ∫α f ( x, y )dxdy
− −
Probablity of hatched region on the plane

F(α ,α )=1
F(-α ,y) = F(x, -α ) = 0

= ∫ f ( x, y )dB
B
Module 7
Example Problem 1
Consider the flows in two adjacent streams.

Denote it as a 2-D RV (X,Y), with a joint pdf,
=
f ( X ,Y ) C if 5,000 ≤ x ≤ 10,000 and
4,000 ≤ y ≤ 9,000
Solution: To get C,
α α α α
−
∫α ∫α f ( x, y )dxdy
−
1 ⇒ ∫ ∫ C dxdy =
=
−α −α
1
1
∴ C=
(5000)2
For a watershed region B,
P(B) = ∫ ∫ f ( x, y )dxdy
B
Module 7
Example Problem 1 Contd…
Determine P[X ≥ Y]
9000 y
= 1- P[X ≤ Y] = 1 - ∫ ∫
5000 5000
f ( x, y )dxdy
9000 y 9000
=1- ∫ ∫
5000 5000
Cdxdy =1- ∫
5000
[ y − 5000]dy
9000
y2 
= 1- C  − 5000 y 
 2  5000
9000
 (9000) 2
(5000)  2
= 1- C  − 45000000 − + 25000000 
 2 2  5000
= 17
25
Module 7
Calculate F(10000,9000):
x y x y
1
F(x,y) = ∫
−α
∫
−α
f ( x, y )dydx =∫
−α
∫
− α (5000)
2
dydx
x
 y 4000 
= ∫
−α

 (5000)
2
-
(5000) 2  dx

yx 4000 y (5000) 5000 × 4000
= - x − +
(5000)2 (5000)2 (5000)2 (5000)2
xy 4x y 4
= - − + (Ans.)
(5000)2 5 × (5000) (5000) 5
F(10000,9000)
10000 × 9000 4(10000) 9000 4

= - − +
(5000) 2
5 × (5000) (5000) 5
= 3.6 - 1.6 -1.8 +0.8
= 1.0
Module 7
Example Problem 2
=
f(x,y) x 2 +xy 0 ≤ x ≤ 1& 0 ≤ y ≤ 2
= 0 elsewhere
α α
Find, P[x+y ≥ 1], verify ∫α ∫α f(x,y)dxdy
− −
=1
Solution :
1
2 1
xy )dxdy=
2
x x y
3 2
∫∫ + ∫0  3 6  dy
+
2
(x
3
0 0 0
2
2
1 y  1 y2  2 4 
= ∫  +  dy=  +  =  + 
0
3 6  3 12  0  3 12 
= 1 [verified]
Module 7
P [ X + Y ≥ 1]
1
1
∫ − + x ) dx
2 3
=1 − P [ X + Y ≤ 1] =1-
60
(4 x 5 x
1− x
 2 xy 
1
∫ ∫ x + 3
1
= 1-  dydx 1  4x3 5x4 x2 
0 0   =1-  − + 
6 3 4 2 0
1− x
 2
1
y 2x 
=1- ∫  x y +  dx 1 4 5 1
=1 − − +
0  0 6 2
6
3 4 
 2
1
x (1 − 2 x + x 2 )  1 7 
= 1- ∫  x (1 − x ) +  dx = 1 −
0 
6  6 
 12 
65
 6x2 − 6x3 + x − 2x2 + x3 ) 
1
=
= 1- ∫   dx 72
0  
6
Module 7
Marginal Distribution Function
Discrete variables
α
Marginal Distribution=
of X is p[ xi ] ∑ p( x , y
j =1
i j ), ∀i
α
Marginal Distribution=
of y is q [ y j ] ∑ p( x , y
i =1
i j ), ∀j
Continuous variables
α
Marginal distribution function of x is given by g(x) = ∫α f ( x, y )dy
−
α
and of y is given by h(y) = ∫α f ( x, y )dx
−
[g(x) & h(y) are also pdf and should satisfy the conditions]
Module 7
Marginal Distribution Function Contd…
P [c ≤ X ≤ d ]= P[c ≤ X ≤ d , −α ≤ y ≤ α ]
α
d

= ∫  ∫ f ( x, y )dy dx
c  −α 
d
= ∫ g ( x )dx
c
[f ( x=
, y ) g ( x ) × h( y ) ......stochastically independent
g ( x ) is original distribution of the 1-D random variable X
x
Remember C.D.F = F(x) = ∫α g ( x )dx
−
g ( x ) is same as f(x), i.e. orginal density function of x.
Module 7
Example Problem
y
Function Px ,y ( x, y )= C (5 − ( ) − x ) for 0 ≤ x ≤ 2 and 0 ≤ y ≤ 2 can serve as a
2
bivariate contnuous probablity density function.
i) Find the value of C

2 2
S
∴ Probability (x ≤ 2 & y ≤ 2) = ∫0 ∫0 C (5 − (
2
) − t ) dsdt=1
2
2
 s2  2
∫0 C 5s − 4 − ts  dt = 1 ⇒ ∫0 C [10 − 1 − 2t ] dt = 1
0
∫ C [ 9 − 2t ] dt =
1 ⇒ C [9 t − t ]0 =
2 2
1
0
C [18 − 4] =1 ⇒ C =1
14
Module 7
ii) Find probability of x ≤ 1 & y ≤ 1
Px ,y ( x, y=
) prob( X ≤ x,Y ≤ y )
x y
S
= ∫0 ∫0 (5 − (
2
) − t ) /14 dsdt
1  xy 2 x 2 y 
x
1 y2 1  1 1
= ∫ [5 y − − ty ]dt = 5 xy − − = 5 − −  = 0.304
0
14 4 14  4 2 
 14  4 2
iii) Find "marginal desities" Px (x) & Py (Y)

2 2 2
Py (y) = ∫ PX ,Y (t , y )dt
Px (x) ∫P
0
X ,Y ( x, s )ds = ∫ (5 − s − x ) /14 ds
0
2 0
2
1 
2 1 y − t ) dt
=
s2
5s − − xs 
= ∫0 14 (5 −
2
14  4 0 2 2
1 1  yt t 
= [10 − 1 − 2 x ] = 5 t − − 
14 14  2 2 0
1 1 1
= (9 − 2 x ) = [10 − y − 2]= (8 − y )
14 14 14
Module 7
iv) Find cumulative marginal distributions Px ,y (x,α ) & Px x ,y (α ,Y)
9 − 2x
x x
∴ Px ,y (x,α ) =prob( X ≤ x )= ∫ PX , ( t )dt = ∫ ( ) dx
0 0
14
x
9x − x2   9x − x2 
=   =  
 14  0  14 
y
8−y
y
 8y − y 2 / 2 
(α , y ) prob(Y ≤ y )= ∫ Py ( s )ds = ∫ (
Px ,y= ) ds =  
0 0
14  14 
9x − x 2
Check putting x=2 , =1
14
 16 y − y 2 
& y=1,   =1
 28 
v) Find "conditional densities"

(5 − y − x)
Px / y (x/y) = Px x ,y (x,y)/Py (y) = 2
(8 − y )
(5 − y − x )
and Px / y (y/x) = Px x ,y (x,y)/Px (x) = 2
(9 − 2 x )
Module 7
Conditional Density Functions
Let (X,Y) is a 2-D random vector, with joint density function of f(x,y). Let g(x)
and h(y) be the marginal density functions of X and Y respectively.
The CDF of X, given Y=y is defined as, f ( x, y )

g(x | y ) = but h(y)>0
h( y )
f ( x, y )
and the CDF of Y, given X=x is defined as, h( y | x ) = but g(x)>0
g(x )
g ( x | y ) ≥ 0 as f(x,y) and h(y) are (+ve)

α α α
f ( x, y ) 1 1
∫−α g ( x | y )dx = 1= ∫−α h( y ) dx = ∫
h( y ) − α
f ( x , y )dx =
h( y )
h( y ) = 1
 α 
 ∫
∴ f ( x , y )dx = h ( y ) 
 −α 
Module 7
Conditional Density Functions Contd….
α
CDF = G(x|y) = P[ X ≤ x|Y = y ] = ∫α g ( x | y )dx
−
where y belongs to certain region, i.e. y ∈ R (R is a region)
∫
∴ g(x | y ∈ R) =
f ( x, y )dy
R
∫ h( y )dy
R
Now cumulative conditional distribution function
= P [ X ≤ x | y ∈ R] = F( x | y ∈ R)
x
= F(X | y ∈=
R) ∫α g ( x | y ∈ R )dx
−
Module 7
Independence of random variables
g ( x | y ) = g ( x ), when X & Y are statistically independent variables

f ( x, y )
=
g(x|y) = g(x )
h( y )
If f(x,y) = g(x) x h(y), then x, y are independent
For e.g.,
f ( x, y ) = e − ( x + y ) ; x>0; y>0
α −( x + y ) α
e 
∴ g(x) = ∫ e − ( x + y )dy =  
0  − 1 0
= e − x , x>0 & h(y) = e − y , y>0
∴ f ( x, y ) =g ( x ) × h(y) ∴ x &y are independent
Module 7
Functions of random variables
 Simultaneous occurrence  Joint density function

 Distribution of one variable irrespective of the value of the other
variables  Marginal density function
 Distribution of one variable conditioned on the other variable
 Conditioned distribution
Example 1:
c
x : discrete variable, p(x) = f(x) =, x=2,3,4,5, y= x 2 − 7 x + 12
x
y takes two values 0 & 2, p(y=0) = p(x=3) + p(x=4)= c + c = 7c
3 4 12
p(y=2) = p(x=2) + p(x=5) = c + c =7c

2 5 10
x 2 3 4 5
y 2 0 0 2
p(x) c/2 c/3 c/4 c/5
Module 7
Functions of Random variables Contd…..
Example 2 :
2
for functions of RVs:
= 1- ∫ e − x dx
0
For continuous curve
−x e −2 e −0
f(x) = e ; x>0 = 1- [ − ]
−1 −1
y = 2x+1
y −1 = = e −2 (Ans.)
P[y ≥ 5] = P [ X ≥ ]
2
= P[X ≥ 2]
= 1- P[X ≤ 2] Alternatively
α α
2 e  −x
∫2
−x −2
= 1- ∫ e dx−x =e dx =  e (Ans.)
0
 −1  2
Module 7
i) g(x|y) = f(x,y) / h(y)
1 14 (10 − y − 2 x )
= (5 − y − x ) × = , 0 ≤ x ≤ 2, 0 ≤ y ≤ 2
14 2 (8 − y ) 2(8 − y )
Conditional CDF:
(10 x − xy − x 2 )
x
g ( x | y ) = P [ X ≤ x | Y = y ]= ∫ g ( x | y )dx =
−α
2(8 − y )
Now , P [ X ≤ 1,Y =
3 ]
2
(10(1) − (1)( 3 ) − 12 ) 10 − 3 − 1 15
= a[1| 3 ] = 2 = 2 =
2 2(8 − )3 2(13 ) 26
2 2
ii) Now, g(x|y ≤ 1] 0 ≤ x ≤ 2; 0 ≤ y ≤ 2
1 1
1 (5 − y − x )dy
∫ f ( x, y )dy
0
∫ 14
0
2
= 1
= 1
∫ h( y )dy
o
∫ (8 − y ) /14dy
o
Module 7
1 1
(5 − y  2

∫ 2
− x )dy
 5 y − y
4
− xy  5 − 1 − x 
= 0
= 0
=  4 
1
 2

1
8 − 1 
∫ (8 − y )dy 8 y − y  2
 2  0
o
(19 − 4 x )
=
30
x
 19 − 4 x   19 x x2 
iii ) F [ x | y ≤ 1]
= ∫0  30 dx =  30 − 15 
  
Now ,
2
18 x 2 x
P  X ≤ 1 | Y=
≤ 1 F  1 | Y < 1 = −
 2   2  15 15 for x= 1
2
18 x 2 17
= − =
2 × 15 15(4) 30
Module 7
Properties of RV
 Population: A complete assemblage of all possible RV
 Sample: A subset of population
 Realization: A time series of the RVs actually realized. They are continuous.
All samples are not realizations.
 Observation: A particular value of the RV in the realization

Note: ARMA is a synthetically generated realization
Module 7

Hydrological Statistics: 7 Lectures

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hydrological Statistics: 7 Lectures

Uploaded by

Copyright:

Available Formats

Module 7

Prof. Subhankar Karmakar

 Statistical parameter estimation

Lecture 1: Introduction to probability

– No uncertainty for a given set of inputs

– There is always uncertainty for a given set of inputs

E.g. Rainfall-runoff model

 Random Variable (R.V.) is a variable whose value can not be predicted

E.g. rainfall, stream flow, soil hydraulic properties

If X is a R.V. & Z(X) is function of X ==> Z is a R.V.

 Discrete random variables

No. of particles emitted by a radio-active material.

 Continuous random variables

 By using ‘class interval’ we can convert continuous random

 Discrete Variable P [X=xi] Probability mass function (pmf)

[Here 1 is likelihood of occurrence] X1 X2 X3 Xn

Cumulative distribution function (CDF) Cumulative distribution function (cdf)

∫α f ( x )dx + P[X=d]+ ∫ f ( x )dx =

Here, interval is given by (x-∆x/2, x+∆x/2)

 (X,Y) – two dimensional random vector or random variable

Temperature  Radiation Coeff. Y

 Variables may or may not be dependent

Case 1: (X,Y) is a 2-D discrete random vector

F (3,2) = P[ X ≤ 3, Y ≤ 2] = 0.35 3 2 0.01 0.03 0.05 0.05 0.05 0.06 0.25

F (7,8) = 1 0.03 0.08 0.16 0.21 0.24 0.28

Probablity of hatched region on the plane

F(-α ,y) = F(x, -α ) = 0

Consider the flows in two adjacent streams.

10000 × 9000 4(10000) 9000 4

g ( x ) is same as f(x), i.e. orginal density function of x.

i) Find the value of C

iii) Find "marginal desities" Px (x) & Py (Y)

v) Find "conditional densities"

The CDF of X, given Y=y is defined as, f ( x, y )

g ( x | y ) ≥ 0 as f(x,y) and h(y) are (+ve)

where y belongs to certain region, i.e. y ∈ R (R is a region)

Now cumulative conditional distribution function

g ( x | y ) = g ( x ), when X & Y are statistically independent variables

= e − x , x>0 & h(y) = e − y , y>0

∴ f ( x, y ) =g ( x ) × h(y) ∴ x &y are independent

 Simultaneous occurrence  Joint density function

p(y=2) = p(x=2) + p(x=5) = c + c =7c

 Population: A complete assemblage of all possible RV

 Sample: A subset of population

 Observation: A particular value of the RV in the realization

You might also like