You are on page 1of 32

HIDROLOGI KUANTITATIF

Conceptual models

Overview of TANK model (Sugawara and Funiyuki, 1956)


Tank A
Overland flow - 1st side outlet
Interflow -2nd side outlet
Percolation - bottom outlet
Tank B Tank A
Baseflow

The final discharge is computed


by: Qtott = q1t + q2t + q4t Tank B

Parameters to be calibrated:
k1, k2, k3, k4 , d1, d2’ s1 and s2
Conceptual models

Overview of ADM model (Todini, 1996)


Water balance component Water balance component Transfer component
surface runoff
interflow
percolation
baseflow

Transfer component
transfer along hillslopes
transfer along river network

Parameters to be calibrated:
Water Balance Component : B, Wm, D1, D2, P1 ,P2 and K0
Transfer Component : C1,, D1, C2, and D2
Conceptual models

Overview of NAM model (DHI)


Storage
snow storage
surface storage
lower or root zone storage
groundwater storage

Modelling component
overland flow
interflow
interflow and overland flow
routing
groundwater recharge
Parameters to be calibrated:
baseflow
Umax, Lmax ,CQOF, CQIF, TIF ,
TOF, TG, CK12 and CKBF
Data-driven models

Overview of ANN
bj
X1
. W1j
.
Wij j
Xi Yj
. Wnj
.
Xn

Output of jth node :


N inp

Y j  f (  Wij X i  b j )
i 1

Most commonly used transfer


functions:
1. Sigmoid function : 2. Hyperbolic tangent function:
1
f ( Xi ,Wi )  f ( X i ,Wi )  Tan (  X i )
1  exp(  Xi )
Data-driven models

Learning in ANN

Target vector : T  ( t1 ,t 2 ,....,t p )


Output vector : Y  ( y1 , y2 ,...., y p )
Error function :
E   ( yi t i )2
P p

Minimise the value of E : optimisation problem


- Backpropagation algorithm
Data-driven models

Overview of Decision tree (DT) X2


4 class 0 class 0
- solves classification problem
3
class 1
• ID3 algorithm
2
n
Entropy( S )    Pi log 2 Pi class 0 class 0

i 1 1
Sv class 1
Gain( S , A )  Entropy( S ) 


v valuesA S
Entropy( S v ) X1

0 1 2 3 4 5 6
Decision tree
• C4.5 algorithm
c Si Si x2 > 2
SplitInformation( S , A )   log 2 Yes No
i 1 S S x1 > 2.5 x1 < 4
Yes No Yes No

InformationGain( S , A )
GainRatio( S , A )  x2 < 3.5 Class 0 Class 0 x2 < 1

SplitInformation( S , A ) Yes No Yes No

Class 1 Class 0 Class 1 Class 0


Data-driven models

Overview of Model tree (MT) 4


X2
Model 3 Model 2

Tree structure where 3

 nodes are splitting conditions Model 1

 leaves are:
2
Model 4 Model 6
• constants ( regression tree) 1

• linear regression models ( M5 model Model 5 X1

tree)
• Building initial tree
1 2 3 4 5 6
Y (output)
Ti
SDR  sd ( T )   sd ( Ti ) M5 model tree
i T
x2 > 2

• Pruning the tree : needed when a large Yes No


x1 > 2.5 x1 < 4
tree overfits the data: a subtree is replaced Yes No Yes No

by one linear model x2 < 3.5 Model 3 Model 4 x2 < 1

Yes No Yes No
• Smoothing : is used to compensate for
the sharp discontinuities between adjacent Model 1 Model 2 Model 5 Model 6

linear models
Boosting techniques in ML
The boosting algorithm for regression problem: AdaboostRT
• uniform distribution of weights
• while t = T,
• calculate error rate ft(x) based on threshold value
t   Dt ( i )
f ( x ) yi
i: t i 
yi
• set t = t2
• update distribution
 f t ( xi )  yi  Zt-normalisation factor
Dt ( i )   t if   
Dt 1 ( i )  x yi 
Zt 1 otherwise 
 

• set t = t + 1
• final hypothesis
 1 
f fin ( x )    log 
t
* f t ( x )

t 
Model evaluation criteria :

• Root mean squared error (RMSE) =   y  y 


n
~ 2
i i
i 1

• Normalised Root mean squared


RMSE
error (NRMSE) =
Standard deviation of observed data

 2 
  RMSE  
• Coefficient of efficiency (COE) = 1  
 2 
 n    yi  yi   
1
n

  i 1 
STUDY AREA

Bagmati River Basin


Location : Central Nepal

Basin characteristics :
• Contributing Area :
2900 Km2
• Elevation range :
approx.. 2700 - <100m

• Average slope :
1%
DATA COLLECTION

• Precipitation
data Daily
: data for 8 years (January 1988 to December 1995)
3 stations - (Kathmandu, Hariharpur and Daman)
Thiessen polygon - mean aerial precipitation

• Runoff data :
Daily data for 8 years (January 1988 to December 1995)
1 stations - (Pandheradobhan)

• Evapotranspiration data :
Calculated using the FAO modified Penman method
DATA ANALYSIS AND PREPARATION

Daily average aerial precipitation and the runoff at the basin outlet
Precipitation: Runoff:
Daily Average = 5.44 mm Daily Average = 149.96 Cumec
Maximum = 364.59 mm Maximum = 5030 Cumec
Rainfall-Discharge plot

6000 0
50
5000
100

Precipitation [mm]
Runoff [Cumec]

4000
150
3000 200
250
2000
300
1000
350
0 400
Jan-88

Jun-89

Aug-91
Jan-92

Apr-95
M ay-88
Sep-88
Feb-89

Oct-89

Jul-90
Nov-90
Apr-91

M ay-92
Sep-92
Feb-93
Jun-93
Oct-93
Mar-94
Jul-94
Dec-94

Aug-95
Mar-90

Time [days]

Runoff [Cumec] Precipitation [mm]


DATA ANALYSIS AND PREPARATION

Data analysis for inputs for data-driven model


Correlation of rainfall with runoff
• Visual inspection 0.9
0.8

• Correlation analysis 0.7

Correlation Coeff.
0.6
0.5

• Autocorrelation analysis of discharge 0.4


0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8
Relationship between rainfall event & the resulting runoff Lag in R(t-i)

6000 400
350 AutoCorrelation of Discharge
5000
300 1.2
Runoff [Cumec]

4000

Rainfall [mm]
250
1

Correlation coeff.
3000 200
0.8
150
2000
100 0.6
1000
50 0.4
0 0
0.2
2020 2025 2030 2035 2040 2045 2050 2055 2060 2065 2070
0
Time [days]
0 1 2 3 4 5 6 7 8
Runoff Rainfall
Lag [days]
DATA ANALYSIS AND PREPARATION

Data transformation Transformed runoff vs original runoff

4 6000
• Box Cox Transformation 3.5 5000

Transformed runoff
3
• Logarithmic Transformation

Original runoff
2.5 4000

2 3000
1.5
2000
1
0.5 1000

0 0
0 500 1000 1500
Time [days]

Transformed runoff Original runoff

Frequency Distibution of the original discharge Frequency Distribution of the transformed discharge (Box-cox)
400
1600
350
1400
300
1200
250
1000
Frequency
Frequency

800 200

600 150

400 100

200 50

0 0
10 0 0 0 0 0 0 0 0 00 10 20 e
12 23 34 45 56 67 78 89 or
10 11 12 M
Discharge [m3/s], bins Transformed Discharge, bins
EXPERIMENTS: Conceptual models

Model setup for TANK, ADM and NAM models


Data sets :
Data for calibration : 2000 points (1 Jan 1988 to 22 June 1993)
Data for verification : 922 points (23 June 1993 to 31 Dec 1995)
EXPERIMENTS: Data-driven models

Model setup for ANN, MT and DT


Data sets :
Training Set = 2000 examples (3 Jan 1988 to 24 June 1993)
Verification Set = 919 examples (25 June 1993 to 30 Dec 1995)

Input variables :
Rainfall -upto 2 previous timestep (REt, REt-1 and REt-2)
Runoff -upto 1 previous timestep (Qt, and Qt-1)

Threshold for classification of runoff Qt+1


Qt+1 <= 300 m3/s - “low flow”
Qt+1 > 300 m3/s - “high flow”
RESULTS: Conceptual models (TANK, ADM and NAM)
Tank ADM NAM
Parameters Training Verification Training Verification Training Verification
RMSE 105.08 179.15 107.12 132.81 105.36 177.01
NRMSE 0.469 0.511 0.478 0.379 0.470 0.505
COE 0.7798 0.7390 0.7712 0.8565 0.7786 0.7452

Comparision of results computed by Conceptual models


on Verification set
6000

5000

4000
Runoff [Cumec]

3000

2000

1000

0
23-Jun-93 8-Jul-93 23-Jul-93 7-Aug-93 22-Aug-93 6-Sep-93 21-Sep-93
Time [Days]

Observed Computed by ADM Computed by Tank Computed by NAM


RESULTS: Conceptual models

Comparison (TANK, ADM and NAM models)


Performance based on accumulated runoff

Comparision of Accumulated Runoff on verification set


210

180
Runoff [x 103 Cumec]

150

120

90

60

30
0
20-Jun-93 17-Nov-93 16-Apr-94 13-Sep-94 10-Feb-95 10-Jul-95 7-Dec-95
Time [days]

Observed Computed by ADM Computed by Tank Computed by NAM


RESULTS: Predictive data-driven models
Performance of ANN
Best found hidden node = 3

ANN model on verification set

6000

5000

4000
Runoff [Cumec]

3000

2000

1000

0
25-Jun-93 10-Jul-93 25-Jul-93 9-Aug-93 24-Aug-93 8-Sep-93 23-Sep-93
Time [Days]

Observed Computed
RESULTS: Data-driven models

Performance of MT Linear Model Tree


Paramet Regression Trial 1 Trial 2 Trial 3 Trial 4
er Train Verif Train Verif Train Verif Train Verif Train Verif
RMSE 111.5 157.7 96.8 155.3 98.4 153.6 100.1 160.7 100.2 161.0
NRMSE 0.5 0.4 0.4 0.4 0.4 0.4 0.4 0.5 0.4 0.5
COE 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8

MT on verification set
(No of LM = 8)
6000

5000

4000
Runoff [Cumec]

3000

2000

1000

0
25-Jun-93 10-Jul-93 25-Jul-93 9-Aug-93 24-Aug-93 8-Sep-93 23-Sep-93
Time [Days]

Observed Computed
RESULTS: Data-driven models

Performance of ANN using transformed data


Best found
hidden nodes :
• For Box-Cox - 7
• For log - 3 Comparision of ANN model on verification set (using data transformation)

6000

5000
Runoff [Cumec]

4000

3000

2000

1000

0
25-Jun-93 10-Jul-93 25-Jul-93 9-Aug-93 24-Aug-93 8-Sep-93 23-Sep-93
Time [Days]

M easured Computed-Log transformed Computed-Box-Cox transformed


RESULTS: Data-driven models

Performance of DT for classification


Unprunned Decision tree Prunned Decision tree

Evaluation No.of Correctly classified Incorrectly classified Correctly classified Incorrectly classified
for instances Instances Instances Instances Instances
In number In % In number In % In number In % In number In %
Training 2000 1919 95.95% 81 4.05% 1915 95.75% 85 4.25%
Verification 919 857 93.25% 62 6.75% 855 93.04% 64 6.96%

42 22
Data distribution according to the classification low high

Number of instances Daily runoff (Cumec)


CLASS Criteria According to the According to the According to the According to the
classification criteria Decision tree classification criteria Decision tree
Train Verif Total Train Verif Total M ax M in M ean M ax M in M ean
LOW Qt+1 <=300 1655 814 2469 1692 832 2524 300 5.1 68.74 1970 5.1 80.87
HIGH Qt+1 >300 302 148 450 265 130 395 5030 301 596.4 5030 64.4 592.4
RESULTS: Data-driven models (ANN & MT, low flow)

Comparision of ANN and MT on verification set for low flow

400

350

300
Runoff [Cumec]

250

200

150

100

50

0
600 620 640 660 680 700 720 740 760 780 800
T ime [Days]

M easured Computed by M T Computed by ANN


RESULTS: Data-driven models (ANN & MT, high flow)

Comparision of ANN and MT on verification set for high flow

3500

3000

2500
Runoff [Cumec]

2000

1500

1000

500

0
0 25 50 75 100 125 150
T ime [Days]

M easured Computed by M T Computed by ANN


RESULTS: Data-driven models (Comparison of ANN)
ANN model
whole data set with Parameter using all events combined low & high
train verif train verif
RMSE 113.13 77.67 102.68 68.46
combined low & high NRMSE 0.392 0.335 0.356 0.295
COE 0.8466 0.8889 0.8735 0.9127

Comparision of ANN model on verification set

3500

3000

2500
Runoff [Cumec]

2000

1500

1000

500

0
11-May-95 31-May-95 20-Jun-95 10-Jul-95 30-Jul-95 19-Aug-95 8-Sep-95 28-Sep-95
T ime [Days]

Observed Computed -whole set Computed- combined low & high


RESULTS: Data-driven models (Comparison of MT)
MT model
whole data set with Parameter using all events combined low & high
train verif train verif
RMSE 122.82 92.06 101.97 78.79
combined low & high NRMSE 0.425 0.397 0.353 0.340
COE 0.8190 0.8421 0.8752 0.8843

Comparision of MT on verification set


3500

3000

2500
Runoff [Cumec]

2000

1500

1000

500

0
11-May-95 31-May-95 20-Jun-95 10-Jul-95 30-Jul-95 19-Aug-95 8-Sep-95 28-Sep-95
T ime [Days]

Observed Computed-whole set Computed -combined low & high


RESULTS: Data-driven models (ANN & MT, low flow)
(using committee machine)

Comparision of ANN and MT on verification set for low flow


using Committee Machine-Adaboost-RT

350

300

250
Run off [Cum ec]

200

150

100

50

0
600 620 640 660 680 700 720 740 760 780 800
T ime [Days]

Observed Computed by M T Computed by ANN


RESULTS: Data-driven models (ANN & MT, high flow)
(using committee machine)

Comparision of ANN and MT on verification set for high flow


using Committee Machine-Adaboost-RT

3500

3000

2500
Runoff [Cum ec]

2000

1500

1000

500

0
0 25 50 75 100 125 150
T ime [Days]

Observed Computed by M T Computed by ANN


RESULTS: Data-driven models

Comparison of results by ANN

Using Comitte Machine-Adaboost-RT


Description With normal data With data With normal data With data
transformation transformation
M odel Lo M odel Hi M odel Lo M odel Hi M odel Lo M odel Hi M odel Lo M odel Hi
Number of Hidden Nodes 4 3 3 5 4 3 3 5
RM SE Training 30.93 251.16 31.55 250.57 31.83 250.32 33.04 267.03
Verification 29.14 160.60 30.57 154.13 29.92 158.82 31.67 164.35
NRM SE Training 0.400 0.499 0.408 0.498 0.411 0.498 0.427 0.531
Verification 0.370 0.468 0.388 0.449 0.379 0.463 0.402 0.479
COE Training 0.8401 0.7497 0.8337 0.7509 0.8307 0.7514 0.8176 0.7171
Verification 0.8939 0.7794 0.8495 0.7967 0.8559 0.7841 0.8385 0.7688
RESULTS: Data-driven models

Comparison of results by MT

Using Comitte M achine-Adaboost-RT


Description With normal data With data With normal data With data
transformation transformation
M odel Lo M odel Hi M odel Lo M odel Hi M odel Lo M odel Hi M odel Lo M odel Hi
Number of Rules (LM ) 4 6 2 3 4 6 2 3
RM SE Training 30.96 249.25 32.62 345.78 29.46 241.02 30.50 246.47
Verification 30.92 187.33 31.50 190.42 29.61 162.83 30.59 172.56
NRM SE Training 0.400 0.496 0.422 0.688 0.381 0.479 0.394 0.490
Verification 0.392 0.546 0.400 0.555 0.376 0.475 0.388 0.503
COE Training 0.8399 0.7535 0.8221 0.5257 0.8550 0.7695 0.8445 0.7590
Verification 0.8461 0.6996 0.8402 0.6897 0.8588 0.7730 0.8493 0.7451
RESULTS: Summary

Summary of model performances

Conceptual models Predictive DDM


Parameter Tank ADM NAM "whole event" "high flow"
ANN MT ANN* MT*
RMSE 179.15 132.809 177.01 163.14 153.58 154.13 162.83
NRMSE 0.511 0.350 0.505 0.464 0.437 0.449 0.475
COE 0.7389 0.8565 0.7452 0.7842 0.809 0.7967 0.7730

ANN* - best result achieved using transformed data


MT* - best result achieved using committee machine on non-
transformed data

You might also like