You are on page 1of 10

106 Int. J. Embedded Systems, Vol. 12, No.

1, 2020

Exploration and application of the value of big data


based on data-driven techniques for the hydraulic
internet of things

Qiang Yue and Fusheng Liu*


College of Water Conservancy and Civil Engineering,
Shandong Agricultural University,
Taian, China
Email: yueqiang406@163.com
Email: liufsh8241703@163.com
*Corresponding author

Changqing Song
Agricultural Big Data Center,
Shandong Agricultural University,
Taian, China
Email: scq0705@163.com

Jing Liang
College of Information Engineering,
Qingdao Binhai University,
Qingdao, China
Email: 136184901@qq.com

Yanmin Liu
Faculty of Computer Science,
Dalhousie University,
Halifax, NS, Canada
Email: forevernianshao@163.com

Guangsheng Cao
College of Information Engineering,
Qingdao Binhai University,
Qingdao, China
Email: 52605560@qq.com

Abstract: The use of big data technology to screen the massive amounts of hydraulic
engineering data in the internet of things is important for its efficient application. This research
applies big data methodology to water management to solve numerous problems, such as the
demand diversification of related interest groups, overall water difficulties and other problems
that arise in hydraulic engineering. A historical database that contains a large amount of data and
feedback information is used to design an early-warning health model for a reservoir using big
data methods and based on the C5.0 decision-tree algorithm. The health status of Dingdong
reservoir is forecast using the model as a case study. The results show that the reservoir is in a
healthy state corresponding to no warning level. The early-warning health model is feasible and
effective for utilising abundant case resources, and could be used widely in reservoir health
management. The results obtained in this paper are beneficial to the sustainable development and
scientific management of reservoirs.

Keywords: big data; early health warning; water resources data; internet of things.

Reference to this paper should be made as follows: Yue, Q., Liu, F., Song, C., Liang, J., Liu, Y.
and Cao, G. (2020) ‘Exploration and application of the value of big data based on data-driven
techniques for the hydraulic internet of things’, Int. J. Embedded Systems, Vol. 12, No. 1,
pp.106–115.

Copyright © 2020 Inderscience Enterprises Ltd.


Exploration and application of the value of big data based on data-driven techniques 107

Biographical notes: Qiang Yue is an Associate Professor in the Shandong Agricultural


University. He received his Bachelor’s in Survey Engineering from the Shandong Mining
Institute in 1991, and PhD in Geotechnical Engineering from the Shandong University of Science
and Technology in 2007. He has published more than 20 technical papers in international
journals and conference proceedings. His research interests include IoT and future
communication technologies.

Fusheng Liu is a Professor in the Shandong Agricultural University. He received his Bachelor’s
in Water Conservancy and Hydropower Engineering from the North China University of Water
Resources and Electric Power in 1985, and PhD in Geotechnical Engineering from the Shandong
University of Science and Technology in 2006. He is currently working as an Academic Leader
of Water Conservancy and Hydropower Engineering in the College of Water Conservancy and
Civil Engineering, Shandong Agricultural University, Taian, China. His research interests
include security and IoT.

Changqing Song is a Professor in the Shandong Agricultural University. He received his Master
degree from the Dalian University of Technology. His current research interests include: IoT,
e-commerce and wireless sensor networks.

Jing Liang is an Associate Professor of College of Information Engineering at Qingdao Binhai


University. She has published more than ten academic papers in the international journals and
conferences. Her research interests include data analysis, internet of things, etc.

Yanmin Liu is a senior undergraduate student of Faculty of Computer Science at Dalhousie


University, Halifax, NS, Canada. Her main research interests include wireless and wired
networks, wireless sensor networks, energy efficient routing protocols, web/mobile application
development.

Guangsheng Cao is a Lecturer of College of Information Engineering at the Qingdao Binhai


University. His research interests include digital communication, machine learning and natural
language processing.

1 Introduction data to provide a scientific basis for associated project


construction management is an important part of the
With the rapid development of information technology,
informatisation program for water resources.
water information infrastructure and application systems are
Reservoirs play an important role in solving the
being applied increasingly to the construction and
contradiction between urban and rural water supply and
management of water-conservancy projects and also to
demand in water shortage areas and improving the aquatic
water-administration management; as a result, the related
ecological environment. As a channel for human and natural
data volume has increased vastly. In general, these ‘big
communication, it is very important to maintain the
data’ resources are gradually changing people’s work and
reservoir health (Xiao and Liang, 2005; Zeng et al., 2015;
lives, and collectively the data has begun to change from a
Rapport, 1989; Jorgensen, 1995; Hu et al., 2008). With the
simple processing object into a basic resource (Buxton
rapid development of information technology, it is urgent to
et al., 2008; Liao et al., 2015). The introduction of big data
integrate the diversified water resources, realise information
technology to the water-conservancy industry, as the basic
sharing and deep data mining, and develops the intelligent
technology of water-conservancy construction and
management system of reservoir information, so as to
administration, has become an inevitable trend.
improve the reservoir management efficiency, the capacity
Water information covers the water-conservancy
and level of economic and social development services.
engineering survey, planning, design, construction,
Many agencies apply geographic information systems
operation management and maintenance. Also covered
and computer simulation in the field of reservoir safety
are flood control, water-resource management,
monitoring. In aspects of reservoir information
water-administration management, and soil and water
management, safety monitoring and flood forecasting, a lot
conservation. Water resources can be in multi-ownership
of useful research and application systems have been carried
and cover multiple categories, so that the associated data
out and developed, such as the remote-sensing information
arise from dispersed sources and in different forms from
system developed by the Canada Flood Emergency
varying application services. These realities can restrict the
Management Agency, the flood assessment system
extensive use of the data and hence diminish their value.
developed by the Australian National University, the
Thus, the collection, storage, transmission, processing and
reservoir-monitoring information-processing system
application of water-conservancy data has become a
developed by Italy and France, the river information
problem and challenge facing the development of water
management system developed by the Tennessee River
conservancy. Enhancing the value of water-conservancy
Authority (Mascolo et al., 2014; Tan et al., 2014), the
108 Q. Yue et al.

national water-resources information management system model. It involves large amounts of online and offline data
developed by the China Nanjing Water Conservancy processing and analysis, mining the implied information
Institute, the dam-safety information management system from the data to support decision making and to realise the
developed by the China National Dam Safety Supervision system’s functions of monitoring, diagnosis, decision,
Centre and the reservoir information developed by the optimisation, etc. Because of the data driven concept, it is
China Changjiang River Water Resources Commission (Wu not necessary to establish a global mathematical model of
et al., 2016; Liu and Li, 2006). the controlled system, therefore it is especially for nonlinear
The above systems are aimed mainly at specific targets and uncertain process control. In recent years, the idea of
for single reservoirs or river basins. As such, there are some data driven has been applied in various fields, such as
common problems because of a lack of unified planning and engineering control, fault diagnosis and multi-information
management, such as poor compatibility, low sharing ability integration technology (Hasan et al., 2017; Kos et al., 2015;
(resulting in ‘information islands’), poor scalability, Lin and Chu, 2017).
unsynchronised data acquisition and storage, and a lack of Data-driven control refers to a controller that does not
mass data-processing capabilities. With the development of contain a mathematical model of the controlled process
modern information technologies such as the Internet of information but only uses the online and offline of the
Things, cloud computing and mobile networks, the types controlled system data and through data processing and
and amounts of data are expanding at an alarming rate. knowledge to design the controller. Data-driven control
Much concern is given to how to organise, store, analyse emphasises the dynamic feedback mechanism between the
and apply such massive data (Meng and Ci, 2013). It is the measured data and the controlled system. On the one hand,
premise and foundation to realise intelligent reservoir the data generated by the control system is ‘injected’ into
management by applying big data technology to reservoir the actual system that is running so that the state of the
management to analyse and forecast the flow of real-time control system reflects the characteristics of the current
monitoring data and to diagnose and predict reservoir health actual system. On the other hand, it can be based on the
ahead of time (Li et al., 2015a). actual system results to adjust the control system of the
To summarise the characteristics of water-conservancy measurement point so that the system itself can adapt to
big data, this paper studies water conservancy driven by changes in the environment.
data and analyses the big data mining mode and its Since there is no fixed model, there is no need for data
application in this field. We adopt the optimal-decision-tree driven to build the model library. As a result, a data-driven
method to study an early-warning system for reservoir controller is relatively simple, fast and robust. For offline
health risks. It initially explores the use of big data methods data analysis and statistics, it can find trends in the data
with water data to enhance the value and intelligent development quickly and easily and so is sensitive to the
management of such data. presence of abnormal data.
Data-driven can not only utilise the real-time
information received and stored by the sensors in the
2 Data-driven and water-conservancy big data application environment effectively, but also co-work with
the model-driven method.
2.1 Data-driven principle
The concept of ‘data driven’ originated from the field of
computing. Its basic idea is from data, not dependent on the

Figure 1 Functional block diagram of system simulation based on data driven


Exploration and application of the value of big data based on data-driven techniques 109

Reservoir intelligent big data management system based on computing, data types and quantity are expanding at an
data driven principle is shown in Figure 1. The management alarming rate, and how to realise the organisation, storage,
system and the reservoir’s actual operation system are analysis and application of massive data of water resources
connected to a closed-loop system. The management system is of keen concern. The application of big data technology
collects the data collected by the internet of things in in reservoir management can carry out analysis of real-time
real-time, concentrates the massive data of the reservoir monitoring data flow and achieve real-time diagnosis and
operation, and analyses the different data according to the prediction of reservoir health, and is the premise and basis
different decision goals to realise the health diagnosis and of intelligent reservoir management.
prediction of the reservoir. The management system issues
commands to one or more control terminals based on the
result of the big data analysis to dynamically control the 3 Key technology of water-conservancy big data
actual system operating status. The control, scheduling,
decision making, diagnosis and prediction of the whole 3.1 Intelligent reservoir-detection system
system are based on massive online data analysis. An intelligent reservoir-detection system requires
three-layer architecture of transparent perception, reliable
2.2 Water-conservancy big data transmission and intelligent processing. The reservoir is
connected with the sensing nodes of the network, which can
The long-term business practice of water-conservancy monitor the water level on the spot, as well as piezometric
information has accumulated a large amount of tube levels and evaporation or rainfall. The number of
heterogeneous independent business data. The development sensing nodes is determined based on the actual needs of the
and application of modern information techniques such as reservoir.
remote sensing, geographic information systems, sensor Each sensing node is composed of sensors, a
networks and radio frequency technology has fully network-sensing terminal, a network transmission terminal,
expanded the space of the water-conservancy information a wireless transmission system, a power system and a
scale and element type, the water conservancy data from the bracket. The bracket is fixed on the dam, and the network
stable mechanism of continuously updated and incremental system, the solar power system and other systems are fixed
data (Li et al., 2015b). Water-conservancy data covers the on the bracket. Many nodes constitute the perception layer
aspects of water and soil environment conservation, of the network, and each node carries out the data
water-resource protection, flood control, water-conservancy transmission through the wireless network. Collecting node
project construction and management, maintenance, etc. to complete the perceptive information on network, and
The water-conservancy data has been gradually showing a collect information from network to Internet through the
multi-source, multi-dimensional, massive and transmission terminal, and transmit to intelligent big data
polymorphism of big data features. Water-conservancy data management platform of the plain reservoir through the
is an important scientific basis for administrative internet. A big data management platform can complete the
decision-making of the water-conservancy industry, and information cleaning, storage, processing and display. The
represents an important resource for benefiting people and processed information can be browsed via a personal
water conservancy. computer, a laptop or a mobile phone
Water-conservancy big data is a basis for engineering
construction, management and identification of emergency
events, which contains high application value, but the 3.2 Distributed storage and processing technology
applications of water data efficiency are not high relative to The use of a relational database and a distributed file system
transportation, energy and communications. At present, the can solve the problems of centralised storage of
management and application has not satisfied with the water-conservancy data and the unified management of both
present status of the traditional isolated fragments of data structured and unstructured data. The Hadoop open-source
storage and management. How to manage efficiently, software framework can not only support large
sharing and applying water-conservancy big data, have data-intensive distributed storage but also has strong batch
become a primary water-conservancy informatisation data-processing and analysis capabilities. It is often used for
development problems and challenges. storage and analysis of offline data, as a supplement to the
The big data of water conservancy has the objective relational database management system. A comparison
value of cognition, development and practice, and the value between Hadoop and a traditional relational database is
of water conservancy needs to be realised in the dynamic given in Table 1.
environment of comprehensive analysis and service of Hadoop divides the application into many small
water-conservancy information. Integrating the internet, portions, each of which can be executed or re-executed on
cloud computing and other advanced technology to manage any node in the cluster. Hadoop provides an HDFS
the massive data efficiently can find and make full use of distributed file system for storing data from all computing
the potential value of the data, making the value to nodes, thereby giving the entire cluster a very high data
maximise the upgrade. bandwidth and meaning that the entire framework can
With the extensive application of modern information automatically handle node failures. The Hadoop architecture
technology such as internet of things, big data and cloud
110 Q. Yue et al.

uses Zookeeper to provide coordination management When the parent node of the prior probability and
services within the cluster, using HBase column database conditional probability distribution of child nodes is known,
storage and management of data via Pig, Hive or Mahout to according to the formula (2), we can calculate the joint
achieve the analysis of data mining. probability distribution of all nodes.
From the point of view of information processing, an
Table 1 Hadoop compared with traditional relational database artificial neural network is used to simplify the abstract
simulation of a complex network composed of
Traditional relational
database
Hadoop interconnected neurons, and different neural networks are
constructed according to different connection methods (Xu
Data size GB TB or PB et al., 2015).
Access Interactive and batch Batch An artificial neural network has the advantages of good
Update Many times to read A written to read adaptability and anti-jamming, and can be used to find the
and write many times hidden values of massive data, especially for vector and
Structure Static model Dynamic model discrete data analysis.
Integrity High Low In the artificial neural network, input vector is set to
X = (x1, x2, …, xn), the output vector is set to
Lateral spread Nonlinear Linear
Y = (y1, y2, …, ym), the connection weight of each input to
Structured Structured data Semi-structured and the corresponding neurons of input layer is set to
data sets unstructured data
ωij = (i = 1, 2, …, n; j = 1, 2, …, m). If the threshold of each
neuron is respectively θj = (j = 1, 2, …, m) each neuron
3.3 Large data-analysis technology output yj = (j = 1, 2, …, m) respectively:
n
Traditional data analysis tools for only simple statistics,
query and management data and other aspects of processing, yi  f   ω x  θ  , j  1, 2, ..., m
i 1
ij i j (3)
cannot mine the potential value of information in detail.
Large data technology in the databases of massive data can The connection weight matrix W which is made up of all the
automatically extract the implicit information, accessing the connection weights ωij is as follows:
application value of the law and the model (Lusher et al.,
2014). Large data-analysis technology is based on the  ω11 ω12  ω1m 
ω ω22  ω2 m 
traditional analysis method and integrates the fields of
W 
21
(4)
multidisciplinary technology, including statistical analysis,      
pattern recognition, machines learn and so on. Of this type  
 ωn1 ωn 2  ωnm 
are Bayesian networks, artificial neural networks and
decision trees (Teng et al., 2015; Zhang and Zhong, 2013; Decision tree algorithm is based on the properties of
Franco and Carrasco, 2012; Tawalbeh et al., 2016). massive disorderly data sample statistics and finishing, to
A Bayesian network is used to represent the variable construct the decision tree to find big data contains valuable
connections between the probabilities of complex causality information, provide a basis for judgment and prediction in
graph patterns, through reflects the probability that the policy makers, its characteristic is not a fixed specific
relationships between data, find credible potential function form, and it does not need to sample data prior
dependency (Pearl, 1991). A Bayesian network is suitable distribution hypothesis.
for analysing incomplete data and can deduce dependencies Let S be a collection of n data samples; the decision
from incomplete or inaccurate data. attribute has m different values, so we define m different
Probability function of Bayesian network node is the categories of Pi(i = 1, 2, …, m).
conditional probability distribution function, denoted by Let ni be the number of samples in the category Pi, and
P ( Ai | PπAi ). Among them Ai(i = 1, 2, … n), Said the i node, the mathematical expectation of its information quantity is
π Ai says the i parent node. Assuming that a given the information entropy:
conditions are independent of each other; the joint m
ni  ni 
probability distribution function is as follows: I  p1 , p2 , ..., pm     n log
i 1
2  
n
(5)
n
P  A1 , A2 , ..., Ai    P  A | π  A 
i 1
i i (1) Set {a1, a2, …, ak} to satisfy the condition attribute A, where
A can be decomposed into a set of k sub S for {C1, C2, …,
By the Bayesian chaining rule, the joint probability Ck}. A is the test attribute, and the branch of P of the class
distribution can be written as: collection corresponds to the subset. Let nij be the number
of samples in the subset Cj of Pi, by entropy as A divided
n
into subsets:
P  A1 , A2 , ..., Ai    P  A | A , A , ..., A
i 1
i 1 2 i 1  (2)
Exploration and application of the value of big data based on data-driven techniques 111

k
 n1 j  n2 j    nmj  The pruning error is greater than the parent-node error of
E ( A)    
j 1
n
 I  p1 j , p2 j , ..., pmj   (6)

the leaf node, thus pruning. The branch structure is
constructed from top to bottom: nodes represent attribute
m tests, branches represent test output and leaf nodes represent
nij  nij 
I  p1 j , p2 j , ..., pmj    n
i 1 j
log 2  
 nj 
(7) class distribution. A decision tree is pruned layer by layer
from the node to the top, as shown in Figure 2.
There are many influencing factors involved in the
Here, (n1j + n2j + … + nmj) / n is the weight of the first j reservoir health, the magnanimity, randomness, uncertainty
subsets, and I(p1, p2, …, pm) is the expected information for and ambiguity of the data make the health diagnosis and
subset Cj. prediction difficult to adapt to the efficient operation of the
According to the information gain (8), the information reservoir (Yue et al., 2016; Li et al., 2007; Gao et al., 2005;
gain obtained by mathematical expectation and information Ren and Gao, 2014; Xie et al., 2014; Chen et al., 2015), the
entropy. traditional data analysis method cannot explore the potential
Gain( A)  I  p1 , p2 , ..., pm   E ( A) (8) value information deep in the massive data, and an efficient
and practical data mining method under big data
The information gain rate is calculated as: environment needs to be studied.
GainRatio( A)  Gain( A) / SplitI ( A) (9)
3.4 Intelligent management
m
nij  pij 
SplitI ( A)   i 1
nj
log 2 
 nj 
 (10) Intelligent hydraulics is the advanced stage of hydraulic
informatisation, through the intelligent equipment,
three-dimensional sense of hydraulic engineering
The decision-tree growth is based on the information gain information in all directions, the mass perception, storage,
ratio, found the best grouping variable and break point, from processing and analysis of data transmission are achieved,
top to bottom in turn to construct branch structure, node managing all aspects of hydraulic engineering in a more
attribute that test branch represents the outcome of the test, sophisticated and dynamic way.
the leaf nodes represent category distribution. The internet of things can perceive reservoir water level,
Carrying out decision-tree pruning from the node to the piezometric level, evaporation and rainfall and other
top layer by layer pruning, the first i node contains Ni information, then input to intelligent reservoir management
observations, including Ei error prediction. The error rate system for data management and mining through wireless
fi = Ei / Ni, the true error i of node ei, has a confidence level transmission. Using fuzzy and hierarchy analyses, according
of 1 – . to the health evaluation indicator system of reservoir, we
 fi  ei  can conduct an accurate diagnosis of the reservoir health.
P  z / 2   1 (11) Based on the decision tree C5.0 technology, which is
 fi 1  fi  / Ni  suitable for big data analysis, the reservoir health is
established according to the relevant information of the
The upper bound of the confidence interval for the true error
reservoir culture. The early warning rules are obtained, and
ei of node i is:
the health status of the reservoir is forecast.
fi 1  fi 
ei  fi  z / 2 (12)
Ni

Figure 2 Classified forecasting decision tree


112 Q. Yue et al.

The integrated reservoir data warehouse containing while guaranteeing water supply and reservoir safety, as the
hydrometeorological, water and work conditions and basis for guiding reservoir dispatching. The decision tree is
scheduling operations and other types of data is used to used to analyse the water, rain, work condition data and
seamlessly connecting the internet of things data monitoring scheduling operation data of the plain reservoir over the
platform. The big data mining technology is used to years, we summarised the scheduling rules set based on the
summarise the opening and closing time, flow and water experience of reservoir science management, and then
delivery capacity of output and input pump stations, and combined the real-time information and scheduling rules to
then to formulate the opening and closing time, flow and generate the optimal scheduling scheme to ensure the
water delivery capacity of out-put and in-put pump stations efficient safe operation.

Figure 3 Decision-tree structure for health early-warning model of Dingdong reservoir


Exploration and application of the value of big data based on data-driven techniques 113

4 Engineering application include indicators such as reservoir safety, ecological


environment, social function and sustainability status. From
A reservoir is a channel for communication between people
long-term observational data regarding Dingdong reservoir,
and nature (Costanza et al., 1999; Xu et al., 2005).
the main indicators affecting reservoir health include the
Reservoir health warning is a scientific management mode.
piezometric-level index C31, the water-quality safety index
Health warning reservoir not only has an important
C7, the underground water level index C10 and the
application value, but also has become a core content and
sediment-change index C14.
hot spot of current ecological environment comprehensive
We use the C5.0 decision-tree method for warning
evaluation.
models, taking into account the characteristics of reservoir
According to the connotation of health reservoir (Chu
data health warning affair, choosing binary tree structure to
et al., 2014; Zhang et al., 2012), evaluating the health of a
analyse the health condition, using the data of 1998–2005 as
reservoir involves four elements, namely dam safety,
the training data, building the model with training data;
ecological environment, social function and state
using the data of 2011–2015 as the test data, it is used for
continuous. The principles behind establishing an easily
testing model. According to the principle of maximum
quantifiable health-warning index system for a reservoir are
information gain, piezometric level C31 as a root node test
scientific, systematic, gradation and operability. The health
attributes, choosing node properties, creating branch from
of a reservoir is affected by many factors, and traditional
up to down, the warning decision tree and warning rules
data-analysis methods cannot deeply mine potential
after pruning of Dingdong reservoir are given in Table 2
valuable information in such vast amounts of data. Instead,
and shown Figure 3. Health system of early-warning rules is
an efficient and practical data-analysis method for a big data
given in Table 3. Using the testing dataset to test the
environment should be used.
training model, a good accuracy rate of 91.92% is achieved.
The Dingdong reservoir began filling in September
According to the forecast index values for the coming
1997. Since then, a large amount of observational data has
month, using the decision-tree model to make predictions,
been accumulated through surveys and statistical analysis of
the state of Dingdong reservoir is ‘healthy’, the alarm state
historical data. The training data come from between 1998
is ‘no warning’ and early-warning signals for ‘green’ in a
and 2015. Training completed, making up the sample
month.
aggregate, and then entering the reservoir comprehensive
database. Factors affecting the health of the reservoir

Table 2 Calculation results for conditional attributes of plain reservoir health

Serial no. Conditions Conditional entropy Information gain Information entropy Information gain rate
1 Piezometric level index C31 <= 42.5 0.845052 0.472189 0.472189 1.000000
2 Water-quality safety index 1.121332 0.195909 0.195909 1.000000
C7 >= 37.2
3 Sediment-change index C14 <=7 5.9 0.796942 0.520300 0.872475 0.596349
4 Underground water level index 0.902033 0.415208 0.987526 0.420453
C10 <= 59.4
5 Piezometric level index C31 <= 74.5 0.941694 0.375548 0.999338 0.375797
6 Sediment-change index C14 <= 77.6 0.869944 0.447298 0.782039 0.571964
7 Sediment-change index C14 <= 77.2 0.905997 0.411244 0.815256 0.504436

Table 3 Early-warning rules regarding plain reservoir health

Serial no. Conditions Health condition Warning level Signal


1 Piezometric level index C31 > 74.5 and underground water level index Healthy No Green
C10 > 59.4 and sediment-change index C14 > 77.6
2 Piezometric level index C31 > 42.5 and sediment-change index Sub-health Light Yellow
C14 <= 75.9
3 42.5 < piezometric level index C31 <= 74.5 Sub-health Light Yellow
4 Piezometric level index C31 > 42.5 and underground water level index Sub-health Light Yellow
C10 <= 59.4
5 Piezometric level index C31 <= 42.5 and water-quality safety index Lesion Heavy Orange
C7 > 37.2
6 Piezometric level index C31 <= 42.5 and water-quality safety index Risk Severe Red
C7 <= 37.2
114 Q. Yue et al.

5 Conclusions and recommendations Chu, K., Kan, L., Hua, Z. and Liu, X. (2014) ‘Construction and
application of an indicator system for assessment of river
Big data technology promotes the rapid development of ecosystem in plain tributary networks’, Journal of
water-conservancy data acquisition, management, and Hydroelectric Engineering, Vol. 33, No. 5, pp.138–144.
application. Promoting application value from huge Costanza, R., D’Arge, R., Groot, R.D., Farber, S., Grasso, M. and
amounts of data and using the data to inform decisions Hannon, B. (1999) ‘The value of the world’s ecosystem
represents a new approach to scientific decision-making. services and natural capital’, World Environment, Vol. 387,
No. 1, pp.3–15.
The big data approach is from the point of view of data
applications, using big data resources efficiency and Franco, A.A. and Carrasco, O.J.A. (2012) ‘Building fast decision
trees from large training sets’, Intelligent Data Analysis, |Vol.
recycling. With the increasing influence of big data
16, No. 4, pp.649–664.
technology on management decision-making, situations that
Gao, Y.S., Wang, H. and Wang, F. (2005) ‘The establishment of
until now have depended on intuition to make decisions will reservoir health connotation and evaluation index system’,
be changed completely. Water Conservancy Development Research, Vol. 5, No. 9,
Data mining and upgrading by data driven provides a pp.1–6.
more rapid and effective interpretation basis for the Hasan, K.S., Antonio, J.K. and Radhakrishnan, S. (2017) ‘A
scientific management of water-conservancy projects. The model-driven approach for predicting and analysing the
value enhancement of water-conservancy big data based on execution efficiency of multi-core processing’, International
decision-tree algorithm centrally reflects the regularity of Journal of Computational Science & Engineering, Vol. 14,
the massive water-conservancy historical data. The concept No. 2, pp.105–112.
is clear, namely that the results can be used conveniently in Hu, X.X., Yang, X.H., Li, J.Q. and Geng, L.H. (2008) ‘Set pair
operation, thereby solving the current problems in reservoir analysis model for river health system assessment’, Systems
Engineering-Theory & Practice, Vol. 28, No. 5, pp.164–170
operational management. + 176.
Based on the internet of things intelligent monitoring
Jorgensen, S.E. (1995) ‘Energy and ecological buffer capacities as
system, the key data affecting the health operation of measures of ecosystem health’, Ecosystem Health, Vol. 1,
reservoirs is obtained through the dynamic real-time access, No. 3, pp.150–160.
and the dynamic data driven and big data analysis improves Kos, A., Sedlar, U., Volk, M., Peternel, K., Guna, J. and
the rapid response and decision-making of the reservoir Kovačić, A. (2015) ‘Real-time health visualisation and
health and the accuracy and real-time of optimal scheduling. actuation platform’, International Journal of Embedded
The discovery of big data value cannot be separated Systems, Vol. 7, No. 2, pp.104–112.
from the value orientation and application direction of Li, H., Bao, Y.Q., Li, S.L. and Zhang, D.Y. (2015a) ‘Data science
artificial identification. Manual identification is the key to and engineering for structural health monitoring’,
making big data work. An important task for the application Engineering Mechanics, Vol. 32, No. 8, pp.1–7.
of water resources is how to make the massive data of water Li, C.M., Zeng, Y., Wang, H., Zhang, L., University, H. and
resources applicable to more fields. Center, W.I. (2015b) ‘Construction conceptions and key
technologies about water resources informatization during
13th five-year’, Water Resources Informatization, Vol. 124,
No. 1, pp.9–13.
Acknowledgements Li, J.B., Dong, Z.C., Wang, H.C., Sun, Z.F. and Wang, X. (2007)
This work is supported by the National Key Technology ‘Discussion on healthy operation of reservoir and health of
river’, Water Resources & Hydropower Engineering, Vol. 38,
Research and Development Program of the Ministry of
No. 9, pp.12–15.
Science and Technology of China (2015BAB07B05), the
Liao, Z., Yin, Q., Huang, Y. and Sheng, L. (2015) ‘Management
National Natural Science Foundations of China (No.
and application of mobile big data’, International Journal of
51574156), the Water Conservancy Scientific and Technical Embedded Systems, Vol. 17, No. 1, pp.63–71.
Program of Shandong Province (SDSLKY201305), the Key
Lin, W.T. and Chu, C.P. (2017) ‘A fast and parallel algorithm for
Research and Development Project of Shandong Province frequent pattern mining from big data in many-task
(No. 2018GNC110023) and the Significant Application of environments’, International Journal of High Performance
Agriculture Technology innovation program of Shandong Computing & Networking, Vol. 10, No. 3, pp.157–183.
Province (SDNYCX1531963). Liu, D.Z. and Li, J.J. (2006) ‘Design and implementation of earth
rock fill dam security monitoring software system’, Journal
of Dalian University of Technology, Vol. 46, No. 3,
References pp.407–412.
Lusher, S.J., Mcguire, R., van Schaik, R.C., Nicholson, C.D. and
Buxton, B., Goldston, D., Doctorow, C. and Waldrop, M. (2008) De, V.J. (2014) ‘Data-driven medicinal chemistry in the era
‘Big data: science in the petabyte era’, Nature, Vol. 455, of big data’, Drug Discovery Today, Vol. 19, No. 7,
No. 7209, pp.121–136. pp.859–868.
Chen, J.X., Fang, G.H., Jiang, R.F., Huang, X.F. and Chen, Y. Mascolo, L., Nico, G., Pasquale, A.D. and Pitullo, A. (2014) ‘Use
(2015) ‘Evaluation on the health of river ecosystem based on of advanced SAR monitoring techniques for the assessment of
the cascade development of reservoirs’, Advanced Materials the behaviour of old embankment dams’, SPIE Remote
Research, Vol. 36, No. 5, pp.2956–2963. Sensing. International Society for Optics and Photonics,
pp.156–170, 92450N-92450N-10.
Exploration and application of the value of big data based on data-driven techniques 115

Meng, X.F. and Ci, X. (2013) ‘Big data management: concepts, Xiao, J.F. and Liang, H. (2005) ‘Ecological concept in reservoir
techniques and challenges’, Journal of Computer Research development’, Water Resources and Hydropower
and Development, Vol. 50, No. 1, pp.146–169. Engineering, Vol. 35, No. 11, pp.8–12.
Pearl, J. (1991) ‘Probabilistic reasoning in intelligent systems: Xie, F., Gu, J.G. and Lin, Z.W. (2014) ‘Assessment of aquatic
networks of plausible inference’, Computer Science Artificial ecosystem health based on principal component analysis with
Intelligence, Vol. 70, No. 2, pp.1022–1027. entropy weight: a case study of Waning reservoir’, Chinese
Rapport, D.J. (1989) ‘What constitutes ecosystem health?’, Journal of Applied Ecology, Vol. 25, No. 6, pp.1773–1779.
Perspectives in Biology & Medicine, Vol. 33, No. 1, Xu, F.L., Zhao, Z.Y., Zhan, W., Zhao, S.S., Dawson, R.W. and
pp.120–132. Tao, S. (2005) ‘An ecosystem health index methodology
Ren, Z.G. and Gao, Y.S. (2014) ‘A preliminary study on the (EHIM) for lake ecosystem health assessment’, Ecological
comprehensive evaluation of reservoir health’, Haihe Water Modelling, pp.327–339.
Resources, Vol. 17, No. 2, pp.56–60. Xu, W., Dong, Z., Fu, X., Tan, J., Liu, Q. and Du, F. (2015) ‘Early
Tan, J., Chen, S., Weng, Y. and Yang, G. (2014) ‘Study on dam warning of river ecosystem health based on BP artificial
safety management and emergency response system’, Yangtze neural networks’, Journal of Hohai University, Vol. 43,
River, Vol. 27, No. 4, pp.102–106. No. 1, pp.54–59.
Tawalbeh, L., Haddad, Y., Khamis, O., Benkhelifa, E., Yue, Q., Liu, F.S. and Liu, Z.Q. (2016) ‘Comprehensive
Jararweh, Y. and Aldosari, F. (2016) ‘Efficient and secure assessment of plain reservoir health based on fuzzy and
software-defined mobile cloud computing infrastructure’, hierarchy analyses’, Hydro-Science and Engineering, Vol. 16,
International Journal of High Performance Computing & No. 2, pp.62–68.
Networking, Vol. 9, No. 4, pp.328–341. Zeng, G., Binliang, K.E., Chen, G. and Lin, H.U. (2015)
Teng, F., Yang, H., Li, T. and Fan, X. (2015) ‘MUS: a novel ‘Application of early warning coupled model of river health
deadline-constrained scheduling algorithm for hadoop’, evaluation in Dongtiaoxi river basin’, Journal of Water
International Journal of Computational Science & Resources & Water Engineering, Vol. 26, No. 1, pp.77–81.
Engineering, Vol. 11, No. 4, pp.360–367. Zhang, H.Y., Cai, Q.H., Kong, L.H. and Wang, L. (2012)
Wu, J., Yi, M.Y., Zhang, J.H. and Zhang, Q.L. (2016) ‘Design and ‘Comprehensive assessment of Danjiangkou reservoir
application of an information management system for ecosystem health’, Chinese Journal of Applied Ecology,
structural behaviour monitoring based on big data Vol. 18, No. 1, pp.86–92.
technology’, Journal of Hunan University, Vol. 43, No. 9, Zhang, W.T. and Zhong, Y.F. (2013) The Real Case Essence of
pp.76–81. IBM SPSS Data Analysis and Mining, pp.210–225, Tsinghua
University Press, Beijing.

You might also like