42 (2008) 296– 306

Available at

journal homepage:

Improvement of remote monitoring on water quality in
a subtropical reservoir by incorporating grammatical
evolution with parallel genetic algorithms into
satellite imagery
Li Chena,, Chih-Hung Tanb, Shuh-Ji Kaoc, Tai-Sheng Wanga

Department of Civil Engineering and Engineering Informatics, Chung Hua University, Hsinchu, Taiwan 30067, ROC
Information Division, Agricultural Engineering Research Center, Taoyuan, Taiwan 32061, ROC
Research Center for Environmental Change, Academia Sinica, Taipei, Taiwan 11529, ROC

ar t ic l e i n f o

abs tra ct

Article history:

Parallel GEGA was constructed by incorporating grammatical evolution (GE) into the

Received 22 April 2007

parallel genetic algorithm (GA) to improve reservoir water quality monitoring based on

Received in revised form

remote sensing images. A cruise was conducted to ground-truth chlorophyll-a (Chl-a)

12 July 2007

concentration longitudinally along the Feitsui Reservoir, the primary water supply for

Accepted 12 July 2007

Taipei City in Taiwan. Empirical functions with multiple spectral parameters from the

Available online 20 July 2007

Landsat 7 Enhanced Thematic Mapper (ETM+) data were constructed. The GE, an

Grammatical evolution
Parallel genetic algorithm
Water quality monitoring
Remote-sensed imagery

evolutionary automatic programming type system, automatically discovers complex
nonlinear mathematical relationships among observed Chl-a concentrations and remotesensed imageries. A GA was used afterward with GE to optimize the appropriate function
type. Various parallel subpopulations were processed to enhance search efficiency during
the optimization procedure with GA. Compared with a traditional linear multiple
regression (LMR), the performance of parallel GEGA was found to be better than that of
the traditional LMR model with lower estimating errors.
& 2007 Elsevier Ltd. All rights reserved.



Chlorophyll-a (Chl-a) is used as an indication of the intensity
of algae growth and is one of the major factors affecting
water quality, which can produce visible changes in the
surface of waters (Ritchie et al., 1990). Observed Chl-a
concentrations can help to characterize the trophic state of
an aquatic ecosystem through various numerical schemes
(Carlson, 1977). Its concentrations in surface waters have
traditionally been measured spectrophotometrically after
samples were collected, preserved and transported to the
laboratory (APHA, 1992). Although this approach is well
accepted and widely utilized, the labor-intensive and
time-consuming field works do not allow researchers to 
Corresponding author. Tel.: +886 03 518 6718.

E-mail address: (L. Chen).
0043-1354/$ - see front matter & 2007 Elsevier Ltd. All rights reserved.

construct contemporaneous Chl-a maps at full spatial
scale (Allee and Johnson, 1999). Accordingly, characterizing
the spatial variation of the trophic state within a vast
water body by using a limited number of field data is often
To improve the traditional data collection method, utilization of remote sensing data for water quality assessment has
been investigated. The application of remote sensing to
assess freshwater production has escalated recently due to
its capability of scanning wide water bodies within a short
time period (Harrington et al., 1992). Water quality assessment using satellite data has been carried out since the first
remote sensing satellite Landsat Multi-Spectral Scanner
(MSS) became operational (Thiemann and Kaufmann, 2000).

0} S ¼ /exprS And P can be represented as (1) /exprS:: (2) /opS:: (3) /pre-opS:: (4) /varS:: 2. 2. Backus–Naur form BNF is a notation for expressing the grammar of a language in the form of production rules (Naur. which are items that can appear in the language. its quantitative use is still a difficult task (Dekker et al. BNF grammars consist of so-called terminals. pre_op} T ¼ {Sin. It has an advantage over traditional statistical methods because it is distribution free. Cos. Statistical applications are used traditionally to establish algorithms for predictions of various water quality variables. A mapping process is employed to generate programs in any language by using the binary strings to select production rules (O’Neill and Ryan. Log.. P. Below is an example BNF. and S a start symbol that is a member of N. ¼ /exprS/opS/exprSyyyyyrule 0 |(/exprS/opS/exprS)yyyyyrule 1 |/pre-opS (/exprS)yyyyyrule 2 |/varSyyyyyrule 3 ¼ +yyyyyrule 0 |yyyyyrule 1 |/yyyyyrule 2 |*yyyyyrule 3 ¼ Sin yyyyyrule 0 |Cos yyyyyrule 1 |Log yyyyyrule 2 ¼ X yyyyyrule 0 |1. However. sophisticated regression models must go through time-consuming trial and error procedures so that the correct regression type can be obtained. 2. Constant 1. no prior knowledge is needed about the statistical distribution of the data like the back-propagation network (BPN) (Kishore et al. *.. Then. which can be expanded into one or more terminals and nonterminals.. . which is used to optimize the appropriate function type generated by GE. it is probably the most complicated one to map correctly. Because of the complex nonlinear relationship between several bands of satellite and Chl-a concentration in a reservoir.. at least if a universal type of relation is desired (Zhang et al. This paper is intended to improve the monitoring techniques of using remote-sensed data to estimate water quality parameters. Two models including parallel GEGA and traditional linear multi-regression (LMR) were analyzed and compared in the case study of the Feitsui 297 42 (2008) 296 – 306 Reservoir.ARTICLE IN PRESS WAT E R R E S E A R C H Although the Landsat Thematic Mapper (TM) sensor. nonlinear transfer functions are often found when one relates water quality variables to the satellite imagery observations (Zhang et al. is able to present a synoptic monitoring of water quality problems.g.0 yyyyyrule 1 Mapping process The genotype is used to map the start symbol onto terminals by reading codons of 8 bits to generate a corresponding integer value from which an appropriate production rule is . and nonterminals. where N ¼ {expr. a new system identified method called parallel GEGA is presented for the first time. For example. On the other hand. The details of this procedure are described as follows. P a set of production rules that maps the elements of N to T. etc. the choices are delimited with the ‘‘|’’ symbol. in which N is a set of nonterminals. 1999).e. The integer values are used in a mapping function to select an appropriate production rule from the BNF definition. The last section presents a discussion about the findings of this work. 2002). i. /. which is the major water supply source in northern Taiwan. 2000).2. Evolutionary algorithms. we use the parallel structure to GEGA to increase the diversity of solutions by GA for obtaining the optimal equation between one remote-sensed imagery and water quality parameter efficiently. it is well known that the BPN is considered as a nonlinear black-box model. as well as future research directions. +. In the next section. T. 2003). the newly developed grammatical evolution (GE) performs the evolutionary process on a variable-length binary string that is more flexible than a dynamic tree structure.. Chl-a is one of the most important variables of great ecological significance examined by many researchers in this discipline. T a set of terminals. 2001). the numbers generated always representing one of the rules that can be used at that time (Elseth and Baumgardner. Chen (2003) pointed out that constructing the data structure of a dynamic tree of the GP could be a difficult task while applying GP to estimate the reservoir trophic state using remote-sensed data. In addition. After that. yet.. with each codon representing an integer value where codons are consecutive groups of 8 bits. 2003). A grammar can be represented by the tuple {N. . When there are a number of productions that can be applied to one particular N. Grammatical evolution GE is an evolutionary automatic programming type system that combines a variable-length binary string genome and a BNF (Backus–Naur form) grammar to evolve interesting structures. +. e. Nevertheless. 1995). we begin by introducing the GE algorithm and a discussion of the advantages of a real-coded representation. such as genetic programming (GP). and it is not unusual for it to be criticized as not enhancing our understanding of the physical mechanisms because of its complex weighting coefficients and numerous other parameters. S}. Variable-length binary string genomes are used. Variable X. this realcoded data structure of GE is convenient to incorporate with the developed real-coded genetic algorithm (GA). have been used with much success for the automatic generation of programs or equations between the inputs and outputs.. op. Generally.1. Regression analysis is popular in the formulation of predictive models (Allee and Johnson. 1963). which provides the longest continuous dataset of high-spatialresolution imagery of the Earth. it is hard to choose the proper size of a tree that can express a meaningful equation in advance.

i. we wrap the individual and reuse the codons. concentrating on the start symbol /exprS. (5) The mapping continues until eventually we are left with the following expression: sinðXÞ  cosðXÞ þ 1:0. The leftmost /exprS will now be replaced with /exprS/opS/exprS to give hexprihopihexprihopihexpri. the system traverses the genome. selected by using the following mapping function: Rule ¼ ðcodon integer valueÞ MOD ðnumber of rules for the current nonterminalÞ ð1Þ Considering the following rule. as opposed to binary-coded GA with 0–1 vector chromosomes.. which has been observed in many organisms (Elseth and Baumgardner. This technique of wrapping the individual draws inspiration from the gene-overlapping phenomenon. rule (0). 1 2 3 4 5 6 7 8 9 10 11 12 13 14 8-bit binary codon Integer value Mapping function BNF grammars 11001000 10100000 11001110 01100000 00011011 01001000 01101011 00111110 00010110 00110111 01011000 01100100 11001011 00101001 200 160 206 96 27 72 107 62 22 55 88 100 203 41 200 MOD 4 ¼ 0 160 MOD 4 ¼ 0 206 MOD 4 ¼ 2 96 MOD 3 ¼ 0 27 MOD 4 ¼ 3 72 MOD 2 ¼ 0 107 MOD 4 ¼ 3 62 MOD 4 ¼ 2 22 MOD 3 ¼ 1 55 MOD 4 ¼ 3 88 MOD 2 ¼ 0 100 MOD 4 ¼ 0 203 MOD 4 ¼ 3 41 MOD 2 ¼ 1 /exprS/opS/exprS /exprS/opS/exprS/opS/exprS /pre-opS(/exprS)/opS/exprS/opS/exprS sin(/exprS)/opS/exprS/opS/exprS sin(/varS)/opS/exprS/opS/exprS sin(X)/opS/exprS/opS/exprS sin(X)*/exprS/opS/exprS sin(X)*/pre-opS(/exprS)/opS/exprS sin(X)*cos(/exprS)/opS/exprS sin(X)*cos(/varS)/opS/exprS sin(X)*cos(X)/opS/exprS sin(X)*cos(X)+/exprS sin(X)*cos(X)+/varS sin(X)*cos(X)+1. It is very Table 1 – Example of each codon converted into corresponding BNF grammar No. in this case. (2) Continuing with the first /exprS. consider the individual in Table 1. Thus. To make this choice. We have the following: sinðhexpriÞhopihexprihopihexpri . It is possible for individuals to run out of codons and. If we assume that the codon being read produces the integer 6.. we can see that there are four productions to choose from. we have 200 MOD 4 ¼ 0. we revised it as a real-coded representation. The standard decode of the binary 11001000 is 1  27+1  26+0  25+0  24+1  23+0  22+0  21+0  20. we have the same choice for the first /exprS by reading the next codon value 206. there are four production rules to select from: /opS:: (3) ¼ +yyyyyrule 0 |yyyyyrule 1 |/yyyyyrule 2 |*yyyyyrule 3 hpreopiðhexpriÞhopihexprihopihexpri. In this way. 1995). The real numbers imply that each chromosome is a real-valued vector. For example. Each time a production rule has to be selected to map from a nonterminal. meaning we must take the zeroth production. i.e. another codon is read. given the nonterminal op. the individual in question is given the lowest possible fitness value. This number will then be used to decide which production rule to use according to Eq. then (4) 6 MOD 4 ¼ 2 would select /opS as rule 2: /. It is possible that an incomplete mapping could occur even after several wrapping events and. where /pre-opS becomes sin.. so that /exprS is now replaced with hexprihopihexpri. we read the first codon from the chromosome ‘‘11001000’’ and use it to generate a number ‘‘200’’. a similar choice must Again. (1) in BNF. rule 0.e. Because there is a problem that only integers can be presented by using the binary coding scheme. the result being the application of rule 2 to give Now. which equals to 200. the leftmost /pre-opS will be determined by the codon value 96 that gives us rule 0. .e.ARTICLE IN PRESS 298 WA T E R R E S E A R C H 42 (2008) 296– 306 be made by reading the next codon value 160 and again using the given formula we get 160 MOD 4 ¼ 0.0 . The decoding process is described as follows: (1) First. There are fourteen 8-bit binary codons in one string. Notice that if there had been any extra codons. always starting from the leftmost nonterminal.. i. they would have been simply ignored during the genotype-to-phenotype mapping process. in this case.

1. 3.2. has been recognized as a serious failure mode for GAs (Eshelman and Schaffer. The exploratory degree increases clockwise. Each subpopulation uses three genetic operators including linear ranking selection. real-coded GAs have advantages over binary-coded GAs (Chang and Chen. we only considered realcoded GAs. 2005). Each subpopulation evolves by genetic operations in parallel with the other. The most important advantage of subpopulations is the enhanced diversity among the subpopulations (Herrera and Lozano. starting at the lowest E1 and ending at the highest E4. However. useful and efficient to generate the real-number constants and coefficients shown in these output equations. GE combined with a parallel genetic algorithm 3. With GA with parallel structure The conventional GA is likely to be trapped in a region that does not contain the global optimum.. According to a couple of our previous works. 2000). with three dimensions was presented in Chen and Chang (2006). The front side is devoted to exploration. called HDGA. The other side (the rear side) is for exploitation. Chang et al. blend crossover and Gaussian mutation. The topology includes two important different sides. This revision makes GE combined with a real-coded GA very easily described as follows. starting at the lowest e1 and finishing at the highest e4. 1998. 3.ARTICLE IN PRESS WAT E R R E S E A R C H premature convergence. called Rear Side (Exploitation) e4 e1 + e4 e1 - E1 E4 + E4 E1 e3 e3 e2 E3 E2 E3 e2 E2 Front Side (Exploration) e4 e1 E1 E4 e3 E3 e2 E2 Fig. There are four subpopulations E1–E4. (c) expansion. 299 42 (2008) 296 – 306 Topology of the parallel GA A hypercube topology distributed GA.1. (b) refinement. 1991). A multipopulation GA divides a single population into smaller subpopulations. it is required to convert the real-numbers to integers in mapping chromosomes to the BNF. The exploitation degree increases clockwise. Hence. to which exploratory crossover operators are applied. 1975) have been developed into a powerful optimization approach. There are four subpopulations e1–e4. as shown in Fig. in this study. while maintaining a limited but powerful interaction between all subpopulations. GAs originating in the mid-1970s (Holland. . This problem. and exploitative crossover operators are used. 1 – Structure of an HDGA: (a) basic topology. A principal difference between optimization using a GA versus more traditional methods is that the decision space is searched from an entire population of potential designs.

third the expansion migrations and then the sequence starts again.e.e.3.. 2. The second effect is ‘‘expansion’’ (Fig. to generate the optimal relationship among inputs and outputs automatically.4. Finally. the best individual of each subpopulation is sent toward the corresponding subpopulation every five generations. In other words. i. a GA was incorporated with this GE to optimize the objective value of those functions. Particularly. 3. i. First. 1c). 1a). i. or between two exploitative subpopulations from a lower degree to a higher one. A case study in an oligotrophic/ mesotrophic reservoir As mentioned earlier.. a GE was employed to transfer the realcoded string through BNF grammars to mathematical function. i.. In order to improve the reliability and accuracy. 4. called GEGA. a parallel multi-resolution is obtained by using the crossover operation. . 2 – Three types of migration in an HDGA: (a) refinement migrations. second the refinement/ expansion migrations. the GA was used as a search strategy to determine the most proper relationship among remotely sensed data and water quality.5. e4 Fig. 3. 3. or between two exploratory subpopulations from a higher degree to a lower one. two effects are introduced. Further. which makes migrations in the opposite direction.e. from ei to ei+1. 42 (2008) 296– 306 first the refinement migrations. (c) expansion migrations. which makes migrations from an exploratory subpopulation toward an exploitative one. the Landsat 7 Enhanced Thematic Mapper (ETM+) data were used to map the spatial distribution e4 e1 E4 E4 E1 e3 E1 e3 e2 E3 E3 E2 e4 e2 E2 e1 E1 E4 e3 E3 e1 e2 E2 Fig. as shown in Fig. which allows a diversified search (reliability) and an effective local tuning (accuracy) to be achieved simultaneously (Fig. The sequence of application is from left to right. from Ei+1 to Ei.ARTICLE IN PRESS 300 WA T E R R E S E A R C H this structure. 3 shows a combination of GE and GA. The data from several bands of remotely sensed imagery were used in the GE as inputs to estimate the water quality in the reservoir. (b) ref/exp migrations. and each subsequent migration takes place along a different dimension of the hypercube.e.. this GEGA was implemented as a parallel structure to improve the searching efficiency and prevent premature convergence during the optimization. The first effect is ‘‘refinement’’ (Fig. from Ei to ei. Parallel GA incorporated with GE Refinement and expansion Migration An emigration model is one in which migrants are sent only toward immediate neighbors along a dimension of the hypercube. 1b).

The relationship between observed Chl-a and corresponding image data was constructed by using a parallel GEGA presented above. 2006).. ortho-rectified air photos of the Feitsui Reservoir area with a spatial resolution of 25 cm. gathering water quality data once a month.1. 4. 4) were analyzed for Chl-a (mg/L) before noontime when the Landsat satellite overpasses.02 mg/L.8 m) samples longitudinally along the Feitsui Reservoir (Fig. The ground truthing was conducted on April 18. which maps input values of ETM+ bands onto an output value of water quality such as Chl-a. the reservoir still receives much attention because of significant watershed nutrient load (Kuo et al. in Taiwan. The spatial resolution of ETM+ data is 30 m (except for band 6 of the thermal infrared channel with 120 m). were also . This system identification problem may be viewed as a search for an optimal function type. of surface Chl-a concentrations in an oligotrophic/mesotrophic reservoir. GPS 100 SURVEY II) with 2 m of precision. 4. 2005 was selected to match the simultaneous mission of water quality sampling at 24 points in the Feitsui Reservoir. Chl-a was measured by the fluorometry method (Turner Design 10-AU-005) after acetone extraction.48 and 4.3. Ground-truth observation A total of 24 surface water (0–0. 2005 since April is usually the first peak of Chl-a occurring every year. 4. the Landsat 7 ETM+ image of April 18. 4). The minimum and maximum values of the 24 samples are 0.1 mg/L. The study area—Feitsui Reservoir The Feitsui Reservoir at 251270 N and 1211330 E is the most important reservoir of northern Taiwan. Feitsui. respectively. which have been rectified to the 21 Transverse Mercator (TM2) coordinate system commonly adopted in Taiwan. In addition. supplying drinking water for more than four million people in Taipei City (Fig. with a detection limit of 0. Each sampling site was geographically located using a Global Positioning System (Garmin. Although water quality in the Feitusi Reservoir is the best in Taiwan. 3 – Flowchart of GE combined with GA.2. There are eight regular sampling stations in the reservoir. Concurrent remote-sensed data To quantitatively measure water quality in the study area.ARTICLE IN PRESS WAT E R R E S E A R C H 42 (2008) 296 – 306 Real-Coded Genetic Algorithm 301 Grammar Evolution Iteration = 0 Set parameters Decoding Initialize populations Mapping process Calculate fitness Generate equations Where terminal condition met ? Evaluate objective function value (RMSE) Yes No Linear ranking selection Obtain optimum equation Blend crossover and gaussian mutation New populations Iteration > 0 Fig. The traditional method of LMR was also analyzed for comparison.

This model utilized ETM+ bands 1–5. In order to extract the TM data at the water sampling locations. 2005 Band B1 (blue) B2 (green) B3 (red) B4 (near infrared) B5 (midinfrared) B7 (midinfrared) ð3Þ Spectral wavelength (mm) Range of digital numbers on 24 sampling sites Range of digital numbers on whole water body 0. to mid-infrared responses of the land surface. 2003). especially for bands 4 and 7. The Feitsui Reservoir was located in the center of the scene that SLC-off had very little effect on the image quality since the SLC-off effects are most pronounced along the edge of the scene and gradually diminish toward the center (USGS.52–0. it is not suitable to perform classification with other bands. Due to an instrument malfunction that occurred onboard Landsat 7.1. The Landsat 7 ETM+ image was acquired from the USGS National Center for Earth Resources Observation and Science. 4.487 to 0. the result of using ETM+ bands through logarithmic transformation was also presented as follows: LNChla ¼ 17:239  2:593LNðB1Þ þ 0:111LNðB2Þ þ 0:191LNðB3Þ  0:961LNðB4Þ þ 0:347LNðB5Þ  1:473LNðB7Þ. 4. (2) and (3) are 0.45 to 2. the mean value in a 3 by 3 window of the image was used to represent the ground point. The image was geometrically rectified to the TM2 coordinate system with ortho-rectified air photos. Most of these pixels were classified into the sixth level (the highest Chl-a values) located on the middle areas of the reservoir.75 14–34 10–35 2.ARTICLE IN PRESS 302 WA T E R R E S E A R C H 42 (2008) 296– 306 used for geographic rectification of the Landsat 7 ETM+ image and for a detailed land use/cover of the surroundings of the reservoir. Therefore.69 0. 5. Since analysis was made for a single image with quite a small angular range.35 mm. 2007).740 are all obviously significant. The other statistical parameters Table 2 – Properties of the Landsat 7 ETM+ data of the Feitsui Reservoir on April 18. Image classification was performed by the unsupervised classification algorithm named Iterative Self-Organizing Data Analysis Technique (ISODATA) using software ERDAS Image v8. 3. providing visible. of Eqs. Using the linear multi-regression method To map the spatial variation of water quality parameters in the reservoir with remotely sensed images. Thus.60 0.4. and 7 was given by Chla ¼ 4:483 þ 0:022ðB1Þ þ 0:031ðB2Þ  0:041ðB3Þ  0:13ðB4Þ þ 0:108ðB5Þ  0:235ðB7Þ.55–1. between the different ETM+ bands and water quality parameters of Chl-a ranging from 0. were used as input variables to estimate Chl-a. the correlation coefficient.765. respectively.90 91–103 61–69 40–52 19–29 83–103 56–72 38–57 18–30 1.76–0. which is also classified as an oligo-mesotrophic level of productivity. all image data acquired by the Landsat 7 ETM+ from July 14. The classification result classified the Chl-a of a water body as six levels in this reservoir as shown in Fig. We acquired the image in level 1G SLC-off mode. Fig. all these six bands 1–5 and 7.823 and 0.63–0. R. The Landsat 7 ETM+ band 6 is unique among the various bands in that it shows the emitted radiation from a surface in the thermal region of the spectrum. Estimation of chlorophyll-a of the reservoir 4.35 10–21 10–24 .6.52 0. Allee and Johnson (1999) used bands 1–5 and 7 of Landsat 7 ETM+ to estimate surface Chl-a for each sampling site in Bull Shoals Reservoir. The correlation coefficients.. 2007). 4 – The 24 sampling sites of the Feitsui Reservoir in Taiwan. the atmospheric correction was ignored for this clear image.45–0. USA. 2. 2003 have been collected in Scan Line Corrector turned off (SLC-off) mode (USGS. Besides. and USGS systematically replaced the duplicated image data caused by the SLC failure. The properties of these six bands are shown in Table 2.08–2. ð2Þ Then. 7 are optical bands. Those are the same input bands used in this study based on similar trophic states on both reservoirs. R. Landsat 7 ETM+ bands 1. 5. near infrared.4. recording electro-magnetic radiation from 0. the atmospheric correction has little effect on correlation analysis (Zhang et al. except band 6. which is radiometrically corrected. the empirical relationship between digital numbers of the preprocessed image bands and Chl-a was established using the simplest type of LMR method initially.

Moreover. Parallel GEGA was found to be better than the traditional LMR model for Chl-a concentration estimation.1673 within 300 generations. Using parallel GEGA The same data were used to compare with the traditional regression method described above. 4.891 3.89 and the determination coefficient R2 ¼ 0.765 0.167 0. These coefficients and forms in the above equation are found to be the optimal solutions based on the balance between the complexity of the equation and the number of input bands.371 0.444 0.301 Parallel GEGA including the sum of square errors (SSE) and root mean square errors (RMSE) of the above two equations are shown in Table 3. and four exploitative subpopulations e1–e4 to construct the topology as a three-dimensional hypercube. Since nonlinear relationships may exist between the inputs and outputs.741 2. Correlation coefficient SSE RMSE LMR (2) (3) (5) 0. The optimal relationship among remotely sensed imageries and Chl-a was acquired through parallel GEGA. The final results are also shown in Table 3.2.823 0. The optimal (best) fitness was found at the end of each generation and these on-line behaviors (the best objective value at each generation) of eight subpopulations are demonstrated in Table 4.30 of parallel GEGA are lower than those of the other methods. 5 – Six levels of chlorophyll-a concentration by ISODATA classification. B4. i¼1 (4) where oi is the actual value of Chl-a. Obviously. the SSE ¼ 2.ARTICLE IN PRESS WAT E R R E S E A R C H 303 42 (2008) 296 – 306 2760000 TM2 Northing (m) 2758000 level 6 2756000 level 5 level 4 level 3 level 2 level 1 2754000 308000 310000 312000 314000 316000 TM2 Easting (m) 318000 320000 Fig. it is necessary to use a more advanced automatic programming and optimization model.17 and RMSE ¼ 0.79 of parallel GEGA are better than those of the two types of linear multiple regression. to fit the complex nonlinear transfer function between the ETM+ bands and water quality parameters. At the later stages. B5 and B7 automatically combined in this nonlinear equation through a number of generations’ evolutions and competitions. The objective function can be written as Minimize RMSE ¼ " #1=2 n X ðoi  ei Þ2 =n . . most subpopulations converge because of the effect of migrations. ei is the estimated value by GEGA and n is the total number of the ground sampling data ( ¼ 24). In parallel GA. the transformation did not improve the retrieval accuracy of simple regression analysis significantly. It also shows that the optimal solutions of exploration subpopulations E1–E4 are more diverse than the exploitation subpopulations e1–e4 at the early stages. We add the ‘‘EXP’’ operator to the tuple of terminals of BNF grammars in GE to generate the equations.4. such as GEGA.309 4. which can be represented as  h ðB7=B1Þ ÞÞÞ Chla ¼ LN eð15:765706ðð60:304886=e  i   B1 þ 31:6022906  B4 þ LN 95:110605    B5  B7  9:884309  . It indicates that the correlation coefficient R ¼ 0. there are a total of eight different subpopulations. ð5Þ 26:271711 There are only four bands B1. All subpopulations converge to the optimal solution 2. including four exploratory subpopulations E1–E4. Each subpopulation size was set to be 50 and to use different degrees of exploratory/exploitative blend crossover operators to optimize the equations of six bands and Chl-a generated by GE separately. Table 3 – Results of Chl-a estimation using LMR and parallel GEGA Method Eq.

7106 3. (2)).702 4.1673 4.3847 3. which is described in the following section.4044 3.3214 4.4569 4.2068 4.8444 3.9816 3.3679 3.1673 2.4 4.5998 2.4567 3.7014 4.9825 3.446 3.5999 3.2476 4.3182 4.3678 3.0365 4.672 3.3854 3.2273 4.9815 3.068 3.9815 3.4826 3.1485 2.3679 2.3825 3.0509 4.9815 3.9815 3.2 2.4044 3.1484 4.5998 3.9752 3.1792 4.5168 3.448 3.7421 2.4459 2.4081 3.6732 3.3889 3.1673 4.9752 3.4 3.1673 2.4221 2.7952 2.4 2.1713 4.4278 3.9815 3.1947 4.2 1.3214 4.1673 4.8993 2.8994 2.7456 4.2355 4.2105 4.8993 2.244 4.5692 3.5. the output values of Chl-a were reasonable. based on the LMR and parallel GEGA.9815 3.7106 3.9815 3.5526 3.6 1.4044 3.648 4.9826 3.0393 4.9869 4.8994 2.0679 2.9815 3.9138 3.9815 3.0365 4.9815 3.1673 2.2124 4.9815 3.0394 3.2063 4.3855 3.2273 4.8049 3.9815 3. the 6677 pixels of the Landsat image covering the whole water body in the reservoir will be efficient tools.1673 2.7356 2.0446 3.0582 4.9815 3.0875 4.8 0.3849 3.1665 2.1932 4.2026 3.6721 3.808 2.8049 3.1673 2.78 2. 6 – Chlorophyll-a concentration distribution of the reservoir by LMR (Eq.4 2756000 2754000 308000 310000 312000 314000 316000 TM2 Easting (m) 318000 320000 Fig.3661 2.4049 2.9825 3.2 3.7029 4.0364 4.9752 3.7217 4.9815 3.7373 4.9815 3.4575 3.3687 4.8774 2.3214 4.1673 4.1673 2. Spatial distribution of chlorophyll-a In order to realize the Chl-a concentrations of the broad area of the reservoir.5627 4.0364 4.2273 4.0509 4.1673 4.3773 3.8995 2.1673 4.9802 2.3748 4.3668 2.3718 2.6 0.3854 3.4186 3.4623 3.4044 3.2136 4.37 3.3659 2.37 3.ARTICLE IN PRESS 304 WA T E R R E S E A R C H 42 (2008) 296– 306 Table 4 – Objective values of eight subpopulations in parallel GEGA Generations 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 e1 e2 e3 e4 E1 E2 E3 E4 4.7356 2.5228 3.3679 2.1673 2760000 TM2 Northing (m) 2758000 4.2 4.3864 3.3952 3.8993 2.9364 4.7783 3.0325 3.8 2.0 0.0 1.6733 3.9752 3.2113 4.3668 2.4184 4.0679 3.9815 3.7797 2.4831 3.919 3.3679 3.0673 4.4459 2.7011 3.068 2.7125 3.6694 3.1673 2.4901 3.1673 2.9815 3. as indicated by the higher correlation coefficient and lower error between the observed data and the estimated value of models.3814 2.9817 3.8142 4.2456 3. We checked the domain of these four input variables in Eq.0609 4.6936 3. 4.8447 3.2063 4.379 3.9775 3. In this study.8 1.0 3.9043 3.0663 4.3678 2.9815 3.0529 4.93 3.4044 3.3661 2.7348 2.9817 3.1673 2.3776 2.9815 3.0771 4. (5) by generating the map of the whole water body in the reservoir.8995 2.9815 3.0509 4.9809 3.0591 4.141 4.9815 3.2365 4.0368 4.0365 4.0364 4.448 3.3214 4.4126 2.9058 3.7358 2.9815 3.0 2.3877 3.4487 3.0458 4.3668 2.43 4.1673 2.8049 3.1673 4.0291 3.6 3.0466 3.1491 4.3089 2.1673 2.4527 4.9815 3. the spatial distribution of Chl-a of the Feitusi Reservoir is demonstrated .9817 3.3099 4.6 2.8 3.8774 2.0287 3.4 1.1673 2.3938 3.0509 4.7437 2.9815 3.068 3.9817 3.

L.6 0. Eshelman. Process.2 4.76 process. Optimizing the reservoir operation rule curves by genetic algorithms. this newly developed method is flexibly applied to other reservoirs.. However.2 3.6 3.6 1.8 1.. R.. Carlson.48 2.8 0.8 2. J. Baumgardner. Preventing premature convergence in genetic algorithms by preventing incest. Nix. J. Hydrol. 1992. It is shown that the reservoir was mesotrophic in the central and lower areas and oligotropic in the upper places..4 1. Arkansas.ARTICLE IN PRESS WAT E R R E S E A R C H 305 42 (2008) 296 – 306 2760000 TM2 Northing (m) 2758000 4. Schifebe. 2003. Remote Sens.4 2756000 2754000 308000 310000 312000 314000 316000 TM2 Easting (m) 318000 320000 Fig. R. Int. A study of applying genetic programming to reservoir trophic state evaluation using remote sensor data. Therefore. Hydrol.C. (5)) LMR (Eq. MN.E. 18. Analytical algorithms for lake water TSM estimation for retrospective analyses of TM and SPOT sensor data. Elseth.. Process. Chang. 6 and 7. Int.J. the tendency of distributions of Chl-a was coincident with the classification imagery (Fig. 22 (2). 688–698. there are slight seasonal changes in the phytoplankton community. St. 2006. Limnol. American Public Health Association.3. Besides. 1057–1072. 20 (6). Table 5 – Statistical properties of 6677 estimated Chl-a data using LMR and parallel GEGA Parallel GEGA (Eq. 1977..0 3. L. 1999. 2277–2289. 361–369..4 3.76 2.. Harrington.J. Johnson. APHA (American Public Health Association). even in the Feitsui Reservoir. R. G. J. Booker. USA. Chen.. Manage.0 0. Proceedings of the Fourth International Conference on Genetic Algorithms. S. R E F E R E N C E S Minimum Average Maximum in Figs. 7 – Chlorophyll-a concentration distribution of the reservoir by parallel GEGA (Eq. Vos. West.76 3.. Chang. In: Belew.2 2.J. Paul.. 2005. Peters. A trophic state index for lakes.6 2.2 1.). Remote Sens. CA.J..J. 185–198. F.. L.E..D. K. Morgan Kaufmann. (Eds. Environ.8 3. 1991. Principles of Modern Genetics. L. 115–122. A. R. this phenomenon has been observed by both methods. Oceanogr.D. 5.. Standard Methods for the Examination of Water and Wastewater.86 0. L. San Marco. The statistical properties of the 6677 estimated Chl-a values by these two methods are shown in Table 5. Chen.B. The trophic state of the water body in the reservoir had a wider representation by the parallel GEGA than LMR. Use of satellite imagery to estimate surface chlorophyll a and Secchi disc depth of Bull Shoals Reservoir. 24 (11). 15–35. F. 16th ed. Summary and conclusions The main contribution of this paper is to provide a new parallel GEGA algorithm. 40. Chang. DC. Applying real-coded multi-population genetic algorithm to multi-reservoir operation. Schaffer. variability in species structure of phytoplankton assemblage may generate different optical spectra. J. J. Water Resour. J. F.. . 2002.E. L.W. Washington.R. 23. L.A. Chen. 1998. Chang... However. Int. 5) described in Section 4. 21.D. Remote Sens.G.4 4..4 2. through all the procedures described above in the text including the field data collection and remote-sensed imagery Allee.0 1. the maximum and minimum values of Chl-a generated by the parallel GEGA are closer to the ground observations than those of the LMR. 1992.J.. (5)).60 4... F. W. Dekker. 2265–2275.0 2. 79–100. 1995. Real-coded genetic algorithm for rulebased flood control reservoir management. which creates potentials to monitor chlorophyll-a level in a specific time frame and reservoir.M. It can deal easily with nonlinear transfer problems between remote-sensed imagery and water quality in the reservoir and is shown to be a very efficient and robust optimization tool. (2)) 0. pp. 12. Chen. Remote Sens. Determination of phytoplankton chlorophyll concentrations in the Chesapeake Bay with aircraft remote sensing.

Environ. 5 (4). J. IEEE Trans.K. 42 (2008) 296– 306 Ritchie.. Website of the USGS Landsat project SLC-off products background. C. Comput. Comput. A hybrid neural-genetic algorithm for reservoir water quality management. 41 (3). Mani... 43–63. 2003. IEEE Trans. Remote Sens. The University of Michigan Press..S. Koponen.phpS. Remote Sens. O’Neill. 2007.. Gradual distributed real-coded genetic algorithms. Y. Mississippi. 137–178. 1367–1376. M. Kaufmann. S. Water Res. /http://landsat7. Hallikainen. 73. V. Lung. 40. W.. 227–235.. Agrawal.T. 622–629. 2006. 2000. J. ACM. Water quality retrieval from combined Landsat TM data and ERS-2 SAR data in the Gulf of Finland.T.K. Environ. F. Lozano.T. Application of genetic programming for multicategory pattern classification.. IEEE Trans.R.M. chlorophyll. 242–257.M.. Evol. background. J. Comput. Thiemann. M.. Patnaik. S. L.. 33. 4 (3). J.. V. IEEE Trans. 1963.. Evol.H. Kuo. Commun. 1990. 4 (1). 1–17. Determination of chlorophyll content and trophic state of lakes using field spectrometer and IRS-1C satellite data in the Mecklenburg Lake District. Cooper. and temperature in Moon Lake.C. Evol.. . Germany. 2000...ARTICLE IN PRESS 306 WA T E R R E S E A R C H Herrera. Schiebe. 6 (1). Holland. Ryan. Geosci. C.. Pulliainen. M. 1975. The relationship of MSS and TM digital data with suspended sediments.. Ann Arbor. Zhang. F.. usgs.. Wang. Grammatical evolution. Revised report on the algorithmic language ALGOL 60. US Geological Survey (USGS). Y. Adaptation in Natural and Artificial Systems. H.Y. Naur. 349–357.. 2000. 2001. P. J. Kishore.S.. Remote Sens.