You are on page 1of 8

Environ Monit Assess (2010) 163:49–56

DOI 10.1007/s10661-009-0815-y

Evaluation of significant sources influencing


the variation of water quality of Kandla creek,
Gulf of Katchchh, using PCA
S. G. Dalal · P. V. Shirodkar · T. G. Jagtap ·
B. G. Naik · G. S. Rao

Received: 7 August 2008 / Accepted: 27 January 2009 / Published online: 27 March 2009
© Springer Science + Business Media B.V. 2009

Abstract To evaluate the significant sources con- Introduction


tributing to water quality parameters, we used
principal component analysis (PCA) for the inter- Maintenance of quality of nearshore marine wa-
pretation of a large complex data matrix obtained ter is a sensitive issue, as a wide variety of an-
from the Kandla creek environmental monitoring thropogenic waste is discharged into it. Sewage
program. The data set consists of analytical results and municipal wastes, landfill drainage, pulp and
of a seasonal sampling survey conducted over paper industry, forestry operations, agricultural
2 years at four stations. PCA indicates five prin- runoff, aquaculture and fish processing plants, etc.
cipal components to be responsible for the data are some major point and non-point sources of
structure and explains 76% of the total variance of pollution. Such wastes, being the most ubiquitous
the data set. The study stresses the need to include and direct sources, contaminate the coastal zone
new parameters in the analysis in order to make to a large extent, often leading to eutrophica-
the interpretation of principal components more tion, which refers to the process of natural or
meaningful. The PCA could be applied as a useful man-made enrichment with inorganic nutrients
tool to eliminate multi-collinearity problems and (Kennish 1992; Schramm and Nienhuis 1996).
to remove the indirect effect of parameters. From a coastal zone management perspective, it
is important to understand the relative capacities
Keywords Water quality · PCA · of waters to absorb such wastes for sustainable
Significant sources · Kandla creek development and to adopt proper management
strategies. Since aquatic environment shows spa-
tial and temporal variations in water quality, there
is a need to devise a monitoring program that will
provide the best representation and reliable esti-
mate of a particular water mass. This is necessary
S. G. Dalal (B) · P. V. Shirodkar · to avoid frequent water samplings at many sites.
T. G. Jagtap · B. G. Naik
National Institute of Oceanography, A 2-year survey (October 2002–September
Dona Paula, Goa 403004, India 2003 and June 2004–May 2005) to establish na-
e-mail: dalal@nio.org tional databases on water quality was undertaken
in Kandla creek. The Environmental Guide-
G. S. Rao
Kandla Port Trust, New Kandla, lines for Ports and Harbours (EGPH 1989), laid
Gandhidham, Gujarat, India down by the Ministry of Environment & Forests,
50 Environ Monit Assess (2010) 163:49–56

Department of Environment, Forests & Wildlife, Gulf of Katchchh (Fig. 1). This port has many ter-
Government of India, New Delhi were followed to minals for handling oil and oil products, fertilizers,
assess the pollution levels in marine waters. In the and general cargo. Hectic loading and unloading
present communication, the large data matrix ob- activities and movement of personnel at the port
tained during the monitoring program (1,224 ob- are bound to have adverse effects on the creek
servations) in Kandla creek was utilized to extract water. Tidal height in the creek ranges from 0.83
information on: (a) the temporal variations and to 7.2 m and a surface current varies from 1.5 to
(b) evaluate the significant sources contributing to 5 kn.
water quality parameters.
Sampling and analysis

Material and methods The sampling strategy was designed to cover a


wide range of water quality determinants at key
Study area sites. The sampling for water was done once each
season—premonsoon/summer (February–May),
Kandla creek lies between latitude 22◦ 55 to monsoon (June–September), and post monsoon/
23◦ 05 N and longitude 70◦ 05 to 70◦ 02 E in the winter (October–January) at four key stations—
Gulf of Katchchh. Kandla Port is one of the major mouth of the creek, off cargo jetty, off IOC oil
ports, situated about 90 km from the mouth of the jetty and at the junction, where Sara and Phang

Fig. 1 Location of port 70˚5' E 10' 15' 20'


industrial units and
sampling stations during
a monitoring program
of Kandla creek
10'
N

5'

23˚
00'

55'
Environ Monit Assess (2010) 163:49–56 51

Fig. 2 Temporal
variations—box plot
for selected parameters
for three seasons (S1
premonsoon, S2
monsoon, S3 post
monsoon)
52 Environ Monit Assess (2010) 163:49–56

creeks meet Kandla creek (Fig. 1), over a 2-year tends to minimize the influence of variance of
period from October 2002 to September 2003 and parameters. It also eliminates the influence of
June 2004 to May 2005. The water samples were different units of measurement and renders the
collected from surface, mid-depth, and bottom data dimensionless. The main concern of the
water layers at each station and on-the-spot PCA is to understand the mode of action or
analyses were made for few parameters while for behavior of components of a system and its
other parameters, the samples were analyzed at subsystems (Petersen et al. 2001; Bengraine
the shore laboratory following standard methods and Marhaba 2003). The use of PCA for wa-
for water analysis (APHA 1975; Grasshoff et al. ter quality assessment has increased in the
1983). Water quality variables analyzed include last few years, mainly due to the need to
dissolved oxygen (DO), water temperature obtain appreciable data reduction for analysis
(TEMP), pH, salinity (SAL), suspended solids and decision (Morales et al. 1999). Bartlett’s
(SSOL), biochemical oxygen demand (BOD), sphericity test (χ 2 with degrees of freedom =
phosphate (PO4 –P), nitrite (NO2 –N), nitrate 1/2[ p( p − 1)]) was used to verify the applicability
(NO3 –N), ammonia (NH3 –N), and silicate (Si2 of PCA to raw data (Stevens 1986). The STATIS-
O3 –Si). Turbidity (TURB) was measured by TICA 6.0 software package was employed for data
nephelometric method. Dissolved/dispersed pe- treatment.
troleum hydrocarbons (PHC) were extracted
from seawater with double distilled hexane and
quantified by using Shimadzu RF-1501 fluores- Results and discussion
cence spectrofluorometer. Phenol (PHE) was ex-
tracted with chloroform after complexing with Exploratory data analysis
4-aminoantipyrene, and the color was measured
spectrophotometrically. For estimation of chlo- Box plots of selected parameters during three
rophyll a (CHL), 500 ml of water sample was seasons (Fig. 2) at four sampling stations were
filtered through GF/F glass fiber filter paper examined. By inspecting these plots, it was possi-
extracted in 90% acetone overnight. The extracts ble to perceive differences between the seasons.
were used for the estimation of fluorescence be- Our first approach to establish the parameter-
fore and after acidification using Turner Design associated temporal variation was by use of the
Fluorometer. The fluorescence values were con- Spearman R. The bivariate results (Table 1) show
verted to chlorophyll and phaeophytin (PHAE) that the five parameters having significant cor-
using appropriate calibration factor. Primary relation with the season ( p < 0.05) are: nitrates,
production (PP) was measured using 14 C tech- suspended solids, ammonia, water temperature,
nique. The data quality was checked by careful
standardization and procedural blank measure-
ments of spiked and duplicate samples. Table 1 Spearman non-parametric correlation coefficient
(R) for selected parameters between seasons (bold figures
Multivariate statistical methods indicate significance at p < 0.05)
Parameters Summer and Summer and Monsoon
The water quality parameters in three different monsoon winter and winter
seasons (winter, summer, and monsoon) were as- TEMP −0.39 − 0.78 0.34
signed a numerical value in the data file which, DO −0.22 0.28 0.08
BOD 0.06 0.30 0.05
as a variable corresponding to the season, was
TURB 0.71 0.38 0.37
correlated (pair by pair) with all the measured SSOL 0.93 0.62 0.64
parameters. CHL −0.21 0.74 −0.35
In order to avoid misclassification due to wide NO3 0.48 0.84 0.55
differences in data dimensionality (Ross 1988), NH3 0.50 0.38 0.86
data was standardized through z-scale transfor- PHC 0.24 0.12 0.31
mation before applying PCA. Standardization PHE 0.27 0.25 0.22
Environ Monit Assess (2010) 163:49–56 53

and turbidity. The season-correlated parameters indicate the percent contribution of correspond-
were taken as representing the major source of ing variable to the PC and are called the loading
temporal variations in water quality. In view of of ith variable in kth PC (correlation between
the source types in the creek, these correlations variable ith and kth PC). These loading values
can be explained on the basis of seasonal features were used to group variables in PCs.
in the monitoring region. The correlation matrix Score value (skj) for jth observation in kth PC
suggests that the temperature, SSOL, NO3 , and was obtained from the weight of variables in PCs
NH3 are the most significant parameters to dis- and standardized variables by using the following
criminate between the seasons, which also means equation:
that these parameters account for most of the
expected temporal variations in the water qual- skj = a1 j z1 j + a2k z2 j + − − − − − − − + a pk z pj
ity; this also suggests that the anthropogenic in-
put, which was the major pollution source mainly where j = 1, 2, . . ., n is the number of obser-
derived from the discharge of wastewater into vations; k = 1, 2, . . ., q the number of selected
the system, was independent of the season as it PC numbers, and p the number of independent
was present throughout the year. Some of these variables.
correlations can be explained by climatic changes By applying Bartlett’s sphericity test, a value
associated with the three seasons. Land drainage of 844.764 for the Bartlett chi-square statistics
and strong tidal currents in the creek bring in a was found (df = 136, p < 0.01), confirming that
large amount of colloidal particles into suspen- the parameters are not orthogonal but correlated,
sion during monsoon, whereas, during summer, therefore explaining the data variability with
the strong tidal currents in low water level of a lesser number of parameters (called principal
the creek disturb the settled sediments bringing components).
in large amount of colloidal particles in water, PCA on 17 parameters yielded five principal
thereby increasing the turbidity. This was also components explaining sample variance of about
true for suspended solids, which were significantly 76% (Table 2). The varimax rotation was then
higher during the summer and winter periods as performed to secure increased principal compo-
compared to the monsoon. Although instances of nents of environmental significance; a similar ap-
waste releases due to port activities were evident proach based on PCA has been used to identify
in Kandla creek, the influence of seasonal changes the main components in water quality (Vega et al.
appears to be fairly large. 1998; Helena et al. 2000; Wunderlin et al. 2001;
Simeonov et al. 2003; Singh et al. 2004).
Data treatment In the present study, the first PC that explains
29% of total variance has significant loadings
In the application of PCA to water quality data (> 0.70) on salinity, suspended solids, turbidity,
from Kandla Port monitoring stations, correlation and petroleum hydrocarbons (Fig. 3). High load-
matrix of variables (R p× p ) was used to obtain ings on suspended solids, turbidity, and salinity
eigenvalues and weights of parameters. Since the were due to the natural effects of strong tidal
four sampling stations were combined to calculate currents (tidal range 7 m) and intrusion of saline
the correlation matrix, the correlation coefficients
should be interpreted with caution as they are
simultaneously affected both by spatial and tem- Table 2 Eigenvalues and percentage of explained variance
poral variations. The Scree plot was used here to by first five components by PCA
identify the number of PCs to be retained in or- Component Eigenvalue % total variance Cumulative %
der to comprehend the underlying data structure 1 4.922 28.95 28.95
(Jackson 1993). Eigenvector λ was used to obtain 2 4.282 25.18 54.13
unrotated factor loadings. Via indicates the values 3 1.588 9.34 63.47
of rotated factor loadings, which were obtained 4 1.107 6.51 69.98
by varimax rotation. Rotated loadings in PCs 5 1.018 5.99 75.97
54 Environ Monit Assess (2010) 163:49–56

Factor 1
1

Factor loadings
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
Factor 2
1
Factor loadings

0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8

1 Factor 3
0.8
Factor loadings

0.6
0.4
0.2
0
0.2
0.4
0.6

1 Factor 4
0.8
Factor loadings

0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1

1 Factor 5
0.8
0.6
Factor loadings

0.4
0.2
0
0.2
0.4
0.6
0.8
1
-1.2
TEMP PH SAL DO BOD SSOL TURB PO4 NO2 NO3 NH3 SILI PHC PHE PHAE CHL PP

Fig. 3 Factor loadings of PCA on 17 water quality parameters

waters from the salt works. The high loading of from fields after precipitation events. Within the
petroleum hydrocarbons was due to the spillage part of variance described by the third PC, tem-
from loading and unloading activities of oil and perature and pH have an opposite sign in com-
other petroleum products. The second PC that parison to the DO and BOD. The contribution
explains 25% of the total variance correlates to of salinity was negligible; phosphate and nutrients
water-soluble nitrogenous species, i.e., NO2 –N, were only weakly involved (Fig. 3). The pattern
NO3 –N, NH3 –N, and petroleum hydrocarbons can be interpreted in terms of biological activity
(Fig. 3). The main sources of nitrate were due to as either primary production by algae or their
agricultural activities and the increased drainage subsequent microbial decomposition. The fourth
Environ Monit Assess (2010) 163:49–56 55

PC significantly negatively correlated with DO monitoring studies. If resources become limited,


and BOD with positive loadings on chlorophyll a, the selected parameters may provide a suggestion
and phaeophytin (Fig. 3) represents the ‘organic for future data collection in environment monitor-
source’ of the creek water. In the fifth PC, one can ing studies. Our results suggest that a cumulative
derive that the high loadings of ammonia (Fig. 3) proportion of variance (α0 ) criterion is inappro-
were mainly due to the discharge from the nearby priate for PCA.
fertilizer plant and the oil jetty.
Given that using a cut-off criteria λ0 ≤ 0.70 Acknowledgements The authors wish to thank the Di-
rector, NIO, India for facilities and Kandla Port Trust
retains too many parameters, and that a cut-off of Authorities for sponsoring the environmental monitoring
λ0 > 0.70 leads to retention of too many compo- program in Kandla creek. This is NIO (CSIR) contribution
nents (Jackson 1993), we do not suggest the eigen- Number 4515.
values (λ) criteria. Our results also suggest that a
cumulative proportion of variance (α0 ) criterion is
References
inappropriate. Solutions based on proportionately
more parameters will be less stable and the result- APHA (1975). Standard methods for the examination
ing eigenvector coefficients will be less reliable. As of water and waste water (14th ed.). APHA-AWWA-
a consequence, the interpretability of the analyses WPCE, American Public Health, Washington
will be compromised. DC20036.
Bengraine, K., & Marhaba, T. F. (2003). Using prin-
From an ecologist’s view, parameter selection
cipal component analysis to monitor spatial and
is useful for reducing the number of parameters temporal changes in water quality. Journal of Haz-
required for statistical analyses since it can im- ardous Materials, B100, 179–195. doi:10.1016/S0304-
prove the reliability and stability of final results 3894(03)00104-3.
EGPH (1989). Environmental guidelines for ports and
(Williams and Titus 1988; Grossman et al. 1991). harbour projects. New Delhi: Govt. of India.
In this study, PCA did not result in much data Grasshoff, K., Ehrhardt, M., & Krimling, K. (1983). Meth-
reduction, as one still needs 14 parameters (about ods of seawater analyses (Second revised and extended
80% of the 17 parameters) to explain 76% of the edition, 419 pp.). Weinheim: Verlag Chemie.
Grossman, G. D., Nicckerso, D. M., & Freeman, M. C.
data variance. Therefore, it becomes necessary to
(1991). Principal component analyses of assemblages
include new parameters such as toxic trace metals structure data: Utility of tests based on eigenvalues.
(viz. lead, cadmium, and mercury) in water and Ecology, 72(1), 341–347. doi:10.2307/1938927.
sediments in the monitoring program to make the Helena, B., Pardo, R., Vega, M., Barrado, E., Fernandez,
J. M., & Fernandez, L. (2000). Temporal evolution
interpretation of PCA more meaningful. How-
of groundwater composition in an alluvial aquifer
ever, PCA serves as a means to identify parame- (Pisuerga River, Spain) by principal component analy-
ters that have the highest contribution in the creek sis. Water Research, 34, 807–816. doi:10.1016/S0043-
water quality. 1354(99)00225-0.
Jackson, D. A. (1993). Stopping rules in principal compo-
nents analysis: A comparison of heuristical and statis-
tical approaches. Ecology, 74, 2201–2214.
Conclusion Kennish, M. J. (1992). Ecology of estuaries: Anthropogenic
effects (494 pp.). Florida: CRC.
Morales, M. M., Mart, P., Llopis, A., Campos, L., &
The study demonstrates the value of PCA of
Sagrado, J. (1999). An environmental study by fac-
large and complex databases in deriving better tor analysis of surface seawater in the Gulf of Valen-
information about the water quality and analyt- cia (western Mediterranean). Analytica Chimica Acta,
ical protocols. PCA is also a powerful pattern 394, 109–117. doi:10.1016/S0003-2670(99)00198-1.
Petersen, W., Bertino, L., Callies, U., & Zorita, E. (2001).
recognition technique that attempts to explain the
Process identification by principal component analysis
variance of a large dataset of intercorrelated pa- of river water-quality data. Ecological Modelling, 138,
rameters with a smaller set of independent para- 193–213. doi:10.1016/S0304-3800(00)00402-6.
meters. Five components explain 76% of the total Ross, P. J. (1988). Taguchi techniques for quality engineer-
ing. New York: McGraw-Hill.
variation. It may be necessary to include certain
Schramm, W., & Nienhuis, P. H. (1996). Marine benthic
new parameters such as toxic trace metals and vegetation: Recent changes and the effects of eutroph-
eliminate some observed parameters for future ication (470 pp.). Berlin: Springer.
56 Environ Monit Assess (2010) 163:49–56

Simeonov, V., Stratis, J. A., Samara, C., Zachariadis, G., quality of river water by exploratory data analy-
Voutsa, D., Anthemidis, A., et al. (2003). Assess- sis. Water Research, 32, 3581–3592. doi:10.1016/S0043-
ment of the surface water quality in Northern Greece. 1354(98)00138-9.
Water Research, 37, 4119–4124. doi:10.1016/S0043- Williams, K. K., & Titus, K. (1988). Assessment and sam-
1354(03)00398-1. pling stability in ecological applications of discrimi-
Singh, K. P., Malik, A., Mohan, D., & Sinha, S. (2004). nate analysis. Ecology, 69(4), 1275–1285. doi:10.2307/
Multivariate statistical techniques for the evaluation 1941283.
of spatial and temporal variations in water quality of Wunderlin, D. A., Diaz, M. P., Ame, M. V., Pesce,
Gomti River (India): A case study. Water Research, 38, S. F., Hued, A. C., & Bistoni, M. (2001). Pattern
3980–3992. doi:10.1016/j.watres.2004.06.011. recognition techniques for the evaluation of spa-
Stevens, J. (1986). Applied multivariate statistics for the so- tial and temporal variation in water quality. A case
cial science (515 pp.). Hillsdale: Erlbaum. study: Suquia river basin (Cordoba Argentina). Water
Vega, M., Pardo, R., Barrado, E., & Deban, L. (1998). Research, 35, 2881–2894. doi:10.1016/S0043-1354(00)
Assessment of seasonal and polluting effects on the 00592-3.

You might also like