You are on page 1of 5

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 10, NO.

2, MARCH 2013

293

Automatic Generation of Standard Deviation Attribute Proles for SpectralSpatial Classication of Remote Sensing Data
Prashanth R. Marpu, Member, IEEE, Mattia Pedergnana, Member, IEEE, Mauro Dalla Mura, Member, IEEE, Jon Atli Benediktsson, Fellow, IEEE, and Lorenzo Bruzzone, Fellow, IEEE

AbstractExtended attribute proles, which are based on attribute lters, have recently been presented as efcient tools for spectral-spatial classication of remote sensing images. However, construction of these proles usually requires manual selection of parameters for the corresponding attribute lters. In this letter, we present a technique to automatically build the extended attribute proles with the standard deviation attribute based on the statistics of the samples belonging to the classes of interest. The methodology is tested on two widely used hyperspectral images and the results are found to be highly accurate. Index TermsAttribute proles (APs), classication, hyperspectral data, mathematical morphology, spectralspatial.

I. I NTRODUCTION ORPHOLOGICAL proles (MPs) and attribute proles (APs) are found to be efcient tools to fuse spectral and spatial information for effective classication of remote sensing data [1][5]. An MP of a gray-level image is built using a sequence of xed shape structuring elements of increasing sizes to perform geodesic opening and geodesic closing operations [6]. When multiple image layers are considered, the MPs are individually computed and stacked together, and the stack is referred to as the extended MP (EMP) [4]. An extension of MP is the more versatile morphological AP [7], which is built on a sequence of increasingly severe attribute lters (AFs) [8]. The use of AFs in the AP permits extracting features that are not only related to the scale of the regions in the image (as by using the geodesic operators in the MPs) but also to any measure (e.g.,
Manuscript received February 15, 2012; revised April 4, 2012; accepted May 5, 2012. Date of publication July 18, 2012; date of current version October 22, 2012. This work was supported in part by the Icelandic Research Fund and in part by the Research Fund of the University of Iceland. P. R. Marpu was with the University of Iceland, 101 Reykjavik, Iceland. He is now with the Department of Water and Environment Engineering, Masdar Institute of Science and Technology, 54224 Abu Dhabi, UAE (e-mail: prashanthmarpu@ieee.org). M. Pedergnana was with the University of Iceland, 101 Reykjavik, Iceland. He is now with German Aerospace Center (DLR), Munich, Germany (e-mail: mattia.pedergnana@dlr.de). M. Dalla Mura is with Fondazione Bruno Kessler, Trento, Italy (e-mail: dallamura@fbk.eu). J. A. Benediktsson is with the Department of Electronics and Computer Engineering, University of Iceland, 101 Reykjavik, Iceland (e-mail: benedict@ hi.is). L. Bruzzone is with the Department of Information Engineering and Computer Science, University of Trento, Trento, Italy (e-mail: bruzzone@ ing.unitn.it). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/LGRS.2012.2203784

geometrical, textural, and spectral) that can be computed on the regions. Analogous to the EMP, the corresponding stack of the APs computed on features extracted from hyperspectral data is referred to as extended AP (EAP) [7]. When multiple EAPs are computed by considering different attributes and stacked together in the same data structure, an extended multiattribute prole is obtained [7]. By using multiple attributes, particularly when they are based on measures that are as less correlated as possible (e.g., shape, textural, and spectral characteristics), it is possible to extract a richer description of the regions in the image providing a more complete modeling of the spatial information of the investigated scene [4]. However, even if the application of EMP and EAP for remote sensing data classication is found to be effective, a major issue is the selection of the lter parameters to generate the proles. Various attributes such as area, width, and moment of inertia can be used. However, it is not directly possible to automatically estimate the ranges of the different attributes, which are suitable for the image being processed. In this letter, an attempt is made to automatically generate the standard deviation AP using the statistics of the available training samples. The general assumption is that the maximum standard deviation of the pixel values within the segments of the classes of interest is similar to the standard deviation of the training pixels if they represent the classes of interest. Only the standard deviation attribute is used in this letter, and the efciency of the automatic generation of the standard deviation EAP is validated. Since the morphological operations used for building the proles are operations performed on scalar values, when dealing with high-dimensional data, such as hyperspectral images, a prior feature reduction step is required [3], [5], [7]. In this letter, we consider two unsupervised feature reduction methods, namely, principal component analysis (PCA) [9] and kernel PCA (KPCA) [10], and three supervised feature reduction algorithms, namely, discriminant analysis feature extraction (DAFE), decision boundary feature extraction (DBFE), and nonparametric weighted feature extraction (NWFE) [11]. All these methods are frequently used in the literature.

II. EAP S EAPs [7] are an extension of the concept of APs for the analysis of hyperspectral images. APs are obtained by applying a sequence of morphological AFs to a scalar image [4]. AFs are

1545-598X/$31.00 2012 IEEE

294

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 10, NO. 2, MARCH 2013

connected operators dened in the mathematical morphology framework [6] that process an image by merging its connected components (i.e., regions of connected isolevel pixels) [8]. The operation done on the regions is driven by the result of a binary predicate that evaluates how an attribute a (i.e., measure computed on each region) compares to a given reference value (e.g., a(Ci ) > , being Ci an arbitrary connected component of the image). If the criterion is met, then the region is kept unaltered; otherwise, it is set to the grayscale value of the adjacent region with the closest gray level (i.e., producing a merging between the regions). When the region is merged to the adjacent region of a lower (greater) gray level, the operation performed is thinning (thickening). Given a sequence of ordered criteria {T } = {T1 , T2 , . . . , Tn } [4], an AP is obtained by applying a sequence of attribute thinning and attribute thickening operations to the input image f AP(f ) = {n (f ), . . . , 1 , f, 1 (f ), . . . , n (f )} (1)

TABLE I S TANDARD D EVIATION OF THE C LASSES FOR D IFFERENT DBFE F EATURES OF THE U NIVERSITY OF PAVIA DATA . T HE VALUES A RE ROUNDED AND G IVEN AS P ERCENTAGE OF M EAN OF THE C ORRESPONDING F EATURE . (C = C LASSES , F = F EATURES , AND N = N UMBER OF T HRESHOLDS S ELECTED U SING THE P ROPOSED M ETHOD )

feature). The set of thresholds for constructing the prole is given as {} = stdmin + i (stdmax stdmin )/l, i = 0, 1, 2 . . . , l or {stdmin stdmin + , stdmin + 2, . . . , stdmax }. (3) 4) If the range is less than the minimum interval size of the feature (i.e., stdmax stdmin ), retain the minimum and maximum values of the standard deviations as the thresholds instead of dividing into intervals. 5) The identied thresholds are used to build the AP. It has to be noted that the size of the prole will be different for different features, but the minimum prole size is ve (i.e., according to (1), two features corresponding to the extrema values of the standard deviation range for both the thinning and thickening operations and one feature is the original image). This way, we can adaptively calculate the proles for all the features. Traditionally, the thresholds are manually selected in a trial-and-error way to cover a wide range of values, but this increases redundancy. The features normally have different statistics, and also, the individual classes have different statistics for different features. Therefore, it can be easily inferred that different thresholds are required to build the standard deviation EAP from different features. This is very difcult to do while manually selecting the thresholds. This can be illustrated by observing the values of the standard deviation of different classes for ve DBFE features of the University of Pavia data set, as shown in Table I. It can be clearly seen that the ranges of the values are different for different features, and hence, using similar thresholds for all the features may not be appropriate. Therefore, the proposed method is an alternative way to build the standard deviation EAP using the statistics of the training samples. Refer to Section V-B for a comparison between the accuracy values obtained with a manual and the proposed approach for the selection of the attributes. IV. E XPERIMENTAL R ESULTS Two data sets, which are often found in literature, namely, the ROSIS University of Pavia data set and the Airborne Visible Infrared Imaging Spectrometer (AVIRIS) Indian Pines data set, are used in this letter.

where i and i are the thickening and thinning transformations with criterion Ti , respectively. The EAP is obtained by generating an AP on each of the rst m features (Fi ) computed using any feature reduction technique on the hyperspectral image [7] EAP = {AP(F1 ), AP(F2 ), . . . , AP(Fm )} . (2)

In this letter, we focus on an EAP with standard deviation attribute only. Thus, the criterion evaluated can be expressed as (f (Ci )) > , with as the standard deviation, f (Ci ) as the values of the image on the connected component Ci , and is the value of the standard deviation taken as a reference. The next section will detail the proposed technique to adaptively dene thresholds . III. P ROPOSED M ETHODOLOGY In the proposed methodology, the thresholds (i.e., the values denoted with in the previous section) for the standard deviation attribute are calculated based on the statistics of the classes of interest. As previously mentioned, the assumption is that the standard deviation of the training data of the classes of interest is related to the maximum standard deviation of the pixel values within individual segments of the corresponding classes of interest. The following method is used here for every feature. 1) Calculate the standard deviation of the training pixels of every class represented as stdi for the ith class. 2) Find the minimum and maximum values from the standard deviations (i.e., stdmin and stdmax , respectively) of the classes to dene the range. All the values above 25% of the mean are ignored as they might not correspond to homogenous regions. 3) The range can be now divided into intervals either by dening the number of intervals l (e.g., 4) or the minimum interval size (e.g., 2.5% of the mean of the

MARPU et al.: GENERATION OF APS FOR SPECTRALSPATIAL CLASSIFICATION OF REMOTE SENSING DATA

295

Fig. 1. False color representation of (a) the Indian Pines data, (b) test regions, and (c) training regions.

A. Indian Pines Data Set The Indian Pines data set is a widely used data set acquired using the AVIRIS sensor in 1992 with a spatial resolution of 20 m, consisting of 220 data channels in the 4002500-nm wavelength range. A false color composite of the data is shown in Fig. 1. Sixteen ground truth classes are identied, namely, Corn-notill (50/1384), Corn-mintill (50/784), Corn (50/184), Grass-pasture (50/447), Grass-trees (50/697), Hay-windrowed (50/439), Soybean-notill (50/918), Soybean-mintill (50/2418), Soybean-clean (50/564), Wheat (50/162), Woods (50/1244), Bld-Grass-Trees-D (50/330), Stone-Steel-Towers (50/45), Alfalfa (15/39), Grass-pasture-mowed (15/11), and Oats (15/5), where the numbers in the brackets indicate the number of available training and test pixels, respectively. B. University of Pavia Data Set The image is a widely used data set acquired within the framework of the European Union HySens project using a ROSIS-3 instrument on the ight operated by the German Aerospace Center (DLR) over the University of Pavia in Pavia, Italy. It consists of 115 data channels in the range of 0.43 0.86 m at a spatial resolution of 1.3 m. Twelve noisy bands were removed leading to a nal number of 103 features. A false color composite of the data set is shown in Fig. 2. Nine ground truth classes are identied, namely, Trees (524/2912), Asphalt (548/6304), Bitumen (375/981), Gravel (392/1815), Metal sheets (265/1113), Shadow (231/795), Bricks (514/3364), Meadows (540/18146), and Bare soil (532/4572), where the number of available training and test pixels for every class is given in the brackets, respectively. V. R ESULTS AND D ISCUSSION Experiments were conducted using various feature reduction algorithms and using the standard deviation EAP (referred to as EAP in the rest of this letter). The classication was done using random forest (RF) [12] and support vector machine (SVM) [13] classiers to compare the results. SVM requires an extra step (e.g., by means of cross validation as is done in this work) to identify the optimal parameters, but RF does not require any parameter tuning. Accuracy values were compared using A. Indian Pines Data Table II shows the results of classication with the EAP using RF and SVM classiers, respectively. It can be observed that very high accuracy values are obtained when the proposed EAP is used. As it was observed in [14], the EAP based on NWFE produces the best result, conrming that NWFE is better suited when limited training samples are available. PCA and KPCA as basis for the EAP also provide reasonably high accuracy
Fig. 2. False color representation of (a) the University of Pavia image, (b) test regions, and (c) training regions.

average accuracy (AA), overall accuracy (OA), and the kappa coefcient (Kappa). The experiments were done with both unsupervised (PCA and KPCA) and supervised (DAFE, DBFE, and NWFE) feature reduction algorithms. It has been indicated in [14] that NWFE is better suited when there are limited samples and that DBFE is better suited when sufcient training samples are considered for supervised feature reduction. An extra experiment was conducted to use supervised feature reduction on the EAP to be able to reduce the dimensionality of the prole to study the effect of Hughes phenomenon. For this, NWFE was used for the Indian Pines data set, which has limited training data, and DBFE was used for the Pavia data set, which seems to have a sufcient number of training samples. Below, the proles are indicated using subscript p, e.g., PCAp for the EAP built using the features from PCA. The feature reduction of the EAP is denoted using the corresponding rst two initials of the feature reduction methods. For example, KPNW indicates NWFE feature reduction of the EAP built using KPCA features extracted from original data. The value of 2.5% of the mean is chosen as a minimum interval size in this work. The choice of this parameter is subjective, and it need not be the same in all the cases. There is no denite way to choose the minimum interval size. In general, a smaller value produces more features in the prole and hence increases redundancy. A higher value produces fewer features, but all the spatial information may not be accounted for. It is therefore suggested that it is preferably a smaller value so that the spatial information is embedded in the prole in a better way.

296

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 10, NO. 2, MARCH 2013

TABLE II R ESULTS OF THE C LASSIFICATION OF THE I NDIAN P INES I MAGE U SING THE (a) RF C LASSIFIER AND (b) SVM C LASSIFIER . T HE C LASSIFICATION U SING THE O RIGINAL S PECTRAL BANDS I S D ENOTED AS O RIG , AND THE OTHER F EATURE R EDUCTION M ETHODS A RE D ENOTED U SING THE C ORRESPONDING ACRONYMS . T HE EAP S A RE D ENOTED W ITH S UBSCRIPT p. T HE C ORRESPONDING F EATURE S IZE I S G IVEN IN THE B RACKETS . A = ACCURACIES AND F = F EATURES . (a) ACCURACY VALUES IN P ERCENTAGE U SING RF C LASSIFIER . (b) ACCURACY VALUES IN P ERCENTAGE FOR SVM C LASSIFIER

TABLE III C LASSIFICATION R ESULTS OF THE U NIVERSITY OF PAVIA I MAGE U SING THE (a) RF C LASSIFIER AND (b) SVM C LASSIFIER . T HE C LASSIFICATION U SING THE O RIGINAL S PECTRAL BANDS I S D ENOTED AS O RIG , AND THE OTHER F EATURE R EDUCTION M ETHODS A RE D ENOTED U SING THE C ORRESPONDING ACRONYMS . T HE EAP S A RE D ENOTED W ITH S UBSCRIPT p. T HE C ORRESPONDING F EATURE S IZE I S G IVEN IN THE B RACKETS . A = ACCURACIES AND F = F EATURES . (a) ACCURACY VALUES IN P ERCENTAGE FOR RF C LASSIFIER . (b) ACCURACY VALUES IN P ERCENTAGE FOR SVM C LASSIFIER

values. While comparing RF and SVM classiers, the effect of Hughes phenomenon is clearly evident when the EAP is used with SVM. Better accuracy values are achieved with SVM only after a further feature reduction of the EAP, as shown in Table II. The accuracy values obtained with SVM are still lower compared to those with the RF classier, indicating that RF is more stable when limited training samples are available. The classication maps of the best results using RF and SVM classiers are shown in Fig. 3.
Fig. 3. Classication maps of the best results obtained by (a) RF classier using the EAP of NWFE features and (b) SVM classier using NWFE feature reduction of the EAP of KPCA features.

B. University of Pavia Data Table III shows the results of classication with the EAP using RF and SVM classiers, respectively. It can be observed that very high accuracy values are obtained when the proposed EAP is used. As it was observed in [14], DAFE and DBFE

provide the best accuracy values. The accuracy values provided by the RF classier seem to be consistently better than those achieved by the SVM classier. As a comparison with a

MARPU et al.: GENERATION OF APS FOR SPECTRALSPATIAL CLASSIFICATION OF REMOTE SENSING DATA

297

add more spatial information. The results also agree with the results reported in [14] where it has been observed that the accuracy values depend on the choice of the feature reduction algorithm. NWFE is more accurate when limited samples are used, but DBFE is more accurate when sufcient samples are used. However, KPCA produces the most consistent results. In addition, RF is less affected by Hughes phenomenon than the SVM, which seems to perform better when the dimensionality of the prole is reduced. ACKNOWLEDGMENT The authors would like to thank Prof. P. Gamba of the University of Pavia, Italy, for providing the Pavia University data set and also Prof. D. Landgrebe and the LARS group of Purdue University, USA, for providing the Indian Pines data set freely along with the MultiSpec software. R EFERENCES
[1] M. Pesaresi and J. A. Benediktsson, A new approach for the morphological segmentation of high-resolution satellite imagery, IEEE Trans. Geosci. Remote Sens., vol. 39, no. 2, pp. 309320, Feb. 2001. [2] J. A. Benediktsson, M. Pesaresi, and K. Arnason, Classication and feature extraction for remote sensing images from urban areas based on morphological transformations, IEEE Trans. Geosci. Remote Sens., vol. 41, no. 9, pp. 19401949, Sep. 2003. [3] J. A. Benediktsson, J. A. Palmason, and J. R. Sveinsson, Classication of hyperspectral data from urban areas based on extended morphological proles, IEEE Trans. Geosci. Remote Sens., vol. 43, no. 3, pp. 480491, Mar. 2005. [4] M. Dalla Mura, J. A. Benediktsson, B. Waske, and L. Bruzzone, Morphological attribute proles for the analysis of very high resolution images, IEEE Trans. Geosci. Remote Sens., vol. 48, no. 10, pp. 37473762, Oct. 2010. [5] M. Dalla Mura, A. Villa, J. A. Benediktsson, J. Chanussot, and L. Bruzzone, Classication of hyperspectral images by using extended morphological attribute proles and independent component analysis, IEEE Geosci. Remote Sens. Lett., vol. 8, no. 3, pp. 541545, May 2011. [6] P. Soille, Morphological Image Analysis Principles and Applications, 2nd ed. Berlin, Germany: Springer-Verlag, 2003. [7] M. Dalla Mura, J. A. Benediktsson, B. Waske, and L. Bruzzone, Extended proles with morphological attribute lters for the analysis of hyperspectral data, Int. J. Remote Sens., vol. 31, no. 22, pp. 59755991, Jul. 2010. [8] E. J. Breen and R. Jones, Attribute openings, thinnings, and granulometries, Comput. Vis. Image Underst., vol. 64, no. 3, pp. 377389, Nov. 1996. [9] K. Pearson, On lines and planes of closest t to systems of points in space, Phil. Mag., vol. 2, no. 6, pp. 559572, 1901. [10] B. Schlkopf, A. J. Smola, and K.-R. Mller, Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput., vol. 10, no. 5, pp. 12991319, Jul. 1998. [11] D. A. Landgrebe, Signal Theory Methods in Multispectral Remote Sensing. London, U.K.: Wiley, 2003. [12] L. Breiman, Random forests, Mach. Learn., vol. 45, no. 1, pp. 532, 2001. [13] B. Schlkopf and A. J. Smola, Learning With Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA: MIT Press, 2002. [14] M. Pedergnana, M. Dalla Mura, S. Peeters, J. A. Benediktsson, and L. Bruzzone, Classication of hyperspectral data using extended attribute proles based on supervised and unsupervised feature extraction techniques, Int. J. Image Data Fusion, in press.

Fig. 4. Classication maps of the best results obtained by (a) RF classier using the EAP of DBFE features and (b) SVM classier using DBFE feature reduction of the EAP of DBFE features.

conventional approach for the selection of the reference values, we recall that the overall accuracy obtained for this data set, i.e., an RF classier and a 36-level EAP computed considering standard deviation attribute and 8 equally spaced reference values on each of the rst four PCs, was 78.68% [7]. The proposed approach, for the same settings, led to an OA value of 90.95% [see Table III(b)], thus achieving an increment of more than 12%. This proves the effectiveness of the proposed selection procedure with respect to a conventional manual approach. However, it has to be noted that the parameters are subjectively chosen in the manual approach. The proposed method identies the parameters based on the training data. Even when a sufcient number of training samples were available, the SVM classier required further feature reduction of the prole to achieve the highest accuracy. The EAP based on KPCA produces consistent accuracy values except for the Meadows class, which is a highly nonhomogenous region and, as expected, is a difcult class to be extracted in an unsupervised manner. The classication maps of the best results using RF and SVM classiers are shown in Fig. 4. It can be observed from the results of both data sets that the EAP based on supervised feature reduction provides the best accuracy values, but the EAP of KPCA features also produces good classication maps consistently. VI. C ONCLUSION In this letter, an attempt has been made to automatically generate the standard deviation-based APs for remote sensing data classication. The obtained results show that very high accuracy values are obtained using the proposed method even if only one attribute (standard deviation) is used. The accuracy values might improve if additional attributes are to be used also. In the future, the methods to automatically generate APs using shape-based attributes such as area will be studied in order to