You are on page 1of 5

1

GPU Implementation of an Automatic Target Detection and Classication Algorithm for Hyperspectral Image Analysis
Sergio Bernab e, Sebasti an L opez, Member, IEEE, Antonio Plaza, Senior Member, IEEE, and Roberto Sarmiento

AbstractIn many applications, detecting objects (targets) is a very important task. In particular, algorithms for detecting (moving or static) targets in remotely sensed hyperspectral images often require real-time responses for swift decisions that depend upon high computing performance of algorithm analysis. The automatic target detection and classication algorithm (ATDCA) has been widely used for this purpose. In this letter, we develop two new optimizations for accelerating the computational performance of ATDCA. The rst one focuses on the use of the GramSchmidt orthogonalization method instead of the orthogonal projection process adopted by the classic algorithm. The second one is focused on the efcient implementation of the algorithm on commodity graphics processing units (GPUs), achieving realtime performance of ATDCA for the rst time in the literature. The proposed optimizations are evaluated not only in terms of target detection accuracy, but also in terms of computational performance using two different GPU architectures by NVidiaTM : Tesla C1060 and GeForce GTX 580. Experiments are conducted using hyperspectral data collected by the NASAs Airborne Visible InfraRed Imaging Spectrometer (AVIRIS) over the World Trade Center (WTC) complex in New York City and the Cuprite mining district in Nevada. These results reveal considerable acceleration factors while retaining the same target detection accuracy for the algorithm. Index TermsHyperspectral imaging, automatic target detection and classication algorithm (ATDCA), Gram-Schmidt orthogonalization, commodity graphics processing units (GPUs).

of least square-based error minimization [7]. The IEA uses a similar approach, but with a different initialization condition. The RX algorithm is based on the well-known Mahalanobis distance [8]. Many other target/anomaly detection algorithms have also been proposed in the recent literature, using different concepts such as background modeling and characterization [9]. Different quantitative and comparative assessments of target detection algorithms have revealed very good performance of ATDCA with regards to other approaches in terms of both detection accuracy and computational performance [1], [10]. Depending on the complexity and dimensionality of the hyperspectral data the aforementioned algorithms may be computationally very expensive, a fact that limits the possibility of utilizing those algorithms in time-critical applications [11]. In turn, the wealth of spectral information available in hyperspectral imaging data opens ground-breaking perspectives for improving target detection. In many applications it is of critical importance that automatic target detection algorithms complete their analysis quickly enough for practical use. Despite the growing interest in parallel hyperspectral imaging research [12], only a few parallel implementations of automatic target detection algorithms for hyperspectral data exist in the open literature [13]. However, with the recent explosion in the amount and dimensionality of hyperspectral imagery, parallel processing is expected to become a requirement in most remote sensing missions. In this letter, we develop the rst real-time implementation of the ATDCA algorithm for remotely sensed hyperspectral data exploitation. It is based on two optimizations: i) use of the Gram-Schmidt orthogonalization method instead of the OSP process adopted by the classic algorithm, which has already demonstrated its efectiveness when applied to the vertex component analysis (VCA) endmember extraction algorithm [14], and ii) efcient implementation on commodity graphics processing units (GPUs), a low-weight hardware platform that now offers a tremendous potential to bridge the gap towards real-time analysis of remotely sensed hyperspectral data [15] [20]. The remainder of this letter is organized as follows. Section II describes the original ATDCA and the proposed GramSchmidt optimization. Section III describes a GPU implementation of this algorithm. Section IV evaluates the proposed optimizations in terms of target detection accuracy and computational performance. Section V concludes the letter with some remarks and hints at plausible future research lines.

I. I NTRODUCTION Hyperspectral target detection and identication is a very important task in remotely sensed hyperspectral data exploitation [1]. Over the last few years, several algorithms have been developed for the purpose of target identication, including the automatic target detection and classication (ATDCA) algorithm [2], an unsupervised fully-constrained least squares (UFCLS) algorithm [3], an iterative error analysis (IEA) algorithm [4], or the well-known RX algorithm developed by Reed and Xiaoli for anomaly detection purposes [5]. The ATDCA algorithm nds a set of spectrally distinct target pixels vectors using the concept of orthogonal subspace projection (OSP) [6] in the spectral domain. On the other hand, the UFCLS algorithm generates a set of distinct targets using the concept
Sergio Bernab e and Antonio Plaza are with the Hyperspectral Computing Laboratory, Department of Technology of Computers and Communications, University of Extremadura, E-10071 C aceres, Spain. E-mail: {sergiobernabe, aplaza}@unex.es. Sebasti an L opez and Roberto Sarmiento are with the Institute for Applied Microelectronic - (IUMA), University of Las Palmas de Gran Canaria, E35017 Tara Baja, Spain. E-mail: {seblopez, roberto}@iuma.ulpgc.es.

II. ATDCA AND ITS G RAM -S CHMIDT O PTIMIZATION The original ATDCA algorithm, based on OSP concepts, will be referred hereinafter as ATDCA-OSP. The optimization using the Gram-Schmidt method will be referred hereinafter as ATDCA-GS. This optimization allows calculating the OSP without requiring the computation of the inverse of the matrix that contains the targets already identied in the image. A. ATDCA-OSP Let x0 be an initial target signature (i.e., the pixel vector with maximum length in the original n-dimensional hyperspectral image F Rn ). This algorithm uses an OSP operator which is given by the following expression [6]:
PU = I U(UT U)1 UT ,

the same k-dimensional subspace of Rn (k n) as A. In particular, B is obtained as follows:


b1 b2 b3 b4 . . . bk = a1 , = a2 proj b1 (a2 ), = a3 proj b1 (a3 ) proj b2 (a3 ), = a4 proj b1 (a4 ) proj b2 (a4 ) proj b3 (a4 ), = ak
k1 j =1

e1 e2 e3 e4

= = = =

b1 b1 b2 b2 b3 b3 b4 b4

projbj (ak ),

. . . bk ek = bk , (2)

where the projection operator is dened in Eq. (3), in which < a, b > denotes the inner product of vectors a and b. proj b (a) = < a, b > b. < b, b > (3)

(1)

where U is a matrix of spectral signatures, UT is the transpose of this matrix, and I is the identity matrix. This OSP operator is applied to all image pixels, with U = [x0 ]. It then nds a target signature, x1 , with the maximum projection in < x0 > , which is the orthogonal complement space linearly spanned by x0 . A second target signature x2 can then be found by applying another OSP operator PU , with U = [x0 , x1 ], to the original image, where the target signature that has the maximum orthogonal projection in < x0 , x1 > is selected as x2 . The above procedure is repeated until a set of target pixels {x0 , x1 , , xt1 } is extracted, where t is an input parameter to the algorithm. This process is summarized in Algorithm 1. Algorithm 1 Pseudocode of ATDCA-OSP
1: INPUTS: F Rn and t;

2: 3: 4:

5: 6:

7: 8: 9:

% F denotes an n-dimensional hyperspectral image with r pixels and t denotes the number of targets to be detected U =[x0 | 0 |, , | 0]; % x0 is the pixel vector with maximum length in F for i = 1 to t 1 do = I U(UT U)1 UT ; PU is a vector orthogonal to the subspace spanned by the % PU columns of U F; v = PU % F is projected onto the direction indicated by PU i = argmax{1, ,r} v[:, i]; % The maximum projection value is found, where r denotes the total number of pixels in the hyperspectral image and the operator : denotes all elements xi U[:, i + 1] = F[:, i]; % The target matrix is updated end for OUTPUT: U = [x0 , x1 , , xt1 ];

The sequence b1 , , bk in Eq. (2) represents the set of orthogonal vectors generated by the Gram-Schmidt method, and thus, the normalized vectors e1 , , ek in Eq. (2) form an orthonormal set. As far as B spans the same k -dimensional subspace of Rn as A, an additional vector bk+1 computed by following the procedure stated at Eq. (2) is also orthogonal to all the vectors included in A and B. This algebraic assertion constitutes the cornerstone of the ATDCA-GS algorithm, whose pseudocode is represented in Algorithm 2. As it can be observed from Algorithm 2, the computation of the orthogonal projector PU starts by rst initializing it to an arbitrary n-dimensional vector (step 6 of Algorithm 2). Then, the Gram-Schmidt orthogonal set B is generated from the targets already detected in the image (steps 710 of Algorithm 2). Finally, PU is updated in steps 1114 of Algorithm 2 by forcing it to be orthogonal to all the vectors in B and, as a consequence, to all the targets already stored in U. At this point, it is important to emphasize that the n components of the orthogonal projector PU could be initialized to any other values, since this would not affect its orthogonality with respect to the targets already extracted from the input hyperspectral image F. III. GPU IMPLEMENTATION In this section we describe the second optimization applied to the ATDCA-OSP algorithm, which consists of efciently implementing its ATDCA-GS variation in GPUs. It should be noted that a GPU implementation of the original ATDCA-OSP algorithm was presented in [13]. In the context of NVidiaTM compute unied device architecture (CUDA)1 adopted for our implementation, GPUs can be abstracted in terms of a stream model, under which all data sets are represented as streams (i.e., ordered data sets) [21]. The architecture of a GPU can be seen as a set of multiprocesors (MPs), where each multiprocessor is characterized by a single instruction multiple data (SIMD) architecture, i.e., in each clock cycle each processor executes the same instruction but operating on multiple data streams. Each processor has access to a local shared memory and also to local cache memories in the multiprocessor, while the multiprocessors have access to the global GPU (device) memory. Algorithms are constructed
1 http://www.nvidia.com/object/cuda

B. ATDCA-GS In this subsection we describe the rst optimization proposed for ATDCA-OSP, which consists of using the GramSchmidt method for orthogonalization purposes. This process selects a nite set of linearly independent vectors A = {a1 , , ak } in the inner product space Rn in which the original hyperspectral image F is dened, and generates an orthogonal set of vectors B = {b1 , , bk } which spans

home new.html

Algorithm 2 Pseudocode of ATDCA-GS


1: INPUTS: F Rn and t;

2: 3:

4: 5:

6: 7: 8: 9: 10:

11: 12: 13: 14:

15: 16:

17: 18: 19:

% F denotes an n-dimensional hyperspectral image with r pixels and t denotes the number of targets to be detected U =[x0 | 0 |, , | 0]; % x0 is the pixel vector with maximum length in F B =[0 | 0 |, , | 0 |]; % B is an auxiliary matrix for storing the orthogonal base generated by the Gram-Schmidt process for i = 1 to t 1 do B[:, i] = U[:, i]; % the i-th column of B is initialized with the target computed in the last iteration (here, the operator : denotes all elements) PU = [1, , 1]; for j = 2 to i do U[:,i]T B[:,j 1] proj B[:,j 1] (U[:, i]) = B[: B[:, j 1]; ,j 1]T B[:,j 1] B[:, i] = B[:, i] proj B[:,j 1] (U[:, i]); % The i-th column of B is updated end for j % The computation of B is nished for the current iteration of the main loop for k = 1 to i do wT B[:,k] proj B[:,k] (w) = B[: B[:, k]; ,k]T B[:,k] PU = PU proj B[:,k] (w); end for k is nished for the current iteration % The computation of PU of the main loop F; v = PU % F is projected onto the direction indicated by PU i = argmax{1, ,r} v[:, i]; % The maximum projection value is found, where r denotes the total number of pixels in the hyperspectral image xi U[:, i + 1] = F[:, i]; % The target matrix is updated end for OUTPUT: U = [x0 , x1 , , xt1 ];

Fig. 1. False color composition of an AVIRIS hyperspectral image collected by NASAs Jet Propulsion Laboratory over lower Manhattan on Sept. 16, 2001 (left). Location of thermal hot spots in the res observed in World Trade Center area, available online: http://pubs.usgs.gov/of/2001/ofr01-0429/hotspot.key.tgif.gif (right).

the Gram-Schmidt method as detailed in Algorithm 2. This operation is performed in the CPU. A new kernel, in which the number of blocks equals the number of lines in the hyperspectral image and the number of threads equals the number of samples, is now applied to project the orthogonal vector onto each pixel in the image. The maximum of all projected pixels is calculated using a separate kernel which obtains the new target x1 . 3) The algorithm now extends the target matrix as U = [x0 x1 ] and repeats from step 2 until the desired number of targets (specied by the input parameter t) has been detected. The output of the algorithm is a set of targets U = [x0 , x1 , , xt1 ]. IV. E XPERIMENTAL R ESULTS

by chaining so-called kernels which operate on entire streams and which are executed by a multiprocessor, taking one or more streams as inputs and producing one or more streams as outputs. Thereby, data-level parallelism is exposed to hardware, and kernels can be concurrently applied without any sort of synchronization. The kernels can perform a kind of batch processing arranged in the form of a grid of blocks, where each block is composed by a group of threads which share data efciently through the shared local memory and synchronize their execution for coordinating accesses to memory. Next we describe the different steps that we followed for the development of the GPU version of the ATDCA-GS algorithm: 1) Once the hyperspectral image F is mapped onto the GPU memory, a structure is created in which the number of blocks equals the number of lines in the hyperspectral image and the number of threads equals the number of samples. A kernel is now used to calculate the brightest pixel x0 in F. This kernel computes (in parallel) the dot product between each pixel vector and its own transposed version, retaining the pixel that results in the maximum projection value. 2) Once the brightest pixel in F has been identied, the pixel is allocated as the rst column in matrix U. The algorithm now calculates the orthogonal vectors through

A. Hyperspectral image data The rst data set used in our experiments was collected by the AVIRIS sensor over the World Trade Center (WTC) area in New York City on September 16, 2001, just ve days after the terrorist attacks that collapsed the two main towers and other buildings in the WTC complex. The data set consists of 614 512 pixels, 224 spectral bands and a total size of (approximately) 140 MB. The spatial resolution is 1.7 meters per pixel. The leftmost part of Fig. 1 shows a false color composite of the data set selected for experiments using the 1682, 1107 and 655 nm channels, displayed as red, green and blue, respectively. Vegetated areas appear green in the leftmost part of Fig. 1, while burned areas appear dark gray. Smoke coming from the WTC area (in the red rectangle) and going down to south Manhattan appears bright blue due to high spectral reectance in the 655 nm channel. Extensive reference information, collected by U.S. Geological Survey (USGS), is available online for the WTC scene. In this work, we use as a ground-truth reference a U.S. Geological Survey thermal map which shows the target locations of the thermal hot spots at the WTC area, displayed as bright red, orange and yellow spots at the rightmost part of Fig. 1. The second data set used in experiments was collected by the AVIRIS sensor over the Cuprite mining district in

TABLE II S PECTRAL ANGLE VALUES ( IN DEGREES ) BETWEEN THE TARGET PIXELS EXTRACTED BY ATDCA-OSP AND ATDCA-GS AND THE KNOWN GROUND TARGETS IN THE AVIRIS C UPRITE SCENE .
Version ATDCA-OSP ATDCA-GS Alunite 4.81 5.48 Buddingtonite 4.16 4.08 Calcite 9.52 5.87 Kaolinite 10.76 11.14 Muscovite 5.29 5.68 Average 6.91 6.45

Fig. 2. (a) False color composition of the AVIRIS hyperspectral over the Cuprite mining district in Nevada and (b) U.S. Geological Survey mineral spectral signatures used for validation purposes. TABLE I S PECTRAL ANGLE VALUES ( IN DEGREES ) BETWEEN THE TARGET PIXELS EXTRACTED BY ATDCA-OSP AND ATDCA-GS AND THE KNOWN GROUND TARGETS IN THE AVIRIS W ORLD T RADE C ENTER SCENE .
Version ATDCA-OSP ATDCA-GS A 0.00 0.00 B 14.43 27.16 C 0.00 0.00 D 27.38 15.62 E 20.32 27.81 F 7.13 3.98 G 4.15 2.72 H 31.27 24.26

cases, the number of target pixels to be detected was set to t = 19 after calculating the VD. As shown by Table II, the ATDCA-GS extracted targets which were slightly more similar (on average) to the ground references than those provided by ATDCA-OSP. This indicates that the proposed Gram-Schmidt optimization does not penalize the ATDCA algorithm in terms of target detection accuracy. C. Analysis of parallel performance Two different GPU platforms have been used in our experiments. The rst one is the NVidiaTM Tesla C1060 GPU4 , which features 240 processor cores operating at 1.296 GHz, with single precision oating point performance of 933 Gops, double precision oating point performance of 78 Gops, total dedicated memory of 4 GB, 800 MHz memory (with 512-bit GDDR3 interface) and memory bandwidth of 102 GB/s. The second one is the NVidiaTM GeForce GTX 580 GPU5 , which features 512 processor cores operating at 1.544 GHz, with single precision oating point performance of 1,354 Gops, double precision oating point performance of 198 Gops, total dedicated memory of 1,536 MB, 2,004 MHz memory (with 384-bit GDDR5 interface) and memory bandwidth of 192.4 GB/s. Both GPUs are connected to an Intel core i7 920 CPU at 2.67 GHz with 8 cores, which uses a motherboard Asus P6T7 WS SuperComputer. Before analyzing the parallel performance of the proposed GPU implementation, we emphasize that our parallel versions provide exactly the same results as the corresponding serial versions, executed in one of the cores of the i7 920 CPU and implemented using the gcc (gnu compiler default) with optimization ag O3 to exploit data locality and avoid redundant computations. As a result, the only difference between the serial and parallel versions is the time they need to complete their calculations. The reported GPU times are the mean of ten executions in each platform (the measured times were always very similar, with differences if any on the order of only a few milliseconds). Table III reports the processing times obtained for the GPU implementations of ATDCA-OSP and ATDCA-GS on the two considered GPU architectures, and for the two considered hyperspectral scenes. It should be noted that the GPU implementation of ATDCA-OSP corresponds to an improvement of the one described in [13] (some kernels were further optimized), while the GPU implementation of ATDCA-GS is the one described in this letter. As shown by Table III, the ATDCA-GS achieved signicant speedups in both GPU architectures and offered signicant improvements with regards to
4 http://www.nvidia.com/object/product

Nevada in the summer of 1997. It is available online (in reectance units) after atmospheric correction2 . The portion used in experiments corresponds to a 350 350-pixels subset of the sector labeled as f970619t01p02 r02 sc03.a.r in the online data, which comprise 188 spectral bands in the range from 400 to 2500 nm and a total size of around 50 MB. Water absorption bands as well as bands with low signal-tonoise ratio (SNR) were removed prior to the analysis. The site is well understood mineralogically, and has several exposed minerals of interest, including alunite, buddingtonite, calcite, kaolinite, and muscovite. Reference ground signatures of the above minerals, available in the form of a USGS library3 will be used in this work for evaluation purposes. B. Analysis of target detection accuracy 1) Results with the AVIRIS WTC scene: Table I shows the spectral angle distance (SAD) [1] values (in degrees) between the most similar target pixels detected by ATDCA-OSP and the pixel vectors at the known target positions, labeled from A to H in the rightmost part of Fig. 1. The same results are reported for the ATDCA-GS. In all cases, the number of target pixels to be detected was set to t = 30 after calculating the virtual dimensionality (VD) of the data [22]. As shown by Table I, the ATDCA-OSP and ATDCA-GS extracted targets which were similar, spectrally, to the known ground-truth targets. Both versions were able to perfectly detect the targets labeled as A and C, and had more difculties in detecting other targets. In the case of targets labeled as D to H, the ATDCA-GS improved the target detection results (lower SAD values in Table I) with regards to the ATDCA-OSP. 2) Results for the AVIRIS Cuprite scene: Table II shows the SAD values (in degrees) between the most similar target pixels detected by the two considered versions: ATDCA-OSP and ATDCA-GS, and the pixel vectors at the known positions of the target minerals in the AVIRIS Cuprite image. In all
2 http://aviris.jpl.nasa.gov 3 http://speclab.cr.usgs.gov

tesla c1060 us.html

5 http://www.nvidia.com/object/product-geforce-gtx-580-us.html

TABLE III P ROCESSING TIMES ( SECONDS ) AND SPEEDUPS ACHIEVED BY ATDCA-OSP AND ATDCA-GS IN TWO DIFFERENT GPU S .
AVIRIS World Trade Center ATDCA-OSP Serial time Time Tesla C1060 GPU Time GeForce GTX 580 GPU Speedup Tesla C1060 GPU Speedup GeForce GTX 580 GPU 512.112 52.063 11.575 9.84 44.24 ATDCA-GS 7.323 0.522 0.182 14.03 40.25 AVIRIS Cuprite ATDCA-OSP 87.982 9.088 2.125 9.68 41.40 ATDCA-GS 1.595 0.137 0.057 11.65 28.08

R EFERENCES
[1] C.-I. Chang, Hyperspectral Imaging: Techniques for Spectral Detection and Classication. Kluwer Academic/Plenum Publishers: New York, 2003. [2] H. Ren and C.-I. Chang, Automatic spectral target recognition in hyperspectral imagery, IEEE Transactions on Aerospace and Electronic Systems, vol. 39, no. 4, pp. 12321249, 2003. [3] D. Heinz and C.-I. Chang, Fully constrained least squares linear mixture analysis for material quantication in hyperspectral imagery, IEEE Trans. Geosci. Remote Sens., vol. 39, pp. 529545, 2000. [4] R. A. Neville, K. Staenz, T. Szeredi, J. Lefebvre, and P. Hauff, Automatic endmember extraction from hyperspectral data for mineral exploration, Proc. 21st Canadian Symp. Remote Sens., pp. 2124, 1999. [5] I. Reed and X.Yu, Adaptive multiple-band cfar detection of an optical pattern with unknown spectral distrihution. IEEE Trans. Acoustics, Speech and Signal Processing, vol. 38, pp. 17601770, 1990. [6] J. C. Harsanyi and C.-I. Chang, Hyperspectral image classication and dimensionality reduction: An orthogonal subspace projection, IEEE Trans. Geosci. Remote Sens., vol. 32, no. 4, pp. 779785. [7] C.-I. Chang and D. Heinz, Constrained subpixel target detection for remotely sensed imagery, IEEE Trans. Geosci. Remote Sens., vol. 38, pp. 11441159, 2000. [8] J. A. Richards and X. Jia, Remote Sensing Digital Image Analysis: An Introduction. Springer, 2006. [9] G. Shaw and D. Manolakis, Signal processing for hyperspectral image exploitation, IEEE Signal Processing Magazine, vol. 19, pp. 1216, 2002. [10] C.-I. Chang and H. Ren, An experiment-based quantitative and comparative analysis of hyperspectral target detection and image classication algorithms for hyperspectral imagery, IEEE Trans. Geosci. Remote Sens., vol. 2, p. 1044. [11] A. Plaza and C.-I. Chang, High Performance Computing in Remote Sensing. Taylor & Francis: Boca Raton, FL, 2007. [12] A. Plaza, J. Plaza, A. Paz, and S. Sanchez, Parallel hyperspectral image and signal processing, IEEE Signal Processing Magazine, vol. 28, no. 3, pp. 119126, 2011. [13] A. Paz and A. Plaza, Clusters versus GPUs for parallel automatic target detection in remotely sensed hyperspectral images, EURASIP Journal on Advances in Signal Processing, vol. 915639, pp. 118, 2010. [14] S. Lopez, P. Horstrand, G. M. Callico, J. F. Lopez, and R. Sarmiento, A low-computational-complexity algorithm for hyperspectral endmember extraction: Modied vertex component analysis, IEEE Geoscience and Remote Sensing Letters, vol. PP , Issue:99, pp. 15, 2011. [15] J. Setoain, M. Prieto, C. Tenllado, and F. Tirado, GPU for parallel onboard hyperspectral image processing, International Journal of High Performance Computing Applications, vol. 22, no. 4, pp. 424437, 2008. [16] Y. Tarabalka, T. V. Haavardsholm, I. Kasen, and T. Skauli, Real-time anomaly detection in hyperspectral images using multivariate normal mixture models and gpu processing, Journal of Real-Time Image Processing, vol. 4, pp. 114, 2009. [17] H. Yang, Q. Du, and G. Chen, Unsupervised hyperspectral band selection using graphics processing units, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 4, no. 3, pp. 660668, 2011. [18] J. A. Goodman, D. Kaeli, and D. Schaa, Accelerating an imaging spectroscopy algorithm for submerged marine environments using graphics processing units, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 4, no. 3, pp. 669676, 2011. [19] E. Christophe, J. Michel, and J. Inglada, Remote sensing processing: From multicore to GPU, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 4, no. 3, pp. 643652, 2011. [20] S. Sanchez, A. Paz, G. Martin, and A. Plaza, Parallel Unmixing of Remotely Sensed Hyperspectral Images on Commodity Graphics Processing Units, Concurrency and Computation: Practice & Experience, vol. 23, no. 13, pp. 15381557, 2011. [21] J. Setoain, M. Prieto, C. Tenllado, A. Plaza, and F. Tirado, Parallel morphological endmember extraction using commodity graphics hardware, IEEE Geosci. Remote Sens. Lett., vol. 43, pp. 441445, 2007. [22] Q. Du and C.-I. Chang, Estimation of number of spectrally distinct signal sources in hyperspectral imagery, IEEE Trans. Geosci. Remote Sens., vol. 42, no. 3, pp. 608619, 2004. [23] R. O. Green et al., Imaging spectroscopy and the airborne visible/infrared imaging spectrometer (AVIRIS), Remote Sensing of Environment, vol. 65, no. 3, pp. 227248, 1998.

the previously available GPU implementation of ATDCA-OSP. The slightly lower speedups achieved for the AVIRIS Cuprite image (50 MB in size) compared to those obtained for the AVIRIS World Trade Center (140 MB in size) indicate that the ATDCA-GS provides more signicant acceleration factors as the amount of data to be processed is larger. The processing times achieved by the GPU implementation of ATDCA-GS are strictly in real-time. The cross-track line scan time in AVIRIS, a push-broom instrument [23], is quite fast (8.3 milliseconds to collect 512 full pixel vectors). This introduces the need to process the AVIRIS World Trade Center scene (614 512 pixels and 224 spectral bands) in less than 5.09 seconds and the AVIRIS Cuprite scene (350 350 pixels and 188 spectral bands) in less than 1.98 seconds in order to achieve real-time performance. As noted in Table III, all the proposed GPU implementations of ATDCA-GS are well below one second in processing time, including data transfers from CPU to GPU and vice-versa. This represents a very signicant improvement in the computational performance of ATDCA with regards to previous implementations in the literature [13], [20]. V. C ONCLUSIONS AND FUTURE RESEARCH In this letter, we have developed the ATDCA based on two optimizations: i) the use of Gram-Schmidt orthogonalization method instead of the orthogonal subspace projector commonly applied by this algorithm, and ii) the exploitation of GPUs to efciently implement the algorithm. The experimental results are very encouraging, reporting signicant speedups with regards to previously available software and hardware versions, and with strictly real-time performance in two different hyperspectral analysis scenarios. Although these results are very encouraging, GPUs are still far from being exploited in real missions due to power consumption and radiation tolerance issues. Currently we are experimenting with FPGA platforms in order to be able to adapt the proposed algorithms to hardware devices that can be mounted onboard hyperspectral imaging instruments after certication by international space agencies. ACKNOWLEDGMENT This work has been supported by the European Communitys Marie Curie Research Training Networks Programme, reference MRTN-CT-2006-035927 (HYPER-I-NET). Funding from the Spanish Ministry of Science and Innovation (HYPERCOMP/EODIX project, reference AYA2008-05965-C0402; DREAMS project, reference TEC2011-28666-C04-04) is gratefully acknowledged.

You might also like