Real Time Recursive Hyperspectral Sample and Band Processing PDF

Chein-I
Chang
Real-Time Recursive
Hyperspectral
Sample and Band
Processing
Algorithm Architecture and
Implementation
Real-Time Recursive Hyperspectral Sample
and Band Processing
Chein-I Chang
Real-Time Recursive
Hyperspectral Sample
and Band Processing
Algorithm Architecture and Implementation
Chein-I Chang
Center for Hyperspectral Imaging in Remote Sensing (CHIRS)
Information and Technology College
Dalian Maritime University
Dalian, China
Remote Sensing Signal and Image Processing Laboratory (RSSIPL)
University of Maryland, Baltimore County (UMBC)
Baltimore, MD, USA
ISBN 978-3-319-45170-1 ISBN 978-3-319-45171-8 (eBook)

DOI 10.1007/978-3-319-45171-8
Library of Congress Control Number: 2016955822
© Springer International Publishing Switzerland 2017

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission
or information storage and retrieval, electronic adaptation, computer software, or by similar or
dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt
from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained
herein or for any errors or omissions that may have been made. The publisher remains neutral with
regard to jurisdictional claims in published maps and institutional affiliations.
Printed on acid-free paper
This Springer imprint is published by Springer Nature

The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
This book is dedicated to the fourth
anniversary of the passing of my mother
(張郭貴蘭女士) who passed away on
November 28, 2012, while this book being
written. Her constant inspiration and endless
support were my driving force to complete
this book.
Preface
Owing to recent advances of hyperspectral imaging technology, with as many as

hundreds of spectral bands being used for data acquisition an immediate issue is
dealing with enormous data volumes in an effective and efficient means. Several
approaches have been studied to resolve this issue. One is focused on algorithm
design and analysis from a signal processing point of view (Chang 2003, 2013).
Another approach involves data compression from a data processing point of view
(Mota 2009; Huang 2011). A third approach is parallel processing from a data
structure point of view (Plaza and Chang 2007). A fourth approach is real-time
processing from a timely implementation point of view. A fifth approach is pro-
gressive processing from a data transmission and communication point of view.
Finally, a sixth approach is Field Programmable Gate Array (FPGA) from a
hardware point of view. While the first three have been investigated and studied
extensively over the past years, the last three seem relatively new and are yet to be
explored in the literature. This book intends to address issues arising in connection
with the last three approaches. It is an outgrowth of my recent research on the
design and development of algorithms for real-time processing of hyperspectral
imagery which were supposed to be covered in Chang (2013) but could not be
included due to space limitations. Its main theme is real-time processing which has
attracted considerable interest in recent years. However, to treat this subject more
effectively this book introduces a new concept, Progressive HyperSpectral Imaging
(PHSI), which will be explored for the first time. Since PHSI is primarily developed
from a need of data communication in real-time processing, its idea is originated
from two different means of data acquisition by sensors, Band-Interleaved-by-
Sample (BIS) process which operates on data sample vectors with full band
information sequentially sample by sample and Band SeQuential (BSQ) process,
which operates full data samples in individual bands but band by band sequentially
(Schowengerdt 1997). As a result, in light of PHSI various approaches to processing
data can be developed under this umbrella, for example, onboard processing, online
processing, sequential processing, iterative processing, causal processing and real-
time processing.
vii
viii Preface
Two types of PHSI are designed and developed in this book to process
hyperspectral data in real time. One is sample-wise PHSI which processes
hyperspectral data sample by sample in a progressive manner, with full bands of
each data sample vector being processed. This is generally used to address the issue
of BIS process. An approach to implementing sample-wise PHSI developed in this
book is called progressive hyperspectral sample processing (PHSP), which is
similar to the BIS concept. In other words, all data sample vectors are fully
processed sample by sample progressively in real time. Most hyperspectral imaging
algorithms currently being used can be interpreted by PHSP one way or another.
The other type is band-wise PHSI which operates hyperspectral data by
processing bands in a progressive manner, where data samples in each of individual
bands are processed separately and sequentially one band after another. This type is
generally used to address the issue associated with the BSQ process. An approach to
band-wise PHSI is called progressive hyperspectral band process (PHBP), which is
derived from the BSQ idea with slight modifications. It can be further extended to
recursive hyperspectral band process (RHBP) in which data samples can be
processed in real time as new bands come in. In this book, both PHBP and RHBP
will be referred to as RHBP where RHBP can be considered as a recursive version
of PHBP. More specifically, data processing by RHBP can take place as new
incoming bands feed in, while at the same time, the processed results are output
at nearly the same time provided the computing time is negligible. Unfortunately,
such RHBP seems to have received little interest to date. For example, if we assume
that the use of each spectral band is specified by a bit in terms of whether or not it is
selected for data processing, the number of bits used to process data samples can be
interpreted as the number of bands used to process data. The proposed RHBP
materializes this concept to make it applicable to hyperspectral data processing.
Unlike PHSP, which seems natural and easy to understand, RHBP is rather new.
On some occasions RHBP has been confused with band selection (BS) methods.
This deserves more clarification. RHBP is a new theory developed for hyperspectral
data transmission and communications where hundreds of spectral channels can be
fine-tuned progressively for data processing. It is not a BS process. More specifi-
cally, it deals with issues of transmitting and communicating hyperspectral data
band by band but not issues of selecting bands. Thus, RHBP does not need to solve
band selection optimization problems to find optimal bands. In fact, BS and
progressive band process are completely separate subjects. They may be correlated
but concepts are rather different. BS must select all necessary bands prior to data
processing. In this case, it requires that prior knowledge of how many bands need to
be determined prior to the selection of optimal bands. In this case, a recently
developed Virtual Dimensionality (VD) has been used to estimate the band number
and a band selection criterion is then used to find all required optimal bands. As a
result, BS generally cannot be implemented in real time. In contrast, RHBP
performs data processing as new incoming bands are fed in without having to
determine the band number and selecting bands. Such an advantage is significant
in data transmission and communication, a task which cannot be accomplished by
BS. Moreover, RHBP offers many advantages that cannot be provided by most
Preface ix
hyperspectral data processing techniques. First, it produces preliminary results

progressively so that users can terminate data processing whenever an abort deci-
sion is made. Second, it reduces a tremendous amount of computation time only
using new innovations information without reprocessing all data. Most importantly,
it can be realized in real time, so that a timely decision can be made while data
processing is taking place. Finally, it also paves the way for Field Programmable
Gate Array (FPGA) chip design. To support its utility in real-world problems, three
areas, spectral unmixing, endmember finding and hyperspectral target detection, are
used for illustration in this book.
Five separate but related subjects are presented. Part I: Fundamentals provides
knowledge readers will need to understand subsequent chapters. Part II–Part V are
devoted to recursive hyperspectral sample processing (RHSP) and recursive
hyperspectral band processing (RHSP). In particular, Parts II and III design and
develop techniques for sample spectral statistics-based RHSP and signature spectral
statistics-based RHSP respectively, whereas Parts IV and V present treatments
similar to those in Parts II and III to design and develop techniques for RHSP
RHBP, respectively. In addition, Chap. 21 presents conclusions along with an
appendix on matrix identities.
This book represents a new addition to the series containing my three books
(Chang 2003a, 2013, 2016). It supplements materials not covered in these for a
better and more comprehensive treatment in hyperspectral imaging. However, to
make individual chapters as self-contained as possible some narratives in each
chapter may be repeated over again. Also, image data sets used for experiments
will be reiterated in each chapter as well. I believe that this type of presentation will
allow readers to avoid having continually back and forth between chapters. How-
ever, the readers who are already familiar with the given topics and image data sets
may skip these parts and go directly to section of greater interest.
For the data used in this book I would like to thank the Spectral Information
Technology Applications Center (SITAC) who furnished its HYDICE data that was
used in the experiments described in this book. In addition, I would also like to
thank and acknowledge the use of Purdue’s Indiana Indian Pines test site and the
AVIRIS Cuprite image data available on their Web sites.
Like my previous books, this book would have been impossible without tremen-
dous contributions of many people who deserve my sincere gratitude and deepest
appreciation. They are my former Ph.D. students, Drs. Hsian-Min Chen (陳享民),
Shih-Yu Chen ( ), Cheng Gao ( ), Hsiao-Chi Li ( ), Drew Paylor,
Robert Schultz and Chao-Cheng Wu ( ), as well as my current Ph.D.
students, Marissa Hobbs, Li-Chien Lee ( ), Yao Li ( ) and Bai Xue
( ) plus four visiting scholars from China, Professor Chunhong Liu (劉春红)
from China Agriculture University, Professor Liaoying Zhao ( ) from Hang-
zhou Dianzi University, Professor Meiping Song (宋梅萍) from Dalian Maritime
University and Professor Lin Wang ( ) from Xidian University and one former
visiting Ph.D. student, Professor Yulei Wang ( ) from Dalian Maritime
x Preface
University, China. Also, I would like to thank the members of the Center for
Hyperspectral Imaging in Remote Sensing (CHIRS) at Dalian Maritime University,
Professors Meiping Song (宋梅萍), Sen Li (李森), Chunyan Yu (于纯妍), Lin
Wang ( ) and Yulei Wang ( ).
In addition, my appreciation is also expanded to my colleagues, Professor Pau-
Choo Chung ( ) with the Department of Electrical Engineering, National
Cheng Kung University, Professor Yen-Chieh Ouyang ( ) with the
Department of Electrical Engineering, National Chung Hsing University, Professor
Chinsu Lin ( ) with the Department of Forestry and Natural Resources at
National Chiayi University, Professor Chia-Hsien Wen ( ) with Providence
University, Dr. Ching-Wen Yang ( ) who is the Director of Computer
Center, Taichung Veterans General Hospital and Ching-Tsong Tsai ( )
with Tunghai University. Specifically, I would like to particularly thank my former
Ph.D. student Dr. Hsiao-Chi Li for carrying out nearly all experiments presented in
Chaps. 2, 6, 10–12, 16 and current Ph.D. student Ms. Yao Li for doing all
experiments in Chaps. 14, 15, 18–20. This book cannot be completed without
their involvements.
I would also like to acknowledge several universities in Taiwan and China for
their financial and professional support during my visit to China and Taiwan,
particularly, Chang Jiang Scholar Chair Professorship (教育部長江學者講座教
授) from Dalian Maritime University (大連海事大學) in China along with strong
and enthusiastic supports from its president Dr. Yuqing Sun ( ), Dis-
tinguished Chair Professorship of Remote Sensing Technology (遙測科技傑出講
座教授) from the National Chung Hsing University, and Adjunct Chair Professor-
ship from Providence University and Chair Professorship with the Taichung Vet-
erans General Hospital.
Finally, I would also like to thank many of my special friends in Taichung
Veterans General Hospital (TCVGH), Dr. San-Kan Lee ( ) (former
Superintendent of TCVGH), Dr. Ping-Wing Lui ( ) (Deputy Super-
intendent of TCVGH), Dr. Yong Kie Wong ( ) (Deputy Superinten-
dent at Show Chwan Health Care System), Dr. Clayton Chi-Chang Chen
( ) (Chairman of Radiology at TCVGH), Dr. Yen-Chuan Ou
( ) (Head of Department of Medical Research at TCVGH), Dr.
Bor-Jen Lee ( )( at TCVGH), Dr. Jyh-Wen Chai
( ) (Chief of Radiology at TCVGH), Dr. Man-Yee Chan ( )
(Chief of Oral and Maxillofacial Surgery at TCVGH), Dr. Francis S.K. Poon
( ) (Director, Clinical Informatics Research and Development Center
at TCVGH) and Dr. Siwa Chan ( ) at TCVGH who have unselfishly provided
their expertise and resources for the research described in this book during my stay
in Taichung, Taiwan. Last but not least, I would also like to thank my very close and
special friend, General Manager, Vincent Tseng ( ) at Bingotimes,
Inc. ( ). All their supports are greatly appreciated.
Preface xi
Finally and most importantly, I would like to thank my family members, older
sister Feng-Chu Chang ( ), younger sister Mann-Li Chang ( ), youn-
ger brother Chein-Chi Chang ( ), sister-in-law Chuen-Chin Kuo ( ) as
well as my niece Chia-Fang Lue ( ), also my three nephews, Yu-Wei Wayne
Chang ( ), Yu-Cheng Channing Chang ( ) and Yu-Rui Raymond
Chang ( ). Their supports cannot be underestimated as this book was being
prepared during the most difficult time in my life after my mother passed away.
Fall 2016 Chein-I Chang

Baltimore, MD
Dalian, Liao Ning, China Chang Jiang Scholar Chair Professor
Fellow, IEEE and SPIE
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Recursive Hyperspectral Sample Processing . . . . . . . . . . . . . 5
1.2.1 Sample Spectral Statistics-Based Recursive
Hyperspectral Sample Processing . . . . . . . . . . . . . . . 7
1.2.2 Signature Spectral Statistics-Based Recursive
Hyperspectral Sample Processing . . . . . . . . . . . . . . . 8
1.3 Recursive Hyperspectral Band Processing . . . . . . . . . . . . . . . 8
1.3.1 Band Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.2 Progressive Hyperspectral Band Processing . . . . . . . 9
1.3.3 Recursive Hyperspectral Band Processing . . . . . . . . 10
1.4 Scope of Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.1 Part I: Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.2 Part II: Sample Spectral Statistics-Based
Recursive Hyperspectral Sample Processing . . . . . . . 12
1.4.3 Part III: Signature Spectral Statistics-Based
Recursive Hyperspectral Sample Processing . . . . . . . 13
1.4.4 Part IV: Sample Statistics-Based Recursive
Hyperspectral Band Processing . . . . . . . . . . . . . . . . 13
1.4.5 Part V: Signature Statistics-Based Recursive
Hyperspectral Band Processing . . . . . . . . . . . . . . . . 14
1.5 Real Hyperspectral Images to Be Used in This Book . . . . . . . 14
1.5.1 AVIRIS Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.2 HYDICE Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.5.3 Hyperion Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6 Synthetic Images to Be Used in this Book . . . . . . . . . . . . . . . 24
1.7 How to Use this Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.8 Notations and Terminology Used in the Book . . . . . . . . . . . . . 28
xiii
xiv Contents
Part I Fundamentals
2 Simplex Volume Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.2 Determinant-Based Simplex Volume Calculation . . . . . . . . . . 33
2.3 Geometric Simplex Volume Calculation . . . . . . . . . . . . . . . . . 34
2.4 General Theorem for Geometric Simplex
Volume Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5 A Mathematical Toy Example . . . . . . . . . . . . . . . . . . . . . . . . 43
2.6 Real Image Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3 Discrete-Time Kalman Filtering for Hyperspectral Processing . . . . 49
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Discrete-Time Kalman Filtering . . . . . . . . . . . . . . . . . . . . . . 50
3.2.1 A Priori and A Posteriori State Estimates . . . . . . . . . 51
3.2.2 Finding an Optimal Kalman Gain K(k) . . . . . . . . . . . 53
3.2.3 Orthogonality Principle . . . . . . . . . . . . . . . . . . . . . . 54
3.2.4 Discrete-Time Kalman Predictor and Filter . . . . . . . . 56
3.3 Kalman Filter-Based Linear Spectral Mixture Analysis . . . . . . 58
3.4 Kalman Filter-Based Hyperspectral Signal Processing . . . . . . . 61
3.4.1 Kalman Filter-Based Hyperspectral Signal
Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.4.2 Kalman Filter-Based Spectral Signature Estimator . . . 63
3.4.3 Kalman Filter-Based Spectral Signature Identifier . . . 64
3.4.4 Kalman Filter-Based Spectral Signature
Quantifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4 Target-Specified Virtual Dimensionality
for Hyperspectral Imagery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2 Review of VD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.3 Eigen-Analysis-Based VD . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.1 Binary Composite Hypothesis Testing Formulation . . 81
4.3.2 Discussions of HFC Method and MOCA . . . . . . . . . . 85
4.4 Finding Targets of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.4.1 What Are Targets of Interest? . . . . . . . . . . . . . . . . . . 87
4.4.2 Second-Order-Statistics (2OS)-Specified
Target VD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4.3 HOS-Specified Target VD . . . . . . . . . . . . . . . . . . . . 97
4.5 Target-Specified VD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.5.1 Target-Specified Binary Hypothesis Testing . . . . . . . 100
4.5.2 ATGP-Specified VD Using ηl ¼ tlATGP 2 . . . . . . . . 103
pffiffiffiffi
4.5.3 ATGP-Specified VD Using ηl ¼ tlATGP . . . . . . . 103
4.5.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Contents xv
4.6 Synthetic Image Experiments . . . . . . . . . . . . . . . . . . . . . . . . 107

4.7.1 HYDICE Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.7.2 AVIRIS Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
Part II Sample Spectral Statistics-Based Recursive

Hyperspectral Sample Processing
5 Real-Time Recursive Hyperspectral Sample Processing for Active
Target Detection: Constrained Energy Minimization . . . . . . . . . . . 123
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.2 Constrained Energy Minimization . . . . . . . . . . . . . . . . . . . . . 126
5.3 RT-CEM Using BIP/BIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.4 CEM Using BIL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.5 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.5.1 RT-CEM Using BIP/BIS . . . . . . . . . . . . . . . . . . . . . 135
5.5.2 RT-CEM Using BIL . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.6.1 HYDICE Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.6.2 Computing Time Analysis . . . . . . . . . . . . . . . . . . . . 148
5.6.3 AVIRIS Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6 Real-Time Recursive Hyperspectral Sample Processing
for Passive Target Detection: Anomaly Detection . . . . . . . . . . . . . . 157
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.2 Anomaly Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.2.1 K-AD/R-AD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.2.2 Causal R-AD/K-AD . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.3 Matrix Inverse Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.3.1 Causal Sample Covariance/Correlation Matrix . . . . . . 162
6.3.2 Calculation of Matrix Inverse Using
Matrix Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.4 Real-Time Causal Anomaly Detection . . . . . . . . . . . . . . . . . . 167
6.4.1 Real-Time Causal R-AD . . . . . . . . . . . . . . . . . . . . . . 167
6.4.2 Real-Time Causal K-AD . . . . . . . . . . . . . . . . . . . . . 168
6.4.3 Computational Complexity . . . . . . . . . . . . . . . . . . . . 169
6.5.1 Target Implanted . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.5.2 Target Embeddedness . . . . . . . . . . . . . . . . . . . . . . . . 174
6.6.1 AVIRIS Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.6.2 HYDICE Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
xvi Contents
6.6.3 Background Suppression . . . . . . . . . . . . . . . . . . . . . 192

6.7 RT-AD Using BIL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
6.7.1 RT-R-AD Using Band-Interleaved-Line . . . . . . . . . . 199
6.7.2 RT-K-AD Using BIL . . . . . . . . . . . . . . . . . . . . . . . . 201
6.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Part III Signature Spectral Statistics-Based Recursive

7 Recursive Hyperspectral Sample Processing
of Automatic Target Generation Process . . . . . . . . . . . . . . . . . . . . 209
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
7.2 Recursive Hyperspectral Sample Processing of ATGP . . . . . . 211
7.3 Determination of Targets for RHSP-ATGP to Generate . . . . . . 215
7.5 Discussions on Stopping Rule for RHSP-ATGP . . . . . . . . . . . 221
7.6 Computational Complexity of RHSP-ATGP . . . . . . . . . . . . . . 223
7.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
8 Recursive Hyperspectral Sample Processing
of Orthogonal Subspace Projection . . . . . . . . . . . . . . . . . . . . . . . . . 227
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
8.2 Difference Between OSP and Linear Spectral
Mixture Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
8.3 Recursive Hyperspectral Sample Processing of OSP . . . . . . . . 230
8.3.1 Derivations of Recursive Update Equations . . . . . . . . 231
8.4 Signature Generation by RHSP-OSP . . . . . . . . . . . . . . . . . . . 235
8.4.1 Finding Unsupervised Target Signal Sources . . . . . . . 237
8.4.2 Determining the Number of Unwanted Targets . . . . . 238
8.4.3 RHSP-OSP Using an Automatic Stopping Rule . . . . . 241
8.5 HYDICE Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
8.6 Computational Complexity Analysis . . . . . . . . . . . . . . . . . . . 255
8.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
9 Recursive Hyperspectral Sample Processing of Linear
Spectral Mixture Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
9.2 LSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
9.2.1 Abundance Sum-to-One Constrained LSMA . . . . . . . 266
9.2.2 Abundance Nonnegativity Constrained LSMA . . . . . . 267
9.2.3 Abundance Fully Constrained LSMA . . . . . . . . . . . . 269
9.3 RHSP-LSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
9.3.1 Adaptive Linear Mixing Model . . . . . . . . . . . . . . . . 270
9.3.2 RHSP-LSMA Updated by Single Signatures . . . . . . . 270
9.3.3 RHSP-LSMA Fused by Two Signature-Varying
Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Contents xvii
9.4 Adaptive RHSP-LSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

9.4.1 OSP-Based Finding Signatures for ALMM . . . . . . . . 277
9.4.2 Linear Spectral Unmixing–Based Finding
Signatures for ALMM . . . . . . . . . . . . . . . . . . . . . . . 278
9.5 Determination of Number of Signatures
for ARHSP-LSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
9.6 HYDICE Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
9.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
10 Recursive Hyperspectral Sample Processing of Maximum
Likelihood Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
10.2 Criteria for Finding Virtual Signatures for LSMA . . . . . . . . . . 291
10.2.1 Least-Squares LSMA . . . . . . . . . . . . . . . . . . . . . . . . 291
10.2.2 Orthogonal Projection-Based LSMA . . . . . . . . . . . . . 293
10.2.3 Maximal Likelihood Estimation-Based LSMA . . . . . . 295
10.3 Recursive Hyperspectral Sample Processing of MLE . . . . . . . 297
10.3.1 RHSP-LS-Based Algorithm . . . . . . . . . . . . . . . . . . . 297
10.3.2 RHSP-MLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
10.4 Stopping Rule for RHSP-MLE . . . . . . . . . . . . . . . . . . . . . . . 301
10.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
10.8 Unmixed Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
10.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
11 Recursive Hyperspectral Sample Processing of Orthogonal
Projection-Based Simplex Growing Algorithm . . . . . . . . . . . . . . . . 319
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
11.2 Simplex Volume Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
11.3 Orthogonal Projection-Based Simplex Growing Algorithm . . . 323
11.4 Orthogonal Projection-Based SGA (OPSGA) . . . . . . . . . . . . . 328
11.5 Recursive OP-Simplex Growing Algorithm . . . . . . . . . . . . . . 329
11.5.1 Recursive GSV Calculation . . . . . . . . . . . . . . . . . . . 329
11.5.2 Derivations of RHSP-OPSGA Equations . . . . . . . . . . 331
11.5.3 RHSP-OPSGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
11.6 Various Versions of GSVA-Based Algorithms . . . . . . . . . . . . 335
11.6.1 1-GSVA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
11.6.2 Adaptive GSVA . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
11.7 Determining the Number of Endmembers
for RHSP-OPSGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
11.8.1 HYDICE Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
11.8.2 CUPRITE Mining District Data . . . . . . . . . . . . . . . . 343
11.9 Computer Processing Time Analysis . . . . . . . . . . . . . . . . . . . 350
11.10 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
xviii Contents
12 Recursive Hyperspectral Sample Processing of Geometric

Simplex Growing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
12.2 Geometric Simplex Growing Algorithm (GSGA) . . . . . . . . . . 361
12.2.1 Finding Heights of Simplexes for GSGA . . . . . . . . . . 366
12.2.2 Geometric Simplex Growing Algorithm . . . . . . . . . . 369
12.3 Recursive Hyperspectral Sample Processing
of Geometric Simplex Growing Algorithm . . . . . . . . . . . . . . . 370
12.3.1 Orthogonal Subspace Projection-Based
RHSP-GSGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
12.3.2 Orthogonal Vector Projection-Based
RHSP-GSGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
12.4 Relationship Between RHSP-GSGA
and RHSP-OPSGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
12.5 Determining Number of Endmembers
for RHSP-GSGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
12.6 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . 378
12.6.1 Computational Complexity of Determinant-Based
SGA (DSGA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
12.6.2 Computational Complexity of Dist-SGA . . . . . . . . . . 379
12.6.3 Computational Complexity of OPSGA . . . . . . . . . . . 379
12.6.4 Computational Complexity of Recursive OPSGA . . . 380
12.6.5 Computational Complexity of GSGA
Using Gram–Schmidt Orthogonalization Process . . . 380
12.6.6 Computational Complexity of Recursive GSGA . . . . 380
12.7.1 HYDICE Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
12.7.2 Cuprite Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
12.7.3 Computer Processing Time Analysis . . . . . . . . . . . . . 392
12.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Part IV Sample Spectral Statistics-Based Recursive Hyperspectral

Band Processing
13 Recursive Hyperspectral Band Processing for Active Target
Detection: Constrained Energy Minimization . . . . . . . . . . . . . . . . . 399
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
13.2 Recursive Equations for Calculating the Inverse
of a Casual Band Correlation Matrix . . . . . . . . . . . . . . . . . . . 401
13.3 Recursive Band Processing of CEM . . . . . . . . . . . . . . . . . . . . 403
13.4.1 HYDICE Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
13.4.2 Hyperion Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
13.5 Graphical User Interface Design . . . . . . . . . . . . . . . . . . . . . . 419
13.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
Contents xix
14 Recursive Hyperspectral Band Processing

for Passive Target Detection: Anomaly Detection . . . . . . . . . . . . . . 421
14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
14.2 Causal Bandwise R-AD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
14.3 Recursive Hyperspectral Band Processing
of Anomaly Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
14.3.1 RHBP-R-AD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
14.3.2 RHBP-K-AD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
14.5 Computing Time Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
14.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
Part V Signature Spectral Statistics-Based Recursive

Hyperspectral Band Processing (RHBP)
15 Recursive Hyperspectral Band Processing of Automatic
Target Generation Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
15.2 ATGP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
15.3 Recursive Equations for RHBP-ATGP . . . . . . . . . . . . . . . . . . 455
15.4 Algorithms for RHBP-ATGP . . . . . . . . . . . . . . . . . . . . . . . . . 458
15.5.1 TI Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464
15.5.2 Discussion of Results . . . . . . . . . . . . . . . . . . . . . . . . 466
15.5.3 TE Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
15.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
16 Recursive Hyperspectral Band Processing of Orthogonal
Subspace Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
16.2 Orthogonal Subspace Projection . . . . . . . . . . . . . . . . . . . . . . 485
16.3 Recursive Equations for RHBP-OSP . . . . . . . . . . . . . . . . . . . 486
16.4 Recursive Hyperspectral Band Processing of OSP . . . . . . . . . 489
16.5.1 TI Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
16.5.2 TE Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
16.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
xx Contents
17 Recursive Hyperspectral Band Processing of Linear Spectral

Mixture Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
17.2 Linear Spectral Unmixing Via Recursive Hyperspectral
Band Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
17.2.1 Derivation of Update Equation for Innovation
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507
17.3 Discussions on RHBP-LSMA . . . . . . . . . . . . . . . . . . . . . . . . 512
17.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
18 Recursive Hyperspectral Band Processing of Growing
Simplex Volume Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
18.2 Recurisve Hyperspectral Band Processing of Orthogonal
Projection-Based Simplex Growing Algorithm . . . . . . . . . . . . 531
18.2.1 Recursive Equations for RHBP-OPGSVA . . . . . . . . . 531
18.2.2 Recursive Hyperspectral Band
Processing of OPSGA . . . . . . . . . . . . . . . . . . . . . . . 533
of Geometric Simplex Growing Algorithm . . . . . . . . . . . . . . . 534
18.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
19 Recursive Hyperspectral Band Processing
of Iterative Pixel Purity Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543
19.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544
19.2 PPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
19.3 IPPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
19.3.1 P-IPPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
19.3.2 C-IPPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
19.3.3 FIPPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
19.3.4 Generalization of IPPI to Using Skewer Sets . . . . . . . 551
19.4 Recursive Hyperspectral Band Processing of IPPI . . . . . . . . . . 554
19.4.1 RHBP-IPPI Using a Skewer Set Fixed
for All Bands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
19.4.2 RHBP-IPPI Using Varying Skewer
Sets with Bands . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
19.5.1 TI Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561
19.5.2 TE Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
19.7 Graphical User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
19.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
Contents xxi
20 Recursive Band Processing of Fast Iterative

Pixel Purity Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
20.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596
20.2 FIPPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
20.3 Recursive Hyperspectral Band Processing of FIPPI . . . . . . . . . 600
20.3.1 Recursive Hyperspectral Band Processing
of FIPPI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
20.3.2 Recursive Skewer Processing of FIPPI . . . . . . . . . . . 602
20.4.1 TI Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
20.4.2 TE Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
20.5.1 RHBP-FIPPI Experimental Results with nVD ¼ 9 . . . . 615
20.5.2 RHBP-FIPPI Experimental Results with nVD ¼ 18 . . . 617
20.5.3 RSP-FIPPI Experimental Results with nVD ¼ 9 . . . . . 620
20.5.4 RSP-FIPPI Experimental Results with nVD ¼ 18 . . . . . 621
20.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
21 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627
21.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627
21.2 Recursive Hyperspectral Sample Processing . . . . . . . . . . . . . . 630
21.2.1 Part II: Sample Spectral Statistics-Based
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632
21.3 Recursive Hyperspectral Band Processing . . . . . . . . . . . . . . . 637
21.3.1 Part IV: Band Spectral Correlation-Based
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 639
21.3.2 Part V: Signature Spectral Correlation-Based
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
21.4 Future Work on Hyperspectral Band Processing . . . . . . . . . . . 647
21.4.1 Multispectral Imaging by Nonlinear Band
Dimensionality Expansion . . . . . . . . . . . . . . . . . . . . 648
21.4.2 Hyperspectral Single Band Selection . . . . . . . . . . . . . 650
21.4.3 Hyperspectral Band Subset Processing . . . . . . . . . . . 650
Appendix A: Matrix Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681
About the Author
Chein-I Chang is Professor in the Department of Computer Science and Electrical

Engineering at the University of Maryland, Baltimore County, where he founded
the Remote Sensing Signal and Image Processing Laboratory and conducts research
in the design and development of signal processing algorithms for hyperspectral
imaging, medical imaging, and Pattern Recognition.He also establishes a Center for
Hyperspectral Imaging in Remote Sensing (CHIRS) at the Dalian Maritime
University and is currently the Center Director holding Chang Jiang Scholar
Chair Professor. Dr. Chang has published over 155 refereed journal articles,
including more than 50 papers just in IEEE Transactions on Geoscience and
Remote Sensing, and 7 patents, with several pending, on hyperspectral image
processing. He authored three books, Hyperspectral Imaging: Techniques for
Spectral Detection and Classification (Kluwer, 2003), Hyperspectral Data
Processing: Algorithm Design and Analysis (Wiley, 2013), and Real Time Pro-
gressive Hyperspectral Image Processing: Endmember Finding and Anomaly
Detection (Springer, 2016). He also edited two books, Recent Advances in
Hyperspectral Signal and Image Processing (Transworld Research Network,
India, 2006) and Hyperspectral Data Exploitation: Theory and Applications
(Wiley, 2007), and co-edited with A. Plaza a book on High Performance Comput-
ing in Remote Sensing (CRC Press, 2007). Dr. Chang received his Ph.D. in
electrical engineering from the University of Maryland, College Park. He is a
Fellow of IEEE and SPIE with contributions to hyperspectral image processing.
xxiii
Chapter 1
Introduction
Abstract With advanced remote sensing technology hyperpectral imaging has

become an emerging technique that has found its way into many applications
ranging from geology, agriculture, and law enforcement to defense, medical imag-
ing, and food safety and inspection. In particular, it can be used to reveal many
material substances that cannot be resolved by multispectral imaging, for example,
subpixel targets, anomalies, endmembers, and mixed pixels. However, such imag-
ing also brings with it issues of how to cope with these problems. In addition, owing
to the vast amount of data collected by hyperspectral imaging sensors, the technique
also faces many challenging issues, for example, excessive computational costs,
effective spectral dimensionality [generally referred to as virtual dimensionality by
Chang (Hyperspectral Imaging: Spectral Techniques for Detection and Classifica-
tion, Kluwer Academic Publishers, New York, 2003)], removal of highly correlated
interband information by data compression or data reduction, data communication,
and transmission once hyperspectral imaging sensors are deployed in space. One
effective means of dealing with these issues is to develop real-time hyperspectral
imaging algorithms that can process hyperspectral data in real time in a kind of
process-and-forget manner such that data that have already been visited and
processed are discarded and only new incoming data will be processed to update
the results so as to achieve significant savings in computation and data storage. A
recent book by Chang (Real time progressive hyperspectral image processing:
endmember finding and anomaly detection, Springer, New York, 2016) is devoted
to this specific topic, where a new theory of progressive hyperspectral imaging
(PHSI) is introduced for processing hyperspectral data samplewise progressively in
real time for two specific applications, endmember finding and anomaly detection,
both of which are considered passive hyperspectral target detection methods that
generally require timely decisions on finding targets. This book is considered a
companion book to Chang’s, in parallel to the PHSI theory developed there, and
extends PHSI theory to recursive hyperspectral imaging (RHSI), where two types of
RHSI are developed, recursive hyperspectral sample processing (RHSP) and recur-
sive hyperspectral band processing (RHBP), in the sense that RHSP can implement
PHSI more efficiently and effectively by further updating data samples recursively
via recursive equations, and RHBP can process hyperspectral images progressively
band by band as well as recursively via recursive equations, both of which can be
performed in a manner similar to that of a Kalman filter.
© Springer International Publishing Switzerland 2017 1

C.-I Chang, Real-Time Recursive Hyperspectral Sample and Band Processing,
DOI 10.1007/978-3-319-45171-8_1
2 1 Introduction
Acronyms
α Abundance vector
^
α Estimate of abundance vector α
αj jth abundance fraction
^j
α Estimate of jth abundance fraction (αj)
A Weighting or mixing matrix
Az Area under an ROC curve
Bl lth band image
bl lth band image represented as a vector
C Total number of classes
Cj jth class
d Desired signature vector
D Desired signature matrix
Dλ Eigenvalue diagonal matrix
δ Detector or classifier
Δ Database
ε Error threshold
ej jth endmember
I Identity matrix
I(x) Self-information of x
k(.,.) Kernel function
K Total number of skewers used in Pixel Purity Index
K Sample covariance matrix
λ Eigenvalue of sample covariance matrix (K)
^λ Eigenvalue of sample correlation matrix (R)
l Index of band number
L Number of spectral channels or bands
Λ Eigenvector matrix
μ Global mean
μj Mean of jth class
mj jth signature vector
M Signature or endmember matrix
n Noise vector
N Total number of image pixel vectors in a band image, i.e., N ¼ nr nc
nB Number of bands being processed
nBS Number of bands to be selected
nl Set of first l bands
ns Number of first s sample vectors being visited
nVD Number of endmembers or signatures estimated by virtual dimensionality
p Number of endmembers
PD Detection power or probability
PF False alarm probability
P⊥U Projector to reject undesired target signatures in U
1.1 Introduction 3
r Image pixel vector

R Sample correlation matrix
σ2 Variance
SB Between-class scatter matrix
SW Within-class scatter matrix
t⊥ Orthogonal projection of target t
τ Threshold
w Weight vector
W Weight matrix
U Undesired signature matrix
v Eigenvector
ξ Transform used to perform dimensionality reduction
Ψ Interference matrix
z Projection vector
<.,.> Inner product
1.1 Introduction
The development of progressive hyperspectral imaging (PHSI) arose from how data
are acquired and collected as well as how data are processed. In general,
hyperspectral imaging sensors use two data acquisition formats (Schowengerdt
1997). One is referred to as the band-interleaved-by-sample (BIS) or the band-
interleaved-by-pixel (BIP) process (Fig. 1.1a), which acquires data sample vectors
with full band information sequentially sample by sample, or the band-interleaved-
by-line (BIL) process shown in Fig. 1.1b, which acquires data sample vectors with
full band information sequentially line by line, where (x,y) indicates the spatial
coordinate of a data sample vector or pixel vector and λ is a wavelength parameter
used to specify spectral bands. Note that the pixels highlighted by red in Fig. 1.1a
and the line highlighted by red in Fig. 1.1b are the data sample vectors currently
being processed.
Another format is referred to as Band SeQuential (BSQ) process shown in
Fig. 1.2, which acquires band images band by band sequentially where (x,y)
indicates the spatial coordinate of a data sample vector or pixel vector and λ is a
wavelength parameter used to specify spectral bands, and the band image
highlighted by RED in Fig. 1.2 is the band data currently being processed.
According to Figs. 1.1 and 1.2, to process a designed hyperspectral imaging
algorithm in real time, it must follow one of these two data acquisition formats.
With this interpretation the BIP/BIS/BIL format generally results in real-time
processing since the data sample vectors are processed in a raster fashion progres-
sively sample by sample or line by line. The PHSI theory presented in Chang (2016)
is developed to address this issue. To further facilitate the real-time capability of
PHSI, the first part of this book (i.e., Part II) extends PHSI to recursive
hyperspectral sample processing (RHSP) so that RHSP can be carried out by
4 1 Introduction
Fig. 1.1 Hyperspectral imagery acquired by BIP/BIS/BIL format. (a) BIP/BIS; (b) BIL
Fig. 1.2 Hyperspectral

imagery acquired by BSQ
format
updating data sample vectors recursively via recursive equations. On the other
hand, the BSQ format allows users to fully process each band image band by
band progressively. However, owing to the very high correlation among bands
resulting from hundreds of contiguous spectral channels used by hyperspectral
imaging sensors, it is anticipated that much interband redundancy can be removed
without loss of much information. Thus, effectively carrying out this task becomes
a challenging issue. The third part of this book (i.e., Part III) takes up this issue. In
particular, it extends PHSI to recursive hyperspectral band processing (RHBP),
which can be implemented more efficiently and effectively in real time by updating
data sample vectors to allow progressive hyperspectral band processing (PHBP) to
be processed not only progressively band by band but also recursively by recursive
equations.
1.2 Recursive Hyperspectral Sample Processing 5
Many hyperspectral imaging algorithms developed in Chang (2003a, b, 2013) are

designed on a single-pixel basis without accounting for spatial correlation. Accord-
ingly, they can be carried out in a causal manner and very easily implemented in
real time, for example, algorithms based on orthogonal subspace projection (OSP)
(Harsanyi and Chang, 1994), fully constrained least squares (FCLS) (Heinz and
Chang, 2001), and simplex growing algorithm (SGA) (Chang et al. 2006a, b).
However, when these pixel-based algorithms are extended to their unsupervised
versions, two issues arise in their real-time implementation. One is dealing with the
growing knowledge provided by newly generated targets in real time. Second, since
it is unsupervised, another issue is coming up with an automatic stopping rule that
allows the algorithms to be terminated in real time while the data processing is
ongoing. In other words, these two issues are the major driving forces behind the
design and development of progressive algorithms that allow data processing to be
executed by the growing knowledge generated from data currently being processed,
referred to as a posteriori knowledge. When knowledge thus obtained is provided
sample by sample or band by band, the algorithms can be further implemented in
real time as real-time processing algorithms. On the other hand, when the new
knowledge is produced by newly generated target knowledge one target at a time,
these algorithms can be further implemented in a progressive manner. During such
a progressive process these algorithms can provide progressive profiles of infor-
mation changes in real time as well.
Since data-obtained a posteriori target knowledge is increased progressively
sample by sample, the target knowledge obtained by previous samples remains
unchanged. In this case, it is apparent that algorithm design should take advantage
of this situation without reprocessing previous samples that have already been
visited. From a statistical signal processing point of view, data information can be
categorized into three types of information. One is so-called processed data infor-
mation obtained from processing data sample vectors being visited. Another is new
data information provided by the current data sample vector being processed. A
third one is called innovations information contained in new information but not in
the processed data information. Technically, these three types of information are
supposed to be statistically independent and uncorrelated. Only the innovations
information is needed for data updates. One excellent example of taking this
approach is Kalman filtering, to be discussed in Chap. 3, which derives recursive
equations to update data information without reprocessing all previously visited data
sample vectors. Analogously to Kalman filtering, this book also follows the same
treatment to extend progressive algorithms to recursive algorithms, which requires
only innovations information to update results via recursive equations as every time
a new target is generated. With this interpretation, innovations information is
defined as information that can be provided only by new targets but cannot be
obtained by previous targets. Such a recursive process utilizes recursive equations
to update results without reprocessing previously visited data sample vectors.
6 1 Introduction
Thus, practically speaking, a recursive process can be considered as a process-and-

forget process in the sense that it discards data sample vectors it has already
processed. Accordingly, a progressive algorithm that is implemented by updating
data processing information in a recursive manner is called a recursive algorithm.
However, it should be noted that progressive and recursive are completely different
concepts; the former places a focus on algorithm implementation in terms of how
data are processed, whereas the latter emphasizes the algorithmic architecture in the
updating of data. On the other hand, the idea of progressive processing is to produce
progressive profiles of changes in sample processing sample by sample or changes in
band processing band by band so that sample-varying or band-varying changes can
be detected to provide data analysts with additional information to improve their
interpretations. By contrast, recursive processing is developed primarily to update
data according to specially designed recursive equations so that the data sample
vectors that have been visited and processed will not be reprocessed. A similar
concept to pairing progression with recursion can also be found in the pairing of a
Gaussian process with a Markov property, defined as a Gauss–Markov random
process, where Gaussian specifies the process under consideration governed by a
Gaussian distribution, while the data processing property is described by a Markov
property. Thus, Gaussian process and Markov property are completely different
concepts. Similarly, a recursive process is not a progressive process, and vice versa.
Nevertheless, to avoid confusion, when the term recursive is used in this book, it is
generally used with the intention of including recursive structures in a progressive
process, that is, a recursive process performs data not only recursively to update the
data but also progressively to produce progressive profiles of processed data.
To resolve the issue of how to automatically terminate a recursive process in real
time, a Neyman–Pearson detection theory was developed. The idea is to compute
and consider the maximal OP leakage of each newly generated target into the
complement subspaces linearly spanned by previously generated targets as a signal
source and then use this residual signal source to formulate a binary composite
hypothesis-testing problem. A desired Neyman-Pearson detector (NPD) can be
developed in a manner similar to how the Harsanyi–Farrand–Chang (HFC) method
was developed by Harsanyi et al. (1994a, b) to estimate virtual dimensionality
(VD). As a result, an extension to a target-specified VD is discussed in Chap. 4,
where the targets to be specified for hypothesis testing can be generated by a
recursive process. The NPD determines whether a signal source specified by the
maximal OP leakage of a target fails the test. If the NPD fails, the considered target
is declared to be a real target. Because the maximal OP leakages calculated from
progressively generated targets are monotonically decreasing, the NPD generally
fails initially and the test is then continued on until NPD passes the test for the first
time, in which case the NPD is terminated. Such a detection is performed in
conjunction with a recursive process that generates each new target. It can be also
implemented in real time to determine whether the algorithm needs to be terminated
while a new target is being generated at the same time.
In general, hyperspectral sample processing algorithms can be grouped into
two categories. One group uses sample spectral covariance/correlation statistics,
such as anomaly detection, which finds targets whose signatures are spectrally
distinct from those of the surrounding sample vectors. The other is to use
signature characterization, such as a linear mixing model (LMM) used in linear
spectral mixture analysis (LSMA) for data unmixing. Thus, depending on which
one is used, the recursive equations derived for these two categories are also quite
different.
1.2.1 Sample Spectral Statistics-Based Recursive

According to Chang (2016), hyperspectral target detection can be performed

either in an active or passive mode. In active hyperspectral target detection,
constrained energy minimization (CEM), developed by Harsanyi (1993), is a
well-known technique used to detect known targets at the subpixel level. In
passive hyperspectral target detection, anomaly detection (AD) is most widely
used to detect unknown targets with their signatures spectrally distinct from the
signatures of data sample vectors in their surroundings. Interestingly, both tech-
niques make use of the sample covariance/correlation matrix to suppress back-
ground so that the contrast of targets of interest against the background can be
enhanced to improve target detectability. However, two issues prevent CEM and
AD from being implemented in real time. One is calculating the covariance/
correlation matrix, which requires the entire data set. In order for data to be
processed in real time, two passes or stages are needed to perform progressive
multiple pass/progressive stage processing (PMP/PSP), as described in Chang
(2016), each of which can be processed in real time, where the first pass/stage
process is to acquire the entire data set in real time to calculate the covariance/
correlation matrix, and in the second pass/stage the algorithms are run in real time
using the knowledge obtained in the first pass/stage. A second issue is causality,
which is a prerequisite for any real-time process. Causality as a concept and an
issue has been studied extensively (Chang 2016). A causal process is defined as a
process that can only use all data sample vectors up to the data sample vector
currently being visited. More specifically, in a causal process, no future data
sample vectors after the current data sample vectors should be allowed to be
used for data processing.
Part II in this book develops a single recursive process to resolve these
two issues, viz., causality and real-time calculation of the covariance/correlation
matrix.
8 1 Introduction
1.2.2 Signature Spectral Statistics-Based Recursive

To date, most hyperspectral imaging algorithms have been designed and developed
on the basis of a single-pixel vector. This means that the algorithms process data
sample vectors pixel by pixel without taking into account interpixel spatial corre-
lation. Examples include target detection, such as OSP (Harsanyi and Chang 1994)
and automatic target generation process (ATGP) (Ren and Chang 2003);
endmember extraction/finding, such as the Pixel Purity Index (PPI) (Boardman
1994), N-FINDR (Winter 1999), and a simplex growing algorithm (SGA) (Chang
et al. 2006a, b); and LSMA, such as fully constrained least squares (FCLS) method
(Heinz and Chang 2001). Thus, technically speaking, all these algorithms can be
processed in real time once their required prior knowledge is obtained. Unfortu-
nately, in real-world applications, obtaining such prior knowledge is either very
expensive or nearly impossible. Under such circumstances, algorithms must be
designed in an unsupervised manner, in which case the required knowledge can be
obtained only directly from the data to be processed a posteriori. Like target
detection (Sect. 1.2.1), PMP/PSP will require two passes or stages to perform
data processing in real time where the first pass/stage process is to acquire neces-
sary a posteriori knowledge required by the algorithm to be implemented and the
second pass/stage is to run the algorithm in real time. However, there is a disad-
vantage of using this method. If the obtained a posteriori knowledge needs to be
updated sample by sample, PMP/PSP must be repeatedly implemented over and
over again. This will result in significant increases in computing time, not to
mention that such time delays will make it impossible to implement the algorithm
in real time. Part III of this book is dedicated to resolving this issue.
RHBP can be considered a parallel theory to RHSP (Sect. 1.2), both of which
correspond to two completely different data acquisition formats, BIP/BIS/BIL and
BSQ, respectively. While these two processes appear similar, the techniques devel-
oped for both are quite different because recursive equations derived for BIP/BIS/
BIL are based on a covariance/correlation matrix or signature matrix formed by
data sample vectors, whereas the recursive equations derived for BSQ are from a
band matrix, which is formed by band images, not data sample vectors.
In recent years, hyperspectral band processing has generally been performed by
either data compression via linear/nonlinear transforms or data reduction via band
selection (BS), each of which has its own merits and applications. However, from a
real-time processing point of view, BS seems to be a better option since data
compression generally requires the entire data set before transforms are carried
out and cannot be implemented in a causal manner; thus, it cannot be implemented
1.3 Recursive Hyperspectral Band Processing 9
in real time. This will be particularly important for data communications and
transmission in future satellite hyperspectral data processing. Thus, Parts IV
and V of this book will focus on how BS can be extended and expanded from
various perspectives.
1.3.1 Band Selection
Various spectral bands provide different levels of information of interest. The

primary goal of BS is to select an appropriate band subset from the original band
set to represent the original data in some sense of optimality. Therefore, the
information preserved by BS has a significant impact on data analysis because the
information of unselected bands will be completely discarded following BS. Thus,
a key objective in BS is designing effective criteria so that BS can satisfy the
requirements ofvarious applications. This generally requires an exhaustive search
L
for all possible ¼
L! combinations, with L and ΩBS being

ΩBS L ΩBS ! ΩBS !
the total number of spectral bands in Ω and the number of bands to be selected in
ΩBS, respectively.
More specifically, assume that J(.) is a generic objective function of ΩBS for BS
to be optimized over all possible ΩBS. For a given number ofselected bands, nBS,
the goal of BS is to find an optimal band subset, ΩBS with ΩBS ¼ nBS, that satisfies

the following optimization problem:

Ω*BS ¼ arg max=min ð Ω Þ : ð1:1Þ
ΩBS Ω, ΩBS ¼nBS J BS
Depending on how the objective function J(ΩBS) is designed, the optimization in

(1.1) can be performed by either
maximization or minimization over all possible
band subsets ΩBS in Ω with ΩBS ¼ nBS .
Over the years many BS techniques have been investigated using various criteria
or features to define J(ΩBS) in (1.1). In addition, two more key issues needed to be
addressed in connection with BS: (1) the number of bands that need to be selected,
nBS, and (2) how to select the appropriate bands, ΩBS via (1.1). While the first issue
was addressed by VD, recently developed by Chang (2003a, b) and Chang and Du
(2004), the second issue seems more challenging because it is determined by
particular applications that drive the different designs of BS algorithms.
1.3.2 Progressive Hyperspectral Band Processing
As noted earlier, the determination of nBS may be resolved by VD. Unfortunately,

current VD estimation techniques are generally developed based on data statistics
10 1 Introduction
and structures not specified by particular applications that actually determine how
bands are selected. To mitigate this problem, an alternative approach to BS was
developed by Chang et al. (2011a, b, c, d) and Chang (2013) and is called
progressive band dimensionality processing (PBDP), in which bands are processed
without determining the value of nBS. This was further extended to progressive
band selection by Chang and Liu (2014) and Chang (2013). To properly select
bands, both BS and PSDP approaches require band prioritization (BP) criteria to
first rank bands so that bands can be selected according to their priorities. However,
different BP criteria generally result in different sets of selected bands. In partic-
ular, when BP is designed based on data statistics and structures, the bands selected
by BP may have nothing to do with applications. Another issue is that if a band is
selected by its high priority, it is very likely that its adjacent bands may also be
selected owing to their high correlation with the selected band. This furthermore
indicates that these bands may be redundant and should not be selected. To resolve
such a dilemma, band decorrelation (BD) is required to remove bands having high
correlation with selected bands. However, this also leads to another challenging
issue: determining an appropriate threshold for BD so that BD between bands and
bands below this threshold will be not selected. Apparently, such a threshold cannot
be fixed and must adapt to the data set and applications. To avoid the use of BP and
BD, we need to reinvent the wheel and seek a completely different approach. This
book develops a new approach, called PHBP, that does not rely on BP and
BD. Instead, PHBP is performed using algorithms that can be specified by a
particular application. For example, in applications of hyperspectral target detec-
tion, Chang et al. (2015a, b) and Chang and Li (2016a, b) developed PHBP versions
of AD and CEM, respectively.
1.3.3 Recursive Hyperspectral Band Processing
One issue associated with implementing PHBP developed in Sect. 1.3.2 is that it
must repeatedly recalculate causal band matrices as new band images come in. As
the number of bands grows large, such calculation becomes excessive and may be
impossible to manage. To address this computational issue, PHBP is further
extended to RHBP, which not only can implement PHBP progressively band by
band but can also update causal band matrices recursively band by band via
recursive equations. As a consequence, there are several salient differences between
PHBP and RHBP. First, RHBP can be considered a Kalman-filtering-like process
carried out by recursive equations. Second, owing to its recursive nature, RHBP can
also be implemented in real time. Third, RHBP can be easily implemented in
hardware, which provides further feasibility in chip design. Finally and most
importantly, RHBP has great promise in future hyperspectral data communication
and transmission.
1.4 Scope of Book 11
1.3.3.1 Sample Spectral Statistics-Based Recursive Hyperspectral Band

Processing
Section 1.2.1 develops sample spectral statistics-based RHSP for real-time

constrained energy minimization (RT-CEM) and real-time anomaly detection
(RT-AD). Correspondingly, we can also follow a similar treatment to derive sample
spectral statistics-based RHBP for CEM and AD as RHBP-CEM and RHBP-AD.
While RHBP-CEM and RHBP-AD may look similar to RT-CEM and RT-AD, their
conceptual developments are quite different. First, since RHSP is implemented
samplewise sample by sample according to the BIP/BIS/BIL format, it can be
carried out in real time. However, this is not true for RHBP because it is
implemented bandwise to acquire an entire band image one at a time, band by
band. Second, the causal sample spectral statistics-based covariance/correlation
matrix used by RHSP must be rederived for causal sample spectral statistics-
based band covariance/correlation matrices for RHBP. Third, RHSP can be
implemented as part of RHBP since RHSP is considered a special case of RHBP
when it uses up all the bands. In other words, RHSP is the completion of RHBP
when all full bands are processed.
1.3.3.2 Sample Spectral Statistics-Based Recursive Hyperspectral

Band Processing
Similarly, a theory of signature spectral statistics-based RHBP can also be derived

as a counterpart of signature spectral statistics-based RHSP in Sect. 1.2.2. In this
case, ATGP, OSP, LSMA, and SGA developed for RHSP can also be rederived as
their RHBP counterparts, RHBP-ATGP, RHBP-OSP, RHBP-LSMA, and RHBP-
SGA along with RHBP-PPI and RHBP-FIPPI whose RHSP counterparts are
already developed in Chaps. 8 and 12 of Chang (2016).
1.4 Scope of Book
This book is comprised of five parts. Part I: Fundamentals provides needed knowl-
edge to help readers follow in subsequent chapters. Parts II–V are devoted to RHSP
and RHSP. In particular, Parts II and III design and develop techniques for sample
spectral statistics-based RHSP and signature spectral statistics-based RHSP,
respectively, whereas Parts IV and V follow treatments similar to those in Parts II
and III to design and develop techniques for sample spectral statistics-based RHBP
and signature spectral statistics-based RHBP, respectively.
12 1 Introduction
1.4.1 Part I: Fundamentals
Many important applications can be found in hyperspectral imaging. Part I presents

background knowledge for those unfamiliar with these topics. It consists of four
chapters. Chapter 2 covers three basic spectral unmixing techniques: abundance-
unconstrained, partially abundance-constrained, and fully abundance-constrained
methods, which play a key role in many subsequently developed techniques, such
as endmember finding and target detection and classification. Chapter 3 covers
another main theme in the book. It presents the fundamentals of designing and
developing hyperspectral imaging algorithms recursively. Since to VD was origi-
nally defined as the number of spectrally distinct signatures, it does not specify what
spectrally distinct signatures are. For example, there are different types of spectrally
signatures, such as anomalies, endmembers, unsupervised targets, signatures used
to form a LMM for LSMA or Linear Spectral Unmixing (LSU), each of which
requires a different value of VD. To address this issue, Chap. 4 extends VD to
target-specified VD (TSVD).
1.4.2 Part II: Sample Spectral Statistics-Based Recursive

Part II discusses recursive real-time RHSP for target detection, which makes the
covariance/correlation matrix K/R recursive to update targets of interest as well as
progressive to process data sample vectors. The other is real-time RHSP for signa-
ture generation, to be treated in Part III, which finds signatures recursively on a
single-pixel processing basis while progressively processing data sample vectors in
real time. As described in Chang (2016), causality is a prerequisite for real-time
processing. In this case, the concept of causality must be included in design and
development of target detection algorithms where causal sample covariance matrix
(CSCVM) and causal sample correlation matrix (CSCRM) are introduced as causal
versions of the commonly used global sample covariance matrix and global sample
correlation matrix, respectively. By virtue of these newly defined causal covariance/
correlation matrices, hyperspectral target detection can be carried out in real time
based on a BIP/BIS/BIL data acquisition format. Most importantly, the profiles of
various degrees of sample-wise progressive background suppression by CSCVM
and CSCRM can also be generated for visual assessment, with target detection
taking place recursively which cannot be accomplished by conventional target
detection.
Two chapters (Chaps. 5 and 6) on recursive real-time hyperspectral target
detection are included in Part II:
• Chapter 5: Recursive Real Time Hyperspectral Sample Processing for Active
Target Detection: Constrained Energy Minimization;
• Chapter 6: Recursive Real Time Hyperspectral Sample Processing for Passive
Target Detection: Anomaly Detection.
1.4 Scope of Book 13
1.4.3 Part III: Signature Spectral Statistics-Based Recursive

Unlike hyperspectral target detection, which relies on the covariance/correlation

matrix to capture targets of interest, another type of hyperspectral imaging algo-
rithms is pixel-based that do not use sample spectral statistics. Accordingly, their
performance is completely determined by effective annihilation of undesired sig-
natures so as to improve target detectability. However, in reality the prior knowl-
edge of such signatures is generally not available and must be found directly from
the data to be processed. Consequently, finding signatures in an unsupervised
manner becomes very challenging because the same two key issues arising in
RHSP, described in Part II—(1) how many signatures must be generated and
(2) finding these signatures of interest—also need to be resolved. To deal with
these two issues, Part III, which consists of six chapters, Chaps. 7–12, develops
recursive theories for several well-known algorithms currently being used in the
literature: ATGP, OSP, LSMA, maximum likelihood estimation (MLE), and SGA,
each of which is described in a separate and individual chapter in Part II.B as
follows:
• Chapter 7: RHSP of Automatic Target Generation Process
• Chapter 8: RHSP of Orthogonal Subspace Projection
• Chapter 9: RHSP of Maximum Likelihood Estimation
• Chapter 10: RHSP of Linear Spectral Mixture Analysis
• Chapter 11: RHSP of Orthogonal Projection-Based Growing Simplex Volume
Analysis
• Chapter 12: RHSP of Geometric Growing Simplex Volume Analysis
1.4.4 Part IV: Sample Statistics-Based Recursive

Hyperspectral Band Processing
Following the same logic used to derive Part II, a parallel theory is also developed
in Part IV for real-time RHBP based on the BSQ data acquisition format. Thus, Part
IV presents recursive real-time causal band covariance/correlation-based RHBP for
target detection, which makes use of the causal band covariance/correlation matrix
to update targets of interest band by band, while Part V addresses recursive real-
time RHBP for finding signatures band by band.
Since the process is carried out band by band in real time, causal versions of
band covariance/correlation matrices similar to CSCVM and CSCRM are also
introduced as causal band sample covariance matrix (CBSCVM) and causal band
sample correlation matrix (CBSCRM) for target detection. Two chapters
(Chaps. 13 and 14) are included as counterparts to Chaps. 5 and 6 in Part II:
14 1 Introduction
• Chapter 13: RHBP for Active Target Detection: Constrained Energy

Minimization
• Chapter 14: RHBP for Passive Target Detection: Anomaly Detection.
1.4.5 Part V: Signature Statistics-Based Recursive

Although Part V can also be considered as the counterpart to Part III, it should be
noted that in Part III signatures are updated recursively using full band information
through RHSP. By contrast, in Part V signatures are updated band by band while the
number of signatures is fixed by RHBP. As a result, there is no need to determine
how many signatures must be generated. Instead, RHBP allows users to determine
how many bands must be used when the performance of algorithms shows little
improvement as the number of bands increases.
Six chapters (Chaps. 15–20) are included in Part V:
• Chapter 15: RHBP of Automatic Target Generation Process
• Chapter 16: RHBP of Orthogonal Subspace Projection
• Chapter 17: RHBP of Linear Spectral Mixture Analysis
• Chapter 18: RHBP of Growing Simplex Volume Analysis
• Chapter 19: RHBP of Iterative Pixel Purity Index
• Chapter 20: RHBP of Fast Iterative Pixel Purity Index
1.5 Real Hyperspectral Images to Be Used in This Book
Three real hyperspectral image data sets are frequently used in this book in
experiments. Two are AVIRIS real image data sets, Cuprite Mining District in
Nevada and Purdue’s Indian Pine test site in Indiana. A third image data set is the
HYperspectral Digital Imagery Collection Experiment (HYDICE) image scene
(Basedow et al. 1992). Each of these three data sets is briefly described in what
follows.
1.5.1 AVIRIS Data
Two AVIRIS data sets presented in this section are Cuprite data and Lunar Crater
Volcanic Field (LCVF) data, which can be used for different purposes in
applications.
1.5 Real Hyperspectral Images to Be Used in This Book 15
1.5.1.1 Cuprite Data
One of the most widely studied hyperspectral image scenes available in the public
domain is the Cuprite Mining District in Nevada (Fig. 1.3a) with a region of interest
shown as a subscene in Fig. 1.3b, which is a 224-band image 350 350 pixels in
size collected over the Cuprite Mining District site in Nevada in 1997.
It is well understood mineralogically. As a result, a total of 189 bands were used
for experiments where bands 1–3, 105–115, and 150–170 were removed prior to the
analysis owing to water absorption and low signal-to-noise ratio (SNR) in those
bands.
Although there are more than five minerals in the data set in Fig. 1.4a, which is
an enlarged figure of Fig. 1.3b, the ground truth available for this region only
provides the locations of the pure pixels: alunite (A), buddingtonite (B), calcite (C),
kaolinite (K), and muscovite (M). The locations of these five pure minerals are
labeled A, B, C, K, and M, respectively, and shown in Fig. 1.4b.
The five pure pixels shown in Fig. 1.4b are carefully verified using laboratory
spectra in Fig. 1.5 provided by the USGS (available at http://speclab.cr.usgs.gov).
Fig. 1.3 Cuprite Mining District image scene (a) original Cuprite Mining District image scene;
(b) the image cropped from the center region of the original scene in (a) (350 350)
16 1 Introduction
Fig. 1.4 (a) Spectral band number 170 of Cuprite Mining District AVIRIS image scene; (b)
spatial positions of five pure pixels corresponding to minerals alunite (A), buddingtonite (B),
calcite (C), kaolinite (K), and muscovite (M)
1 1
Calcite
Calcite
0.9
Alunite
Kaolinite Alunite
0.8 0,8
0.7 Kaolinite
Reflectance
Reflectance
Muscovite
0.6 0,6
Buddingtonite
Buddingtonite
0.5
0.4 0,4
Muscovite
0.3
0,2
0.2
20 40 60 80 100 120 140 160 180 200 220 400 700 1000 1300 1600 1900 2200 2500
Band number Wavelength (nm)
Fig. 1.5 Five USGS ground truth mineral spectra
It is recommended that bands 1–3, 105–115, and 150–170, owing to the low
water absorption and low SNR in those bands, be removed prior to data processing.
As a result, a total of 189 bands are used for experiments, and the reflectance and
radiance signatures of the five pure pixels shown in Fig. 1.4b are shown in Fig. 1.6a,
b. The steps to produce the spectra in Fig. 1.6a, b follow:
1. Download the laboratory reflectance data at http://speclab.cr.usgs.gov/.
2. Use Spectral Angle Mapper (SAM) (Chang 2003a, b) as a spectral similarity
measure to identify the five pixels in Fig. 1.4a that correspond to the five
reflectances obtained in step 1 using the following procedure.
a
7000
Muscovite Alunite
Buddingtonite
6000 Calcite
Alunite Kaolinite Kaolinite
Muscovite
5000
Reflectance
4000 Buddingtonite
3000
Calcite
2000
1000
0
400 600 800 1000 1200 1400 1600 1800 2000 2200 2400
Wavelength(nm)
b
12000
Alunite
Kaolinite
Muscovite Buddingtonite
Calcite
10000 Muscovite
Kaolinite
8000 Alunite
Radiance
6000
4000
Calcite
Buddingtonite
2000
0
400 600 800 1000 1200 1400 1600 1800 2000 2200 2400
Wavelength(nm)
Fig. 1.6 (a) Reflectance signatures of five minerals marked in Fig. 1.4b in wavelengths;
(b) radiance signatures of five minerals marked in Fig. 1.4b in bands
18 1 Introduction
Cuprite Alteration Zones U

(Generalized from Spectral Mineral Map) D
N
Eastern Center
Western Center
U
D
U
D
U
D
D
U
Silica
Alunite
Kaolinite
Kaolinite +
Muscovite U
D
Propylitic
Muscovite 1 km
U D
Fig. 1.7 Alteration map available from USGS
• Remove noisy bands from the five reflectance data.

• Remove bands with abnormal readings from the spectral library.
• To measure the spectral similarity, several bands still need to be removed to
account for compatibility.
Note that the ground truth is not stored in a “file.” The locations of the five
minerals are identified by comparing their reflectance spectra against their
corresponding lab reflectance values in the spectral library.
Figure 1.7 also shows an alteration map for some of the minerals that is
generalized from a ground map provided by the USGS and obtained by Tricorder
SW version 3.3. Note that this radiometrically calibrated and atmospherically
corrected data set, available at http://aviris.jpl.nasa.gov, is provided in reflectance
units with 224 spectral channels, where the data were calibrated and atmospheri-
cally rectified using the ACORN software package.
1.5.1.2 Lunar Crater Volcanic Field
A second AVIRIS image scene to be used in our experiments is shown in Fig. 1.8a.
It is the Lunar Crater Volcanic Field (LCVF) located in Northern Nye County,
Nevada, which is one of the earliest hyperspectral image scenes studied in the
literature. Atmospheric water bands and low-SNR bands were removed from the
a
vegetation
cinders
shade
anomaly
rhyolite
dry lake
b
1800
anomaly
1600 cinders
dry lake
1400 rhyolite
shade
1200 vegetation
Radiance
1000
800
600
400
200
0
0.40 µm 2.5 µm
Wavelength
Fig. 1.8 (a) AVARIS LCVF scene; (b) spectra of anomaly, cinders, dry lake, rhyolite, shade,
and vegetation
data, reducing the image cube from 224 to 158 bands. The LCVF image has a 10 nm
spectral resolution and 20 m spatial resolution.
The image scene in Fig. 1.8a is relatively simple compared to the Cuprite Mining
District scene in Fig. 1.4a, where there are five targets of interest: red oxidized
basaltic cinders, rhyolite, playa (dry lake), vegetation, and shade, whose radiance
spectra are plotted in Fig. 1.8b. In addition to these five target signatures, there is an
interesting target that is an anomaly two pixels in size located in the upper left dry
lake with its spectral signature also plotted in Fig. 1.8b.
20 1 Introduction
1.5.2 HYDICE Data
The HYDICE image scene shown in Fig. 1.9a is 200 74 pixel vectors in size, and
its ground truth is provided in Fig. 1.9b, where the center and boundary pixels of the
objects are highlighted by red and yellow, respectively. The upper part contains
fabric panels 3, 2, and 1 m2 in size from the first column to the third column. Since
the spatial resolution of the data is 1.56 m2, the panels in the third column are
treated as subpixel target. The lower part contains different vehicles with sizes of
4 8 m (the first four vehicles in the first column) and 6 3 m (the bottom vehicle
in the first column) and three objects in the second column (the first two have a size
of two pixels and the bottom one has three pixels), respectively. In this particular
scene there are three types of targets with different sizes, small targets (panels of
three different sizes, 3, 2, and 1 m2) and large targets (vehicles of two different
sizes, 4 8 m and 6 3 m and three objects two and three pixels in size) to be used
to validate and test target detection performance.
Figure 1.9c shows an enlarged HYDICE scene from the same flight for visual
assessment. It has a size of 33 90 pixel vectors with 10 nm spectral resolution and
1.56 m spatial resolution, where five vehicles lined up vertically to park along the
tree line in a field, where the red (R) pixel vectors in Fig. 1.9d show the center pixel
of the vehicles, while the yellow (Y) pixels are vehicle pixels mixed with back-
ground pixels.
Fig. 1.9 HYDICE vehicle scene; (a) image scene; (b) ground truth map; (c) five vehicles; (d)
ground truth of (c)
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
Fig. 1.10 (a) A HYDICE panel scene containing 15 panels; (b) ground truth map of spatial
locations of the 15 panels
A third enlarged HYDICE image scene (Fig. 1.10a) is also cropped from the
upper part of the image scene in Fig. 1.9a, b marked by a square.
It has a size of 64 64 pixel vectors, with 15 panels in the scene. This particular
image scene was studied in depth in Chang (2003a, b, 2013, 2016). The scene
contains a large grass field background, a forest at the left edge, and a barely visible
road running along the right edge. Bands 1–3 and 202–210 are low signal/high
noise bands, and bands 101–112 and 137–153, which are water vapor absorption
bands, were removed. The spatial resolution is 1.56 m, and the spectral resolution is
10 nm. There are 15 panels located in the center of the grass field. They are arranged
in a 5 3 matrix, as shown in Fig. 1.10b, which provides the ground truth map of
Fig. 1.10a with the panel signatures in watts per square meter. Each element in this
matrix is a square panel and denoted by pij with rows indexed by i ¼ 1, . . . , 5 and
columns indexed by j ¼ 1, 2, 3.
For each row i ¼ 1, . . . , 5, the three panels were painted using the same material,
but they have three different sizes. According to the ground truth, the panels in rows
2 and 3 are made of the same materials with slightly different colors of paint, a light
olive parachute and a dark olive parachute. Consequently, if panels could be
detected in row 2, then panels in row 3 would also be detectable, and vice versa.
The same principle is also applicable to panels in rows 4 and 5, which are also made
of the same green fabrics with/without spectral tags. Nevertheless, they were still
considered different materials. For each column j ¼ 1, 2, 3, the five panels have the
same size. The sizes of the panels in the first, second, and third columns are 3 3 m,
2 2 m, and 1 1 m respectively.
Thus, the 15 panels have 5 different materials and 3 different sizes. Figure 1.10b
shows the precise spatial locations of these 15 panels, where red (R) pixels are a
panel’s center pixels and yellow pixels (Y) are panel pixels mixed with the
background. The 1.56 m spatial resolution of the image scene suggests that the
panels in the second and third columns, denoted by p12, p13, p22, p23, p32, p33, p42,
p43, p52, p53 in Fig. 1.10b, are one pixel in size. Additionally, except for the panel in
the first row and first column, denoted by p11, which also has a size of one pixel, all
other panels located in the first column are two-pixel panels; these are the panel in
22 1 Introduction
the second row with two pixels lined up vertically, denoted by p211 and p221, the
panel in the third row with two pixels lined up horizontally, denoted by p311 and
p312, the panel in the fourth row with two pixels also lined up horizontally, denoted
by p411 and p412, and the panel in the fifth row with two pixels lined up vertically,
denoted by p511 and p521. Since the size of the panels in the third column is 1 1 m,
they cannot be seen visually from Fig. 1.10a because its size is less than the
1.56 m pixel resolution.
It is worth noting that panel pixel p212, marked yellow in Fig. 1.10b, is of
particular interest. Based on the ground truth, this panel pixel is not a center
panel pixel representing a pure panel pixel. It is marked by a Y pixel to indicate
that it is a boundary panel pixel. However, in our extensive and comprehensive
experiments, conducted in later chapters, this Y panel pixel is always extracted as
the one with the most spectrally distinct signature compared to the R panel pixels in
row 2. This indicates that a signature of spectral purity is not equivalent to a
signature of spectral distinction. As a matter of fact, in many cases, panel pixel
p212 instead of panel pixel p221 is the first panel pixel extracted by endmember
finding algorithms to represent the panel signature in row 2, as will be demonstrated
experimentally in many chapters throughout the book. Also, because of such
ambiguity, the panel signature in the second row can be considered to be
represented by either p221 or p212. This implies that the ground truth of R panel
pixels in the second row in Fig. 1.10b may not be as pure as was originally thought.
Figure 1.11 plots the five panel spectral signatures obtained from Fig. 1.10b,
where the ith panel signature, denoted by pi, was generated by averaging the red
panel center pixels in row i. These panel signatures will be used to represent the
target knowledge of the panels in each row.
7000
P1
6000 P2
P3
P4
5000 P5
Radiance
4000
3000
2000
1000
0
0 20 40 60 80 100 120 140 160 180
Band
Fig. 1.11 Spectra of p1, p2, p3, p4, and p5

Fig. 1.12 Areas identified

by ground truth and marked
by three background interferer
signatures, grass, tree, and
road, plus an interferer
grass
tree
road
A visual inspection and the ground truth in Fig. 1.10a, b also reveal four
background signatures (Fig. 1.12), which can be identified and marked interferer,
grass, tree, and road. The interferer is a large rock located in the woods/forest. The
trees are selected from the woods/forest located at the left edge. The road runs
vertically along the right edge of the woods/forest. The grass is a dirt field where
panels are located. These four signatures, along with the five panel signatures in
Fig. 1.11, can be used to form a nine-signature matrix for a linear mixing model to
perform supervised LSMA.
1.5.3 Hyperion Data
The data to be studied were collected by the Hyperion sensor mounted on the Earth
Observer 1 (EO-1) satellite. Hyperion uses a high-resolution hyperspectral imager to
record Earth surface images of approximately 7.5 100 km with a 30 m spatial
resolution and a 10 nm spectral resolution. Image scene EO1H0150322011201110K3
was downloaded from the USGS Earth Explorer website (EarthExplorer. 01 Sept.
2013 http://earthexplorer.usgs.gov/). The image used is a Hyperion L1 data product,
which includes 198 channels of calibrated spectral information. This image was then
cropped to select a region of interest (ROI) covering the western side of the Chesa-
peake Bay Bridge. The resulting image cube is a square with spatial dimensions of size
64 64 pixels and 198 spectral bands ranging from 426 to 2396 nm. The image was
then compared with higher-resolution areal imagery to identify five distinct areas of
interest (AOIs). The areas include a beach, Westinghouse Bay, Mezick Ponds,
farmland, and a large corporate building. Four pixels were selected at random from
each of these areas and used to generate an average spectral signature for the material
contained in the area, with the exception of the farmland, which used seven pixels
owing to the fact that it is composed of four disjoint regions in the image where four
pixels were selected from the largest farmland region and a single pixel was chosen
from the three remaining regions. A color image of the region and the spectral
signatures of each of the five AOIs are shown in Fig. 1.13.
24 1 Introduction
a b
c
10000
Beach
9000 Westinghouse Bay
Mezick Ponds
8000 Farm Land
Corporate Building
7000
6000
5000
4000
3000
2000
1000
0
0 50 100 150
Fig. 1.13 (a) Cropped Hyperion image scene EO1H0150322011201110K3. (b) Cropped Hype-
rion image scene EO1H0150322011201110K3 with five areas of interest (AOIs): beach (dark
blue), Westinghouse Bay (green), Mezick Ponds (orange), farmland (light blue), and a corporate
building (magenta). (c) Average spectral signatures for each of the five AOIs in (b)
1.6 Synthetic Images to Be Used in this Book
Since real images generally do not have complete ground truth about the
endmembers, we must rely on synthetic images that are simulated by complete
knowledge to conduct quantitative analysis for performance evaluation.
Several synthetic images recently developed in Chang (2013, 2016) and Wu
et al. (2009) can be used for this purpose. These synthetic images were
1.6 Synthetic Images to Be Used in this Book 25
a b
BKG
Fig. 1.14 (a) Cuprite Mining District AVIRIS image scene; (b) spatial positions of five pure
pixels corresponding to minerals A, B, C, K, and M; (c) five mineral reflectance spectra and
background signature (b); (d) five mineral radiance spectra and background signature in (b)
custom-designed and simulated based on the Cuprite Mining District image data
available on the USGS website http://aviris.jpl.nasa.gov/. This Cuprite Mining
District image scene (Fig. 1.4b) can be used to simulate synthetic images for our
study. Although there may be more than five minerals in the data set in Fig. 1.4a, the
ground truth available for this region only provides the locations of the five pure
pixels: alunite (A), buddingtonite (B), calcite (C), kaolinite (K), and muscovite
(M) (Fig. 1.4b). For clarity, Figs. 1.4a, b and 1.6a, b are reproduced in Fig. 1.14a–d,
which will be used to simulate synthetic images with various scenarios. An area
marked “BKG” in the upper right corner of Fig. 1.14a was selected to find its
sample mean, that is, the average of all pixel vectors within the area BKG, denoted
by b and plotted in Fig. 1.14b, which will be used to simulate the background for the
image scene in Fig. 1.15.
The synthetic image designed here simulates 25 panels shown in Fig. 1.15, with
5 panels in each row simulated by the same mineral signature and 5 panels in each
column having the same size.
Among the 25 panels are five 4 4 pure-pixel panels for each row in the first
column and five 2 2 pure-pixel panels for each row in the second column, the five
2 2 mixed pixel panels for each row in the third column, and the five 1 1
subpixel panels for each row in the fourth column and the fifth column, where the
mixed and subpanel pixels were simulated according to the legends in Fig. 1.15.
Thus, a total of 100 pure pixels (80 in the first column and 20 in the second column),
referred to as endmember pixels, were simulated in the data by the five
endmembers, A, B, C, K, and M. The BKG in Fig. 1.14a is empirically selected
since it seems more homogeneous than other regions. Nevertheless, other areas can
also be selected for the same purpose. This b-simulated image background is
further corrupted by an additive noise to achieve a certain SNR, which is defined
26 1 Introduction
c
7000
6000 Alunite Muscovite
Kaolinite
5000
Reflectance
4000
3000
Buddingtonite
Background
2000
Calcite
1000
0
500 1000 1500 2000 2500
Wavelength (nm)
d
12000
Mean of five signatures
Kaolinite Muscovite
10000
Alunite
8000
Radiance
Buddingtonite
6000
4000 Calcite
2000
Sample mean of Area BKG

0
400 600 800 1000 1200 1400 1600 1800 2000 2200 2400
Wavelength (nm)
Fig. 1.14 (continued)
as 50 % signature (i.e., reflectance/radiance) divided by the standard deviation of

the noise in Harsanyi and Chang (1994). Once target pixels and background are
simulated, two types of target insertion can be designed to simulate experiments for
various applications.
1.6 Synthetic Images to Be Used in this Book 27
100%
50% signal + 50% any other four
A
50% signal + 50% background
B
C 25% signal + 75% background
Fig. 1.15 A set of 25 panels simulated by A, B, C, K, and M
The first type of target insertion is a target implantation (TI), which can be
simulated by inserting clean target panels into a clean image background by
replacing their corresponding background pixels, referred to as TI1, and into a
clean image background corrupted by additive Gaussian noise by replacing their
corresponding background pixels, referred to as TI2. In both scenarios, there are
still 100 pure panel pixels, all located in the first and second columns. A third
scenario, referred to as TI3, is simulated by adding a Gaussian noise to TI1, in
which case the clean panels are corrupted by an additive Gaussian noise. As a result,
there are no pure signatures in TI3. The design of TI scenarios is to see how the
Gaussian noise term, n, plays in model (1) to respond to three criteria, OP, simplex
volume, and fully constrained abundance fractions, where panels are clean in TI1
and TI2 while clean panels in TI3 are contaminated by additive Gaussian noise.
A second type of target insertion is target embeddedness (TE), which can also be
simulated by embedding clean target panels into a clean image background plus
additive Gaussian noise by superimposing target pixels over the background pixels.
Like TI, we can also simulate TE1, TE2, and TE3 as counterparts of TI1, TI2, and
TI3, respectively, with the only key difference among them being how target panels
are inserted. With TE, there are no endmembers present in TE1, TE2, and TE3
scenarios. The need to design TE scenarios is in seeing that adding a background
signature b to a standard signal detection model,
r ¼ s þ n, ð1:2Þ
yields
r ¼ s þ b þ n, ð1:3Þ
which is the same signal/background + noise (SBN) model used by Thai and Healey
(2002). By virtue of (1.3) it is a signal detection model with abundance sum-to-one
constraint (ASC) violated. In this case, it is interesting to find how a pure signature
as an endmember can be found through various algorithms.
28 1 Introduction
Since TI2 and TE2 are of major interest for the experiments conducted in this
book, they will be referred to as TI and TE for simplicity throughout the book,
without the suffix 2.
1.7 How to Use this Book
Although this book can be studied chapter by chapter, it can also be used as a
reference book without having to go back and forth between chapters. To make
each chapter standalone and self-contained, the image scenes used for the experi-
ments are briefly described in each chapter, with more details in Sects. 1.6 and 1.7
in this chapter. In addition, many descriptions and statements may be repeated and
reiterated in the introduction of each chapter, so that readers do not have to waste
time flipping through other chapters looking for the information they need. Thus,
readers who already possess knowledge on a particular topic can skip these descrip-
tions and go straight to the desired sections. Finally, the book is arranged in five
different parts such that each part not only can be read independently but is also
closely tied to the other parts. Most importantly, each part is arranged as an
integrated entity on a specific theme.
1.8 Notations and Terminology Used in the Book
Since this book deals primarily with real hyperspectral data, image pixels are
generally mixed and not necessarily pure, and the term endmember is not used;
instead, a general term like signature or signature vector is used. In addition,
because we are only interested in target analysis, the term targets instead of
materials is used throughout the book. To draw a distinction between a target
pixel and its spectral signature vector, we use the notation t to represent the target
pixel, r to denote an image pixel vector, and s or m to indicate its spectral signature
vector. We also use bold uppercase for matrices and bold lowercase for vectors.
When a vector is used in the text, it is generally described by a row vector specified
by an open parenthesis “()” with transpose “T” as a superscript. When a vector is
used in mathematical derivations, it is generally described by a column vector with
closed brackets “[]” instead of “()” for better illustration. In other words, open
parenthesis “(.)” and bracket “[.]” are interchangeable in vector representation
whichever is more appropriate. An italic uppercase L will be used for the total
number of spectral bands, K for the sample spectral covariance matrix, and
R for the sample spectral correlation matrix. Also, δ*(r) is used to represent a
detector or classifier that operates on an image pixel vector r, where the superscript
“*” in δ*(r) specifies what type of detector or classifier is to be used. It should be
noted that δ*(r) is a real-valued function that takes the form of an inner product of a
T
filter vector w with r, that is, δ* ðrÞ ¼ w* r, with the filter vector w* specified by a
particular detector or classifier. We also use α and α ^ to represent the abundance
vector and its estimate, where the hat over the α denotes “estimate.”
Part I
Fundamentals
This book is devoted to design and development of recursive hyperspectral imaging

algorithms for various applications. Part I includes three chapters that present the
fundamental concepts and knowledge required for understanding the rest of the
book: Chap. 2: “Simplex Volume Calculation,” Chap. 3: “Discrete-Time Kalman
Filter for Hyperspectral Processing,” and Chap. 4: “Target-Specified Virtual
Dimensionality for Hyperspectral Imagery.”
As for finding endmembers, the most widely used criterion for optimality is the
simplex volume (SV). However, finding a correct and true SV was overlooked in
earlier research since SV calculation is generally carried out by computing matrix
determinants, and this approach has been considered straightforward. Unfortu-
nately, it was shown in Li et al. (2015) and Li (2016) that such determinant-based
SV calculation can only produce true SVs when the matrix to be calculated is
square. For a nonsquare matrix using a matrix determinant to compute SVs will
suffer a matrix singularity problem and may result in incorrect SVs. Chapter 2
investigates this issue. Furthermore, to explore the recursive structures of the
designed algorithms, it is necessary to consider how algorithms are implemented.
This leads to Chap. 3, in which discrete-time Kalman filtering is studied, where its
concept will be used to derive the recursive hyperspectral image algorithms devel-
oped in this book. Finally, for unsupervised hyperspectral target detection algo-
rithms, two key issues are of interest: (1) how many targets should be present in the
data? and (2) how can these targets be found? Chapter 4 develops a unified theory
for target-specified virtual dimensionality that can not only provide constructive
algorithms for finding targets but also use these found targets to determine the
number of targets that need to be generated by the algorithms.
Chapter 2
Simplex Volume Calculation
Abstract Using maximal simplex volume (SV) as an optimal criterion for finding
endmembers is a common approach and has been widely adopted in the literature.
However, very little work has been reported on how SV is calculated because it can
be simply done by finding the determinant of a matrix formed by vertices of a
simplex. Interestingly, it turns out that calculating SV is much more complicated
and involved than one might think. This chapter investigates this issue from two
different aspects, through eigenanalysis and a geometric approach. The
eigenanalysis takes advantage of the Cayley–Menger determinant to calculate the
SV, referred to as determinant-based SV (DSV) calculation. The major issue with
this approach is that when the matrix is of ill-rank, calculating the determinant runs
into a singularity problem. To deal with this issue, two methods are generally
considered. One is to perform data dimensionality reduction to make the matrix
of full rank. The drawback of this method is that the original volume has been
shrunk and the found volume of a dimensionality-reduced simplex is not the true
original SV. The other is to use singular value decomposition to find singular values
for calculating the SV. An issue arising from this method is its instability in
numerical calculations. An alternative to eigenanalysis is a geometric approach
derived from the simplex structure whose volume can be calculated by the product
of the base and height of a simplex. The resulting approach is called geometric SV
(GSV) calculation. This chapter explores DSV and GSV calculations, with further
discussions in Chaps. 11 and 12.
2.1 Introduction
Endmembers are defined as pure spectral signatures and provide very important
information for hyperspectral data exploitation. Accordingly, finding endmembers
has attracted considerable interest in recent years (Chang 2003a, b, 2013, 2016, and
references therein). Using maximal simplex volume (SV) as an optimal criterion is
a common practice in finding endmembers owing to the fact that the purity of
endmembers can be specified by the geometric convexity of a simplex. Therefore,
many well-known SV-based endmember finding algorithms (EFAs) have been

DOI 10.1007/978-3-319-45171-8_2
32 2 Simplex Volume Calculation
developed and reported in the literature, such as the minimum-volume transform

(MVT) developed by Craig (1994), the N-finder algorithm (N-FINDR) by Winter
(1999a, b) along with its various sequential versions, SeQuential N-FINDR (SQ -
N-FINDR), SuCcessive N-FINDR (SC N-FINDR) (Wu et al. 2008; Xiong
et al. 2011; Chang 2013, 2016), and simplex growing algorithm (SGA) (Chang
et al. 2006a, b), which were derived to ease the computational complexity of
N-FINDR. Interestingly, very little work has been reported on how an SV is
calculated because it is a general understanding that SVs can be simply calculated
using the Cayley–Menger determinant. Unfortunately, such an SV-formed matrix is
generally not of full rank because the number of endmembers used to form the
vertices of a simplex is usually much smaller than the data band dimensionality. In
this case, the problem of finding the determinant of an ill-ranked matrix needs to be
addressed in the matrix determinant calculation. However, this issue of calculating
the SV has been overlooked.
In general, there are two common approaches to dealing with ill rank issue
arising in SV calculation. Both can be considered eigenanalysis methods that use
transforms to compact original high-dimensional data space into a lower-
dimensional transformed data space. One method is to preprocess data via dimen-
sionality compression (DC), such as principal components analysis (PCA). The
other is dimensionality reduction (DR), such as band selection (BS), prior to matrix
determinant calculations. In both cases data dimensionality can be reduced to make
an ill-ranked matrix a full-ranked matrix. There are two problems with these two
approaches. One is how to select an appropriate transform for DC or a BS algorithm
for DR. Another is that the transformed data or data represented by selected bands
are no longer original data sample vectors. As a result, the SV calculated in the data-
reduced space is not the true SV in the original data space. As an alternative,
singular value decomposition (SVD) can be used to find the matrix determinant
without DC/DR. While this approach addresses the data dimensionality problem, it
runs into another problem—numerical instability when the data dimensionality
becomes large, which is exactly the case in hyperspectral imagery.
To resolve the aforementioned issues, this chapter considers a rather different
approach from a geometric point of view. The idea stems from the fact that the
geometric convexity of a simplex can be used to realize the purity of endmembers
because a simplex satisfies two abundance constraints, the abundance sum-to-one
constraint (ASC) and the abundance nonnegativity constraint (ANC). According to
the geometric structure of a simplex, its volume can be calculated by the product of
its base and height. With this interpretation an (n + 1)-vertex simplex can be
calculated by the product of its base, which is an n-vertex simplex, and its height,
which is perpendicular to its base. Most importantly, such a geometric approach
does not require DR or any eigenanalysis-based methods. Consequently, the
volume that it calculates for any simplex is a true SV.
2.2 Determinant-Based Simplex Volume Calculation 33
2.2 Determinant-Based Simplex Volume Calculation
Calculating a determinant-based SV (DSV) is not trivial. Suppose that Sj+1 is a ( j + 1)-

vertex simplex specified by j + 1 endmembers, m1 , . . . , mj , mjþ1 . Let DSV(Sj+1)

be the volume of Sjþ1 ¼ S m1 ; . . . ; mj ; mjþ1 calculated by the Cayley–Menger
determinant:

1 ... 1
Det 1
m1 m2 . . . mjþ1
DSV m1 ; m2 ; . . . ; mjþ1 ¼ for j 1: ð2:1Þ
j!
Since the matrix in (2.1) is generally not a square matrix, DSV computation cannot
be done directly by simply multiplying eigenvalues because the matrix is of ill rank.
Three comments on finding DSV via (2.1) are in order:
1. For a given set of p endmembers in a hyperspectral image with band dimen-
sionality, L, the dimensionality of a simplex Sp formed by these p endmembers
has at most p 1. Because p is usually much smaller than full spectral dimen-
sionality, L, calculating DSV via (2.1) generally requires DC/DR to reduce L to
p 1 to find the determinant of a square matrix. Once a simplex Sp is reduced to
( p 1)-dimensional spectral space, the matrix used to calculate DSV in (2.1)
becomes a ðp 1Þ ðp 1Þ square matrix, in which case DSV is simply the
product of the eigenvalues of the reduced square matrix.
2. According to the LU factorization of n n matrix A (Moon and Stirling 2000),
the determinant of A, |A|, can be calculated as
Y
A ¼ PLU ¼ PLU ¼ n uii ; ð2:2Þ
i¼1
where L is a lower triangular matrix and U is an upper triangular matrix, both of

which are given by
2 3 2 3
1 0 0 0 u11 u12 u13 u1n
6 l21 1 ⋱ ⋮ ⋮7 6 u22 u23 u2n 7
6
L¼4 7, U ¼ 6 0 7; ð2:3Þ
⋮ ⋮ ⋱ 0 0 5 4⋮ ⋮ ⋱ uðn1Þn uðn1Þn 5
ln1 ln2 ln3 1 0 0 0 unn
and P is a permutation matrix to be used for pivoting so as to achieve numerical

stability during the LU factorization. However, it should be noted that the DSV
found after DC/DR is not the true volume of Sp but rather an approximation and
smaller than its original SV.
3. When no DC/DR is applied, we must deal with the issue of finding DSV for a
nonsquare matrix in (2.1). In this case, SVD (Chang 2013, Chap. 6) can be used
for this purpose. Unfortunately, there is a numerical stability issue associated
with it when its singular values become very small, as discussed in Li
et al. (2015a, b) and Li et al. (2016a, b).
2.3 Geometric Simplex Volume Calculation
In geometry, a simplex, also called a hypertetrahedron, is the generalization of a

tetrahedral region of space to arbitrary j dimensions. Suppose that m1 , m2 , . . . , mjþ1
is a set of any j + 1 data sample vectors. A j-dimensional simplex Sj+1 is a convex
polygon specified by j + 1 vertices. It should be noted that the subscript of Sp+1
denotes the number of vertices, j + 1, but the dimensionality of Sj+1 is j owing to the
constraint of ASC imposed on the simplex, which leads to a reduction of one
dimension. Thus a single-point convex hull can be considered a one-vertex simplex
with zero dimension, a line segment between two specified points can be considered
a two-vertex simplex with one dimension, a three-vertex simplex is a triangle with
two dimensions, a four-vertex simplex is a tetrahedron with three dimensions, and a
five-vertex simplex is a pentachoron with four dimensions, and so forth. If m1 , m2 ,
. . . , mjþ1 are further assumed to be independent in the sense of affine transformation,
which means m e 2 m2 m1 , me 3 m3 m1 , . . . , m
e jþ1 mjþ1 m1 are linearly
independent, then a ( j + 1)-vertex simplex Sj+1 with a dimensionality of j is made up
of all data sample vectors specified by

Sjþ1 ¼ Sn m1 ; m2 ; . . . ; mjþ1 o
Xjþ1
¼ α1 m1 þ α2 m2 þ þ αjþ1 mjþ1 i¼1 αi , αi 0, for 1 i j þ 1 :
ð2:4Þ
The content or hypervolume of a simplex is referred to as a geometric simplex

volume (GSV), denoted by GSV(Sj+1), which represents the volume of a j-dimen-
sional simplex Sj+1 with j + 1 vertices. Note that any vertex of a ( j + 1)-vertex
simplex Sj+1 can be regarded as the apex of a j-vertex simplex Sj in a ( j 1)-
dimensional space as a base formed by the other j vertices, as illustrated in Fig. 2.1
(http://www.mathpages.com/home/kmath664/kmath664.htm).
As a matter of fact, we can prove the following theorem, which shows how to
calculate the volume of a ( j + 1)-vertex simplex, Sj+1, as GSV(Sj+1).
Theorem 2.1: Geometric Simplex Volume Calculation The volume of a ( j + 1)-
vertex simplex, Sj+1, is GSV(Sj+1), given by the volume of its corresponding
Yj
parallelotope, V(Pj), divided by 1/j ! where V(Pj) can be calculated by h,
i¼1 i
with hi being the ith height of Pi such as special cases, one-dimensional P1 as a line
segment specified by h1, P2 as a parallelogram specified by h1 and h2, and a three-
dimensional parallelepiped P3 illustrated in Figs. 2.1c and 2.2 respectively.
Note that the volume of V(Pj) is calculated by its heights specified by the
edge vectors m e 2 m2 m1 , m e 3 m3 m1 , . . . , m e jþ1 mjþ1 m1 , with hjþ1 ¼
kme⊥
jþ1 k obtained by the j + 1 vertices m1 , m 2 , . . . , mjþ1 that form a ( j + 1)-vertex
simplex, Sj+1, where m e⊥ is the orthogonal projection to the space linearly spanned
jþ1
by m e 3; . . . ; m
e 2; m e jþ1 (also see Fig. 12.1 in Chap. 12).
2.3 Geometric Simplex Volume Calculation 35
v2 h
v1
b c
v3
h
v3
v2
v2
v1
v1
Fig. 2.1 (a) Two-dimensional three-vertex simplex S3 formed by two edge vectors v1, v2; (b)
three-dimensional four-vertex simplex S4 formed by three edge vectors v1, v2, v3; (c) three-
dimensional parallelotope (or parallelepiped) P3 formed by the same edge vectors
Fig. 2.2 Example of a

three-dimensional
parallelotope z
(parallelepiped)
h3
x
h1
y h2
Proof Let Sj+1 be a ( j + 1)-vertex simplex, and Sj is its base which is a j-vertex
simplex. Assume that hj is the height of Sj+1, which is the perpendicular distance
(altitude or height) of the apex of a j-vertex simplex Sj contained in a subspace
linearly spanned by hSji. For example, h2 is the height of a three-vertex S3 with
two-vertex simplex S2 formed by connecting two data sample points as its base.
Similarly, h3 is the height of S4 with a three-vertex simplex S3 formed by three data
sample points as a triangle as its base. The volume DSV(Sj+1) of a ( j + 1)-vertex
simplex Sj+1 given by (2.1) can be reexpressed by GSV(Sj+1) of a ( j + 1)-vertex
simplex Sj+1 with a j-vertex simplex Sj as its base and hj as its height, as follows:
ð hj

j1
ð hj j1
h h¼0 h
GSV Sjþ1 V Sjþ1 ¼ V Sj dh ¼ GSV Sj j1
h¼0 hj hj for j2
hj
hj h¼0 hj j hj
¼ GSV Sj j1 ¼ GSV Sj j1 ¼ GSV Sj :
j hj hj j
ð2:5Þ
Note that (2.5) is not defined for j ¼ 1 since GSV(S1) ¼ 0. However, if we further
define h1 ¼ GSV(S2) as the initial condition, which is the height of a zero-
dimensional one-vertex simplex, S1, then according to (2.5) GSV(Sj+1) can be
calculated by the previous V(Sj) for j 2 recursively as
hj
GSV Sjþ1 ¼ GSV Sj
j ð2:6Þ
¼ ðGSVðS2 Þ=j!Þhj hj1 h2 ¼ ð1=j!Þhj hj1 h2 h1 ;
where h1 , h2 , h3 , , hj are j edges of a parallelotope Pj shown in Fig. 2.2, with the

initial condition given by h1 ¼ GSVðS2 Þ for (2.6). In other words, given a j-dimen-
sional parallelotope Pj specified by j vectors, with h1 , h2 , , hj indicating its
j edges, we can consider hj to be the height perpendicular to the base whose volume
Yj1
can be calculated by h . Then the volume of Pj can be actually calculated
i¼1 i
recursively by
Yj Yj1
V Pj ¼ h ¼ hj i¼1 hi ¼ hj V Pj1 ;
i¼1 i
ð2:7Þ
with h1 ¼ V ðP1 Þ as the initial condition. Therefore, in light of (2.7), GSV(Sj) in

(2.6) can be rewritten
Yj
GSV Sjþ1 ¼ ð1=j!Þ i¼1 hi ¼ ð1=j!ÞV Pj : ð2:8Þ
□
Using Theorem 2.1 we can further derive the following theorem.
2.3 Geometric Simplex Volume Calculation 37
Theorem 2.2 Assume that m1 , m2 , . . . , mjþ1 are j + 1 vertices of a simplex Sj+1.

Then the volume of the parallelotope Pj is determined by j edge vectors, m2 m1 ,
m2 m1 , . . . , mjþ1 m1 of Sj+1 and can be calculated by the absolute value of

Det(ME) with its edge matrix ME ¼ m2 m1 , . . . , mjþ1 m1 (Stein 1966) as

V Pj ¼ DetðME Þ: ð2:9Þ
By virtue of (2.8) and (2.9) the GSV of a ( j + 1)-vertex simplex Sj+1, V(Sj+1), can
be expressed as

GSV Sjþ1 ¼ ð1=j!ÞV Pj ¼ ð1=j!ÞDetðME Þ: ð2:10Þ
Interestingly, (2.9) can also be obtained by

1 1 1 1 0 0
Det ¼ Det
m1 m2 mjþ1 m1 m2 m1 mjþ1 m1
ð2:11Þ
¼ Det½ m2 m1 m3 m1 mjþ1 m1
e2 m
¼ Det½ m e3 m e jþ1 ;
which is well known as the Cayley–Menger determinant of (2.1). Comparing (2.11)

to (2.1) it is worth noting that (2.1) is the simplex volume calculation based on the
j + 1 vertices, m1 , m2 , . . . , mjþ1, used to form a ( j + 1)-vertex simplex, while (2.8) is
the simplex volume calculation based on m e 2 m2 m1 , m e 3 m3 m1 , . . . ,
me jþ1 mjþ1 m1 , which are used to form the j edges of a ( j + 1)-vertex simplex,
Sj+1. As a result, a ( j + 1)-vertex simplex using j edge vectors has one dimension
less than a ( j + 1)-vertex simplex Sj+1 using j + 1 vertices imposing ASC.
Using (2.8) we can show that finding DSV(Sj) is equivalent to calculating GSV
(Sj) as follows:

1 1 1

DSV Sjþ1 ¼ ð1=j!Þ DetðME Þ ¼ ð1=j!ÞDet

2 jþ1 ð2:12Þ
m
m1 m

¼ ð1=j!Þ Det½ me2 m e3 m
e jþ1 ¼ GSV Sjþ1 :
It is also worth noting that the volume derived by (2.12) is only applicable when
ME is a square matrix. More details on (2.12) can be found in Li et al. (2015a, b).
When no DR is performed, we must deal with the issue of an ill-ranked matrix
determinant for a nonsquare matrix in (2.12). In this case, SVD can be used.
Generally, SVD can be applied to any m n matrix such as ME with the following
two valid relations:

1. ME * ME ¼ VΣ * U* UΣV* ¼ V Σ * Σ V* ;

2. ME ME * ¼ UΣ * V* VΣU* ¼ U Σ * Σ U* :
Consequently, the columns of V are eigenvectors of M

E ME, the columns of
U are eigenvectors of MEM
E , and the nonzero elements or singular values of Σ are
the square roots of the nonzero eigenvalues of M
E ME or MEM
E . Thus, the pseudo-

determinant of ME is defined by Det M#E as
1=2 1=2
Det M#E ¼ Det M*E ME ¼ Det ME M*E
Y 1=2 Y ð2:13Þ
¼ λ 6¼0
λ j ¼ σ;
σ 6¼0 j
j j
where λj is the nonzero eigenvalues of M

E ME or MEM
E and σ j is the corresponding
singular value of Σ. Unfortunately, V(Sj+1) calculated by (2.12) does not give the
true volume of a simplex Sj+1. But there is no such problem with using (2.8).
Theoretically, both (2.10) and (2.12) should yield the same GSV while finding
the volume of a ( j + 1)-vertex simplex in a j-dimensional space. When it comes to
practical implementation, they do not produce the same result owing to the issue of
a nonexistent determinant of a nonsquare matrix, which must be solved by applying
DR or SVD. On the other hand, (2.10) has no such issue.
Note that there is a significant difference between finding (OP) perpendicular to
endmembers and finding OP perpendicular to the simplex. Figure 2.3 shows a three-
endmember-vertex simplex example for illustration.
Assume that m1, m2, and m3 are three endmembers forming a simplex S(m1, m2,
m3) which is specified by a triangle connected by the three vertices m1, m2, and m3
lying on the hyperplane highlighted by blue and S(m1, m2) is simply the segment
connected by m1 and m2, with the simplex volume given by the vector length of
m2 m1, ||m2 m1||. This S(m1, m2, m3) is simply obtained by adding to the
two-vertex simplex
n S(m1, m2) a third endmember m3, which yields the maximal
⊥ o
OP, m3 ¼ arg maxr P r , where the segment Am3 is perpendicular to S
½m2 m1
(m1, m2). Since S(m1, m2) must satisfy ASC, m1 þ m2 ¼ 1, S(m1, m2) is reduced to
a one-dimensional simplex. Thus, the third endmember for S(m1, m2, m3) is found
by carrying out P⊥ ½m2 m1 on the space perpendicular to the segment m2 m1 and
finding the data sample vector that yields the maximal OP, which is m3. However,
as also shown in Fig. 2.3, P⊥ ½m1 m2 is OP perpendicular to the hyperplane highlighted
by red linearly spanned by m1 and m2 with no ASC constraint. As a result, t3 ¼ arg

n o
maxr P⊥
½m1 m2 r
derived from the well-known automatic target generation
process (ATGP) developed by Ren and Chang (2003) is different from m3, which
was found by the segment m2 m1 rather than the space linearly spanned by m1
and m2. This major difference arises from the fact that P⊥ ½m1 m2 operates on a
two-dimensional space linearly spanned by two independent vectors, m1 and m2,
while P⊥½m2 m1 operates on a one-dimensional space spanned by m2 m1 owing to
the fact that the effect of the vector m1 is removed from the simplex volume
calculation by placing m1 at the origin.
2.4 General Theorem for Geometric Simplex Volume Calculation 39
< m 2 – m1 , m 3 – m1 >
m 2 – m1 m2 P[^m – m ]
2 1
m1 A
m3
t3
P[^m m ] < m1 , m 2 >

1 2
Fig. 2.3 Three-endmember simplex example for illustration
2.4 General Theorem for Geometric Simplex Volume

Calculation
Using Theorem 2.1 we can further prove the following theorem.

Theorem 2.3: Geometric Simplex Volume Calculation by OP Let

jþ1 ¼ arg maxtjþ1 SV m1 ; m2 ; . . . ; mj ; tjþ1
tSGA ð2:14Þ
and
n o
ATGP
tjþ1 ¼ arg maxr P⊥
Uj r ; ð2:15Þ

with Uj ¼ m1 m2 mj . Then
ATGP, OP SGA, OP
t ¼ t ¼ h* ð2:16Þ
jþ1 jþ1 j
and
Y j
V Sjþ1 ¼ GSV m1 ; . . . ; mj ; mjþ1 ¼ m1 m2 i¼2 1=i 1 ! h*i
Yjþ1
¼ m1 m2 i¼3 1=i 1 ! tSGA i
, OP
Yjþ1
¼ m1 m2 i¼3 1=i 1 ! tATGPi
, OP
;
ð2:17Þ
where h
j is the maximal height perpendicular to the simplex

Sj ¼ S m1 ; . . . ; mj1 ; mj . Note that h
j is defined as the height of the simplex

specified by Sjþ1 ¼ S m1 ; . . . ; mj1 ; mj ; mjþ1 . It is perpendicular to its base,

Sj ¼ S m1 ; . . . ; mj1 ; mj , and can be obtained using tSGA ATGP
jþ1 in (2.14) or tjþ1 in
(2.15).
Proof Using Fig. 2.3 as an illustrative example, we can proceed with the proof by
mathematical induction as follows.
1. Initial condition: j ¼ 2 (degenerated simplex)
Since the lowest-dimensional simplex is a one-dimensional two-vertex
degenerated simplex, let m1 and m2 be two initial endmembers to form a (m1,
m2)-degenerated simplex obtained by the line segment connected by m1 and m2,
denoted m1m2. There are two ways to obtain m1 and m2. One is to find a data
sample vector with maximal vector length as m1, and then m2 is a data sample
vector with maximal distance from m1. The other is find two data sample vectors
with maximal distance as m1 and m2. In either case, SV(m1,m2) ¼ ||m2 m1||.
To find the third endmember m3, m3 should be the one that yields the maximal
height or maximal OP perpendicular to the line segment m1m2. More specifi-
cally, let U2 ¼ ½m1 m2 , and hU2i is a subspace linearly spanned by two
endmembers, m1 and m2. Then any data sample vector r can be decomposed
into the form r ¼ rOP þ rU2 , where rOP 2 hU2 i⊥, which is orthogonal to the space
hU2i and rU2 2 hU2 i. Therefore, tATGP 3 and tSGA
3 can be decomposed into the form
ATGP
t3 ATGP, OP
¼ t3 þ t3 ATGP, U2 ATGP, OP
, with t3 2 hU2 i⊥ and tATGP
3
, U2
2 hU2 i, and tSGA
3
¼ t3SGA, OP þ tSGA
3
, U2
, with tSGA
3
, OP
2 hU2 i⊥ and tSGA
3
, U2
2 hU2 i, respectively, as
shown in Fig. 2.4, where there are three ways to represent the third endmember
m3 via the vector Om3.
(a) Om3 ¼ Om1 þ m1 m3 with m1 m3 ¼ m1 A þ Am3 , Am3 ¼ m3OP , m1 A ¼
OP
¼ h* ;
3 , m1 m3 ¼ m1 A þ Am3 , and m3
mU 2
2
(b) Om3 ¼ Om2 þ m2 m3 with m2 m3 ¼ m2 A þ Am3 , Am3 ¼ m3OP , m3 A ¼
OP
mU ¼ h* ;
3 and m3
2
2
(c) Om
OP3 ¼
OA*þ Am3 where OA ¼ Om1 þ m1 A, OA ¼ Om2 þ m2 A and
m ¼ h .
3 2
Note that the spatial locations of m1, m2, and m3 are included in the information
⊥
of m1 A ¼ mU 31 and m3 A ¼ m32 in the PU2 -specified hyperplane. As a result, none
2 U2
of the three representations has any effect on Am3 ¼ m3OP .

According to Theorem 2.1, this height should be the one to produce a triangle
with the maximal area calculated by multiplying its base ||m1 m2|| by its

maximal height h
2 given by h*2 ¼ t3ATGP, OP . This implies that
maxr V ðm
1 ; m2 ; rÞ ¼
GSVðm1 ; m2; m3 Þ ð2:18Þ
¼ ð1=2Þm1 m2 h*2 ¼ ð1=2Þm1 m2 t3ATGP, OP :
2.4 General Theorem for Geometric Simplex Volume Calculation 41
However,

maxr GSVðm1 ; m2 ; rÞ ¼ GSV m1 ; m2 ; tSGA
SGA, OP 3
ð2:19Þ
¼ ð1=2Þm1 m2 t3 :

Equating (2.18) to (2.19) yields h*2 ¼ t3ATGP, OP ¼ tSGA
3
, OP
. In this case, m3
can be either tATGP or t SGA
with the same maximal OP given by
ATGP, OP 3 SGA, OP 3
h2 ¼ t3
*
¼ t3 .
2. Now assume that (2.16) and (2.17) are true for any positive integer
j endmembers. Then we would like to prove that (2.16) and (2.17) are also
true for j + 1 endmembers.
Since hUji is the space linearly spanned by the j endmembers m1, m2, . . ., mj,
it is obvious that it also contains the simplex formed by m1, m2, . . ., mj, denoted
by S(m1, m2, . . ., mj) because a simplex is a linear convex set. In this case, the
height of the simplex S(m1, m2, . . ., mj) must be perpendicular to the simplex S
(m1, m2, . . ., mj), thus hUji. More specifically,
also using Fig.2.4OPasan illustra-
tive example, by replacing m3 with mj+1, m3OP ¼ h*2 with mjþ1 ¼ h* , and
j
m1, m2 with m1 , m2 , . . . , mj , any data sample vector r can be decomposed into
⊥
the form r ¼ rOP þ rUj , where rOP 2 Uj and rUj 2 Uj . Thus, the data sample
vector that yields the maximal height should the one lying in the hyperplane ⊥
specified by P⊥ P r
Uj r over all the data sample vectors r with maximal Uj
according to (2.15).
In other words,

GSV m1 ; m2 ; . . . ; mj ; tjþ1
ATGP

ATGP, OP
¼ GSV m1 ; m2 ; . . . ; mj1 ð1=jÞ tjþ1 ð2:6Þ

¼ GSV m1 ; m2 ; . . . ; mj ð1=jÞmaxhj hj ð2:15Þ

¼ GSV m1 ; m2 ; . . . ; mj1 ð1=jÞ h*j
Fig. 2.4 Illustration of U2 m3

initial condition, m2 m1 m 31
OP *
|| m 3 ||= h2
m1 A
m2
U2
m 32
O
Y j ATGP, OP
¼ km 2 m 1 k ð 1= ð i 1 Þ Þ t ð1=jÞ
t ATGP, OP
ðby inductionÞ
i jþ1
Y j
i¼2
¼ ð1=j!Þkm2 m1 k i¼2 i
h* ð2:6Þ

¼ maxtjþ1 GSV m1 ; m2 ; . . . ; mj ; tjþ1 ¼ GSV m1 ; m2 ; . . . ; mjþ1 : ð2:20Þ
On the other hand,

GSV m1 ; m2 ; .. . ; mj ; mjþ1
¼ maxtjþ1 GSV m1 ; m2 ; . . . ; mj ; tjþ1

¼ GSV m1 ; m2 ; . . . ; mj ; tSGA

jþ1

, OP
¼ GSV m1 ; m2 ; . . . ; mj ð1=jÞ tSGA jþ1

¼ GSV m1 ; m2 ; . . . ; mj ð1=jÞmaxhj hj ð2:15Þ

¼ GSV m1 ; m2 ; . . . ; mj ð1=jÞ h*j
Yj1
¼ km 2 m 1 k i¼2
ð 1= ð i 1 Þ Þ h *
i ð1=jÞ h*j ðby inductionÞ
Y j
¼ ð1=j!Þkm2 m1 k h *
i¼2 i
ð2:6Þ
ð2:21Þ
ATGP, OP
¼ GSV m1 ; m2 ; . . . ; mj ; tjþ1 ð2:15Þ:
ATGP
By virtue of (2.20) and (2.21), both tjþ1 and tSGA
jþ1 yield the same maximal SV,
Yj1 *
that is, ð1=j!Þm2 m1 h , by (2.6). More specifically, the OP of
i¼2 i
ATGP
tjþ1 and tSGA projected on the P⊥
Uj -specified hyperplane should be the same,
ATGP, OP jþ1 SGA, OP
t ¼ t . □
jþ1 jþ1
An immediate result from Theorem 2.3 is the following corollary by definition.

Corollary

(a) GSV m1 ; m2 ; . . . ; mj ; tSGA
jþ1 GSV m1 ; m2 ; . . . ; m ; t
j jþ1
VCA
for all j 2 by
(2.14).
ATGP, OP VCA, OP
(b) tjþ1 t for j 2 by (2.15).
SGA, OP ATGPjþ1
VCA, OP
(c) tjþ1 ¼ t
jþ1
, OP t
jþ1
by (2.17) and (a–b).
Using Theorem 2.3 we can further derive GSV for (2.12):

2.5 A Mathematical Toy Example 43

1 1 1

GSV Sjþ1 ¼ ð1=j!Þ DetðME Þ ¼ ð1=j!ÞDet
m1 m2 mjþ1
¼ ð1=j!ÞDet½ m
e2 m e3 m e jþ1 ð2:22Þ
Y j
¼ ð1=j!Þm2 m1 i¼3
h *
i ¼ GSV S jþ1 :
2.5 A Mathematical Toy Example
In this section, we present a mathematical toy example to make a comparison of SV

calculations using various approaches described in this chapter. Let us define the
SV calculated by (2.6) as a GSV, the SV calculated by the Cayley–Menger
determinant by (2.1) as a determinant-based volume (DSV), and the SV calculated
by DSV using DR as DR-DSV or that using SVD as SVD-DSV. Theoretically, a
GSV and a DSV without DR or SVD can be shown to be identical. In addition, a
GSV using DR, referred to as a DR-GSV, is also included for comparison, where
the DR-GSV is derived by reducing the dimensionality of Sj+1 in an L-dimensional
space to a j-dimensional space as Sj+1 and then applying (2.6) to Sj+1.
To compare the volume values of simplexes using these methods, we first use
two simple 2D simplexes and a 3D simplex, all in three-dimensional space to
illustrate their differences.
First, three different types of simplexes are generated in a three-dimensional
space. A three-vertex triangle is formed by three points randomly generated in a
three-dimensional space, as shown in Fig. 2.5a, where three vertices are specified
by their spatial coordinates (7,7,7), (6,10,2), and (7,2,1). Also, a three-vertex
simplex shown as a regular triangle with equal edge lengths of 1.633 is generated
in Fig. 2.5b, where the three vertices are specified by their spatial coordinates
(1,0,0), (0.3333,0.9428,0), (0.3333,0.4714,0.8165). Finally, an arbitrary
four-vertex simplex is generated as a pyramid (Fig. 2.5c) with the vertices specified
by their spatial coordinates (8,2,4), (7,3,8), (4,7,7), (4,0,3). Unlike the first two
simplexes in Fig. 2.5a and b, whose dimensionality is two, which is one less than the
a b c
7 1 8
6 7
0.5
5 6
4 0
z
z
3 5
-0.5 4
2
1 -1 3
10 1 8
8 6.8 7 0.5 1 6 8
6 6.6 0 0.5 4 7
0 6
y 4
2 6 6.2 6.4 x y -0.5 -1 -1 -0.5 x y 2 0 4
5
x
Fig. 2.5 (a) Triangle; (b) regular triangle; (c) 3D simplex with one vertex in origin, in 3D space
Table 2.1 Comparative analysis of volume among various methods of simplexes in three-
dimensions
Method
SVD- PCA- PCA-
Simplex GSV DSV DSV GSV DSV
Arbitrary 2D simplex (Fig. 2.5a) 21.8518 N/A 155.5426 21.8518 21.8518
Regular 2D simplex (Fig. 2.5b) 1.1547 N/A 1.2172 1.1547 1.1547
Arbitrary 3D simplex (Fig. 2.5c) 15.8333 15.8333 15.8333 N/A N/A
data dimensionality of three, this four-vertex simplex has the same dimensionality
as its data dimensionality, three.
Table 2.1 tabulates the values of a SV calculated by various versions of SV
calculation methods using GSV and DSV, where “N/A” indicates “not applicable.”
As we see in Table 2.1, all three methods, GSV, DSV, and SVD-DSV, yield the
same SV value when the dimensionality of the simplex equals the data dimension-
ality. In this case, no dimensionality reduction is needed, and SVD could be used as
an alternative method to obtain the SV. But this result does not justify the use of
SVD for calculating the volumes of simplexes in a different dimensional space.
However, in all cases, GSV still yields the same SV as DSV when the dimension-
ality of a simplex is smaller than that of the data space, where the SV values
calculated using SVD-DSV are not consistent.
2.6 Real Image Experiments
A real image scene in Fig. 2.6 (also shown in Fig. 1.10), acquired by the airborne
HYperspectral Digital Imagery Collection Experiment (HYDICE) sensor in August
1995 from a flight altitude of 10,000 feet with a ground sampling distance of
approximately 1.56 m, was used for experiments. It has 210 spectral channels
ranging from 0.4 to 2.5 μm with a spectral resolution of 10 nm. The low signal/
high noise bands, bands 1–3 and 202–210, and the water vapor absorption bands,
bands 101–112 and 137–153, were removed. Thus, a total of 169 bands were used
for the experiments. The scene has a size of 64 64 pixel vectors.
To compare the SV values calculated by GSV with and without PCA as a DR,
the SGA developed by Chang et al. (2006a, b) was used to find a set of data sample
vectors known as endmembers from the image. The largest simplex is formed by a
set of 170 vertices specified by 170 data sample vectors found using SGA in a
169-dimensional data space. Since SGA finds endmembers progressively one after
another starting from a single endmember as a zero-vertex m0 simplex up to
169
170 endmembers mj j¼0 as a 169-vertex simplex, we used these progressively
found endmembers to form a set of growing simplexes one by one, progressively
denoted by Sj, j ¼ 0, . . . , 169, where j is the number of endmembers found in order
by SGA. S0 is formed by a single vertex m0, S2 is a line segment formed by two
2.6 Real Image Experiments 45
Fig. 2.6 HYDICE panel

scene
a b
70 12 64
GSV x 10 GSV
60 PCA-DSV PCA-DSV
10
50
40 8
Log10(SV)
SV
30
6
20
10 4
0
2
–10
–20 0
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180
number of vertices number of vertices
Fig. 2.7 Comparison of SV values calculated by (a) GSV versus PCA-DSV and (b) PCA-GSV
versus PCA-DSV
vertices m0 and m1, and Sj+1 consists of j + 1 vertices m0 , m1 , . . . , mj . Figure 2.7

plots the SV values of simplex Sj ranging from 0 to 169.
According to the results, several interesting observations can be made. First and
foremost, the SV value calculated using DR-DSV and GSV were different, espe-
cially when the number of vertices was large. Although the SV values calculated by
DR-DSV for the mathematical example were identical to GSV in Table 2.1,
numerical instability became an issue that had to be addressed as the number of
vertices increased. Second, the SV values produced by both methods showed a
trend whereby the SV value was increased until it reached a peak at a specific
number of vertices; it then began to decrease where a specific number occurs when
the height, h(Sj+1) ¼ hj which is the distance from the newly generated vertex mj+1
for the ( j + 1)-vertex simplex, Sj+1 to the previously formed j-vertex simplex Sj less
Table 2.2 Computational complexity

Method GSV DSV PCA-DSV SVD-DSV

Computational complexity OðpLÞ Oðp Þ
3
O pL2 þ Oðp3 Þ O pL2
1.4
GSV
1.2 PCA-DSV
Accumulated time (sec)
0.8
0.6
0.4
0.2
0
0 20 40 60 80 100 120 140 160 180
number of vertices
Fig. 2.8 Comparisons of cumulative computing times required by GSV and PCA-DSV
hðSjþ1 Þ
than the dimensionality of the simplex Sj+1, j + 1, that is, jþ1 < 1 . This peak
occurred at the 73rd endmember and was the 78th endmember when DR was
performed.
To further compare the performance of each method, a comparative analysis of
the computational complexity is provided in Table 2.2, where p is the number of
endmembers used to specify the vertices of simplexes and L is the data
dimensionality.
Since GSV only involves inner products, it has a complexity of OðpLÞ compared
to DSV, which has a complexity of Oðp3 Þ. In real-world problems, high data
dimensionality is always an issue. When a DR technique or SVD is performed
prior to SV calculation, thecomputational
complexity of implementing PCA-DSV
and SVD-DSV becomes O pL þ Oðp Þ and Oðp2 LÞ, respectively.
2 3
A major advantage of GSV over DSV is that GSV calculates SV using (2.6),
which only uses mathematical products. This cannot be done using DSV since the
volume of Sj can only be calculated one at a time. Figure 2.8 shows the accumulated
time required by GSV to calculate SV by multiplying height h(Sj+1) ¼ hj by the SV
of the previous j-vertex simplex Sj compared to the accumulated time required by
PCA-DSV, where the computing time was obtained by an average of 100 trials. The
experimental results show that GSV can significantly reduce the computing time of
SV calculation.
2.7 Conclusions 47
2.7 Conclusions
SV has been widely used as a criterion for algorithm design to find endmembers.
Since the number of endmembers is generally much smaller than the data dimen-
sionality, the matrix formed by endmembers, called the endmember matrix used to
calculate SV via its matrix determinant, is usually not of full rank. As a result,
calculating the determinant of an endmember matrix runs into a singularity prob-
lem. Two approaches are used to deal with this issue. One approach involves using
DC or DR, which reduces the original data space to a lower-dimensional data space
to make an endmember matrix of full rank. Unfortunately, as shown in Li
et al. (2015a, 2015b) and Li et al. (2016a, b), the SV found by a dimensionality-
reduced data space is not the true SV obtained directly from the original data space.
The other approach is to use SVD or the Moore–Penrose pseudo-inverse to find
SV. While the SV found by this approach is correct, it suffers from two issues,
numerical instability and excessive computing time when data dimensionality and
the number of endmembers are high. Interestingly, the issue of finding the true SV
has been overlooked by all SV-based endmember finding algorithms, including a
well-known N-FINDR, SGA. This chapter investigates this important issue and
further develops a GSV-based method to resolve the dilemma. The proposed
GSV-based method has several benefits that a DSV-based method cannot provide.
One is that no DR is required. Another is that GSV can calculate the true SV
without experiencing the numerical issues. Third, the computational complexity is
significantly reduced. Last but not least, when GSV is applied to endmember
finding algorithms, it can avoid finding incorrect endmembers caused by numerical
errors resulting from use of the matrix determinant to calculate the SV.
Chapter 3
Discrete-Time Kalman Filtering
for Hyperspectral Processing
Abstract As the book’s title implies, recursive is a key word in hyperspectral

processing, which means that the data processing involved in hyperspectral imagery
can be implemented by recurrence relations. One way to realize a recurrence
relation is to use a recursive equation that specifies how a recurrence relation
takes place. One of most significant advantages that can be gained from recursive
processing is that previously processed information is stored in recurrence relations
so that there is no need to reprocess all the data that had already been processed. As
a consequence of using recurrence relations, data processing need not revisit all the
past data sample vectors; rather, it focuses on incoming data sample vectors.
Accordingly, all of the data processing can be updated by including only the
information, referred to as innovations information, that is contained in the incom-
ing new data sample vectors but not in the already-processed past information. With
this unique feature data processing can be implemented in real time via recurrence
relations specified by recursive equations. This chapter is devoted to exploring the
concept of recursive processing, particularly to the concept of discrete-time Kalman
filtering, which has been widely applied in statistical signal processing to the design
and development of hyperspectral processing algorithms, presented in subsequent
chapters in Parts II-V.
3.1 Introduction
When it comes to real-time processing, two issues must be addressed. One is

causality, which only allows those data samples that have already been visited or
processed to be used for data processing. In other words, no future data samples that
are yet to be visited or processed later should be allowed to be included in data
processing. The other issue is that data processing should depend only on the data
sample currently being visited or processed but not previously visited or processed
data samples. To effectively address these two issues the information required for
data processing can be decomposed into three pieces of information: (1) processed
information, which is obtained by processing data samples up to the data sample
currently being visited; (2) new information, which is the current data sample being

DOI 10.1007/978-3-319-45171-8_3
50 3 Discrete-Time Kalman Filtering for Hyperspectral Processing
processed; and (3) innovations information, which retains only information that is
contained in the new data sample (i.e., the current data sample) but that cannot be
obtained from the processed information. One way to satisfy these requirements is
through the use of a recursive process. In statistical signal processing, the Kalman
filter (KF) is probably the most widely used linear operator that processes data
samples recursively in a very effective manner. It implements two equations, a state
equation, which updates internal states via the state transition matrix, and a mea-
surement or input/output equation, which updates outputs via the Kalman gain
obtained by error covariance matrix and state estimates. Since the update equations
derived for state estimates and outputs can be implemented causally and recur-
sively, the most appealing and attractive feature of the KF in real-world applica-
tions is its ability and capability in real-time processing.
3.2 Discrete-Time Kalman Filtering
Kalman filtering has been widely used in statistical signal processing for the
purpose of parameter estimation (Poor 1994; Gelb 1974). It makes use of a
measurement equation (input/output equation) and a state equation (process equa-
tion) to recursively estimate and update states and outputs. Here, we follow the
treatment in Gelb (1974) and briefly describe its idea with detailed derivations.
Suppose that there is a discrete-time linear dynamic system specified by the
white noise vectors {u(k)}, output vectors {y(k)} and system gain matrix {H(k)}.
Now let {x(k)} be the system state vectors that describe the states of the system over
time. Two equations are of interest. One is the state equation given by
xðk þ 1Þ ¼ Φðk þ 1, kÞxðkÞ þ uðkÞ; ð3:1Þ
where Φ(k + 1,k) is the state transition matrix, E½uðkÞuT ðlÞ ¼ QðkÞδkl, and δkl is the
delta function. Figure 3.1 depicts a diagram of it.
A second equation is the measurement equation, which is the output of the
system resulting from the state {x(k)} given by
yðkÞ ¼ HðkÞxðkÞ: ð3:2Þ
As a matter of fact, when the system is observed, it is also corrupted by white noise
{v(k)} to yield the observation equation given by
Fig. 3.1 Diagram of a state u(k)

equation
x(k) x(k+1)
F (k+1,k)
3.2 Discrete-Time Kalman Filtering 51
Fig. 3.2 Diagram of an v(k)

observation equation
x(k) y(k) z(k)

H(k)
zðkÞ ¼ yðkÞ þ vðkÞ; ð3:3Þ
where E½vðkÞvT ðlÞ ¼ RðkÞδkl and E½uðkÞvT ðlÞ ¼ CðkÞδkl , with E[.] being the
expectation operator. Figure 3.2 illustrates the final system output {y(k)} from an
observation process given by Fig. 3.2.
As an example, assume that a random process, {x(k)} can be observed only
through an observation process, {z(k)}. Let an estimate of x(k) at the time instant tk
Xk
k
based on k observations fzðiÞgi¼1 be given by ^x ðkÞ ¼ 1k i¼1
zðiÞ. Thus, with k + 1
observations fzðiÞgkþ1
i¼1 ,
X
1 Xkþ1 k 1 k 1
^
x ð k þ 1Þ ¼ z ð i Þ ¼ zð i Þ þ z ð k þ 1Þ
k þ 1 i¼1 k þ 1 k i¼1 kþ1
k 1 ð3:4Þ
¼ ^x ðkÞ þ z ð k þ 1Þ
kþ1 kþ1
1
¼x^ ðk Þ þ ðzðk þ 1Þ x^ ðkÞÞ:
kþ1
According to (3.4), x^ ðk þ 1Þ , which is an estimate for x(k + 1), can be simply

obtained by the previously estimated ^x ðkÞ, and the error measurement zðk þ 1Þ ^x
ðkÞ calculated by subtracting ^x ðkÞ from the current measurement z(k + 1) without
k
reprocessing the previously visited measurements fzðiÞgi¼1 . This simple example
illustrates how we can take advantage of the previously estimated ^x ðkÞ as processed
information and apply it to x(k) and zðk þ 1Þ ^x ðkÞ as innovations information to
update ^ x ðk þ 1Þ, where z(k + 1) is new incoming information at time instant tk+1.
This interpretation provides insights into the central concept of Kalman filtering
using three pieces of information.
3.2.1 A Priori and A Posteriori State Estimates

Now we define an a priori estimate of x(k) as ^x kk 1 , also denoted by x^ k ðÞ in
Gelb (1974), to indicate the prediction made at the time instant right before tk based
on the previous k 1 observations, fzðiÞgk1i¼1 , and an a posteriori estimate as

^
x k k , also denoted by ^x k ðþÞ in Gelb (1974), to indicate a prediction made at
the time instant immediately after tk by observing
the kth observation z(k), the
obtained processed information ^x kk 1 , and the kth observation z(k). In this
P ( k – 1 | k – 2) P ( k – 1 | k – 1) P ( k | k – 1) P ( k | k )
xˆ ( k – 1 | k – 2) xˆ ( k – 1 | k – 1) xˆ ( k | k – 1) xˆ ( k | k )
time tk-1 time tk
Fig. 3.3 Diagram of a priori and a posteriori state estimates

x kk can be considered an updated estimate of ^x kk 1 using the
case, ^
incoming measurement at time instant tk, z(k) (Figs. 3.2 and
3.3).
In other words, given an a priori estimate, ^x kk 1 , we are interested in

updating the state estimate, ^x kk , in linear and recursive forms to avoid using a
growing memory filter as follows:

^x kk ¼ KðkÞ^x kk 1 þ KðkÞzðkÞ; ð3:5Þ
where both KðkÞ and K(k) are time-varying matrices yet to be determined. Define
the estimation error vector

x kk ¼ xðkÞ ^x kk ) ^x kk ¼ xðkÞ þ e
e x k k ; ð3:6Þ

x kk 1 ¼ xðkÞ ^x kk 1 ) ^x kk 1 ¼ xðkÞ þ e
e x kk 1 ; ð3:7Þ

x kk ¼ KðkÞ þ KðkÞHðkÞ I xðkÞ þ KðkÞx kk 1 þ KðkÞvðkÞ: ð3:8Þ
e

By definition, E½vðkÞ ¼ 0. Also, if E xe kk 1 ¼ 0, the estimate ^x kk in (3.6)

x kk ¼ 0 which
is an unbiased estimate for any given state vector x(k), that is, E e
means that only if the term in brackets in (3.8) equal to zero as follows
KðkÞ þ KðkÞHðkÞ I ¼ 0 ) KðkÞ ¼ I KðkÞHðkÞ: ð3:9Þ
Then (3.9) becomes

x kk ¼ ðI KðkÞHðkÞÞ^x kk 1 þ KðkÞzðkÞ;
e ð3:10Þ
or, alternatively,

x kk ¼ ^x kk 1 þ KðkÞ zðkÞ HðkÞ^x kk 1 ;
^ ð3:11Þ

x kk ¼ ðI KðkÞHðkÞÞ^x kk 1 þ KðkÞvðkÞ:
e ð3:12Þ
Error Covariance Matrix

P k8k ¼E e x k k e
x k k
h i9
< ðI KðkÞHðkÞÞ^x kk 1 ^x T kk 1 ðI KðkÞHðkÞÞT = ð3:13Þ
¼E h i :
: þKðkÞvðkÞ ^x T kk 1 ðI KðkÞHðkÞÞT þ vT ðkÞKT ðkÞ ;
T
Since P kk 1 ¼ E e
x k k 1 e
x kk 1 and

x kk 1 vT ðkÞ ¼ E vðkÞe
E e x T k k 1 ¼ 0 ð3:14Þ
due to the fact that the measurement error is uncorrelated, (3.13) is reduced to

P kk ¼ ðI KðkÞHðkÞÞP kk 1 ðI KðkÞHðkÞÞT

þ K k RðkÞKT ðkÞ: ð3:15Þ
3.2.2 Finding an Optimal Kalman Gain K(k)
Assume that the objective function is

J ðkÞ ¼ E xeT kk Ae
x k k ; ð3:16Þ
where A is any positive semidefinite matrix. By letting A ¼ I, (3.16) becomes

J ðkÞ ¼ trace P kk ; ð3:17Þ
which is equivalent to minimizing the length of the estimation error vector. To find
the optimal value of K(k), we differentiate (3.17) using (3.15) with respect to K(k)
via the relation
∂

trace ABAT ¼ 2AB ð3:18Þ
∂A
and obtain

2ðI KðkÞHðkÞÞP kk 1 HT ðkÞ þ 2KðkÞRðkÞ ¼ 0: ð3:19Þ
Solving K(k) by (3.19) results in

1
KðkÞ ¼ P kk 1 HT ðkÞ HðkÞP kk 1 HT ðkÞ þ RðkÞ ; ð3:20Þ
which is referred to as a Kalman gain matrix.

Substituting (3.20) into (3.15) yields

1
P kk ¼ P kk 1 P kk 1 HT ðkÞ HðkÞP kk 1 HT ðkÞ þ RðkÞ
HðkÞP kk 1 ¼ ðI KðkÞHðkÞÞP kk 1 :
ð3:21Þ
The extrapolation of the state estimate vector and error covariance matrix
between two measurements can be performed by (3.1) as follows:

^x kk 1 ¼ Φðk, k 1Þ^x k 1k 1 ; ð3:22Þ

P kk 1 ¼ Φðk, k 1ÞP k 1k 1 þ Qðk 1Þ; ð3:23Þ

K kk 1 ¼ Φðk, k 1ÞKðk 1Þ: ð3:24Þ
3.2.3 Orthogonality Principle

Assume that zðlÞ0 l k 1 is a given observation process. A predictor of y

(k) using the k observations up to zðlÞ0 l k 1 is denoted by ^z kk 1 .

Theorem: Principle of Orthogonality If ^y kk 1 is a best linear estimate that
solves the mean-squared-error (MSE) problem
h 2 i
min 0 E y0 kk 1 yðkÞ , with x2 defined by xT x; ð3:25Þ
y k k1
h
2 i
then ^y kk 1 ¼ arg min 0 E y0 kk 1 yðkÞ , if and only if
y k k1

y kk 1 is orthogonal to the observations zðlÞ0 l k 1 . That is, we

^

y kk 1 ¼ yðkÞ ^y kk 1
define the error vector e
h T i T
y kk 1 zðlÞ ¼ E e
E yðkÞ ^ y kk 1 zðlÞ ¼ 0 for all 0 l
k 1: ð3:26Þ
Proof Sufficiency of (3.26)

Assume that y0 kk 1 is any other linear estimate based on the observations

zðlÞ0 l k 1 . Then the MSE resulting from y0 kk 1 is
h 2 i
E y0 k k 1 yð k Þ
h 2 i
¼ E y0 kk 1 ^y kk 1 þ ^y kk 1 yðkÞ
h 2 i
¼ E y0 kk 1 ^y kk 1 ð3:27Þ

þ 2Eh y0 kk 1 ^y kki 1 ^y kk 1 yðkÞ
2
þ E ^y kk 1 yðkÞ :

Since y0 kk 1 ^y kk 1 is also a linear combination of the observations

zðlÞ0 l k 1 , the middle term in (3.27) is E y0 kk 1 ^y kk 1 Þ

^y kk 1 yðkÞ ¼ 0, according to (3.26). This implies that
h 2 i
E y0 k k 1 yð k Þ
h 2 i h 2 i
¼ E y0 kk 1 ^y kk 1 þ E ^y kk 1 yðkÞ ð3:28Þ
h 2 i
E ^ y k k 1 yð k Þ :

Because y0 kk 1 is arbitrarily chosen, it proves the sufficiency of (3.26).
Necessity of (3.26)
Suppose that y0 kk 1 is a linear estimate based on the observations

zðlÞ0 l k 1 and there is an observation z( j) such that (3.26) fails. Then

h T i
E yðkÞ y0 kk 1 zðjÞ 6¼ 0: ð3:29Þ

Now we construct a new linear estimate y kk 1 by
h T i
E yðkÞ y0 kk 1 zðjÞ
y k k 1 ¼ y0 k k 1 þ zðjÞ: ð3:30Þ
E½z2 ðjÞ
Its MSE is then calculated by

h 2 i
E yðkÞ y kk 1
20 h 112 3
0 T i

E yðkÞy k k1 zðjÞ
0
6 7
¼ E4@yðkÞ @y0 kk 1 þ E½z2 ðjÞ zðjÞAA 5
h 2 i
¼ E yð k Þ y0 k k 1
h T i
E yðkÞ y0 kk 1 zðjÞ h T i
0
2 E y ð k Þ y k k 1 zð j Þ
E½z2 ðjÞ
0 h T i 12
E yðkÞy0 kk1 zðjÞ
þ@ E½z2 ðjÞ
A ðE½zðjÞÞ2
h T i2

h 2 i E yðkÞ y0 kk 1 zðjÞ
¼ E yð k Þ y0 k k 1 ð3:31Þ
h i E½z2 ðjÞ

< E yðkÞ y0 kk 1
2
;

which implies that y kk 1 is a better linear estimate than y0 kk 1 . Thus,

y0 kk 1 cannot satisfy (3.26).
□
By virtue
of the
orthogonality principle we can further
define
an innovation
process e kk 1 resulting from the estimate of ^y kk 1 by

e kk 1 ¼ zðkÞ ^y kk 1 ¼ zðkÞ HðkÞ^x kk 1 ; ð3:32Þ

where ^ y kk 1 is the least-squares estimate of y(k) given the observation

zðlÞ0 l k 1 . As a result, E e kk 1 ¼ 0 and
T
E e kk 1 e kk 1 ¼ Py kk 1 þ RðkÞ δkl , where Py(k|k 1) is the

covariance matrix of ey kk 1 in the estimate of ^y kk 1 and given by
h T i
P y k k 1 ¼ E e
y k k 1 e
y k k 1 : ð3:33Þ

Innovations Theorem e kk 1 ¼ zðkÞ ^y kk 1 is a white noise process
with
the same covariance matrix as v(k), that is,
E e kk 1 eT ll 1 ¼ E½vðkÞvT ðlÞ:
3.2.4 Discrete-Time Kalman Predictor and Filter

^ k þ 1k is the best least-squares estimate of x(k + 1) based on
The prediction x

innovations using the observations zðlÞ0 l k , which can be obtained as

follows.
Discrete-time Kalman predictor (Fig. 3.4):
v(k)
u(k-1) x(k) y(k) z(k)

H(k)
F (k,k-1) D
z(k)
xˆ ( k | k – 1)
e(k) - yˆ ( k | k – 1)
H(k)
K(k) F (k+1,k)
xˆ ( k | k – 1)
predictor xˆ ( k + 1 | k ) D
xˆ ( k | k – 1)
filter xˆ ( k | k )
Fig. 3.4 Diagram of discrete-time Kalman predictor and filter
Xk 1
^x k þ 1k ¼ E xð k þ 1 Þe T
l l 1 P y ð l Þ þ R ðl Þ e l l 1
Xk1 l¼0
1
¼ E xðk þ 1ÞeT ll 1 Py ðlÞ þ RðlÞ e ll 1 þ E xðk þ 1ÞeT kk 1
l¼0 1
Py ðkÞ þ RðkÞ e kk 1
¼ Φðk þ 1:kÞ^ x kk 1 þ KðkÞe kk 1 ;
ð3:34Þ
discrete-time KF:

x kk ¼ ^x kk 1 þ KðkÞe kk 1
^

¼ ^x kk 1 þ KðkÞzðkÞ ^y kk
1 ð3:35Þ
¼ ^x kk 1 þ KðkÞ zðkÞ HðkÞ^x kk 1 ;
or, alternatively,

x kk ¼ Φðk, k 1Þ^x k 1k 1 þ KðkÞe kk 1
^

¼ Φðk, k 1Þ^x k 1k 1 þ KðkÞzðkÞ ^y kk
1 ð3:36Þ
¼ Φðk, k 1Þ^x k 1k 1 þ KðkÞ zðkÞ HðkÞ^x kk 1 ;
where the Kalman gain matrix K(k) is given by

1
KðkÞ ¼ E xðk þ 1ÞeT kk 1 Py kk 1 þ RðkÞ ; ð3:37Þ
T
with Py kk 1 ¼ E e y k k 1 e
y kk 1 ¼ HðkÞP kk 1 HT ðkÞ:
As a result of (3.35)–(3.37), K(k) in (3.37) can be reexpressed as the following
recursive equations:
1
KðkÞ ¼ P kk 1 HT ðkÞ HðkÞP kk 1 HT ðkÞ þ RðkÞ ; ð3:38Þ
where

P k þ 1k ¼ Φðk þ 1, kÞA kk 1 ΦT ðk þ 1, kÞ þ QðkÞ; ð3:39Þ
T
A kk 1 ¼ P kk 1 KðkÞ Py kk 1 þ RðkÞ K ðkÞ; ð3:40Þ

x k þ 1k ¼ Φðk þ 1, kÞ^x kk 1 þ KðkÞe kk 1 ;
^ ð3:41Þ

KðkÞeðkÞ Φðkþ1, kÞ Kðkþ1Þeðkþ1Þ
x kk 1 ! ^x kk
^ ! ^x k þ 1k ! ^x k þ 1k þ 1 :
ð3:42Þ
3.3 Kalman Filter-Based Linear Spectral Mixture Analysis
As described in Sect. 3.2, a KF makes use of a measurement equation (input/output

equation) and a state equation (process equation) to recursively estimate parameters
of states. When a KF is implemented for linear spectral mixture analysis (LSMA) as a
mixed pixel filter, we can interpret the state vector x in an equation, called a state
equation, as the abundance vector α present in an image pixel vector r, which is
specified by another equation, called a measurement equation. With this formulation,
a KF takes advantage of a linear mixing model (LMM) commonly used in LSMA to
describe how a pixel vector r is linearly mixed via a measurement equation. The
resulting LSMA was studied in Chang and Brumbley (1999a, b) for linear spectral
unmixing (LSU) and therefore is called Kalman filtering linear unmixing (KFLU).
However, in this chapter, KFLU is referred to as KF-LSMA in a more general sense to
reflect its applicability in LSMA. Specifically, KF-LSMA includes an additional
equation, a state equation, which is absent in LSU, to estimate the abundance
fractions of the abundance vector α of the image pixel vector r currently being
processed. Implementing these two equations recursively, KF-LSMA generally pro-
duces better estimates of abundance fractions than LSU methods.
Let r(x,y) be the location of a multispectral/hyperspectral imager pixel vector at
spatial coordinates (x,y) and L be the total number of spectral channels (bands).
Then r(x,y) is an L 1 column vector. We further assume that r(x,y) can be
modeled by a linear mixture of p image endmembers, m1, m2, . . ., mp which are
3.3 Kalman Filter-Based Linear Spectral Mixture Analysis 59
present in the image with appropriate abundance fractions, α1, α2, . . . , αp with αj
corresponding to the abundance fraction of the jth endmember mj, as follows:
rðx; yÞ ¼ Mαðx; yÞ þ uðx; yÞ; ð3:43Þ
where u(x,y) is a measurement or model error introduced at r(x,y) and

M ¼ m1 m2 . . . mp is the image endmember matrix formed by m1, m2, . . ., mp
and can be viewed as a measurement matrix. It should be noted that the endmember
matrix remains invariant and independent of the spatial location of r(x,y). However,
the abundance fractions α1, α2, . . . , αp are parameters that vary pixel vector by pixel
vector and must be estimated. To simplify the discussion, we represent the image
pixel vector r(x,y) by r(k), with the parameter k indicating the position of r(x,y),
which is processed top to bottom, left to right. As a result, its corresponding
abundance vector will be denoted by α(k). One of the major strengths of a KF is
its introduction of a so-called state equation that keeps track of changes in states
during their transitions and is absent in LSMA. If we interpret a state at position k as
an abundance vector α(k), then the state equation allows us to capture changes in
abundance fractions resident in the image pixel vector r(k) at position k with respect
to the image pixel vector r(k + 1) at position k + 1. More specifically, we can model
the state equation by
αðk þ 1Þ ¼ Φðk þ 1, kÞ αðkÞ þ vðkÞ; ð3:44Þ
where Φ(k + 1, k) is an L L transition matrix from the pixel vector at position k to

the pixel vector at the position k + 1, and v(k) can be considered a state noise or
model error at position k. To implement a KF, we assume that the noise u(k) in
(3.44) is an additive white noise given by

E uðkÞuT ðlÞ ¼ QðkÞ ¼ σ 2u δkl ILL ; ð3:45Þ
where δnk is Kronecker’s notation defined by δkk ¼ 1 for k ¼ l and δkl ¼ 0 for l 6¼ k.
Similarly, we can additionally assume that the noise v(k) in (3.45) is also an
additive white noise given by

E vðkÞ vT ðlÞ ¼ RðkÞ ¼ σ 2v δkl Ipp : ð3:46Þ
Using (3.43)–(3.46), a KF can perform three operations, smoothing, filtering,

and prediction, where the smoothing and filtering can be essentially derived from
prediction (Gelb 1974). Since we are only interested in filtering and prediction that
can be used to estimate spectral signatures, only these two operations will be
reviewed and briefly discussed in what follows. For all details on Kalman filtering
implementation, see Gelb (1974).
Assume that k is the spatial position of the image pixel vector currently being
processed, and the observed image vectors described by (3.43) are available up to k,
k
rðiÞi¼1 ^ k þ 1k denote the minimum MSE prediction of the abundance
. Let α

^ kk denote the minimum MSE estimate of
vector α(k) at the position k + 1 and α
k
the abundance vector α(k) at position k based on all image pixel vectors rðiÞi¼1 that

have already been visited up to position k. Similarly, P k þ 1 k and P kk

represent the one-step MSE prediction error covariance matrix and MSE estimation
error covariance matrix at position k. Then a KF that performs LSU can be
described in the following procedure.
• Initial conditions:
and Q(k),
R(k)
^ 0 1 ¼ E½αð0Þ is the mean of α(0),
α

P 0 1 ¼ Cov½αð0Þ is the covariance matrix of α(0);
• Kalman gain at k:
1
KðkÞ ¼ Φðk þ 1, kÞP kk 1 MT MP kk 1 MT þ R
• Abundance update at k:

^ k k ¼ α
α ^ kk 1 þ KðkÞ rðkÞ M^
α k k 1 ;
• Error measurement update at k:

P kk ¼ ½I KðkÞM P kk 1 ;
• Abundance prediction at k + 1:

^ k þ 1k ¼ Φðk þ 1, kÞ^
α α k k ;
• Error measurement prediction at k + 1:

P k þ 1k ¼ Φðk þ 1, kÞP kk ΦT ðk þ 1, kÞ þ Q:
Accordingly, KF-LSMA performs abundance fraction estimation for mixed

pixel quantification using its measurement equation described by the kth image
pixel vector r(k) and state equation characterized by the abundance vector α(k)
present in the r(k). Therefore, the abundance vector α(k) can be estimated via
recursive algorithms running between the measurement equation and state equation
through all image pixel vectors, r(1), r(2), . . . in the entire image cube.
3.4 Kalman Filter-Based Hyperspectral Signal Processing 61
3.4 Kalman Filter-Based Hyperspectral Signal Processing
An application of Kalman filtering to LSMA, called KF-LSMA for mixed pixel

quantification, is discussed in Sect. 3.3. When a KF is implemented for LSU as a
mixed pixel quantifier, the state vector x in the state equation is specified by the
abundance vector α present in an image and the measurement equation specified by
the pixel vector r from which α can be estimated. With the measurement equation, a
KF describes how a pixel vector r is linearly mixed via a measurement equation. By
recursively implementing these two equations, KF-LSMA generally produces
better estimates of abundance fractions than other LSMA methods (Chang and
Brumbley 1999a, b). Several advantages resulting from the use of a KF are obvious
for remote sensing image analysis. One is its ability to deal with nonstationary data,
which are generally the case in remotely sensed imagery. Another is its recursive
structure that makes real-time processing feasible. A third advantage is its utiliza-
tion of two equations, measurement and state equations, which can be implemented
in various forms to accomplish different tasks such as smoothing, filtering, and
prediction (Gelb 1974).
While KF-LSMA is designed to process hyperspectral image pixel vectors in a
three-dimensional (3D) image cube, other applications of KF to one-dimensional
hyperspectral signal processing were studied in Chang (2013), where three Kalman-
filter spectral characterization signal processing (KF-SCSP) techniques are
designed for processing single signature-based pixels to perform spectral estima-
tion, identification, and quantification, rather than KF-LSMA, which requires
complete knowledge of image endmembers to perform mixed pixel quantification.
In what follows, we describe each of the three KF-SCSP technqiues discussed in
Chang (2013).
3.4.1 Kalman Filter-Based Hyperspectral Signal Processing
In Sect. 3.3 KF-LSMA, the measurement equation is described by an LMM, and the
state equation is included to perform abundance vector estimation for spectral
unmixing. Therefore, KF-LSMA actually operates on a hyperspectral image cube
and performs mixed pixel classification by estimating abundance vectors via its
state equation, while each image pixel vector in the image cube is used as an input
to the measurement equation. In contrast, KF-SCSP techniques are rather different
from KF-LSMA in the sense that a KF-SCSP technique is a one-dimensional
(1D) signal processing technique to exploit the spectral variability of a single
signature vector across its spectral coverage to perform spectral analysis without
referencing other data sample vectors. As a result, the sample correlation used in the
state equation by KF-LSMA is not available in KF-SCSP techniques.
This and the following sections present several new applications of Kalman
filtering to spectral estimation, identification, and quantification for hyperspectral
signature characterization by rederiving the measurement and state equations,

which results in several techniques referred to as KF-SCSP. There are important
and salient differences between KF-LSMA and KF-SCSP techniques worth men-
tioning. In KF-LSMA the measurement equation is described by an LMM and the
state equation is included to perform abundance vector estimation of image
endmembers via spectral unmixing. Therefore, KF-LSMA operates on a
hyperspectral image as an image cube and performs mixed pixel classification
using the estimated abundance vector produced by its state equation, where each
image pixel vector is considered an input to the measurement equation. Conversely,
KF-SCSP performs quite differently from KF-LSMA in the sense that it is a 1D
signal processing technique that explores spectral variability within a single signa-
ture vector across a spectral range band by band, where there are no sample vectors
involved in KF-SCSP such as sample covariance matrix used in KF-LSMA. Thus, if
KF-SCSP is applied to a hyperspectral image, it treats a pixel vector as a single
signature vector without accounting for sample correlation and performs as a
spectral signature filter rather than an image classifier as KF-LSMA does. Another
significant difference is the use of a measurement equation. KF-LSMA requires
prior knowledge of image endmembers to form an LMM to implement the mea-
surement equation for spectral unmixing. This is not true for KF-SCSP because the
data to be processed in KF-SCSP are a signature vector, not an image cube, and
there is no need for an LMM. Additionally, KF-SCSP can be implemented as a
signature estimator, an identifier, and an abundance quantifier, as opposed to
spectral unmixing methods, including KF-LSMA, which can only be implemented
as an abundance vector estimator for image endmembers assumed to be known a
priori. Finally, since KF-SCSP was not developed for image classification, the state
equation used in spectral unmixing to keep track of changes in abundance fractions
of image endmembers pixel vector by pixel vector is not available. Instead, the state
equation used in KF-SCSP is designed to capture the spectral variation of a
signature vector across its spectral coverage wavelength by wavelength.
Three KF-SCSP techniques are derived and presented in this section: Kalman
filter-based spectral signature estimator (KF-SSE), Kalman filter-based spectral
signature identifier (KF-SSI), and Kalman filter-based spectral signature quantifier
(KF-SSQ). In KF-SSE, the input and output of its measurement equation are
specified by a noise-corrupted signature vector and its true signature vector to be
estimated, respectively. KF-SSE then uses a state equation to predict the spectral
values of the true signature vector across its spectral coverage via a signal model
such as a Gaussian–Markov model. Thus, KF-SSE is designed to capture spectral
changes between adjacent spectral bands compared to KF-LSMA, which is
designed to capture changes in abundance fractions between two adjacent pixel
vectors. Most importantly, as noted previously, KF-SSE does not need a linear
mixture model as required by KF-LSMA. Therefore, there is no need for KF-SSE to
find image endmembers to form an LMM. On the other hand, KF-SSI is developed
to identify a signature vector via a matching signature vector chosen from a known
database or spectral library. In doing so, KF-SSI is derived from KF-SSE by
replacing the true signature vector used in KF-SSE with an auxiliary signature
vector that enables capturing the matching signature vector in identifying the
unknown signature vector. According to their functionality, both KF-SSE and
KF-SSI were developed as signature vector estimators, but not abundance vector
estimators, as KF-LSMA was originally designed. This is because no LMM is used
to unmix abundance fractions of image endmembers. Unlike KF-SSE or KF-SSI,
KF-SSQ can be considered a follow-up signature filter. It models its state equation
as a zero-holder interpolator by taking a KF-SSE-estimated signature vector, that is,
a spectral estimate in its measurement equation as a system gain vector, to achieve
spectral quantification of the estimated signature vector. As a result, the quantifi-
cation of the spectral signature value at the current lth band is used as a prediction of
the spectral value at the next adjacent band, the (l + 1)st band. Its ability in spectral
quantification makes KF-SSQ particularly useful in applications of chemical/bio-
logical (CB) defense, where the lethal level of a CB agent is determined by its
concentration (Wang et al. 2004b) and the collected samples do not have to be
spectrally correlated the way image pixel vectors do in an image cube. Therefore,
the quantification of a CB agent is crucial in damage control assessment.
3.4.2 Kalman Filter-Based Spectral Signature Estimator
Assume that t ¼ ðt1 ; t2 ; . . . ; tL ÞT is a true signature vector to be estimated and

r ¼ ðr 1 ; r 2 ; . . . ; r L ÞT is an observable signature vector from which the ture signature
vector t can be estimated. Since the measurement equation and the state equation
described by (3.43) and (3.44) are developed for image classification, they are not
directly useful for our needs and must be modified as follows:
Measurement equation: r l ¼ cl tl þ ul ; ð3:47Þ

State equation: tlþ1 ¼ φðl þ 1, lÞtl þ vl ; ð3:48Þ
where the index l denotes the lth band, c ¼ ðc1 ; c2 ; . . . ; cL ÞT is a system gain vector,
ϕðl þ 1, lÞ is a transition parameter from the lth band to the (l + 1)st band, and
u ¼ ðu1 ; u2 ; . . . ; uL ÞT and v ¼ ðv1 ; v2 ; . . . ; vL ÞT are white noise vectors, all of which
must be determined a priori. According to Chang and Brumbley (1999a, b), to
implement (3.47) and (3.48) recursively, the initial condition to start the recursive
algorithm between the measurement and state equations is set to ^t 1 ¼ 0 by
assuming that the estimate of the true spectral signature at the first band is
0. The KF is then implemented recursively until it reaches the last band [see
recursive formulas (6–10) in Chang and Brumbley (1999a)]. Comparing (3.47)–
(3.48) to (3.43)–(3.44), there are several salient differences between KF-LSMA
and KF-SSE. First, KF-LSMA uses (3.43) and (3.44) to model the same image
vector. That is, (3.43) represents the status of the image pixel r, which is linearly
mixed by a set of p known image endmembers, m1, m2, . . . , mp and (3.44)
T
specifies its corresponding abundance vector α ¼ α1 ; α2 ; . . . ; αp , which is
unknown but needs to be estimated via (3.43). Therefore, KF-LSMA requires
the prior knowledge of image endmembers to be used in the linear mixture model
(3.43). As a rather different approach, KF-SSE is used to estimate a target
signature vector t rather than an abundance vector α. It treats a true signature
vector t ¼ ðt1 ; t2 ; . . . ; tL ÞT as a state vector in (3.48) and estimates the state vector
band by band via an observable vector r ¼ ðr 1 ; r 2 ; . . . ; r L ÞT specified by the
measurement equation (3.47). As a result, the variables in both (3.47) and (3.48)
are scalars and represent spectral signature values of the lth band of the target
signature vector t within the observed signature vector r. In this case, the mea-
surement equation (3.47) is simply a true signature vector corrupted by noise u
¼ ðu1 ; u2 ; . . . ; uL ÞT , not a linear mixture model specified by (3.43). Another major
difference is that both (3.47) and (3.48) are linear functions of the same lth band of
the true spectral signature t, tl. This is different from (3.43) to (3.44), which are
linear functions of two distinct vectors, image vector r(k) and abundance vector
α(k) with different dimensions L and p, respectively. Therefore, compared to
KF-LSMA, whose input and output are an image pixel vector r(k) and an estimate
of the abundance vector α(k), α ^ KFLU ðkÞ , respectively, the input of KF-SSE is
simply the lth band spectral value of the observable spectral signature vector r, rl,
and its output is an estimate of tl that is the lth band spectral value of the true
signature vector t, denoted by ^t l
KFSSE
. Furthermore, the state in (3.48) is designed
to predict the spectral value of the (l + 1)st band of the true signature vector t by
updating the currently estimated spectral value of its lth band according to the
spectral variation between two adjacent bands. This is quite different from the
state in (3.44), which is included to keep track of changes in abundance vectors
between two adjacent image pixel vectors. In addition, (3.48) models the rela-
tionship of the spectral signature values between one wavelength and the next
adjacent wavelength as a Gaussian–Markov process specified by the state transi-
tion parameter ϕðl þ 1, lÞ and the additive Gaussian noise, v ¼ ðv1 ; v2 ; ; vL ÞT . If
the variance σ 2v or standard deviation of v is set too low, then the state equation
may not be effective enough to capture real variations in spectral values of one
material. Accordingly, the standard deviation of the state noise is always assumed
to be high.
3.4.3 Kalman Filter-Based Spectral Signature Identifier
In the previous section, KF-SSE was used to estimate a true signature vector
t through an observed signature vector r. By remodeling its measurement and
state equations, KF-SSE can be reinterpretated as a spectral signature identifier,
which is referred to as a KF-based spectral signature identifier (KF-SSI. To perform
spectral signature idenitification, it always assumes that there is a database or
K
spectral signature library available for this purpose, denoted by Δ ¼ fsk gk¼1 , which
is made up of K spectral signature vectors, s1, s2 , . . ., sk.
Suppose that the observable spectral signature vector is r ¼ ðr 1 ; r 2 ; . . . ; r L ÞT , to
K
be identified via Δ ¼ fsk gk¼1 . The idea is to use the measurement equation to
describe the observable spectral signature vector r as a noise-corrupted matching
spectral signature vector sk selected from the database or spectral signature library
Δ. An auxiliary spectral signature vector t is then introduced in the state equation to
model the state that describes the ability to identify a given matching signature
vector s in the observable signature vector r. In other words, the identification of the
observable spectral signature vector r is performed by finding a particular spectral
signature vector sk from the Δ that best matches r. In this case, the measurement
equation (3.47) and the state equation (3.48) can be remodeled respectively as
follows:
Measurement equation: r l ¼ cl skl þ ul ð3:49Þ
and
State equation: tlþ1 ¼ tl þ vl ; ð3:50Þ
where the scalar system gain cl is generally considered to be 1. Unfortunately,

according to (3.49) and (3.50), both equations are uncorrelated in the sense that
there is no state parameter tl present in the measurement equation. To resolve this
dilemma, we reexpress (3.49) as
measurement equation: r lþ1 ¼ ðskl =tl Þtl þ ul ; ð3:51Þ
where the matching spectral signature vector sk ¼ ðsk1 ; sk2 ; ; skL ÞT is included in
the measurement equation (3.51) and therefore is the signature vector to be used to
match the observable signature vector r ¼ ðr 1 ; r 2 ; ; r L ÞT . The auxiliary spectral
signature vector t in (3.51) then serves to bridge the observable signature vector
r and the matching signature vector sk to dictate the ability of a given signature
vector sk to match the observable signature vector r. As a result of introducing the
target signature vector t in (3.51), the system gain parameter cl in (3.49) is
reexpressed in (3.51) and becomes a spectral-varying parameter specified by skl/tl.
Interestingly, the use of such a spectral-varying system gain parameter by cl ¼ skl =tl
takes care of the effects resulting from a matching signature vector sk in (3.51). This is
a key difference between KF-SSE and KF-SSI. Therefore, a KF using the pair (3.51)
and (3.50) to perform spectral signature identification is called a KF-SSI. In this case,
the input is provided by the observable spectral signature vector r ¼ ðr 1 ; r 2 ; . . . ; r L ÞT
K
to be identified by a particular signature vector in a database, Δ ¼ fsk gk¼1 , via a
KF-SSI. And the output of the KF-SSI is the estimate of the auxiliary signature
T
vector t, ^t
KFSSI
ðsÞ ¼ ^t 1 ðsÞ,^t 2 ðsÞ, . . . ,^t l
KFSSI KFSSI KFSSI
ðsÞ . To implement (3.50)
and (3.51) recursively, the initial condition used to start the recursive algorithm
between the measurement and state equations is set to ^t 1
KFSSI
ðsÞ ¼ 0 by assuming
that the estimate of the auxiliary signature vector t at the first band is 0. Then the
KF is implemented recursively until it reaches the last band. Since a KF is a
MSE-based estimation technique, the least-squares error (LSE) between ^t l
KFSSI
T
ðsÞ and the observable spectral signature r ¼ ðr 1 ; r 2 ; . . . ; r L Þ is used instead of
MSE as a quantitative measure defined by
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2ffi
XL
LSEðr; sk Þ ¼ r^t ðsk Þ¼
KFSSI KFSSI
l¼1
r l ^t l ðsk Þ : ð3:52Þ
Using (3.52) the observable spectral signature vector r ¼ ðr 1 ; r 2 ; . . . ; r L ÞT is

identified as a particular spectral signature vector sk in Δ that yields the least
value of LSE(r, s).
It is worth noting that in addition to the auxiliary spectral signature vector
t introduced in (3.51), another major distinction between KF-SSE and KF-SSI is
that KF-SSI identifies a signature vector via a database or spectral library
K
Δ ¼ fsk gk¼1 , while KF-SSE estimates the true signature vector without the need
for Δ. As a result, their measurement equations are implemented differently, where
the signature vector used in measurement equation (3.47) for KF-SSE is the true
signature vector t, but the signature vector used in measurement equations (3.49)
and (3.51) for KF-SSI is a matching signature vector sk chosen from the database or
K
spectral library, Δ ¼ fsk gk¼1 . A third major distinction is that the standard deviation
of the state noise v, σ v specified in (3.50) generally has a significant effect on the
LSE compared to that of KF-SSE, which has no such effect on the estimation
performance of KF-SSE. This is because KF-SSI performs identification via a
matching signature vector in Δ. Consequently, a slight variation caused by σ v
may result in an incorrect identification. This is particularly true when the spectral
signature vectors in the Δ are similar. Additionally, a fourth major distinction is that
the two equations specified by (3.50) and (3.51) implemented in KF-SSI must
process two different signature vectors, observable signature vector r and matching
signature vector sk, compared to KF-SSE, which only has the observable signature
vector r processed in (3.47) and (3.48). Accordingly, KF-SSI is sensitive to
measurement noise σ u and state noise σ v, both of which are correlated. However,
this is not the case for KF-SSE. Thus, in KF-SSI, both measurement noise and state
noise must be appropriately addressed, as will be demonstrated by the experiments
in subsequent sections.
It is noteworthy to comment on the relationship between KF-SSE and KF-SSI,
which can be similarly illustrated by the relationship between constrained energy
minimization (CEM) developed in Harsanyi’s dissertation (1993) and orthogonal
subspace projection (OSP) developed by Harsanyi and Chang (1994). CEM
assumes that there is a desired target signature vector d to be detected. This
d actually correponds to the target signature vector t assumed in KF-SSE to be
estimated. It then uses d to generate detected abundance fractions of d present in all

the pixels in an image cube to perform target detection, whereas KF-SSE estimates
^t lKFSSE to approxmiate the lth band rl in the observed signature r. In other words,
CEM performs target detection in an image cube, while KF-SSE estimates target
abundance in a single signature vector. On the other hand, OSP assumes that there
are p image endmembers that can be used to unmix pixel vectors in an image cube
by estimating abundance fractions of each of the p image enedmembers present in
each of the image pixel vectors. This is exactly what KF-SSI does for a single
K
signature vector, where it uses a database formed by K signatures Δ ¼ fsk gk¼1 that
correspond to p image endmembers used by OSP for unmixing then identifies a
signature vector from the database by finding a best matching signature vector. This
functionality is equivalent to that of OSP, which uses its p estimated abundance
fractions for each of the image pxiel vectors to perform linear spectral unmixing.
The key difference is that CEM and OSP operate on image cubes, as opposed to
KF-SSE and KF-SSI, which operate on single signature vectors only.
3.4.4 Kalman Filter-Based Spectral Signature Quantifier
So far, KF-SSE and KF-SSI were developed in the two previous subsections in
attempts to perform signature vector estimation and identification from an observ-
able signature vector r. This subsection presents another new application of
KF-SSE in spectral quantification to quantify a signature vector r, which is referred
to as a Kalman filter-based spectral signature quantifier (KF-SSQ).
One of the great challenges in hyperspectral data exploitation is spectral quan-
tification. This is particularly important for CB defense, where the concentrations of
targets are of major interest rather than their shapes and sizes as generally encoun-
tered in image processing. The concentration of an agent is a key element in the
assessment of a threat (Kwan et al. 2006). In recent years, many algorithms have
been developed for quantification (Goetz and Boardman 1989; Shimabukuro and
Smith 1991; Settle and Drake 1993; Smith et al. 1994b; Tompkins et al. 1997;
Heinz and Chang 2001; Kwan et al. 2006). However, most of them are image
analysis-based linear spectral unmixing methods that estimate the abundance frac-
tions of image endmembers assumed to be present in a linear mixture model such as
the fully constrained least-squares (FCLS) method (Heinz and Chang 2001). The
technique developed in this subsection is signature vector-based and allows one to
quantify a particular material substance present in a single signature vector of
interest band by band without resorting to an LMM. Once again, a direct application
of KF-SSE in spectral quantification is not applicable since the state equation
described by (3.48) is the spectral value tl but not its abundance fraction αl,
which is of interest in quantification. To estimate the abundance fraction αl in the
state equation, the state tl in (3.48) is replaced by its abundance fraction αl and the
system gain cl in (3.48) is replaced by the lth band of the target signature vector t, tl.
The state transition parameter ϕðl þ 1, lÞ is set to 1 such that the state equation is
actually a zero-holder interpolator. The resulting measurement equation and state
equation become
Measurement equation: r l ¼ αl tl þ ul ; ð3:53Þ

State equation: αlþ1 ¼ αl : ð3:54Þ
By virtue of the pair (3.53) and (3.54), a KF-SSQ can be designed and implemented
to quantify the target signature t ¼ ðt1 ; t2 ; . . . ; tL ÞT present in the data sample vector
r in tems of its abundance fraction, αl. In this case, the inputs of the KF-SSQ are an
observable signature vector r ¼ ðr 1 ; r 2 ; . . . ; r L ÞT and target signature vector
t ¼ ðt1 ; t2 ; . . . ; tL ÞT . Its output is the estimate of αl associated with target signature
vector t, denoted by α ^ lKFSSQ ðtÞ.
Two scenarios can be implemented for the KF-SSQ. One is that the target
signature vector t in (3.53) is assumed to be known a priori. However, in many
applications, we may not have such prior knowledge. Under this circumstance,
another scenario is that t can be estimated from the observable signature vector r
¼ ðr 1 ; r 2 ; . . . ; r L ÞT via KF-SSE, where the tl in (3.53) is then replaced by its KF-SSE
estimation, ^t l
KFSSE
. The resulting measurement equation becomes
r l ¼ αl^t l
KFSSE
þ ul : ð3:55Þ
The pair (3.54) and (3.55) specifies the implementation of a KF-SSQ for unknown
target signature vector t, where the input is KF-SSE-estimated ^t l
KFSSE
specified by
(3.55) and the output is the estimate of αl in (3.54) corresponding to ^t l
KFSSE
, denoted

KFSSQ ^ KFSSE
^l
by α t .
The KF-SSQ described by (3.54) and (3.55) is quite different from KF-LSMA,
which assumes all target signature vectors, that is, image endmembers, are known a
priori and then quantifies all the known signatures present in the data vector r via
the FCLS method developed by Heinz and Chang (2001) using an LMM. However,
the second scenario of the KF-SSQ specified by (3.54) and (3.55) can be
implemented without prior knowledge of the target signature vector t, a task that
cannot be accomplished by KF-LSMA.
In what follows, the algorithmic steps of each of the three KF-SCSP techniques,
KF-SSE, KF-SSI, and KF-SSQ, are described. Assume that r ¼ ðr 1 ; r 2 ; . . . ; r L ÞT
and t ¼ ðt1 ; t2 ; . . . ; tL ÞT are the observable signature vector and the target signature
vector to be estimated, respectively.
KF-SSE
1. Initial conditions:
(a) Preset the values of σ u and σ v.
(b) Set cl ¼ ϕ ¼ 1 for all l ¼ 1, 2, . . . , L.

lþ1l
(c) Let ^t 1j0 ¼ 0 and P ¼ Var½t1 ¼ 1.
1 0
2. Start rl with l ¼ 1:
(a) Calculate the Kalman gain:
1
K l ¼ ϕ P cl cl P cl þ σ u :
lþ1 l l l1 l l1
(b) Update the state at l ¼ 1 by

^t ¼ ^t þ K l r l cl ^t :
l l l l1 l l1
(c) Update the variance of the state at l ¼ 1 by
P ¼ ½1 K l cl P :
l l l l1
(d) Predict the state at l þ 1 by
^t ¼ϕ ^t :
lþ1l lþ1l ll
(e) Predict the state variance at l þ 1 by

¼ϕ P ϕ þ σv:
lþ1l lþ1l lþ1l
P
l l
3. Increase l by 1 and process rl with l ¼ 2, repeat the five recursive steps outlined
in step 2 until l ¼ L.

KFSSE T
4. Output KF-SSE-estimated ^t
KFSSE
¼ ^t 1 ; ^t 2 ; . . . ; ^t L , with ^t l
KFSSE KFSSE KFSSE
¼ ^t l , for all 1 l L obtained from steps 2 and 3.
KF-SSI
K
(a) Assume that Δ ¼ fsk gk¼1 is a given spectral library or database and sk is a
matching signature selected from Δ.
(b) Preset the values of σ u and σ v.
(c) Set ^t 1j0 ðsk Þ ¼ 0 and P ¼ Var½t1 ¼ 1.
1 0
(d) Set cl ¼ skl =^t l for all l ¼ 1, 2, . . . , L.
(e) Set ϕ ¼ 1 for all l ¼ 1, 2, . . . , L.
lþ1 l
2. Start rl with l ¼ 1.
1
lþ1 l l l1 l l1

^t ¼ ^t þ K l r l cl ^t :
l l l l1 l l1
(c) Update the state variance at l ¼ 1 by
P ¼ ½1 K l cl P :
l l l l1
^t ¼ϕ ^t :
lþ1l lþ1l ll

¼ϕ P ϕ þ σv:
lþ1l lþ1l lþ1l
P
l l
3. Increase l by 1 and process rl with l ¼ 2. Update cl ¼ skl =^t l and repeat the five
recursive steps outlined in step 2 until l ¼ L.
T
4. Output KF-SSI-estimated ^t
KFSSI
ðsk Þ ¼ ^t 1 ðsk Þ,^t 2 ðsk Þ, . . . ,^t L
KFSSI KFSSI KFSSI
ðsk Þ ,
with ^t l ðsk Þ ¼ ^t l , for all 1 l L obtained from steps 2 and 3.
KFSSE
5. Find the s* that yields the minimum LSE between ^t

KFSSI
ðsk Þ and
T
r ¼ ðr 1 ; r 2 ; . . . ; r L Þ over sk in Δ via (3.52), that is, s ¼ argfminsk2Δ LSEðr; sk Þg.
*
6. Identify r ¼ ðr 1 ; r 2 ; . . . ; r L ÞT as the signature s*.
KF-SSQ
(a) Preset the values of σ u.
(b) Set σ v ¼ 0.
(c) Assume that α(t) is the abundance vector corresponding to the target signa-
ture t. Set cl ¼ tl for l ¼ 1, 2, . . . , L.
^ 1j0 ðtÞ ¼ 0 and P ¼ Var½α1 ¼ 1.
(d) Let α
1 0
(e) Set ϕ ¼ 1 for all l ¼ 1, 2, . . . L.
lþ1 l
3.5 Conclusions 71
2. Start rl with l ¼ 1.
1
lþ1 l l l1 l l1

α
^ ðtÞ þ K l r l cl α
^ ðtÞ ¼ α
^ ðt Þ :
l l l l1 l l1
(c) Update the state variance at l ¼ 1 by
P ¼ ½1 K l cl P :
l l l l1
^
α ¼ϕ α
^ :
lþ1l lþ1l l l

¼ϕ P ϕ þ σv:
lþ1l lþ1l lþ1l
P
l l
3. Increase l by 1 and process rl with l ¼ 2. Update cl ¼ tl , and repeat the five

recursive steps outlined in step 2 until l ¼ L.
KFSSQ
4. Output KF-SSQ-estimated abundance vector α ^ KFSSQ ðtÞ ¼ α^1 ^ 2KFSSQ
ðtÞ, α
T
^L
ðt Þ, . . . , α KFSSQ
^l
ðtÞÞ , with α KFSSQ
^ l, for all 1 l L obtained from steps
ðt Þ ¼ α
2 and 3.
3.5 Conclusions
This chapter reviews Kalman filtering, which is one of the most widely used real-
time processing techniques in statistical signal processing. Its application to
hyperspectral imaging was first explored by Chang and Brumbley (1999a, b) to
develop KF-based LSMA (KF-LSMA). The concept of Kalman filtering was
further explored by Chang and Wang (2006) and Chang (2013), which developed
KF-based techniques for spectral signature characterization rather than KF-LSMA
for LSU. Accordingly, several new applications of Kalman filtering in
hyperspectral signature analysis were developed for spectral estimation, identifica-
tion, and quantification by rederiving the measurement and state equations, which
resulted in various techniques referred to as KF-based spectral characterization
signal processing (KF-SCSP) techniques.
Chapter 4
Target-Specified Virtual Dimensionality
for Hyperspectral Imagery
Abstract Virtual dimensionality (VD) was first envisioned and coined by Chang
(Hyperspectral Imaging: Techniques for Spectral Detection and Classification,
Kluwer Academic/Plenum Publishers, New York, 2003) and later written about
with details in Chang and Du (IEEE Transactions on Geoscience and Remote
Sensing 42:608–619, 2004). It was originally developed for the purpose of finding
an appropriate number of signatures required by linear spectral mixture analysis to
form an effective linear mixing model for data unmixing. To go beyond its original
intent VD was further defined as the number of spectrally distinct signatures present
in hyperspectral data where the intention is to leave room for users to specify what
spectrally distinct signatures are. Unfortunately, since its inception in 2003, many
techniques developed to estimate VD have overlooked this issue and, thus, have
produced various values for VD. As a result, different conclusions have been drawn
about VD. This issue was realized by Chang (Hyperspectral data processing:
algorithm design and analysis, Wiley, Hoboken, 2013), where VD was defined by
two types of criteria, data characterization-driven criteria and data representation-
driven criteria. This chapter revisits VD and follows a different route. It extends VD
to target-specified VD from various target signal source perspectives of interest
instead of focusing on data as treated in Chang (Hyperspectral data processing:
algorithm design and analysis, Wiley, Hoboken, 2013). In particular, it generalizes
VD in two different approaches. One is to develop a unified theory for VD of
hyperspectral imagery, where the value of VD can be completely determined by
targets of interest. With this newly developed theory VD techniques can be cate-
gorized according to targets that can be characterized by various orders of statistics
into first-order statistics, second-order statistics, and high-order statistics, where a
binary composite hypothesis testing problem is formulated with a target of interest
considered as a desired signal under the alternative hypothesis, while the null
hypothesis represents an absence of the target. With this interpretation many VD
estimation techniques can be unified under the same framework, including those
defined in Chang (Hyperspectral data processing: algorithm design and analysis,
Wiley, Hoboken, 2013), for example, the Harsanyi–Farrand–Chang method or the
maximum orthogonal complement algorithm (MOCA) (Kuybeda et al., IEEE
Transactions on Signal Processing 55:5579–5592, 2007). A second approach is to
define VD as being completely determined by real target signal sources instead of
data spectral statistics. Once again, a binary composite hypothesis testing problem
is also used to determine whether a specific target signal source is a desired signal

DOI 10.1007/978-3-319-45171-8_4
74 4 Target-Specified Virtual Dimensionality for Hyperspectral Imagery
source to account for VD. Interestingly, this approach is more practical in real-
world problems because VD is actually determined by real target signal sources
directly extracted from the data for a particular application rather than eigenvectors
specified by data spectral statistics, which are independent of applications.
4.1 Introduction
Linear spectral mixture analysis (LSMA) is one of most important and fundamental
applications in hyperspectral data exploitation. When implemented, it requires prior
knowledge of the signatures used to form a linear mixing model (LMM) to find the
abundance fraction of each of the signatures present in a data sample vector.
Unfortunately, in reality, such prior knowledge is generally not available. Under
such a circumstance two challenging issues must be addressed. One is determining
the number of signatures, p. The other is finding these p signatures that are used to
form an LMM. In the past, the value of p was determined empirically, and no
specific criteria were proposed until virtual dimensionality (VD) was introduced in
2003 (Chang 2003a, b). Once the value of p is determined, a follow-up issue is to
find the desired p signatures. To resolve this issue, various techniques have been
developed, for example, endmember extraction algorithms (EEAs) discussed in
great detail in Part II of Chang (2013) as well as endmember finding algorithms
(EFAs) studied extensively in Parts II and III of Chang (2016) or unsupervised
target detection algorithms such as the automatic target generation process (ATGP),
unsupervised nonnegativity constrained least-squares (UNCLS) method, and
unsupervised fully constrained least-squares (UFCLS) method in Chang (2016).
Interestingly, it was shown in Gao et al. (2015a, b) and Chang (2016) that signatures
extracted by EEAs or found by EFAs are not necessarily desired signatures that are
suitable for the LMM used by LSMA. In other words, EEAs and EFAs are not the
best algorithms for finding signatures to form an LMM for LSMA, as many
researchers once expected. This leads to a crucial clarification of how to properly
use VD for interpretation. Specifically, the value of VD, nVD, determined for
endmembers should not be the same value of nVD determined for signatures used
by LSMA. The same principle holds for the number of anomalies. This implies that
the value of nVD must be determined by applications, not solely by data statistics.
This chapter investigates this important issue, which so far has not been recognized
or explored.
Many techniques have been developed to estimate the number of signal sources in
hyperspectral imagery. The earliest one was developed by Harsanyi et al. (1994a, b),
referred to as the Harsanyi–Farrend–Chang (HFC) method, which has been shown
to be by far the most effective and promising. However, the concept of VD was not
realized until Chang (2003a), which modeled the VD estimation problem as a
binary composite hypothesis testing problem with covariance eigenvalues and
4.2 Review of VD 75
correlation eigenvalues being treated as signal sources under the null and alterna-
tive hypotheses, respectively. It then developed a Neyman–Pearson detector (NPD)
to determine the value of nVD as specified by a prescribed false alarm probability.
The idea of using a binary hypothesis testing problem was also later adopted by the
maximal orthogonal complement algorithm (MOCA) (Kuybeda et al. 2007), which
used a maximum likelihood detector (Bayesian detector) instead of adopted by
NPD to determine the rank of rare signals in the data to estimate nVD. An idea of
combining the HFC method with MOCA was investigated in Chang et al. (2011b),
where an approach called maximal orthogonal subspace projection (MOSP) was
developed to estimate nVD. Most recently, MOSP was further extended to high-
order statistics (HOS) HFC methods (Chang et al. 2014a, b).
Another approach is to use a linear model for data representation to estimate
nVD. One is signal subspace estimation (SSE) and hyperspectral signal identifica-
tion by minimum error (HySime) proposed by Bioucas-Dias and Nascimento
(2005, 2008). However, there are two major issues with the SSE/HySime method.
The first has to do with the fact that the value of nVD estimated by SSE/HySime is
heavily reliant upon the method used to estimate the noise covariance matrix. The
second and most serious issue is that the HySime-estimated value is a single fixed
value and completely determined by the data statistics and linear representation
regardless of application. In other words, the SSE/HySime-estimated nVD is the
same for all applications, such as anomaly detection, endmember finding/extrac-
tion, and LSMA. Apparently, this is generally the case. Thus, from a practical point
of view, it is not an appropriate measure for estimating nVD. A second one is
LSMA-based methods proposed by Chang et al. (2010a, b), where the value of
nVD is determined by the minimal unmixed error produced by the number of
signatures used to form an LMM for LSMA.
The aforementioned techniques have one thing in common: their estimated
values for nVD have nothing to do with applications. Although the value estimated
by the HFC method with its variants can be made to vary by adjusting different false
alarm probabilities, which may be somewhat specified by applications, the signal
sources used by HFC-based methods via a binary composite hypothesis testing
formulation are eigenvalues or eigenvectors, not real target signal sources. Accord-
ingly, they are not as effective as the signal sources found directly from applica-
tions. To resolve this dilemma, this chapter develops a target-specified VD (TSVD)
theory for hyperspectral imagery where the value of nVD is completely determined
by the targets of interest characterized by specific applications with algorithms
custom-designed for finding them.
4.2 Review of VD
Determining the value of nVD can be very challenging. One of the earliest attempts
at doing so involved using eigenanalysis, such as eigenvalues or singular values, as
a criterion to find a significant gap that would indicate how many eigenvalues/
singular values should be counted to represent the number of signal sources.

However, this approach requires a judicious selection of an appropriate threshold
to determine nVD. A detailed treatment on eigenanalysis can be found in Chang
(2013). Another approach involves the use of least-squares error (LSE) or mean
squared error (MSE) to determine nVD, for example, HySime or sparisty promoting
iterated constrained endmember (SPICE) (Zare and Gader 2007). Such approaches
generally require an assumption that data samples can be modeled by a linear
representation or a linear admixture of a number of signal sources plus a noise
term whose covariance matrix must be reliably estimated. Unfortunately, their
produced nVD is fixed at a single constant value and their generalizability is very
limited and cannot be adapted to different applications, as noted in the introduction.
To address this issue Harsanyi et al. (1994a) took advantage of concepts in
statistical signal processing to develop a revolutionized approach, called the HFC
method, by formulating the determination of the number of signatures present in the
data as a binary composite hypothesis testing problem, where an NPD was used to
determine how many signatures assumed to be in the data were subject to a
prescribed value of the false alarm probability. The beauty of the HFC method is
that it interprets the problem of determining nVD as a signal detection problem
where the signal detectability is determined by the false alarm probability, which
can be then adjusted to adapt various applications, rather than a prescribed thresh-
old or a single fixed constant value. Most importantly, it makes no assumptions
about LMM/representation. Interestingly, the potential usefulness of the HFC
method was not recognized until VD was first introduced and coined in Chang
(2003a) and later written about in publications by Chang and Du (2004) and Chang
(2013). It was originally designed to address the issue of how many spectrally
distinct signatures should be used to form an LMM required by LSMA to unmix
data. Surprisingly, since then, VD has successfully found its way into many other
applications in hyperspectral data exploitation, for example, the number of
endmembers to be extracted or found, nE, the number of dimensions to be retained
following data dimensionality, nDR, or the number of bands to be selected by band
selection (BS), nBS (Chang 2003b). In particular, the use of an NPD has become one
of the most popular and widely used methods to determine nVD.
However, the most difficult and challenging issue in using the Neyman–Pearson
detection test is to find a probability distribution (PD) under each hypothesis. In the
HFC method, these PDs are Gaussian-distributed because an eigenvalue distribu-
tion has proven to be an asymptotic Gaussian distribution (Anderson 1984). By
taking advantage of the idea behind the HFC method, two approaches were
developed. One is called eigenvalue likelihood maximization (ELM) (Luo
et al. 2013), which maximizes the entropy of log function of the product of
likelihood functions derived from the HFC method. In this case, it does not perform
NP tests but rather produces a single constant value. The other is the MOCA,
developed by Kuybeda et al. (2007), that uses the extreme theory (Leadbetter 1987)
to find PDs under the null and alternative hypotheses, which are assumed to be
Gaussian and uniformly distributed, respectively. As an alternative, Broadwater
and Banerjee (2009) proposed the subpixel dimensionality distance (SDD), which
4.2 Review of VD 77
generates nonparametric PDs for both hypotheses by self-created background and

target pixels. Most recently, Ambikapathi et al. (2013) combined and integrated
LSMA-based VD in Chang et al. (2010a, b), HySime, MOCA, and MOSP to
develop a hybrid approach, referred to as geometry-based estimation of number
of endmembers (GENE), with more restricted assumptions. First, to develop their
own Neyman–Pearson detection theory, Chang et al. (2010a, b) had to assume the
presence of an LMM with an additive Gaussian noise. In addition, they had to make
a reliable estimation of the noise covariance matrix to calculate the LMM-fitting
error, in which case the PDs under both hypotheses could be shown to be
chi-squared distributions with the degree of freedom, Nmax, needing to be deter-
mined a priori. More specifically, this approach implemented HySime in a formu-
lation of binary hypothesis testing problems used in MOSP where the signals under
both hypotheses are fitting errors obtained by the covariance noise estimation
resulting from the LMM. In other words, GENE is simply a combination of
LSMA-VD, HySime, MOCA, and MOSP in the sense that the signals under both
hypotheses are LMM-fitting errors obtained by the noise covariance estimation.
This idea is similar to that of HySime. The binary composite hypothesis testing for
its NPD was derived from MOSP. The idea of deriving chi-squared distributions
follows the same idea in MOCA which used Gaussian and uniform distributions
under H1 and H0 hypotheses respectively.
Most recently, a second-order-statistics (2OS)-based theory was developed for
estimating nVD in Paylor and Chang (2013) and Paylor (2014). Of particular interest
is an LSE-based model approach derived in Paylor and Chang (2014) in comparison
with the GENE. Of course, other approaches are worth mentioning, such as the
concept of intrinsic dimensionality (ID). For example, an approach in Eches
et al. (2010) discussed ID, which is totally different from VD. It proposed a normal
compositional model (NCM) that assumed that each pixel of an image was modeled
as a linear combination of an unknown number of pure materials, called
endmembers, and then estimated the mixture coefficients of the normal composi-
tional model, referred to as abundances, as well as their number using a reversible
jump Bayesian algorithm. Another approach proposed in Heylen and Scheunders
(2013) estimated ID using nearest-neighbor distance ratios with k nearest neighbors
(k-NN) as a criterion for each pixel and then determined the ID of the data by
finding the mean of the IDs over all pixels plus one. Most recently, a comparative
analysis among three methods, HFC, HySime, and random matrix theory, was also
conducted to estimate the ID (Cawse-Nicholson et al. 2013).
Generally speaking, all the aforementioned techniques can be considered
eigenanalysis approaches where the signal sources under both hypotheses are either
eigenvalues or eigenvectors. The only exception is the ATGP-NPD developed in
Chang et al. (2011a, b, c, d), which used ATGP-generated targets as signal sources.
However, in real practical applications, targets of different types in hyperspectral
imagery exhibit their own unique profiles of spectral characteristics, for example,
anomalies, endmembers, artificial targets, urban targets, natural targets, and so
forth, all of which can be characterized by their spectral properties (Chap. 18 in
Chang 2013). An eigenanalysis approach does not capture real spectral statistics
since various real target signal sources may be projected onto the same eigenvec-
tors. Accordingly, the spectral properties of real target signal sources are very likely
compromised. Thus, a crucial and critical issue in designing and developing
effective techniques for VD estimation is the signal sources that must be determined
by real targets of interest, not eigenvalues/eigenvectors. To the author’s best
knowledge, the ATGP-NPD developed in Chang et al. (2011a, b, c, d) and a theory
of HOS-based VD in Chang et al. (2014a, b) are the only approaches that make use
of real target signal sources to estimate VD. This chapter is believed to be the first to
extend the idea of ATGP-NPD and HOS-VD to derive a unified theory of TSVD.
Table 4.1 summarizes various VD estimation techniques available in the liter-
ature, along with their criteria, assumptions, and constraints.
4.3 Eigen-Analysis-Based VD
In real-world problems, targets of interest are generally determined by different

applications. In this case, VD must be able to adapt its definition to various targets
under consideration. The theory of TSVD is developed for this purpose. In doing so
this theory formulates finding VD as a binary composite hypothesis testing problem
where the targets of interest specified by a particular application are treated as
desired signal sources to be detected. The value of nVD is then determined by the
number of failures in testing these signal sources by an NPD.
The idea of characterizing targets in terms of spectral statistics was derived from
a new notion, called interband spectral information (IBSI) introduced in Chang
et al. (2010a, b, 2011a, b, c, d), which is also discussed in Chang (2013). Assume
that t is the target signature of interest and St the sample pool specified by t. The
spectral statistics of t is defined by IBSI(St), which is the sample statistics generated
by spectral correlation among data sample vectors in St. Using the concept of IBSI
(St), targets can be characterized by various orders of spectral statistics, for exam-
ple, first-order spectral statistics targets, second-order spectral statistics targets, and
high-order spectral statistics targets. As a consequence, for a given order of spectral
statistics, q, determining the value of nVD for the qth-order spectral targets is
equivalent to finding the number of the qth-order spectral targets. In doing so a
binary composite hypothesis testing formulation can be used where a qth-order
spectral target is considered a desired signal under the alternative hypothesis, H1,
with its absence represented by the null hypothesis H0. The number of failures (i.e.,
H1 is true and H0 fails) in a test is the number of qth-order spectral targets present in
the data.
Two types of detectors are generally used for signal detection. One is a Bayesian
detector, proposed in the MOCA (Kuybeda et al. 2007), which is an a priori detector
assuming uniform cost and equal probability hypotheses. In this case, the Bayesian
detector is reduced to a maximum likelihood detector (MLD), and only a single
value is produced for the number of the qth-order spectral targets. The other is an
Table 4.1 Methods for VD estimation along with their criteria, assumptions, and constraints
Gaussian Threshold Noise
Statistics Criterion LMM model parameter estimation Estimation for nVD nVD
HFC First order Eigenanalysis No No PF No NPD: eigenvalues Vary/PF
ELM Second Likelihood No Yes None No Maximization Fixed
order function
MOCA Second SVD No No None No MLD: singular vector Fixed
order
4.3 Eigen-Analysis-Based VD
HySime Second MSE: LMM Yes Yes None Yes Minimization Fixed
order
OSP Second LSE: Yes No ε No Minimization Vary/ε
order OSP-LMM
MOSP Second OSP No No PF No NPD: ATGP-generated targets Vary/PF
order
SDD Non- Detector power No No PF No NPD: nonparametric PDs Vary/PF
parametric
SPICE Second LSE + V + SPT No No ε No LSE Vary/ε
order
GENE Second LSE Yes Yes Nmax, PF Yes NPD: LMM-fitting error Vary/PF
order
Second-order Second LSE Yes No PF No NPD: second-order-statistics-gener- Vary/PF
statistics order ated targets
LSE model Second LSE Yes Yes PF No NPD: LS-generated targets Vary/PF
order
NCM Prior Likelihood NCM Yes σ2 No Bayesian algorithm for ID Vary/σ 2
function
HOS-VD HOS OSP No No PF No NPD: HOS-generated targets Vary/PF
k-NN Manifold NN No No k (NN) Yes Average ID/pixel Vary/k
79
NPD, developed in the HFC method, which assumes no prior knowledge and signal
detection performance is evaluated by the false alarm probability, PF, at a signif-
icant level. As a result, the number of the qth-order spectral targets varies and is
determined by a prescribed value of PF. The TSVD for hyperspectral imagery is
derived by integrating the concept of specifying spectral targets of any given order
into a binary composite hypothesis testing detection formulation to derive a detec-
tor that can determine the value of nVD for a specific type of spectral target, such as
anomaly, endmember, and others. Under this umbrella many existing VD estima-
tion techniques can be treated as the first two orders of statistics-based techniques,
for example, HFC/noise-whitened HFC (NWHFC) as a first-order-statistics-based
technique and MOCA, SSE/HySime as 2OS-based techniques.
As noted in the introduction, VD was originally designed to resolve the issue of
how many signatures are required to form an LMM used by LSMA for data
unmixing and was used subsequently to address similar issues in various applica-
tions, such as the number of endmembers, nE, to be extracted, the number of
dimensions, nDR, required to be retained after data dimensionality reduction, the
number of bands, nBS, to be selected by BS, and the number of anomalies, nA,
present in the data. Unfortunately, when VD was defined in Chang (2003a), it did
not specify what a spectrally distinct signature should be. Accordingly, a single
value of VD, nVD, cannot be adequately used for all applications. Apparently, the
previously developed eigenanalysis-based approaches are not able to cope with this
issue. This chapter reinvents a wheel to revolutionize the concept of VD and
develops a rather different theory for VD from those discussed in Chang (2003a,
Chap. 17) and Chang (2013, Chap. 5). First, finding nDR, nBS, nE, and nA requires
different values. That is, the value of nVD estimated for nDR, nBS, nE, and nA should
be different in each case and cannot be the same value determined by a single
constant value. For example, nDR and nBS are generally determined by data struc-
tures and statistics compared to nE and nA, which are indeed determined by targets
of interest. By far most techniques available in the literature belong to the first
category and have no reference to any specific targets. Therefore, they are better for
estimating nDR and nBS, not nE and nA, since there is no prior knowledge but only
data structures and statistics available for use in DR and BS. Examples of such
techniques include the HFC method and HySime. On the other hand, for nE and nA
to be estimated effectively, the techniques used to estimate the value of nVD must be
those that can capture the spectral characteristics of specific targets of interest, not
data statistics or structure. This means that techniques used to estimate nDR and nBS
may not be suitable for estimating nE and nA. Therefore, new approaches must be
sought. This chapter presents a theory of TSVD for hyperspectral imagery where
the value of nVD is completely determined by targets of interest characterized by
their spectral statistics.
4.3 Eigen-Analysis-Based VD 81
4.3.1 Binary Composite Hypothesis Testing Formulation
Binary hypothesis testing has been widely used for signal detection in noise where
two hypotheses represent signal presence and absence. This approach was first
applied to determine the value of nVD by the HFC method (Harsanyi et al. 1994a, b).
4.3.1.1 HFC Method
The idea of the HFC method is very simple. It assumes that the hyperspectral
signatures are unknown nonrandom and deterministic signal sources and noise is
white with zero mean. Under this circumstance, the signatures of interest will be
N
contributed to the first-order statistics, which is the sample mean. Let fri gi¼1 be data
XN
sample vectors and μ the sample mean vector given by μ ¼ r . The HFC
i¼1 i
XN
method first calculates the sample autocorrelation matrix, RLL ¼ r r T , and
i¼1 i i
XN
sample covariance matrix, KLL ¼ i¼1
ðri μÞðri μÞT , then finds the differ-
ence between their corresponding eigenvalues. More specifically, let

^λ 1 ^λ 2 ^λ L and fλ1 λ2 λL g be two sets of eigenvalues
generated by RLL and KLL , called correlation eigenvalues and covariance
eigenvalues, respectively, where L is the number of spectral channels. If a
hyperspectral signal source is present in the data, there should be some spectral
dimension l with 1 l L such that ^λ l > λl owing to the fact that the signal source
will contribute to the sample mean in the sample correlation matrix RLL but not to
the sample covariance matrix KLL . It should be noted that the sample correlation
N N
matrix RLL is invariant to any permutation fPðiÞgi¼1 of figi¼1 , that is,
XN N N
RLL ¼ r r , where fri gi¼1 ¼ rPðiÞ i¼1 . This implies that RLL is an
T
i¼1 PðiÞ PðiÞ
intrasample correlation matrix, not an intersample spatial correlation matrix. In this
case, we can formulate the problem of determining the value of nVD as follows.
To determine the value of nVD, Harsanyi et al. (1994a, b) formulated the problem
of determining the value of nVD as a binary hypothesis problem as follows:
H 0 : zl ¼ ^λ l λl ¼ 0
versus for l ¼ 1, 2, . . . , L; ð4:1Þ
H 1 : zl ¼ ^λ l λl > 0
where the null hypothesis H0 and the alternative hypothesis H1 represent the case
that the correlation eigenvalue is equal to its corresponding covariance eigenvalue
and the case that the correlation eigenvalue is greater than its corresponding
covariance eigenvalue, respectively. In other words, when H1 is true (i.e., H0
fails), it implies that there is a hyperspectral signature contributing to the correlation
eigenvalue in terms of first-order statistics since the noise energy represented by the
eigenvalue of RLL in that particular component is the same as that represented by

the eigenvalue of KLL in its corresponding component. The only difference is the
sample mean vector, which is a nonzero vector in RLL but a zero vector in KLL .
An NPD for ^λ l λl , denoted by δNP(l ) for the binary composite hypothesis testing
problem specified by (4.1), can be obtained by maximizing the detection power PD,
with the false alarm probability PF being fixed at a specific given value, α, which
determines the threshold value τl in the following randomized decision rule:
8
< 1, if Λðzl Þ > τl ,
δNP ðzl Þ ¼ 1 with probability κ, if Λðzl Þ ¼ τl , ð4:2Þ
:
0, if Λðzl Þ < τl ;
where the likelihood ratio test Λ(zl) is given by Λðzl Þ ¼ p1 ðzl Þ=p0 ðzl Þ, with p0(zl) and
p1(zl) given by (4.1). Thus, a case of ^λ l λl > τl indicates that δNP(zl) in (4.2) fails
the test, in which case signal energy is assumed to contribute to the eigenvalue, ^λ l ,
in the lth spectral dimension. It should be noted that the test for (4.1) must be
performed for each of the L spectral dimensions. Therefore, for each pair of ^λ l λl ,
the threshold τl varies and should be spectral dimension-dependent. Using (4.2), the
value of nVD can be determined by calculating
XL
HFC ðPF Þ ¼
VDNP l¼1
δNP ðzl Þ ; ð4:3Þ
where PF is a predetermined false alarm probability, bδðzl Þc ¼ 1 only if δNP ðzl Þ ¼ 1

and bδðzl Þc ¼ 0 if δNP ðzl Þ < 1. Because the hypothesis testing problem (1) is
performed by detecting the spectral sample mean of each spectral band, which is
the first-order spectral statistics, the HFC method is considered a first-order sample
spectral statistics method.
Finally, the HFC method was further improved by the so-called NWHFC
method, which whitens data prior to implementation of the HFC method. As a
result, VD is defined as VDNP NWHFC (PF).
4.3.1.2 Maximum Orthogonal Complement Algorithm
The MOCA, a recent approach developed by Kuybeda et al. (2007), has attracted
considerable interest and been further studied by Chang et al. (2011a, b, c, d, 2014a, b).
It was designed to find a signal subspace and its rank dimensionality. If we interpret
the spectrally distinct signatures defined by VD as rare signal sources, MOCA can
also be considered a VD estimation technique for HOS-characterized target signal
sources (Chang et al. 2014a, b).
N
Let fri gi¼1 be data sample vectors. Assume that for each l, which is supposed to
N
be the rank of a signal subspace Sl, the data sample vectors fri gi¼1 are divided into
two index classes, one for the target signal class, IT(l ), the other for the background
class, IB(l ). Now, we define
2
νl ¼ maxi2IB ðlÞ P⊥Sl r i
; ð4:4Þ
⊥ 2
ξl ¼ maxi2IT ðlÞ PSl ri ; ð4:5Þ
ηl ¼ maxfξl ; νl g: ð4:6Þ
More specifically, for each 1 l L we define

n ⊥ o
tSVD ¼ arg max r
P r ; ð4:7Þ
l Sl
SVD 2
ηl ¼ t : l ð4:8Þ
Note that {Sl} is monotonically increasing at l, in the sense that

S0 S1 Sl , and {ηl} is monotonically decreasing at l, in the sense that
η0 η1 ηl . The MOCA casts finding an optimal l* specified by (4.4)–(4.8)
as a binary composite hypothesis testing problem with two hypotheses, H1 and H0,
to represent two corresponding cases, described as follows:

H 0 : ηl p ηl H 0 ¼ p0 ðηl Þ
versus for l ¼ 1, 2, . . . , L ; ð4:9Þ
H 1 : ηl p ηl H 1 ¼ p1 ðηl Þ
where the null hypothesis H0 represents that the maximum residual is from the
background signal sources, while the alternative hypothesis H1 represents that the
maximum residual is from the target signal sources. To make (4.9) work, we need to
find PDs governed by each of the two hypotheses. Because the orthogonal comple-
ment subspace projections of data sample vectors P⊥ Sl ri under H0 are supposed to be
noise sample vectors, it is reasonable for MOCA to assume that the vector P⊥ Sl r i
under H0 behaves as independent identically Gaussian random variables. Moreover,
ηl represents the maximum residuals of orthogonal projection obtained in < Sl >⊥
under H0. By virtue of extreme value theory (Leadbetter 1987), ηl can be modeled
as a Gumbel distribution, that is, Fvl ðηl Þ is the cumulative distribution function (cdf)
of vl given by
( h i)
ð2logN Þ1=2 p ffiffiffiffiffiffiffi
xσ 2 ðLlÞ
ð2logN Þ1=2 þ12ð2logN Þ1=2 ðloglogNþlog4π Þ
Fvl ðxÞ exp e σ 2 2ðLlÞ
: ð4:10Þ
Furthermore, MOCA made another assumption about the ηl under the alternative
hypothesis H1. That is, for each pixel ri, i 2 IT ðlÞ, the maximum of residuals of
orthogonal complement subspace projections can be modeled as a random variable
ξl with the PD pξl ðηl Þ, which is assumed to be uniformly distributed on
[0, ηl1]. Since there is no prior knowledge available about the distribution of target
pixels, assuming ηl under H1, uniform distribution seems most reasonable
according to the maximum entropy principle in Shannon’s information theory.
Under these two assumptions, we can obtain
pðH0 ; ηl Þ ¼ pνl ðηl ÞFξl ðηl Þ ¼ pνl ðηl Þðηl =ηl1 Þ; ð4:11Þ
pðH 1 ; ηl Þ ¼ Fνl ðηl Þpξl ðηl Þ ¼ Fνl ðηl Þð1=ηl1 Þ: ð4:12Þ

Since pηl ðηl Þ ¼ pðH 0 ; ηl Þ þ pðH 1 ; ηl Þ ¼ ð1=ηl1 Þ ηl pνl ðηl Þ þ Fνl ðηl Þ , we can obtain
an a posteriori PD of p(H0|ηl) given by
η l pν l ð η l Þ
p H 0 η l ¼ ð4:13Þ
η l p ν l ð η l Þ þ Fν l ð η l Þ
and an a posteriori PD of p(H1|ηl) given by
Fνl ðηl Þ
p H1 ηl ¼ : ð4:14Þ
ηl pνl ðηl Þ þ Fνl ðηl Þ
By virtue of (4.13) and (4.14), the rank of the desired signal subspace can be
obtained by

l* ¼ arg min1lL p H 0 ηl p H 1 ηl : ð4:15Þ
By virtue of (4.15), we can define VD determined by MOCA as
VDMOCA ¼ l* : ð4:16Þ
A final comment on MOCA is noteworthy. The idea of MOCA is to decompose all

singular vectors into three sets corresponding to primary signal sources forming the
background subspace, secondary signal sources forming rare signal source subspace,
and noise sources forming a noise subspace. The l* determined by MOCA via (4.16) is
to separate signal sources including background and rare signal sources from noise
sources. The value of l* is exactly the value of nVD. It then goes one step further by
developing an algorithm, called minimax-singular value decomposition (MX-SVD),
to find an optimal value, p*(l), for given l signal sources so that the background and
signal sources can be separated into p* rare signal sources and l p* background signal
sources. When MX-SVD is implemented in conjunction with MOCA, two optimal
values can be found, l* and p* for a given l*, denoted by p*(l*). Or, alternatively, a better
and simpler way to explain the idea of MOCA is to use eigenvalues instead of singular
vectors for illustration. Let fλ1 λ2 λL g be all the eigenvalues arranged in
descending order. The purpose of MOCA and MX-SVD is to find a set of two optimal
values, l* and p*, to decompose fλ1 λ2 λL g into three separate disjoint sets
n o
of eigenvalues, λ1 λ2 λl* p* corresponding to background signal sources
n o
specified by SVD-generated singular vectors, λl* p* þ1 λl* p* þ2 λl*
corresponding
to rare signal sources
specified by SVD-generated singular vectors,
and λl* þ1 λl* þ2 λL corresponding to noise signal sources specified by
SVD-generated singular vectors. Nevertheless, the main theme in this chapter is to
determine the number of target signal vectors for the value of nVD, not to find the
number of rare signal sources. Thus, only MOCA is of major interest in finding l* rather
than MX-SVD in finding p*. In this case, the MX-SVD is not considered part
of MOCA.
4.3.2 Discussions of HFC Method and MOCA
While both the HFC method and MOCA take advantage of binary hypothesis
testing theory, several crucial differences between the HFC method and MOCA
need to be noted.
1. From a detection point of view, the HFC method is developed based on
Neyman–Pearson detection theory, as opposed to MOCA, which was derived
from Bayesian or maximum likelihood detection theory. As a result, the
HFC-estimated nVD is a function of the false alarm probability, PF, compared
to a single fixed value-VD produced by MOCA.
2. For each spectral component the HFC method designs an NPD to determine
whether a spectrally distinct signature is present in the lth spectral component by
testing the difference between the correlation eigenvalue and the corresponding
covariance eigenvalue calculated from this particular spectral component,
^λ l λl . By contrast, MOCA does not use singular values resulting from the
SVD in the same way as the HFC method uses eigenvalues. Instead, it replaces
^λ l λl with the maximum l2 norm of singular vectors to perform binary
hypothesis testing to determine whether a given singular vector is a rare signal
source vector. Therefore, technically speaking, the HFC method can be consid-
ered a first-order statistics method since it detects ^λ l λl resulting from the
sample mean of the lth spectral component. On the other hand, MOCA computes
the signal energy strength of a singular vector measured by the l2 norm. Accord-
ingly, MOCA can be considered a 2OS method because singular vectors are
obtained by second-order statistics. For this reason, MOCA is more accurately
expressed as SVD-MOCA.
3. Interestingly, the HFC method can also be implemented in the same framework
as MOCA, with ^λ l λl being replaced by the lth eigenvector using Neyman–
Pearson detection theory. The resulting HFC method, referred to as the principal
components analysis (PCA)-HFC method, is a 2OS HFC method because the
eigenvectors are obtained from PCA, which is formed by the 2OS sample
covariance matrix and can be considered a counterpart of MOCA.
The HFC method makes no assumptions about the data set compared to MOCA,
which assumes data sample vectors represented by a linear form from which a
sequence of linear signal subspaces can be determined via orthogonal subspace
projection (OSP). Thus, basically, MOCA can be considered as belonging to the
same category of signal subspace estimation (SSE) (Bioucas-Dias and Nascimento
2005), hyperspectral signal identification by minimum error (HySime) (Bioucas-
Dias and Nascimento 2008), and LSMA-based methods in Chang et al. (2010a, b),
all of which take advantage of various data linear representations to derive VD
estimation techniques via OSP. As a consequence, all of these methods, including
MOCA, can be considered data representation-driven techniques. In contrast, the
HFC method only deals with data statistics and has nothing to do with data
representation. In this case, the HFC method can be considered a data
characterization-driven technique. More details regarding such taxonomy can be
found in Chang (2013).
4.4 Finding Targets of Interest
While using binary hypothesis testing to determine the value of nVD is an elegant
idea, the signals considered under each hypothesis are not real signals. For example,
the HFC method and MOCA respectively use eigenvalues and the vector length of
eigenvectors/singular vectors to determine the value of nVD, but none of them is a
real signal. As noted in Section 4.3 the HFC method and MOCA may underestimate
VD due to the fact that the first very large eigenvalues may dominate the subsequent
eigenvalues and their corresponding eigenvectors may not be able to use one
eigenvector to specify one real target signal. On some occasions different
eigenvectors may produce the same signal sources, thus determine the same value
of nVD. The concept of using target pixels to estimate the value of nVD was further
investigated in Chang et al. (2014a, b) to extend the HFC method to HOS-based
HFC methods. This approach has proven to be very promising and effective at
unifying various VD estimation techniques in the same problem setting and formu-
lation. The TSVD presented in this chapter is developed to address this issue. In
what follows we will characterize targets according to their spectral statistics into
two categories, 2OS and HOS.
4.4.1 What Are Targets of Interest?
One great advantage of using a hyperspectral imaging sensor is its ability to

uncover subtle and relatively small targets, for example, endmembers and
4.4 Finding Targets of Interest 87
anomalies. Targets of this type are generally unknown and very difficult to locate by
visual inspection owing to their limited spatial extent, or sometimes no spatial
information resulting from their presence at the subpixel level. To address this
issue, a new concept of spectral targets was recently introduced in Chang
et al. (2010a, b, 2011a, b, c, d) to differentiate spatial targets, which are generally
identified by their spatial properties such as size, shape, and texture. In traditional
image processing there are no spectral bands involved. Targets of interest are
generally defined and identified by their spatial properties such as size, shape, and
texture. Accordingly, the targets of this type are considered spatial targets. The
techniques developed to recognize such spatial targets are referred to as spatial
domain-based image processing techniques. On the other hand, owing to the use of
spectral bands specified by a range of wavelengths, a multispectral or hyperspectral
data sample is actually a vector expressed as a column vector, a sample of which in
a spectral band is produced by a particular wavelength. As a consequence, a single
hyperspectral sample vector already contains abundant spectral information pro-
vided by hundreds of contiguous spectral bands that can be used for data exploita-
tion. Such spectral information within a single data sample vector is called IBSI,
defined earlier. By virtue of such IBSI, two single data sample vectors can be
discriminated, classified, and identified via a spectral similarity measure, such as
the Spectral Angle Mapper (SAM). A target is called a spectral target if it is
analyzed based on its spectral properties characterized by IBSI(S), as opposed to
a spatial target, which is analyzed by interpixel spatial information provided by the
spatial correlation among sample pixels.
As an illustrative example, let S ¼ fri gi¼1 N
be a set of N data sample vectors,
T
where ri ¼ ðr i1 ; r i2 ; . . . ; r iL Þ is the ith data sample vector in S and L is the total
number of spectral bands. The spectral correlation across all the spectral bands
within ri is defined and referred to as the IBSI of signature ri, denoted by IBSI(ri),
with S ¼ fri g. That is, IBSI(ri) is provided by the spectral correlation among the
L
L spectral values, r ij j¼1 , across spectral bands within the single data sample
XL
vector ri, such as 2OS provided by IBSI(ri), the autocorrelation of ri, r 2 , and
j¼1 ij
XL
cross correlation of ri, r r . However, what we are really interested in
j¼1, k¼1, j6¼k ij ik
is the sample statistics provided IBSI(ri) among a set of data sample vectors,
N
S ¼ fri gi¼1 , denoted by IBSI(S), specifically, 2OS of IBSI(S) such as a sample
XN
autocorrelation matrix of S, r r T , and a sample cross-correlation matrix of S,
i¼1 i i
XN
r r T . It should be noted that IBSI(S) is independent of the intersample
i¼1, j¼1, i6¼j i i
spatial correlation because IBSI(S) remains the same after the samples in S used to
calculate IBSI(S) are reshuffled. This type of spectral information is the opposite of
the sample spatial statistics commonly used in traditional image processing, which
takes into account the spatial correlation among a set of data sample vectors where
reshuffling data sample vectors in a sample spatial correlation matrix can result in a
different sample spatial correlation matrix because it alters the spatial correlation
when the data sample vectors are rearranged in different spatial coordinates.
Two types of spectral target are generally of interest based on sample spectral
statistics, one characterized by second-order sample IBSI and the other by sample
IBSI of order higher than two, referred to as high-order IBSI. It should be noted that
the term IBSI(S) is defined as a sample spectral correlation resulting from a set of
data sample vectors specified by S. It is actually the size of S closely related to how
to define second-order IBSI and high-order IBSI. In hyperspectral image analysis,
the spectral targets of interest in hyperspectral data exploitation are those that either
occur with low probability or have small populations when they are present. Such
spectral targets generally appear in small populations and occur with low probabil-
ities, for example, special spices in agriculture and ecology, toxic wastes in
environmental monitoring, rare minerals in geology, drug/smuggler trafficking in
law enforcement, combat vehicles on the battlefield, landmines in war zones,
chemical/biological agents in bioterrorism, weapon concealment, and mass graves.
As a result, the sample size of data sample vectors specified by such a spectral
target, S, is relatively small and can generally be considered as being composed of
insignificant objects because of their very limited spatial information, but they are
actually critical and crucial for defense and intelligence analysis owing to the fact
that they are generally hard to identify by visual inspection. From a statistical point
of view, they are insignificant compared to targets with large sample pools and the
spectral information statistics of such special targets can be captured not by second-
order IBSI but rather by high-order IBSI.
4.4.2 Second-Order-Statistics (2OS)-Specified Target VD
Since most VD estimation techniques reported in the literature are developed based
on eigendecomposition, one common drawback of these methods is the value of
nVD, which is generally determined by eigenvalues or eigenvectors/singular vec-
tors, none of which represents real target signal sources or energies. In this case,
even though the number of spectrally distinct signatures can be estimated by VD,
we still do not know what these spectrally distinct signatures are. In this section we
present two techniques that generate 2OS target signal sources to be used to
estimate the value of nVD: (1) the LSMA-based techniques developed in Chang
et al. (2010a, b) and Paylor (2014), where particular algorithms are involved in
finding real signal sources used to unmix data in the LSE sense, and (2) the ATGP/
MOCA developed in Chang et al. (2011a, b, c, d), where the ATGP developed in
Ren and Chang (2003) was used to generate real target sample vectors to replace the
singular vectors used in MOCA.
4.4.2.1 OSP-Specified Targets
Assume that m1, m2, . . ., mp are spectral signatures used to unmix the data sample
vectors. Let L be the number of spectral bands and r be an L-dimensional data
sample vector which can be modeled as a linear combination of m1, m2, . . ., mp
with appropriate abundance fractions specified by α1, α2, . . . , αp. More precisely,
r is an L 1 column vector and M be an L p target spectral signature matrix,
denoted by [m1 m2 . . . mp], where mj is an L 1 column vector represented by the
spectral signature of the j-th target resident in the pixel vector r. Let
T
α ¼ α1 ; α2 ; . . . ; αp be a p 1 abundance column vector associated with
r where αj denotes the fraction of the j-th target signature mj present in the pixel
vector r. A classical approach to solving a mixed pixel classification problem is
linear unmixing which assumes that the spectral signature of the pixel vector r is
linearly mixed by m1, m2, . . ., mp, referred to as an LMM as follows:
r ¼ Mα þ n; ð4:17Þ
where n is noise or can be interpreted as a measurement or model error.

Equation (4.17) represents a standard signal detection model where Mα is a
desired signal vector needed to be detected and n is a corrupted noise. Since we are
interested in detecting one target at a time, we can divide the set of the p target
signatures, m1, m2, . . ., mp into a desired target, say mp and a class of undesired
target signatures, m1 , m2 , . . . , mp1 . In this case, a logical approach is to eliminate
the effects caused by the undesired targets m1 , m2 , . . . , mp1 that are considered as
interferers to mp before the detection of mp takes place. With annihilation of the
undesired target signatures the detectability of mp can be therefore enhanced. In
doing so, we first separate mp from m1, m2, . . ., mp in M and rewrite (4.17) as
r ¼ dαp þ Uγ þ n ð4:18Þ

where d ¼ mp is the desired spectral signature of mp and U ¼ m1 m2 . . . mp1 is
the undesired target spectral signature matrix made up of m1 , m2 , . . . , mp1 which
are the spectral signatures of the remaining p 1 undesired targets. Using (4.18) we
can design an OSP to annihilate U from the pixel vector r prior to detection of tp.
One of such desired orthogonal subspace projectors was OSP derived by Harsanyi
and Chang (1994) and given by
P⊥
U ¼ I UU ;
#
ð4:19Þ
1 ⊥
where U# ¼ UT U UT is the pseudo-inverse of U. The notation ⊥
U in PU indicates
⊥
that the projector PU maps the observed pixel vector r into the orthogonal
complement of < U >, denoted by < U>⊥ . Applying P⊥
U to (4.19) yields
T ⊥ T ⊥
⊥ y =a p w PU d + w PU n
x = PU r T
y =w x
Fig. 4.1 Diagram of linear detection system δLDS(r) designed for (4.20)
P⊥ ⊥ ⊥ ⊥ ⊥ ⊥
U r ¼ PU dαp þ PU Uγ þ PU n ¼ αp PU d þ PU n: ð4:20Þ
To detect the signal P⊥

U d with signal strength αp in (4.20), we need to design a linear
detection system (LDS), δLDS(r) (Fig. 4.1) by operating the input x ¼ P⊥ U r, with the
output specified by y ¼ wT x via a weighting vector w, where the signal-to-noise
ratio (SNR) can be defined as

T 2
T
T T
w αp P⊥ Ud w α p P⊥
Ud w α p P⊥ Ud
SNRðwÞ ¼
2 ¼

⊥ T
w T P⊥ Un E wT P⊥ Un w T PU n

T ⊥ T !

T ⊥ T
α2p wT P⊥ U d w PU d α2p wT P⊥ U d w PU d
¼ ¼
w T P⊥ T ⊥
U E½nn PU w σ2 w T P⊥ ⊥
U PU w
!
α2p wT P⊥ T ⊥
U dd PU w
¼ ⊥ ; ð4:21Þ
σ 2 T
w PU w
2 ⊥ T
where EðÞ is expectation, P⊥U ¼ P⊥U and PU ¼ P⊥U.
There are two approaches to maximizing (4.21) over w. One is from a pattern
classification point of view by formulating (4.21) as a generalized eigenvalue
problem as follows (Duda and Hart 1973):
!
α2p wT P⊥ T ⊥
U dd PU w
λðwÞ ¼ SNRðwÞ ¼ ⊥ ; ð4:22Þ
σ2 wT PU w
which turns out to be the well-known Fisher’s ratio or Raleigh quotient, a criterion
commonly used for linear Fisher discriminant analysis (Duda and Hart 1973). As a
result, solving (4.22) is equivalent to solving
1
P⊥
U P⊥ T ⊥
U dd PU w ¼ λðwÞw: ð4:23Þ
Specifically, maximizing (4.22) over w is equivalent to finding the maximal

eigenvalue of (4.23). Since the rank of the matrix ddT P⊥ U is one, this implies that
the matrix has only one nonzero eigenvalue which is also the maximal eigenvalue,
λmax. If we substitute P⊥ ⊥
U d for w in (4.23) and multiply PU on both sides of (4.23),
T ⊥
we can obtain λmax ¼d PU d, which is exactly the maximal SNR derived in
(4.29). By means of (4.23), a linear optimal signal detector for (4.18), denoted
by δOSP(r), was developed in Harsanyi and Chang (1994) as well as Chang (2003a, b)
and given by
δOSP ðrÞ ¼ dT P⊥
U r: ð4:24Þ
A second approach to maximizing (4.21) is from a signal detection point of view

via Schwarz’s inequality, where the SNR in (4.22) can be treated as a deflection
detection criterion. That is,
T ⊥ ⊥
w P d wP d; ð4:25Þ
U U
where equality holds , w ¼ κP⊥

U d for some constant k. In this case, (4.21) can be
reexpressed as
! ! !
α2p T ⊥ α2p wT P⊥ d2
wT P⊥U dd PU w
SNRðwÞ ¼ ¼ ⊥
U
σ2 wT P⊥ Uw σ2 P w 2
0
! 1U
T 2
αp 2
¼ @ w
P⊥ d A

σ2 kP⊥U wk1=2 U
kP⊥U wk
1=2

!
α2p
T ⊥ e 2 w e ¼ d
¼ we PU d where w e ¼ 1=2 and d
σ 2 P w
⊥ P⊥ w1=2
! U U
α2p 2
e
e k 2 P ⊥
kw U d ,
σ2
ð4:26Þ
where
e w d
e ¼ κP⊥
equality , w e ¼
Ud , w ¼ κP⊥
P⊥ w 1=2 U
P⊥ w1=2 : ð4:27Þ
U U
, w ¼ kP⊥
Ud
By virtue of (4.27), the optimal weight vector w* solving (4.26) is given by
w* ¼ κP⊥
U d: ð4:28Þ
Using w* obtained in (4.28) to specify the system δLDS(r) in Fig. 4.1 yields the
maximal SNR given by
!
α2p T ⊥
SNR w *
¼ d PU d ; ð4:29Þ
σ2
and the optimal signal detection system is given by

T
⊥ T ⊥
δLDS ðrÞ ¼ y ¼ w* P⊥ T ⊥
U r ¼ κ PU d PU r ¼ κd PU r; ð4:30Þ
which is exactly the same as the OSP given by (4.24). Nevertheless, it should be
noted that (4.28) gives the same maximal SNR in (4.29) regardless of the scaling
constant k because (4.26) is independent of k.
The idea of (4.19) provides a clue as to how to design the following unsupervised
OSP (UOSP) algorithm to find targets of interest through successive operations of
(4.19) on a growing undesired target signal matrix U (Wang et al. 2002). It should
also be noted that the ATGP developed by Ren and Chang (2003) works by the
same principle as UOSP.
UOSP Algorithm
1. Initial condition:
Let ε be a prescribed error threshold
and
t0 a data sample vector with maximal
vector length, that is, t0 ¼ arg maxr r , where r is a data sample vector and
2
r ¼ rT r. Set k ¼ 0, and use (4.19) to define P⊥ ¼ I t0 t T t0 1 t T , with
t0 0 0
U0 ¼ ½t0 .
T 1 T
2. Let k k þ 1, apply P⊥ Uk1 ¼ I Uk1 Uk1 Uk1 Uk1 via (4.19), with
Uk1 ¼ ½t0 t1 . . . tk1 , to all image pixels r in the image, and find the kth target
tk generated at the kth stage that has the maximum orthogonal projection as
follows:
T
⊥ ⊥
tk ¼ arg maxr PUk1 r PUk1 r : ð4:31Þ
3. If mOSP ðtk Þ > ε, where mOSP ðÞ is a well-defined OSP measure, then go to step
2. Otherwise, the algorithm is terminated. At this point, all the generated target
pixels t0 , t1 , . . . , tk1 are considered the desired targets.
4.4.2.2 Least-Squares-Specified Targets
Unsupervised Least-Squares OSP Method
Originally, the OSP specified by (4.20) was designed to detect the abundance
fraction of the desired signature d, not to unmix the d since it made use of SNR
to perform signal detection, not signal estimation. To address this issue, Settle
(1996) derived a maximum likelihood estimation of αp, α^ p ðrÞ, corresponding to the
signature mp ¼ d, which was also shown later by Tu et al. (1997) to be a least-
squares OSP (LSOSP) given by
dT P⊥
Ur
δLSOSP ðrÞ ¼ T ⊥ ; ð4:32Þ
d PU d
1
which is also identical to (4.30) and includes a constant dT P⊥
Ud to account for
the estimation error caused by the OSP. By virtue of (4.32), we can develop an
unsupervised LSOSP (ULSOSP) algorithm to find abundance-constrained least-
squares-specified targets.
ULSOSP Algorithm
Select ε to be a prescribed error threshold, let t0 ¼ argfmaxr ½rT r g, where r is
run over all image pixel vectors, and set k ¼ 0.
T
ð1Þ ð1Þ
2. Let LSEð0Þ ðrÞ ¼ r α^ 0 ðrÞt0 rα^ 0 ðrÞt0 , and check whether
maxr LSEð0Þ ðrÞ < ε. If so, the algorithm is terminated; otherwise, continue.

3. Let k k þ 1 and find tk ¼ arg maxr LSEðk1Þ ðrÞ .
4. Apply LSOSP method with the signature matrix MðkÞ ¼ ½t0 t1 . . . tk1 to estimate
ðkÞ ðkÞ ðk Þ
^ 1 ðrÞ, α
the abundance fraction of t0 , t1 , . . . , tk1 , α ^ 2 ðrÞ, . . . , α
^ k1 ðrÞ.
5. Find the kth maximum LSE defined by
n o Xk1 ðkÞ T Xk1 ðkÞ
ðk Þ
maxr LSE ðrÞ ¼ maxr r j¼1
^ j tj
α r j¼1
^ j tj : ð4:33Þ
α
6. If maxr LSEðk1Þ ðrÞ < ε, the algorithm is terminated; otherwise, go to step 3.
Unsupervised Nonnegativity Constrained Least-Squares Method
An alternative to solving the problem specified by (4.17) is a classical approach, the

least-squares estimation given by
1
^ LS ðrÞ ¼ MT M MT r;
α ð4:34Þ

^ LS ðrÞ ¼ α
where α ^ LS ^ LS
1 ðrÞ, α ^ LS
2 ðrÞ, . . . , α p ðrÞ ^ LS
and α j ðrÞ is the least-squares
(LS) solution estimated for the abundance fraction of the jth substance signature mj
from the data sample vector r. Here the data sample vector is included to emphasize
that the abundance estimate is determined by r. As shown in Chang (2013) it turns
out that the least squares (LS) α^ LS
j ðrÞ is exactly the same solution obtained by
LSOSP developed above with d ¼ mj.
Unfortunately, the previously described LSOSP and LSE above are abundance-
unconstrained algorithms. To reflect real-world problems, two physical constraints,
Xp
the abundance sum-to-one constraint (ASC), j¼1
αj ¼ 1, and the abundance
nonnegativity constraint (ANC), αj 0 for all 1 j p, can be imposed on
(4.17). When only ANC is imposed on (4.17), LSOSP can be extended to
nonnegativity constrained least-squares (NCLS) method developed by Chang and
Heinz (2000) (see details in Sect. 9.2.2). Moreover, to use NCLS to find targets of
interest in an unsupervised manner, an unsupervised version, called unsupervised
NCLS (UNCLS), was further developed in Chang (2013) and Chang (2016),
described as follows.
UNCLS Algorithm
run over all image pixel vectors, and set k ¼ 0.
T
ð1Þ ð1Þ

3. Let k k þ 1, and find tk ¼ arg maxr LSEðk1Þ ðrÞ .
4. Apply the NCLS method with the signature matrix MðkÞ ¼ ½t0 t1 . . . tk1 to
ðkÞ ðkÞ ðk Þ
^ 1 ðrÞ, α
estimate the abundance fraction of t0 , t1 , . . . , tk1 , α ^ 2 ðrÞ, . . . , α
^ k1 ðrÞ.
maxr LSEðkÞ ðrÞ ¼ maxr r j¼1
^
α j tj r j¼1
^ j tj : ð4:33Þ
α
Unsupervised Fully Constrained Least-Squares Method
NCLS is a partially abundance-constrained method that imposes ANC on (4.17). If

we further impose both constraints ANC and ASC on (4.17), NCLS can be extended
to become a fully constrained least-squares (FCLS) method, as in Heinz and Chang
(2001) (see details in Sect. 9.2.3). In analogy with UNCLS, an unsupervised version
of FCLS called unsupervised FCLS (UFCLS) is also developed in Chang (2013,
2016) with its implementation given in what follows.
UFCLS Algorithm
run over all image pixel vectors, and let k ¼ 0.
T
ð1Þ ð1Þ

3. Let k k þ 1, and find tk ¼ arg maxr LSEðk1Þ ðrÞ .
4. Apply the FCLS method with the signature matrix MðkÞ ¼ ½t0 t1 . . . tk1 to
ðkÞ ðkÞ ðk Þ
^ 1 ðrÞ, α
estimate the abundance fraction of t0 , t1 , . . . , tk1 , α ^ 2 ðrÞ, . . . , α
^ k1 ðrÞ.
ðk Þ
maxr LSE ðrÞ ¼ maxr r j¼1
^ j tj
α r j¼1
^ j tj : ð4:33Þ
α

The three aforementioned unsupervised LS-based algorithms, ULSOSP,
UNCLS, and UFCLS, can be considered unsupervised linear spectral mixture
analysis (ULSMA) techniques. When they are implemented, a prescribed error ε,
which is determined by various applications, is required to terminate the algo-
rithms. In general, this is done by visual inspection on a trial-and-error basis and is
not practical for our purposes. Therefore, instead of using ε as a stopping rule,
TSVD provides an alternative rule to determine how many targets are required for
our designed ULSMA algorithms to generate.
4.4.2.3 ATGP-Specified Targets
Despite the fact that the OSP specified by (4.20) cannot unmix signatures accu-
rately, it can indeed be used to produce 2OS targets. ATGP is such an algorithm that
uses the OSP concept to find LS targets.
N
Let X be a data matrix formed by data sample vectors, fri gi¼1 , that is,
X ¼ ½r1 r2 rN . Define the norm of the data matrix X by

X ¼ max1iN ri ; ð4:35Þ

where ri is the length of the vector ri ¼ ðr i1 ; r i2 ; . . . ; r iL ÞT defined by
2 X L 2
ri ¼ r . Assume that i* ¼ arg max1iN ri . The norm of matrix
l¼1 il
X in (4.34) can be further expressed by

X ¼ r * ; ð4:36Þ
i
which is exactly the brightest pixel ri* whose norm has the maximum vector length.
That is, the maximum l2 norm defined by (4.35) is indeed the maximum pixel vector
length corresponding to the brightest pixel vector in the data set.
Using (4.36) ATGP performs a sequence of OSPs by repeatedly using orthog-
onal projectors P⊥ U ¼ I UU
#
specified by (4.19). Thus, if X is the original
ATGP
hyperspectral image cube, ATGP firstselectsan initial
target pixel t0 that yields
the norm of the space X, denoted by t0 ATGP ¼ X via (4.35), and then projects

ATGP
the space X into a subspace orthogonal to < t0ATGP > via P⊥ U0 with U0 ¼ t0 . Let
the resulting subspace be denoted by X1 ¼< t0ATGP >⊥ . Next, the ATGP a
ATGP selects
first target pixel that yields the norm of the space X1, denoted by t1 ¼ X1
via (4.35), and then projects the space X into a subspace orthogonal to < t0ATGP ,

ATGP ATGP
t1ATGP > via P⊥ U1 with U1 ¼ t0 t1 , and the resulting subspace is denoted by
ATGP ATGP ⊥
X2 ¼< t0 , t1 > . The same procedure is repeated until a stopping rule is
satisfied, that is, the number of target pixels that ATGP must extract. The details of
its algorithmic implementation are given in what follows.
Automatic Target Generation Process

Let ε be a prescribed error threshold and t0 a pixel with the brightest intensity
value, that is, the maximal gray level value. Set k ¼ 0.
2. Let k k þ 1, and apply P⊥
t0 via (4.19) with U ¼ [t0] to all image pixels r in the
image, and find the kth target tk generated at the kth stage that has the maximum
orthogonal projection as follows:
T
tk ¼ arg maxr P⊥ Uk1 r P ⊥
Uk1 r : ð4:37Þ
3. If mðtk1 ; tk Þ > ε, where mð; Þ can be any target discrimination measure, then go
to step 2. Otherwise, the algorithm is terminated. At this point, all the generated
target pixels t0 , t1 , . . . , tk1 are considered the desired targets. It should be
noted that in Ren and Chang (2003) the mð; Þ was called the Orthogonal
Projection Correlation Index (OPCI) defined as
OPCI ¼ t0T P⊥
Uk1 t0 ; ð4:38Þ
where Uk1 ¼ ½t0 t1 . . . tk1 is the space linearly spanned by k targets, ftl gk1
l¼0 :
As shown in step 3, ATGP is terminated by a threshold ε that is preselected

empirically prior to algorithm implementation. In many real applications, such a
selection may be subjective, and an appropriate threshold selection may require
some prior knowledge. Interestingly, selecting an appropriate ε is equivalent to
determining an appropriate number of targets required for ATGP to generate. Now,
if we rewrite (4.37) as
n 2 o
tk ¼ arg maxr P⊥
Uk1 r
; ð4:39Þ
T ⊥ 2
P⊥ P⊥ ¼ r T P⊥ P r , then finding tk in (4.39) is
where Uk1 r Uk1 r Uk1 r ¼ Uk1
equivalent to finding a data sample vector that yields the maximal leakage into the
hyperplane specified by P⊥ Uk1 , which turns out to be the same tk as well. This
implies that ATGP can be terminated by estimating the VD with target signal
sources specified by ATGP-generated targets {tk}.
4.4.3 HOS-Specified Target VD
First, we describe a practical means of generating spectral targets that can be

specified by any arbitrary order of statistics. Since the targets specified by the
first two orders of statistics can be found by ATGP, only targets specified by
HOS are of major interest. In this case, data sphering is performed to remove the
N
first two orders of statistics. Assume that there are N data points fxi gi¼1 , each of
which has dimensionality L, and X ¼ ½x1 x2 . . . xN is an L N data matrix formed
N
by fxi gi¼1 . Let w be an L-dimensional column vector assumed to be a desired
projection vector (PV). Then z ¼ wT X ¼ ðz1 ; z2 ; . . . ; zN ÞT is a 1 N row vector that
N
represents the projection of data fxi gi¼1 mapped along the direction of w, where “T”
denotes the transpose of a vector or matrix. Now assume that FðÞ is a function to be
explored and defined on the projection space z ¼ wT X. The selection of the
function F depends upon various applications.
By setting a general framework, we assume that the function F(.) is specified by
any kth-order statistics, κk, with kth central moments defined by
h i h i
k
E ðzi μÞk E ð w T xi μ Þ
Fðzi Þ ¼ κk ðzi Þ ¼ ¼ ð4:40Þ
σk σk
for each i ¼ 1, 2, . . . , N, where the μ and σ in (4.40) are the mean and standard
deviation of random variable zi, respectively. Since small targets can be character-
ized by data sample vectors that cause a maximum magnitude of asymmetry and
ripples of a distribution, finding a PV w that maximizes (4.40) is equivalent to
finding the direction with which these pixels are most likely aligned. By projecting
N
all data samples fxi gi¼1 on the PV w, the desired small targets can be detected by
those pixels that yield the largest projection along the direction of w.
Following the same treatment developed in Ren et al. (2006), an iterative
algorithm can be designed to find a projector that solves the following constrained
optimization problem with the optimal criterion specified by (4.40):
( ) ( )
1X N
1X N T k2 T
max zk ¼ max T
w yi yi w yi w subject to wT w ¼ 1; ð4:41Þ
w N i¼1 i w N i¼1
where zi is the projection resulting from the sphered data sample yi via the PV w.
The constraint wT w ¼ 1 is used for normalization such that the kth central moment
of the resulting data after the projection will not be affected by the magnitude of w.
Using the Lagrange multiplier to solve (4.41) results in solving the equation
h k2 T i
E yi yiT w yi λ0 I w ¼ 0; ð4:42Þ
h k2 T i
which is the eigenvector of E yi yiT w yi . Using the eigendecomposition,
(4.42) can be reduced to
h k2 T i
wT E yi yiT w yi w ¼ λ0 ð4:43Þ
because of wT w ¼ 1, and (4.43) can be further simplified to

h k2 T i h k i

E wT yi yiT w yi w ¼ E yiT w ¼ E zik ¼ λ0 ; ð4:44Þ
T
which turns out to be the kth central moment of z ¼ w* Y. For example, (4.42) is
reduced to

T T
E yi yi wyi λ0 I w ¼ 0 for k ¼ 3; ð4:45Þ

T
E yi yi wwT yi yiT λ0 I w ¼ 0 for k ¼ 4: ð4:46Þ
It should be noted that a single PV w* solving (4.42) can only find one PV. To find
more PVs, a sequence of OSPs must be performed. In doing so, when a PV w* is found,
the decorrelated data Y are then mapped into the linear subspace < w* >⊥ orthogonal
to <w* >, which is the space linearly spanned by w*. The next PV w* is then found by
solving (4.42) in the space < w* >⊥ . The same process is continued until a stop
criterion is satisfied such as a predetermined number of projections required to be
generated. This resulting procedure is called the kth central moment projection
vector generation algorithm (PVGA) and can be described in detail as follows.
kth Central Moment Projection Vector Generation Algorithm

1. Sphere the original data set X. The resulting data set is denoted by Y.
2. Find the first PV w
1 by solving (4.42) to find the optimal PV based on maxi-
mizing the kth normalized central moment.
3. Using the found w
1 , generate the first projection image
n * T o
* T 1 1
Z ¼ w Y ¼ z z ¼ w yi , which can be used to detect the first type
1
1 i i 1
of anomaly.
T 1 T
4. Apply the OSP specified by P⊥
w1 ¼ I w1 w1 w1 w1 to the data set Y to
produce the first OSP-projected data set denoted by Y1, Y1 ¼ P⊥ w1 Y.

5. Use the data set Y1 and find the second PV w
2 by solving (4.42).
T 1 T
6. Apply P⊥ w2 ¼ I w2 w2 w2 w2 to the data set Y1 to produce the second
OSP-projected data set denoted by Y2, Y2 ¼ P⊥ 1
w2 Y , which can be used to

produce the third PV w3 by solving (4.42). Or, equivalently, define a matrix

1
2 T T
projection matrix W2 ¼ ½w1 w2 and apply P⊥W2 ¼ I W
2
W W 2
W2
to the original sphered data set Y to obtain Y2 ¼ P⊥ W2
Y.
7. Repeat steps 5 and 6 to produce w
3 , . . ., w
p until a predetermined number of PVs
is reached.
Note that the implementation of step 2 in the preceding algorithm is not trivial.
To solve (4.42) for the optimal PV w
1 , the following iterative procedure is proposed
to execute step 2.
ð0Þ
(a) Initialize a random PV w1 , and set n ¼ 0.
k2
ðnÞ ðnÞ
(b) Calculate the matrix E yi yiT w1 yiT , and find an eigenvector v1
corresponding
to the largest magnitude of eigenvalues of the matrix
k2
ðn Þ
E yi yiT w1 yiT .
ðnÞ ðnþ1Þ ðnÞ ðnþ1Þ
(c) If the Euclidean distance w1 w1 > ε or w1 þ w1 > ε, then let
ðnþ1Þ ðnÞ ðnÞ
w1 ¼ v1 and n n þ 1, and go to step (b). Otherwise, w1 is the desired
ðnÞ
PV w
1 . Let w*1 ¼ w1 , and return to step 3 in the kth central moment PVGA.
According to the kth central moment PVGA, it is worth noting that, like eigen-
vectors and singular vectors, the kth moment PVGA-generated PVs are only feature
vectors that represent the target signal directions. They are not necessarily real target
signal vectors. Thus, to find a real target signal vector, we can project the entire data
set on each of the feature vectors to find a data sample vector that yields the maximum
projection along the feature vector. This found data sample vector is selected as the
desired real target signal vector that represents this particular feature vector.
As noted, (4.44) can be calculated for the kth central moment, with k being
any arbitrary integer. However, when k ¼ 1, (4.44) is not applicable. In this case,
the kth central moment PVGA derived by (4.44) is then replaced by the FastICA
developed by Hyvarinen and Oja (1997), where the so-called negentropy criterion is
defined by
h i2 h i2
negentropy Bj ¼ ð1=12Þ κ3j þ ð1=48Þ κ4j 3 ; ð4:47Þ
which is a combination of the third and fourth orders of statistics for ζ j in Hyvarinen
et al. [2001, Eq. (5.35), p. 115] and has been shown to approximate mutual
information that is generally used as a criterion to measure statistical independence
between two signal sources.
4.5 Target-Specified VD
Once targets of interest are found, a follow-up task is to use these targets as signal
sources under a binary composite hypothesis testing problem for TSVD.
4.5.1 Target-Specified Binary Hypothesis Testing
Apparently, binary hypothesis testing has become a major technique designed to

determine the number of signal sources, for example, spectrally distinct signatures
by the HFC method via (4.1) and rare signal sources by MOCA via (4.9). Never-
theless, several key differences between MOCA and the HFC method are crucial
and were discussed in Sect. 4.3.2.
In this section we develop an approach to TSVD estimation by taking advantage
of an NPD-based binary hypothesis testing problem formulated by (4.1) and (4.9),
where under each hypothesis the target signal vectors are specified by the PV
algorithm developed in Sect. 4.4.2 and the extreme value theory (Leadbetter
1987) to find the PD for each of the two hypotheses.
Specifically, let the set of L PVs generated by the kth central moment PVGA
n o
PVðkÞ L
with the kth central moment defined by (4.49)–(4.50), denoted by tl , each
l¼1
of which represents a feature vector that can be used to find a real target signal
vector, as noted earlier. We can divide this set of L feature vectors into two parts,
some of which represent vector directions for the useful signals sl, including target
signal and background, while others are noise nl. Then, similar to (4.1), the problem
PVðkÞ
is simplified to a binary hypothesis testing problem, that is, tl represents noise
direction under H0 or signal direction under H1, 1 l L.
Using the probability density assumption made by MOCA, the binary hypothesis
test can be reexpressed by
H0 : zlk ¼ 0
versus for l ¼ 1, 2, . . . , L ; ð4:48Þ
H1 : zlk > 0

where zkl is the maximum of vector residuals given by zlk ¼ max1iN P perðk1Þ ri 2
Ul1
and P perðk1Þ ri is the orthogonal projection of each pixel vector ri, with 1 i N and
Ul1
h i T 1 T
PV ðkÞ PV ðkÞ per ðkÞ ðk Þ ðk Þ ðk Þ
Ul1 ¼ t1
k1
. . . tl1 and nl with PUk ¼ Ul1 Ul1 Ul1 Ul1 and
l1
l > 1. As we know, the maximum of residuals of PVs on the subspace spanned by

the feature vector representing a certain signal direction is a high value contributed
4.5 Target-Specified VD 101
by the signals in this direction under H1, whereas in the noise direction under H0, it
is a low value that is governed by the maximum-norm noise residual. Under the null
hypothesis H0 with the white Gaussian noise assumption, the cdf of zkl is a Gumbel
distribution given by
( h i)
ð2logN Þ1=2 pffiffiffiffiffiffiffi
zσ 2 ðLlÞ
F0 ðzÞ exp e σ 2 2ðLlÞ
: ð4:49Þ
In addition, under the alternative hypothesis H1 with the uniform distribution

assumption, an a posteriori PD is given by

F0 zlk
p H 1 zlk ¼ ; ð4:50Þ
zlk p0 zlk þ F0 zlk

k zlk p0 zlk

p H 0 zl ¼ k ; ð4:51Þ
zl p0 zlk þ F0 zlk
where p0(zkl ) is the probability density function (pdf) zkl under H0.
There are two detectors that can be derived to determine the value of nVD. One is
a Bayesian detector using an a posteriori probability, called the kth central moment
MOCA, denoted by k-MOCA, as

1, if Λzlk > 1,
δMOCA zlk ¼ ð4:52Þ
0, if Λ zlk 1;

where Λ zlk ¼ p H 1 zlk =p H 0 zlk . Using (4.52), the value of nVD can be deter-
mined by calculating
XL
VDk-MOCA ¼ l¼1
δMOCA zlk for k > 2: ð4:53Þ
The other detector is based on

Neyman–Pearson
2 detection theory, with the target

feature vectors zl ¼ max1iN PtPVðkÞ ri in (4.49) obtained from tPVl , 1 l L,
k
l
which are generated by the kth moment PVGA. The resulting NPD is referred to as
δNP
PVðkÞ (zl ) for 1 l L, and VD using targets specified by the kth moment is then
k
defined via the HFC method as

XL j k k
PVðkÞ ðPF Þ ¼
VDHFC l¼1
δNP
PVðkÞ zl ; ð4:54Þ
ð
with the false alarm probability given by PF δNP
PVðkÞ ¼ p0 zlk dzlk :
Λðzlk Þτ
Finally, if the kth central moment PVGA-generated PVs in (4.53) are replaced
by FastICA-generated PVs, the resulting HFC used to estimate the value of nVD is
called Fast ICA-HFC, with VD defined by
XL 1
Fast ICA ðPF Þ ¼
VDHFC l¼1
δNP
Fast ICA zl ð4:55Þ
with the superscript “k,” as in kth central moment, replaced by the 3rd and 4th
moments, 1, and subscript “PV(k)” replaced by “ICA.”
Now, if we interpret SVD and PCA as the second central moment transformation
with k ¼ 2, then SVD-MOCA/PCA-MOCA and PCA-HFC can be considered 2OS
methods: 2-MOCA (variance-MOCA) and 2-HFC (variance-HFC), respectively.
By including the kth-order-statistics methods using the kth central moment for 3
k < 1 specified by (4.43)–(4.44) joined with the HFC method as a first-order-
statistics method, as well as negentropy-based ICA as a combination of 3rd and 4th
orders of statistics method specified by (4.54), a complete unified theory for VD can
be derived using spectral target statistics of kth order for any positive integer k; a
whole family of spectral target statistics-based VD estimation techniques are
tabulated in Table 4.2, where ATGP-HFC/ATGP-MOCA and ULSMA-HFC/
ULSMA-MOCA can be considered as joint first- and second-order-statistics-
based methods.
The HFC method and MOCA discussed in Sect. 4.3.2, along with their related
methods, are all eigenanalysis based. More specifically, the signal sources under
hypotheses are either eigenvalues in (4.1) or eigenvectors/singular vectors in (4.9).
Since eigenvalues/eigenvectors are obtained by PCA, they are not real target signal
sources. As expected, the VD estimated by the eigenanalysis-based methods is well
below the number of real target signal sources. Unfortunately, this issue was not
addressed until the work reported in Chang et al. (2014a, b), where the real target
signal sources were generated by specific algorithms.
Table 4.2 Categories of target-specified VD estimation techniques

Method
Target generation NPD Bayes
First-order statistics (sample mean) HFC/NWHFC N/A
Second-order statistics (eigenvectors) PCA-HFC SVD-MOCA, PCA-MOCA
First- and second-order statistics ATGP-HFC ATGP-MOCA
First- and second-order statistics ULSMA-HFC ULSMA-MOCA
Third-order statistics Skewness-HFC Skewness-MOCA
Fourth-order statistics Kurtosis-HFC Kurtosis-MOCA
kth-order statistics k-HFC k-MOCA
(3rd,4th) order IBSI Fast ICA-HFC Fast ICA-MOCA

4.5.2 ATGP-Specified VD Using ηl ¼ tlATGP 2
One particular algorithm of interest is ATGP, which has been widely used in
hyperspectral target detection to find unsupervised
SVD 2 target signal sources.
Replacing

tSVD in (4.7) with t ATGP
in (4.39) and ηl ¼ t in (4.8) with ηl ¼ t ATGP 2 , we
l l l l
can reformulate (4.9) as a new hypothesis testing problem:

H 0 : ηl ¼ tlATGP 2 p ηl H 0 ¼ p0 ðηl Þ
versus for l ¼ 1, 2, , L ; ð4:56Þ

H 1 : ηl ¼ tlATGP 2 p ηl H 1 ¼ p1 ðηl Þ
where the signal sources under the hypotheses are now real target signal sources
rather than eigenvectors in (4.9) or eigenvalues in (4.1). Then we can follow the
same treatment as what MOCA did to derive

l*, ATGP ¼ arg min1lL p H0 ηl p H 1 ηl ð4:57Þ
and ATGP-specified VD as
VDATGP ¼ l*, ATGP ; ð4:58Þ
along with its NPD version
l*, ATGP ðPF Þ ¼ argfmin1lL Fvl ðηl Þ 1 PF g; ð4:59Þ

VD ATGP
ðPF Þ ¼ l
*, ATGP
ðPF Þ: ð4:60Þ
pffiffiffiffi ATGP

4.5.3 ATGP-Specified VD Using ηl ¼ t l

According to (4.55), the signal source ηl ¼ tlATGP 2 is a random variable
representing the signal energy of a real target signal source, tATGP
l . However, in
communications it is highly desirable to represent a signal source as its signal
strength by its vector length rather than its signal energy. In this case, we can
reformulate (4.55) as a new signal-strength-based hypothesis testing problem:
pffiffiffiffi ATGP pffiffiffiffi pffiffiffiffi
H0 : η l ¼ tl p ηl H 0 ¼ p0 ηl
versus
pffiffiffiffi ATGP pffiffiffiffi pffiffiffiffi for l ¼ 1, 2, , L : ð4:61Þ
H1 : η l ¼ tl p ηl H 1 ¼ p1 ηl
The problem with solving (4.60) is not straightforward since MOCA cannot be
pffiffiffiffi
directly applied to finding the pdfs of ηl under each of the hypotheses in (4.60).
Nevertheless, the same extreme theory in Leadbetter (1987) can be used to modify
MOCA for this purpose as follows.
Assume that fXi gi¼1 N
, where Xi ¼ ðXi1 ; Xi2 ; ; XiL ÞT is an L-dimensional
random vector, of which each component, Xil, is a Gaussian random variable with
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
XL ffi
zero mean and variance σ 2. Let Y i ¼ l¼1
X2 ¼ Xi . Then Yi follows a
il
chi-squared distribution of degree L and σ and its probability density function is
pffiffiffi 1L:=2 L1 pffiffi
given by f Y ðyÞ ¼ 2σ 2ΓðL=2yÞσ L ey =2σ , with mean E½Y ¼ 2σΓΓðððL=2
Lþ1Þ=2Þ
2 2
Þ and variance
2
2 2σ 2 ΓðL=2þ1Þ
var½Y ¼ Lσ ðE½Y Þ calculated by E Y ¼ ΓðL=2Þ ¼ Lσ .
2 2

As L ! 1 Y i ¼ Xi χ ðLÞ N ðE½Y i , var½Y i Þ.
Theorem 4.1 (Kuybeda et al. 2007) MN ¼ max1iN N i , where Ni is a standard
Gaussian random variable. Then

P aN MN bN u ¼ GðuÞ ¼ expðexpðuÞÞ
and
PðMN uÞ ¼ GðaN ðu bN ÞÞ; ð4:62Þ

pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffi
where aN ¼ 2logN and bN ¼ 2logN 1=2 2logN ðloglogN þ log4π Þ.
Now we use Theorem 4.1 to derive V ¼ max1iN Y i by assuming that {Yi} is
independent and identically distributed, with Y i ¼ Y of mean μY i ¼ μY and variance
σ 2Y i ¼ σ 2Y . Then

V ¼ max1iN Y i ¼ max1iN σ Y i N i þ μY i ¼ σ Y max1iN N i þ μY , and its cdf,
FV(v), can be derived by
FV ðvÞ ¼ P
ðV vÞ ¼ Pðσ Y max1iN N i þ μ Y vÞ
v μY
¼ P max1iN N i
σY
v μY v μY ð4:63Þ
¼ P MN ¼ G an bn
σY σY
v ð μ Y þ bN σ Y Þ v ð μ Y þ bN σ Y Þ
¼G ¼ exp exp :
σ Y =aN σ Y =aN
This implies that V is described by a Gumbel distribution given by

1 vμ vμ
f V ðvÞ ¼ exp þ exp ; ð4:64Þ
β β β
where μ ¼ μY þ bN σ Y and β ¼ aσNY .

Now if we replace (4.49) with (4.62) and (4.55) with (4.60), we can perform the
binary composite hypothesis testing problem specified by (4.60) as follows:
pffiffiffiffiffiffiffiffiffi pffiffiffiffi pffiffiffiffi
l*, ATGP
¼ arg min1lL p H 0 ηl p H 1 ηl ð4:65Þ
and further define ATGP-specified VD as

pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi
VD ATGP
¼ l*, ATGP
ð4:66Þ
along with its NPD version given by

pffiffiffiffiffiffiffiffiffi pffiffiffiffi
l*, ATGP
ðPF Þ ¼ arg min1lL Fvl ηl 1 PF ; ð4:67Þ
pffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffi
VD ATGP ðPF Þ ¼ l*, ATGP ðPF Þ: ð4:68Þ
Also, by virtue of (4.60) and (4.61), we can also derive a new version of the
pffiffiffiffi
MLD-MOCA using ηl
pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffi pffiffiffiffi
l*, MOCA
¼ arg min1lL p H 0 ηl p H 1 ηl ð4:69Þ
pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi
VD MOCA ¼ l*, MOCA ; ð4:70Þ
along with its NPD version given by

pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffi
l*, MOCA
ðPF Þ ¼ arg min1lL Fvl ηl 1 PF ; ð4:71Þ
pffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi
VD MOCA ðPF Þ ¼ l*, MOCA ðPF Þ: ð4:72Þ
4.5.4 Discussions
VD has been proved to be a very useful concept in determining the number of

potential signal sources present in hyperspectral data. However, owing to different
interpretations of signal sources, various techniques have also been developed for
VD estimation. To address this issue, a recent attempt, made in Chang (2013), is to
categorize VD estimation techniques into two types of VD, data characterization-
driven VD (DC-VD) and data representation-driven VD (DR-VD), where the
former is based on statistical characteristics of signal sources such as eigenvalue-
based approaches (HFC, PCA), information-theoretic criteria, an information
criterion (AIC), minimum description length (MDL), Gershgorin disk/radius, Mali-
nowski factor analysis-based error theory], while the latter is to find an optimal
number of signal sources to linearly represent data such as linear regression
model-based SSE and HySime, LSMA-based OSP, and signal subspace-based

MOCA. The work presented in Sect. 2.4 presents a new third type of VD, called
target specified-based VD (TSVD), which makes use of target signal vectors
specified by their spectral statistics to determine the value of nVD, such as
HOS-based methods.
Comments Regarding TSVD

1. Since the PVs are generated to identify target signal directions, they are not real
target signal vectors. The role that PVs play is similar to that played by
eigenvectors in eigendecomposition.
2. According to Liu and Chang (2005) and Chang (2013), PVGA can also be used
to generate eigenvectors in (4.40), with k ¼ 2, without first finding eigenvalues
via a characteristic polynomial equation. As a result, the kth central moment
PVGA can be extended to cover the case of k ¼ 2.
3. Since the first-order-statistics-based HFC/NWHFC only estimates the value of
nVD, it does not provide an algorithm to extract target signal vectors. In this case,
ATGP can be used to find the first-order-statistics-specified target signal vectors.
Since ATGP can extract first- and second-order spectral-statistics-specified
target signal vectors, we can use PCA-HFC to find only statistics-specified target
signal vectors. If we let SATGP be the set of target signal vectors generated by the
ATGP and SPCA-HFC be the set of target signal vectors generated by the
PCA-HFC, the set of SATGP will be treated as the first-order-statistics-specified

target signal vectors, while SPCA-HFC SATGP \ SPCA-HFC can be treated as
2OS-specified target signal vectors.
4. A spectral target may exhibit strong statistics of various orders. In this case, such
a target will be extracted by more than one order of spectral-statistics-based
methods. This is particularly true for a strong target signal source in terms of its
spectral statistics of many orders. As a consequence, the value of nVD deter-
mined by the number of PVs is generally smaller than the number of real target
signal vectors.
5. The HFC method developed for VD estimation does not assume specific targets
the way an HOS-based method does. As a matter of fact, the HFC method
assumes that the desired targets are characterized by sample means of each
spectral band and then makes use of 2OS-characterized eigenvalues to perform
detection. On the other hand, ATGP-MOCA and ATGP-HFC use ATGP-
generated targets in the way an HOS-based method does. With this interpretation
the HFC method and ATGP-MOCA, along with ATGP-HFC, can be considered
first-order-spectral statistics-based and second-order-spectral-statistics-based
methods, respectively. Accordingly, the HFC method, ATGP-HFC, and
HOS-based methods form a complete family of HFC-based VD estimation
techniques where targets of interest can be characterized by targets specified
by spectral statistics of any order. Therefore, ATGP-HFC can be considered a
second-order-spectral-statistics version of the HFC method, while HOS-HFC
can be referred to as an HOS version of the HFC method. In particular, when a
4.6 Synthetic Image Experiments 107
certain statistics criterion is used to generate PVs as feature vectors, it is called

criterion-HFC. For example, if skewness is used, it is referred to as skewness-
HFC.
6. Since the kth central moment PVGA generates PVs as feature vectors that are not
real hyperspectral targets, their roles are similar to singular vectors considered in
MOCA and eigenvectors in the HFC method. To obtain real hyperspectral target
pixels, for each PV we can project all data sample vectors onto this PV and select
one with the maximal projection as a desired hyperspectral target. This same
idea can also be applied to MOCA and HFC methods.
4.6 Synthetic Image Experiments
The synthetic image simulated in Fig. 1.15 is shown in Fig. 4.2 with five panels in
each row simulated by the same mineral signature and five panels in each column
having the same size.
Among the 25 panels are five 4 4 pure pixel panels for each row in the first
column and five 2 2 pure pixel panels for each row in the second column, five
2 2 mixed pixel panels for each row in the third column and five 1 1 subpixel
panels for each row in the fourth and fifth columns, where the mixed and subpanel
pixels were simulated according to the legends in Fig. 4.2. Thus, a total of 100 pure
pixels (80 in the first column and 20 in the second column), referred to as
endmember pixels, were simulated in the data by the five endmembers A, B, C,
K, and M. The area marked “BKG” in the upper right corner of Fig. 1.14a was
selected to find its sample mean, that is, the average of all pixel vectors within the
“BKG” area, denoted by b and plotted in Fig. 1.14b, to be used to simulate the
background (BKG) for the image scene with a size of 200 200 pixels in Fig. 4.2.
The reason for this background selection is empirical since the selected “BKG” area
100%
A
B
Fig. 4.2 Set of 25 panels simulated by A, B, C, K, and M

seemed more homogeneous than other regions. Nevertheless, other areas can also
be selected for the same purpose. This b-simulated image background was further
corrupted by an additive noise to achieve a certain SNR, which was defined as a
50 % signature (i.e., reflectance/radiance) divided by the standard deviation of the
noise in Harsanyi and Chang (1994). Once target pixels and background are
simulated, two types of target insertion can be designed to simulate experiments
for various applications.
The first type of target insertion is target implantation (TI), which can be
simulated by inserting clean target panels into a noisy BKG image by replacing
their corresponding BKG pixels, where the SNR is empirically set at 20:1. That is,
TI implants clean target panel pixels into the noise-corrupted BKG image with
SNR ¼ 20:1, in which case there are 100 pure panel pixels in the first and second
columns.
A second type of target insertion is target embeddedness (TE), which is simu-
lated by embedding clean target panels into a noisy BKG image by superimposing
target panel pixels over the BKG pixels, where the SNR is empirically set to 20:1.
That is, TE embeds clean target panel pixels into noise-corrupted BKG image with
SNR ¼ 20:1, in which case none of the 100 pure panel pixels in the first and second
columns is pure any longer. In other words, a salient difference between TI and TE
is worth mentioning. Since TE inserts targets by adding target pixels to and
superimposing over background pixels instead of replacing background pixels the
way TI does for target insertion, the abundance fraction of the pixel into which a
target pixel is embedded does not sum to one.
Table 4.3 tabulates the values of nVD estimated for TI and TE by HFC and
NWHFC, where NWHFC produced nVD ¼ 6 for all false alarm probabilities versus
HFC, which produced various values for nVD. However, both HFC and NWHFC
produced the same value of nVD ¼ 6 with PF 103 . Also included in Table 4.3 are
results produced by SSE and HySime, where their results were very close to that
produced by NWHFC.
For TSVD, Table 4.4 tabulates the values of nTSVD estimated using 2OS
specified by ATGP methods, ATGP-MOCA, and ATGP-HFC, where n2OSVD ¼ 8
and 9 for TI and n2OSVD ¼ 7 for TE.
Table 4.5 also tabulates the values of nTSVD estimated by various methods using
HOS-specified targets, where nHOSVD ¼ 5 for TI and nHOSVD ¼ 6 for TE.
A comparison of Tables 4.4 and 4.5 reveals that the results make sense. The
reason the values of VD, n2OS-VD, estimated by MOCA and HFC using ATGP-
specified targets produced higher values than that produced by methods using
Table 4.3 nVD estimated for synthetic images by HFC and NWHFC
Data SSE/HySime PF 101 102 103 104 105
TI 5/5 HFC 18 13 12 8 8
NWHFC 6 6 6 6 6
TE 6/5 HFC 14 9 6 6 6
NWHFC 6 6 6 6 6
Table 4.4 nVD estimated by 2OS-VD for synthetic images by ATGP-MOCA and ATGP-HFC
Data ATGP-MOCA PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
TI 9 9 9 9 8 8
TE 7 7 7 7 7 7
Table 4.5 nVD estimated for synthetic images by HOS-based methods

Data Method MOCA PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
TI Skewness 5 5 5 5 5 5
Kurtosis 5 5 5 5 5 5
Fifth moment 5 5 5 5 5 5
Fast ICA 5 5 5 5 5 5
TE Skewness 6 6 6 6 6 6
Kurtosis 6 6 6 6 6 6
Fifth moment 6 6 6 6 6 6
Fast ICA 6 6 6 6 6 6
Table 4.6 nTSVD estimated for TI and TE by MOCA and ATGP using signal energy and strength
MLD NPD NPD NPD NPD NPD
l* PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
⊥
TI P r MOCA 6 6 6 6 6 6
U
ATGP 7 8 7 7 7 7
⊥ 2
P r MOCA 6 6 6 6 6 6
U
ATGP 8 9 8 8 7 7
⊥
TE P r MOCA 6 6 6 6 6 5
U
ATGP 6 6 6 6 6 5
⊥ 2
P r MOCA 6 6 6 6 6 6
U
ATGP 6 6 6 6 6 6
HOS-specified targets is because background signatures are considered to be

specified by 2OS, not by HOS. As a matter of fact, Table 4.5 offers very intriguing
findings and explanations. For TI there are five pure panel signatures, and thus,
nHOS-VD ¼ 5 for five endmembers. However, since TE has no pure panel pixels in
the scene to represent any pure panel signatures, nHOS-VD ¼ 6 represents six distinct
spectral signatures indeed present in the scene.
Finally, Table 4.6 tabulates the value of TSVD, nTSVD, estimated for TI and TE by
MOCA and ATGP-specified signal energy and signal strength, respectively, where
MLD is the maximum likelihood detector used by MOCA and ATGP to replace the
NPD, where the ATGP- and MOCA-estimated TSVD using signal energy and
strength for TE was the same, nTSVD ¼ 6, except for the case of 5 for PF ¼ 105.
Note that for TI, MOCA-estimated TSVD using signal energy and strength was
nTSVD ¼ 6 across the board, but ATGP-estimated TSVD using signal energy and
strength varied, with nTSVD ranging from 7 to 9, which were higher than MOCA-
estimated TSVD, nTSVD ¼ 6. This is due to the fact that ATGP produced real targets
compared to MOCA, which produced singular vectors.
Although many real hyperspectral image scenes can be used for experiment, for
comparative analysis, here we used two real image scenes that have been studied
extensively in the literature.
4.7.1 HYDICE Data
The image scene shown in Fig. 4.3 (also shown in Fig. 1.10a) was used for
experiments. It was acquired by the airborne HYperspectral Digital Imagery Col-
lection Experiment (HYDICE). It has a size of 64 64 pixel vectors with 15 panels
in the scene and the ground truth map in Fig. 4.3b (Fig. 1.10b).
It is worth noting that panel pixel p212, marked yellow in Fig. 4.3b, is of
particular interest. Based on the ground truth, this panel pixel is not a pure panel
pixel and is marked yellow as a boundary panel pixel. However, with our extensive
and comprehensive experiments, this yellow panel pixel is always extracted as the
one with the most spectrally distinct signature compared to the R panel pixels in
row 2. This indicates that a signature of spectral purity is not equivalent to a
signature of spectral distinction. As a matter of fact, in many cases, panel pixel
p212 instead of panel pixel p221 is the one extracted by EFAs to represent the panel
signature in row 2. Also, because of such ambiguity, the panel signature
representing panel pixels in the second row is either p221 or p212, which is always
the last one found by EFAs. This implies that the ground truth of R panel pixels in
the second row in Fig. 4.3b may not be as pure as was thought.
Table 4.7 tabulates the values of nVD estimated for HYDICE by four VD
estimation methods, HFC/NWHFC and SSE/HySime, and the values of nVD esti-
mated by various VD estimation methods, ATGP-MOCA and ATGP-HFC, using
2OS-specified by the ATGP and HOS-based method using HOS-specified targets.
a b
p11, p12, p13,
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
Fig. 4.3 (a) HYDICE panel scene containing 15 panels; (b) ground truth map of spatial locations
of the 15 panels
Table 4.7 nVD estimated for HYDICE by various VD estimation techniques

SSE/HySime 10/20
Second-order MOCA PCA 24
ATGP 37
HOS-MOCA Skewness 19
Kurtosis 21
Fifth moment 25
Fast ICA 22
NPD-based VD PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
First-order HFC 14 11 9 9 7
statistics NWHFC 20 14 13 13 13
HFC
Second-order PCA-HFC 25 24 24 24 24
statistics- ATGP-HFC 39 37 35 35 34
based HFC
HOS-HFC Skewness- 19 19 17 17 16
HFC
Kurtosis-HFC 22 21 19 19 19
Fifth-HFC 28 25 24 22 20
Fast ICA-HFC 25 22 18 17 16
Table 4.8 nTSVD estimated for HYDICE by MOCA and ATGP using signal energy

ηl ¼ tlATGP 2 l* PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
MOCA 33 33 32 32 32 31
ATGP 43 44 43 41 40 40
Table 4.9 nTSVD estimated for HYDICE by MOCA and ATGP using signal strength
pffiffiffiffi ATGP
ηl ¼ t l l* PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
MOCA 32 33 32 31 31 30
ATGP 41 43 41 40 39 36
As shown by the synthetic image experiments in Sect. 4.6, VD and TSVD

produced similar results. However, as for real images, TSVD generally produced
higher values than VD does because the targets of interest in real image scenes are
more complicated than those in synthetic images. Tables 4.8 and 4.9 also tabulate
the value of TSVD, nTSVD, estimated by MOCA and ATGP-specified signal energy
and signal strength, respectively, where MLD is the maximum likelihood detector
used by MOCA and ATGP to replace the NPD.
A comparison of Tables 4.8 and 4.9 with Table 4.7 makes it clear that the values
of TSVD, nTSVD, using ATGP-specified target signals in Tables 4.8 and 4.9 were
much higher than their corresponding values of VD, nVD, in Table 4.7 and
MOCA in Tables 4.8 and 4.9. This makes sense because some spectrally distinct
real target signal sources may be projected onto the same eigenvectors produced
by MOCA.
4.7.2 AVIRIS Data
A second real image scene used for experiments is a well-known Airborne Visible
InfraRed Imaging Spectrometer (AVIRIS) image scene, Cuprite, shown in Fig. 4.4
(see also Fig. 1.4a) available at the USGS Web site http://aviris.jpl.nasa.gov/. This
scene is a 224-band image with a size of 350 350 pixels; it was collected at the
Cuprite Mining District site in Nevada in 1991. It is one of the most widely used
hyperspectral image scenes available in the public domain and has 20 m spatial
resolution and 10 nm spectral resolution in a range of 0.4–2.5 μm. Since it is well
understood mineralogically and has reliable ground truth, this scene has been
studied extensively. Two data sets for this scene, reflectance and radiance data,
are also available for study. Five pure pixels in Fig. 4.4a, b can be identified as
corresponding to five different minerals, alunite (A), buddingtonite (B), calcite (C),
kaolinite (K), and muscovite (M) labeled A, B, C, K, and M in Fig. 4.4b, along with
their respective spectral signatures plotted in Fig. 4.4c, d.
Although the data set contains more than five minerals, the ground truth avail-
able for this region only provides the locations of the pure pixels: alunite (A),
a b
Fig. 4.4 (a) Cuprite AVIRIS image scene; (b) spatial positions of five pure pixels corresponding
to minerals: A, B, C, K, and M; (c) five mineral reflectance spectra; (d) five mineral radiance
spectra
c
7000
Muscovite Alunite
Buddingtonite
6000 Calcite
Alunite Kaolinite Kaolinite
Muscovite
5000
Reflectance
4000 Buddingtonite
3000
Calcite
2000
1000
0
400 600 800 1000 1200 1400 1600 1800 2000 2200 2400
Wavelength(nm)
d
12000
Alunite
Muscovite Kaolinite Buddingtonite
Calcite
10000
Muscovite
Kaolinite
8000 Alunite
Radiance
6000
4000
Calcite
Buddingtonite
2000
0
400 600 800 1000 1200 1400 1600 1800 2000 2200 2400
Wavelength(nm)
buddingtonite (B), calcite (C), kaolinite (K), and muscovite (M) shown in Fig. 4.4b
with their reflectance and radiance spectra plotted in Fig. 4.4a, b.
Tables 4.10 and 4.11 tabulate the values of VD estimated for cuprite reflectance
and radiance data by 2OS- and HOS-based methods, respectively.
Since background is generally characterized by 2OS and targets are generally
specified by HOS, it is expected that the VD values estimated by 2OS-based
Table 4.10 nTSVD estimated for Curpite reflectance data by various VD estimation techniques
SSE/HySime 27/16
2OS-based MOCA PCA 31
ATGP 51
Kurtosis 24
Fifth moment 30
ICA 23
2OS-based HFC 34 30 24 22 20
HFC NWHFC 32 29 25 23 21
PCA-HFC 31 31 30 30 30
ATGP-HFC 51 51 50 48 47
HFC
Kurtosis- 24 24 22 22 22
HFC
Fifth-HFC 30 30 29 28 28
ICA-HFC 25 23 23 23 22
Table 4.11 nTSVD estimated for Curpite radiance data by various VD estimation techniques
SSE/HySime 18/17
Second order-MOCA PCA 46
ATGP 51
Kurtosis 23
Fifth moment 28
Fast ICA 25
2OS -based HFC 29 18 17 15 15
HFC NWHFC 23 20 20 19 19
PCA-HFC 47 46 44 43 43
ATGP-HFC 52 51 51 51 48
HFC
Kurtosis- 24 24 22 22 22
HFC
Fifth-HFC 30 30 29 28 28
Fast ICA- 25 23 23 23 22
HFC
a
vegetation
cinders
shade
anomaly
rhyolite
dry lake
b
1800
anomaly
1600 cinders
dry lake
1400 rhyolite
shade
1200 vegetation
Radiance
1000
800
600
400
200
0
0.40 µm 2.5 µm
Wavelength
Fig. 4.5 (a) AVIRIS LCVF scene. (b) Spectra of anomaly, cinders, dry lake, rhyolite, shade, and
vegetation
methods will be higher than those estimated by HOS-based methods, where back-
ground signatures extracted by 2OS-based methods were not extracted by
HOS-based methods. The results in Tables 4.8 and 4.9 confirm this conjecture.
A third AVIRIS image scene used for experiments is shown in Fig. 4.5 (also
Fig. 1.8a). It is the Lunar Crater Volcanic Field (LCVF) located in Northern Nye
County, Nevada, which is one of the earliest hyperspectral image scenes studied in
the literature. Atmospheric water bands and low-SNR bands were removed from
the data, reducing the image cube from 224 to 158 bands. The LCVF image has a
10 nm spectral resolution and 20 m spatial resolution.
Unlike the Cuprite data in Fig. 4.4, which is a mineralogical scene containing
many pure mineral signatures this image scene in Fig. 4.5a has five targets of
interest, red oxidized basaltic cinders, rhyolite, playa (dry lake), vegetation, and
shade whose radiance spectra are plotted in Fig. 4.5b. According to the scene, these
three target areas, except vegtation, cover very large regions, and it is anticipated
that these three regions will be heavily mixed. Thus, it is expected that there will not
be too many pure signatures as endmembers. However, owing to a volanic explo-
sion, the crater lake in the lower right section may actually contain pure mineral-
ogical signatures as endmembers. In addition to these five target signatures, there is
also an interesting target, an anomaly two pixels in size located in the upper left dry
lake, with its spectral signature also plotted in Fig. 4.5b. Thus, if endmembers are
present in this scene, they will very likely be in areas of the dry lake and vegetation,
along with anomalies. Table 4.12 tabulates the values of VD estimated by 2OS- and
HOS-based methods. As we see from the table, the results make sense and can be
categorized into two classes, values around 11 1 estimated by 2OS-based
methods, which include background signatures, and values around 3 or 4 estimated
by HOS-based methods, which only specify targets by HOS.
In analogy with HYDICE experiments, Table 4.13 also tabulates the value of
TSVD, nTSVD, estimated by MOCA and ATGP-specified target signal energy and
signal strength, respectively, where MLD is the maximum likelihood detector used
by the MOCA and ATGP to replace the NPD.
As we can see from Table 4.13, conclusions similar to those drawn from
Tables 4.7 to 4.9 also apply to Tables 4.11, 4.12, and 4.13, where the values of
TSVD, nTSVD, estimated by ATGP-specified target signal energy and strength in
Table 4.12 nTSVD estimated for LCVF data by various VD estimation techniques
SSE/HySime 12/11
Second order-MOCA PCA 4
ATGP 12
HOS-MOCA Skewness 3
Kurtosis 3
Fifth moment 5
Fast ICA 3
2OS -based HFC 8 6 4 4 4
HFC NWHFC 20 13 11 11 10
PCA-HFC 4 4 4 4 4
ATGP-HFC 12 11 10 10 9
HFC
Kurtosis- 3 3 3 3 3
HFC
Fifth-HFC 5 4 4 4 4
Fast ICA- 3 3 3 3 3
HFC
Table 4.13 nTSVD estimated by MOCA and ATGP using signal energy and strength
MLD l* NPD PF ¼ 101 NPD PF ¼ 102 NPD PF ¼ 103 NPD PF ¼ 104 NPD PF ¼ 105
⊥
Cuprite reflectance P r
U
MOCA 40 41 40 39 38 37
ATGP 67 71 67 64 58 53
⊥ 2
P r
U
MOCA 43 45 43 40 40 39
ATGP 74 80 73 69 67 65
⊥
Cuprite radiance P r
U
MOCA 43 45 43 41 39 38
ATGP 66 74 66 61 59 54
⊥ 2
P r
U
MOCA 47 48 45 45 42 41
ATGP 74 81 74 68 63 61
⊥
LCVF P r
U
MOCA 7 7 7 7 7 7
ATGP 9 12 11 9 8 8
⊥ 2
P r
U
MOCA 8 9 8 7 7 7
ATGP 12 12 12 11 9 9
117
Table 4.13 were much higher than the corresponding values of VD, nVD, in
Tables 4.10, 4.11, and 4.12 and of nTSVD by MOCA in Table 4.13. This is true
since some spectrally distinct real target signal sources may be projected onto the
same eigenvectors produced by MOCA.
4.8 Conclusions
This chapter revisits the concept of VD and presents a unified theory to expand the
applicability of VD from a broader point of view, where spectrally distinct signa-
tures defined by VD can be interpreted as various types of targets characterized by
data statistics explored in Chang (2013). Within this context many works reported
in the literature can be considered VD estimation methods for hyperspectral targets
specified by the first two orders of data statistics. Depending on how spectrally
distinct signatures are defined as desired targets of interest, VD emerges as a
versatile concept that can be used to adapt various applications. For example, if
hyperspectral targets of interest are specified by first-order statistics, then the HFC
method can be used for this case. On the other hand, if hyperspectral targets of
interest exhibit 2OS, then 2OS-based techniques, such as least squares methods
(Paylor, 2014; Paylor and Chang, 2013, 2014) PCA and signal subspace methods,
may be applicable to estimating the value of nVD. On the other hand, when target
signatures of major interest are rare, signal sources such as endmembers, anomalies,
and their data statistics generally go beyond the first two orders of data statistics.
In this case, they can be better characterized by HOS. To the authors’ best
knowledge the HOS-based approach to finding the value of nVD developed by
Chang et al. (2014a, b) was developed to resolve this issue and deviate from
conventional wisdom by finding specific HOS-characterized feature vectors that
can be used to generate desired target signal sources. These feature vectors are then
used to perform Neyman–Pearson detection to determine whether a test feature
vector is a target signal source. The novelty of the HOS methods is to integrate all
key ideas—Neyman–Pearson detection formulations in the HFC method (Harsanyi
et al., 1994a), HOS target generation algorithms in Ren et al. (2006), and finding
PDs under binary hypotheses via the extreme value theory in Kuybeda
et al. (2007)—into a new aspect of HOS-based binary hypothesis testing problems
that can be used to solve VD estimation problems for HOS hyperspectral target
signal sources. Table 4.14 summarizes various types of VD estimation techniques
according to their categorization in Chang (2013a, b), along with TSVD as
presented in this chapter.
4.8 Conclusions 119
Table 4.14 Categorization of types of VD/TSVD

Type VD/TSVD estimation technique
VD data characterization Eigenvalue Eigenvalue distribution
NPD (HFC, NWHFC)
Gershgorin radius/disk
SVD Singular value distribution
Factor analysis Malinowski error theory
Information criterion AIC, MDL
VD data representation Signal model SSE, HySime, MOCA
Linear mixing model LSMA-OSP
TSVD First order HFC, NWHFC
Second order PCA-HFC, PCA-MOCA
First and second orders ATGP-HFC, ATGP-MOCA
HOS HOS-HFC, HOS-MOCA
Part II
Sample Spectral Statistics-Based Recursive
Part II develops sample spectral statistics-based recursive hyperspectral sample

processing (RHSP) algorithms that allow data processing to be executed in a
recursive manner, where the knowledge provided by the causal sample correla-
tion/covariance matrix (CSCRM/CSCVM) is grown sample by sample in a causal
manner. Algorithms using the obtained knowledge can be further implemented in
real time as real-time processing algorithms. In particular, these algorithms can be
implemented in a progressive manner. Since the knowledge used for data
processing is augmented by only including the knowledge provided by new sample
vectors while the previously known and available knowledge remains unchanged,
we can also implement progressive algorithms recursively in the sense that only
newly generated knowledge is used to update data information for processing. In
other words, a recursive process decomposes data information into three pieces of
information: (1) processed information obtained by processing data sample vectors
that have already been visited, (2) new information given by data sample vectors
currently being processed, and (3) innovation information that is provided by new
information from the current sample vector but cannot be obtained from the
processed information. Thus, a recursive process makes use of recursive equations
to update results only through innovation information. It is this innovation infor-
mation that significantly reduces the computational complexity and processing time
of the algorithms.
Part II is primarily devoted to two types of real-time RHSP target detection,
active target detection and passive target detection: Chap. 5, “Real-Time RHSP for
Active Target Detection: Constrained Energy Minimization,” and Chap. 6, “Real-
Time RHSP for Passive Target Detection: Anomaly Detection.”
Chapter 5
Processing for Active Target Detection:
Constrained Energy Minimization
Abstract As discussed in Chap. 5 of Chang (Real-time progressive hyperspectral

image processing: endmember finding and anomaly detection, Springer, New York,
2016) hyperspectral target detection can be generally performed in two completely
opposite modes, active hyperspectral target detection and passive hyperspectral
target detection. Active hyperspectral target detection requires specific prior knowl-
edge that can be used to detect targets of interest as desired objects. Its applications
include reconnaissance, rescue and search, and detection of targets specified by
known knowledge. In the meantime, target detection of this type generally requires
real-time processing to find targets on a timely basis. However, for an algorithm to
be implemented in real time, the data samples used can only be those data sample
vectors up to the data sample currently being processed; no future data sample
vectors yet to be visited should be involved in data processing. Such a property is
generally called causality, which has unfortunately received little attention in real-
time hyperspectral data processing in recent years. This chapter investigates one of
the well-known active hyperspectral target detection techniques, constrained
energy minimization (CEM), for its real-time processing in subpixel detection. In
this investigation, the concept of causal sample correlation matrix (CSCRM),
introduced in Chap. 14 in Chang (Real-time progressive hyperspectral image
processing: endmember finding and anomaly detection, Springer, New York,
2016), will play a key role in allowing CEM to be implemented in a progressive
manner sample by sample. Such resulting CEM is called progressive CEM
(P-CEM). Because CSCRM varies with data sample vectors, P-CEM requires
repeatedly calculating the inverses of such sample-varying CSCRM matrices. To
further reduce computational complexity and computer processing time, the notion
of innovation developed in Chap. 3 for a Kalman filter is used to derive recursive
equations for P-CEM. The resultant recursive version of P-CEM is referred to as
recursive CEM (R-CEM), which paves the way to developing real-time constrained
energy minimization (RT-CEM), which can be executed in a causal manner recur-
sively as well as in real-time sample by sample progressively.

DOI 10.1007/978-3-319-45171-8_5
124 5 Real-Time RHSP for Active Target Detection: Constrained Energy Minimization
5.1 Introduction
Because hyperspectral imaging sensors use hundreds of contiguous spectral bands

to acquire data, it is highly anticipated that many unknown signal sources and
material substances will be uncovered without prior knowledge of what these target
sources are. Under such a circumstance, two approaches are generally adopted. One
is to design unsupervised target detection techniques that can detect such unknown
target signal sources in an unsupervised fashion with no need of prior knowledge.
An example of such a technique is the automatic target generation process (ATGP)
developed by Ren and Chang (2003) and discussed in Sect. 4.4.2.3 of Chap. 4.
However, this approach can be effective only when the characteristics of targets to
be detected can be captured by the designed algorithms. As an alternative, the other
approach is to suppress unknown signal sources with no need of finding them,
regardless of what they are, such as background signatures. One issue with this
approach is that certain target signal sources of interest may also be suppressed. The
goal of designing and developing constrained energy minimization (CEM) in
Harsanyi (1993) was originally to deal with the latter approach. Its idea is to
decompose signal sources into two classes, a target class, comprised of targets of
interest designated as desired signal sources, and a background class, consisting of
all signal sources other than those in the target class. Two operators are then utilized
to perform target detection. Since it does not know all signal sources in the
background class, CEM computes the inverse of the global sample correlation
matrix, R, to suppress the signal sources in the background class. Once the
background is suppressed, a follow-up operator, which is a matched filter, is used
to extract all desired signal sources specified by the designated signal sources. More
specifically, CEM is a two-stage operator that implements an operator of the form d
T 1
R r on the data sample vector r, with R1 performing background suppression,
followed by the match filter using the matched signal designated by d that is used to
specify a target signal source of interest as a desired signal. When there are multiple
targets of interest designated as desired signal sources, CEM is then expanded into
the target-constrained interference-minimized filter (TCIMF), developed by Ren
and Chang (2000b). Interestingly, the effect of using R1 is similar to that of an
orthogonal subspace projection (OSP), developed by Harsanyi and Chang (1994),
which annihilates undesired signatures via an OSP operator. However, there is a
crucial difference between CEM and OSP. Unlike CEM, which assumes no prior
knowledge about the background, OSP assumes that background signatures are
given a priori and designated as undesired signal sources, denoted by U. With this
scenario OSP designs a particular undesired signal projector P⊥ U specified by (4.20)
to eliminate all the desired signal sources in the background class. This is quite
different from CEM, which does not have knowledge of U, in which case it inverts
R, that is, R1 , to suppress all unknown signal sources in the background class. As a
result, R1 used in CEM accomplishes the same task as that carried out by P⊥ U used
in OSP. Nonetheless, both OSP and CEM use the same follow-up matched filter to
5.1 Introduction 125
detect the desired signal source d so as to accomplish target detection. Details about
their relationships can be found in Chap. 5 in Chang (2016).
Since finding complete background knowledge is nearly impossible, CEM has
emerged as a very attractive technique because it requires no prior knowledge other
than about targets of interest. In particular, no background knowledge is needed to
process CEM. This advantage makes CEM perform subpixel detection very effec-
tively, specifically when a known target signature is identified and targets specified
by this target signature need to be found in various locations, such as in rescue and
search or finding combat vehicles in an unknown battlefield. Moreover, many such
targets of interest may also be moving objects that can be detected only by ongoing
processing in real time, in which case the capability to process in real time becomes
essential to implementing CEM. Most importantly, there may also be certain subtle
targets that can be only revealed and captured by real-time processing but may be
missed and overwhelmed by targets detected by CEM with no real-time processing.
Despite the fact that many real-time processing algorithms have been proposed
in the literature for target detection and classification, technically speaking, none of
these are true real-time processing algorithms; they are, rather, fast computational
algorithms. Theoretically, a true real-time processing algorithm must produce its
output at the same time as input comes in. However, in reality this is impossible
since there is always a time delay caused by data processing. With this interpreta-
tion all claimed real-time processing algorithms can only be called near real time,
with the assumption that the data processing time is negligible. Examples include
constrained linear discriminant analysis (CLDA) (Du et al. 2003a; Du and Nevovei
2005, 2009) and parallel processing algorithms such as anomaly detection using a
multivariate normal mixture model and graphic processing unit (GPU) processing
(Tarabalka et al. 2009). Nevertheless, from a practical point of view, such a time
delay is actually determined by the specific application. For instance, in surveil-
lance and reconnaissance applications, finding moving targets such as missiles is
urgent, and the response time must be instantaneous. In this case, very little time
delay should be allowed. As another example, for applications in fire damage
management/assessment, the time to respond can be minutes or hours, in which
case the allowable time delay can be longer. Thus, as long as an algorithm can meet
a required time constraint, it can be considered a real-time processing algorithm.
In particular, this chapter looks into issues of how to take full advantage of CEM
for subpixel target detection. As noted earlier, one of the great advantages of CEM
is that it suppresses background variation with sample vectors. It designs and
develops “causal processing” for real-time implementation via the concept of
innovations information, discussed in Chap. 3 as well as in Poor (1994, footnote
p. 78), which is defined as new information of the current input sample that cannot
be predicted from past events. It was originally proposed by Kailath (1968) who
redeveloped a Kalman filter and showed it to be a very promising and effective
means of updating data in a causal and real-time manner. To investigate these
issues, this chapter develops a progressive version of CEM, called progressive
CEM (P-CEM), that processes data in a causal manner sample by sample,
progressively and in real time. Since P-CEM requires repeatedly calculating the
sample-varying correlation matrix R1 ðrn Þ with the current data sample vector rn,
its computing time is expected to be very high. To further reduce the computational
complexity, the notion of innovation information is further used to derive recursive
equations to update R1 ðrn Þ without recalculating all previously visited data sample
vectors except rn. As a result, a recursive version of P-CEM can therefore be
developed, called recursive CEM (R-CEM), which can not only process data
sample by sample progressively but also update data processing results recursively
based only on the data sample vector rn currently being processed and previously
processed data in the same way that a Kalman filter does. Such R-CEM is indeed the
final version of real-time processing of CEM, to be called real-time CEM
(RT-CEM) in the rest of this chapter.
Note that the idea of the proposed causal processing arises in updating needed
information only through the data sample vector currently being processed and the
information generated by processing previous data sample vectors. Although it is
similar to Kalman filtering (Chap. 3), there are several notable differences between
the two. First, since CEM is performed on a single data sample basis, there is no
counterpart of a state equation as used by a Kalman filter found in CEM. Second,
the measurement equation used by CEM is quite different from that used by a
Kalman filter, where a noise term involved in a Kalman filter is not present in CEM.
Third, CEM usually requires the inversion of a sample correlation/covariance
matrix. To implement CEM in real time, the matrix inversion must be updated
sample by sample. Such an update is not found in the measurement equation in a
Kalman filter, which is updated from the state equation. Thus, a direct application
of real-time Kalman filtering to CEM is not feasible due to the lack of a state
equation that can be derived for CEM. To resolve this issue, an alternative approach
is to use Woodbury’s identity in Appendix A to derive causal innovation informa-
tion update equations for PC-CEM in the same way that is used to derive a Kalman
filter so that only the current data sample vector and the processed information
obtained by previous data sample vectors can be used to generate the innovation
information to update data processing results sample by sample.
5.2 Constrained Energy Minimization
This section briefly reviews CEM developed by Harsanyi in his dissertation (1993).
It assumes that a desired signature specified by d is used for active detection. CEM
uses this designated signature d to constrain signals passed through a custom-
designed finite impulse filter (FIR) while minimizing the least-squares error caused
by data samples that fail to pass the filter due to that fact that they are not spectrally
related to the desired signature d and can be considered interfering signal sources.
Accordingly, a major strength resulting from CEM is that it requires no prior
knowledge about the data to be processed other than the desired signature d to
achieve background suppression. This advantage is very significant and unique
5.2 Constrained Energy Minimization 127
compared to other detection techniques, which require unsupervised knowledge

about unwanted target signal sources, such as background information, to be treated
as interfering unsupervised signal sources for annihilation, for example, OSP by
linear spectral mixture analysis (Chap. 2) (Chang 2016) and unsupervised targets
found by automatic target recognition (Chap. 5) (Chang 2016). This is because so
many background signatures cannot be either identified or inspected visually, and
finding these signal sources can be extremely difficult if not impossible. CEM deals
with this issue by inverting the sample correlation matrix R to perform background
suppression prior to the extraction of d. The use of the inversion of R has the same
effect as the removal of unsupervised signal sources, as noted in the introduction.
Unfortunately, finding unsupervised target signal sources and calculating the sam-
ple correlation matrix R must be done before desired target detection takes place.
Thus, technically speaking, none of the active target detection techniques can really
be implemented in real time. This chapter presents a new technique for real-time
processing of CEM according to two formats of data acquisition, band-interleaved-
pixel/sample (BIP/BIS) and band-interleaved-line (BIL), where the global sample
correlation matrix R must be replaced with a causal sample correlation matrix
(CSCRM), defined in Chap. 14 in Chang (2016), which is formed by only those data
sample vectors up to the pixel/sample vector currently being processed or a causal
data line matrix formed by all data lines up to the data line just completed. Neither
version of CEM has been investigated to date.
Suppose that a hyperspectral image is acquired and represented by a collection
of image pixel vectors, denoted by {r1, r2, . . . , rN} where
T
ri ¼ ðr i1 , r i2 , . . . , r iL Þ for 1 i N is an L-dimensional pixel vector, N is
the total number of pixels in the image, and L is the total number of spectral
channels. Further assume that d ¼ ðd 1 , d2 , . . . , dL ÞT is specified by a desired
signature of interest to be used for target detection. The goal is to find a target
detector detecting data samples specified by the desired target signal d via an FIR
linear filter with L filter coefficients {w1, w2, . . . , wL}, denoted by an L-dimen-
sional vector w ¼ ðw1 , w2 , . . . , wL ÞT that minimizes the filter output energy
subject to the constraint dT w ¼ wT d ¼ 1. More specifically, let yi denote the output
of the designed FIR filter resulting from the input ri. Then yi can be expressed by
XL
yi ¼ l¼1
wl r il ¼ ðwÞT ri ¼ riT w ð5:1Þ
and the average energy of the filter output is given by

XN hX N i
ð1=N Þ y2 ¼ wT Rw
i¼1 i
with R ¼ ð1=N Þ i¼1
ri T
r i ; ð5:2Þ
where R is the autocorrelation sample matrix of the image. CEM is developed to

solve the following linearly constrained optimization problem:

minw wT Rw subject to dT w ¼ wT d ¼ 1: ð5:3Þ
To solve (5.3), we introduce λ as a Lagrange multiplier to take care of the

constraint, dT w ¼ wT d ¼ 1. We then define a Lagrangian, that is, objective func-
tion J(w), by

J ðwÞ ¼ wT Rw þ λ dT w 1 : ð5:4Þ
Differentiating J(w) in (5.4) with respect to w and setting it to zero we can find the
optimal solution w* by

∂J ðwÞ
¼ 0 ) 2Rw* þ λd ¼ 0 ) w* ¼ ð1=2ÞλR1 d: ð5:5Þ
∂w w*
Multiplying (5.5) by dT on the left and using the constraint dT w* ¼ 1 we obtain

1
dT w* ¼ ð1=2ÞλdT R1 d ¼ 1 ) ð1=2Þλ ¼ dT R1 d : ð5:6Þ
Substituting (5.6) back into (5.5) yields
R1 d
w* ¼ : ð5:7Þ
dT R1 d
Thus, the optimal solution to (5.3) is given by an L-dimensional weight vector w* in

(5.7). The CEM filter, denoted by δCEM(r), derived in Harsanyi (1993), is given by
dT R1 r
δCEM ðrÞ ¼ : ð5:8Þ
dT R1 d
To explore the real-time capability of CEM, we need to understand how data are
acquired. Based on data acquisition, three data formats have been widely used by
hyperspectral imaging sensors (Schowengerdt 1997). One is the band-sequence
(BSQ) format, which collects data band by band. The other two are BIP/BIS, which
collects data sample by sample with full band information, and BIL, which collects
data line by line with full band information as well. Obviously, for real-time
processing, BIP/BIS and BIL are more appropriate compared to BSQ, which is
mainly focused on individual band processing rather than data sample processing,
where the former is referred to as hyperspectral sample processing and the latter as
hyperspectral band processing. Interestingly, while CEM using the BSQ format was
recently investigated in Chang et al. (2015a, b, c, d, e) and will also be discussed in
Chap. 13, the real-time sample processing of CEM using BIP/BIS and BIL has not
been addressed and reported in the literature, at least to the best of my knowledge.
Furthermore, although BIP/BIS and BIL look similar, their real-time
implementations are quite different owing to their use of causal sample correlation
5.2 Constrained Energy Minimization 129
matrices, that is, one is updated by samples and the other by data lines. As a result,
two different recursive equations are derived for CEM using BIP/BIS and BIL
respectively for real-time implementation.
Basically, between BIL and BIP/BIS, the latter is the only format that can be
used to perform CEM in real time. However, since hyperspectral imaging sensors
currently being used for data acquisition are carried out in a push-broom fashion,
their data are collected line by line. Accordingly, it makes more sense for CEM to
be implemented in such a manner as a compromise. As noted, the main problem
with the real-time implementation of CEM is its calculation of the global data
sample correlation matrix R, which requires entire data samples. To make CEM a
real-time processing algorithm, the R must be able to adapt to the data samples
currently being processed. In addition, it must also be causal in the sense that it can
only use data samples it has already visited and processed; no future data samples
should be allowed in the data processing. To deal with this causal issue, two types
of causal correlation matrix are defined to implement BIP/BIS and BIL formats. For
the BIP/BIS format, data samples are processed pixel by pixel/sample by sample. In
this case, a BIP/BIS-based CSCRM specified by R(n) can only be formed by data
samples up to the pixel/sample currently being processed, the nth data sample rn.
More specifically, let fri gn1
i¼1 be a set of data samples already processed by CEM.
Then a causal Xn sample correlation used for BIP/BIS should be formed by
RðnÞ ¼ ð1=nÞ i¼1 ri ri . For the BIL format, data samples are processed line by
T
line. Thus, a causal line correlation matrix (CLCRM) specified by R(L(n)) should
be formed by only those data lines up to the data line currently being processed,
denoted by the nth data line, Ln, where L(n) comprises all data lines up to Ln, that is,
LðnÞ ¼ [l¼1n
Ll, where Ll is the lth data line. If we further assume that the number of
N
samples in each line is N, then the set of data samples, xin i¼1 , forms the nth data

line, Ln, which is represented by a data matrix Xn ¼ x1n x2n xNn . Thus, RðLn Þ
XN T
¼ ð1=N ÞXn XnT ¼ ð1=N Þ i¼1 xin xin is a CLCRM formed only by the nth data
line. By virtue of R(Ln), we can define the BIL-based CSCRM for BIL up to the nth
Xn Xn, N T
data line Ln as RðLðnÞÞ ¼ ð1=nÞ l¼1 RðLl Þ ¼ ð1=nN Þ l¼1, i¼1 xil xil . By com-
paring BIL-based causal R(L(n)) to R(n) used by BIP/BIS, it is expected that
developing real-time CEM using the BIL format is more sophisticated than CEM
using BIP/BIS, as will be shown subsequently in derivations.
There are several advantages of implementing RT-CEM over CEM. First is its
ability to perform CEM in such a manner that various levels of background
suppression for each data sample or data line can be observed visually and analyzed
for detection. Second, these various background suppressions provide users with
sample-varying or data line-varying detection maps. For example, when weak
targets are present in earlier stages, the profiles of sample-varying or data line-
varying background suppressions provided by RT-CEM will help extract such
targets before they are overwhelmed by subsequently detected strong targets.
Third, RT-CEM offers a means of implementing CEM in real time. In fact, the
CEM currently in use is not a true real-time processing algorithm, although its
processing time is very fast and it can be implemented in a near-real-time fashion.
This is because causality is a prerequisite for real-time processing algorithms, and
CEM does not satisfy causality. Finally, recursive innovation information update
equations developed for RT-CEM paves the way to future hardware implementa-
tion for chip design, such as a field-programmable gate array (FPGA).
5.3 RT-CEM Using BIP/BIS
Now, assume that rn is the data sample vector currently being visited. Then fri gn1
i¼1
represents the set of data samples already visited and processed. Then the concept
of CSCRM, denoted by R(n) up to rn, is defined by
Xn
RðnÞ ¼ ð1=nÞ r rT:
i¼1 i i
ð5:9Þ
As a result, for a given current data sample rn a causal CEM (C-CEM) can be
rederived from (5.8) via R(n), defined in (5.9) as follows:
dT R1 ðnÞrn
δC-CEM ðrn Þ ¼ : ð5:10Þ
dT R1 ðnÞd
By virtue of such causality, designing and developing a real-time processing

version of CEM becomes feasible. But it does not have the capability of being
implemented in real time since the inverse of R(n) must be recalculated as a new
data sample vector rn comes in. To resolve this issue, this section develops a
recursive C-CEM (RC-CEM), denoted by δRC-CEM(rn), that can implement a
recursive update equation with the help of the following Woodbury’s matrix
identity in Appendix A:

1 A1 u vT A1
A þ uvT ¼ A1 : ð5:11Þ
1 þ vT A1 u
It should be noted that the causal sample autocorrelation matrix R1(n) defined
1
in (5.9) can be reexpressed as R1 ðnÞ ¼ ððn 1Þ=nÞRðn 1Þ þ ð1=nÞrn rnT . By
virtue of (5.11), we can derive a causal update equation for δRC-CEM(rn) in a manner
similar to how a Kalman filter is derived in Poor (1994). More specifically, the
derived causal update equation is obtained by dictating the difference between the
pixel currently being processed, rn, and the processed information obtained by the
1
previous n 1 data sample vectors fri gn1
i¼1 , R (n 1), that is, the information that
is contained in rn but that cannot be obtained and predicted from previously visited
5.3 RT-CEM Using BIP/BIS 131
Fig. 5.1 Diagram showing implementation of recursive equation specified by (5.12) to find R1
(n)
data sample vectors, fri gn1

i¼1 . Now we let A ¼ ððn 1Þ=nÞRðn 1Þ and u ¼ v
pffiffiffi
¼ ð1= nÞrn in (5.11). Then R1(n) can be shown to be expressed as
R1
n ðnÞ ¼ ½ð1 1=nÞRðn 1Þp1 on o
ffiffiffi pffiffiffi
½ð1 1=nÞRðn 1Þ1 ð1= nÞrn ð1= nÞrnT ½ð1 1=nÞRðn 1Þ1
pffiffiffi pffiffiffi :
1 þ ð1= nÞrnT ½ð1 1=nÞRðn 1Þ1 ð1= nÞrn
ð5:12Þ
e ðn 1Þ and e pffiffiffi
Furthermore, let ððn 1Þ=nÞRðn 1ÞR r n ¼ ð1= nÞrn in (5.12); then
e 1 ðn 1Þ ¼ ½ð1 1=nÞRðn 1Þ1 . Figure 5.1 presents a diagram implementing
R
(5.12), where D is one time unit delay. In particular, it shows how R e ðnÞ can be
updated recursively by the previously calculated data sample vector, R e ðn 1Þ, and
the data sample currently being processed, e r n , via (5.12). This diagram was also
derived in Chen et al. (2014a, b, c) since both anomaly detection and CEM use the
same causal correlation matrix R(n) to account for spectral statistics provided by
causal data sample vectors.
Now, using (5.11) in conjunction with the causal recursive update equation
specified by (5.12) CEM can be further implemented as a recursive version of
CEM, R-CEM. Interestingly, according to Fig. 5.1, C-CEM can also be
implemented as a real-time version of CEM, referred to as real-time CEM
(RT-CEM), δRT-CEM ðrn Þ, as follows:
δRT -CEM ðr Þ ¼ κ ðnÞdT R1 ðnÞr

n n n
¼ κ ðnÞdT ½ð1 1=nÞRðn 1Þ1 rn

n pffiffiffi on pffiffiffi 1 o
½ð1 1=nÞRðn 1Þ1 ð1= nÞrn ð1= nÞrnT 1 1=n Rðn 1Þ
κ ðnÞdT pffiffiffi pffiffiffi rn
¼ ð1 1=nÞ1 κðnÞdT R1 ðn 1Þrn

pffiffiffi 2
ðð1 1=nÞ nÞ dT R1 ðn 1Þrn rnT R1 ðn 1Þ rn
κ ð nÞ
1 þ ðð1 1=nÞnÞ1 rnT R1 ðn 1Þrn
¼ ð1 1=nÞ1 κðnÞδRT n1
-CEM ðr Þ
n
pffiffiffi 2 T 1
ð5:13Þ
ðð1 1=nÞ nÞ d R ðn 1Þrn rnT R1 ðn 1Þ rn
κ ð nÞ ;
1
where κðnÞ ¼ dT R1 ðnÞd is a scalar varying with the data sample rn. However,
k(n) can also be calculated by updating k(n 1) using (5.12). It is worth noting that
the first term in (5.13) is the result of combining the information R1 ðn 1Þ
obtained by previous data samples, fri gn1 i¼1 , and the current data sample, rn, and
the second term in (5.13) is new information considered innovation information
obtained by correlating R1 ðn 1Þ and rn, which cannot be predicted from R1
ðn 1Þ or rn alone. It should also be noted that causality is a prerequisite of any
real-time processing algorithm because it cannot use any data sample vectors
beyond the data sample vector rn currently being processed. In this case,
RT-CEM must also be causal.
Theoretically speaking, the CSCRM defined in (5.9) should not include the
current data sample vector rn because the role of R1 ðnÞ in (5.10) is to perform
background suppression. If rn is included in (5.9), then the detection of rn will also
be suppressed and reduced. In this case, all the derivations presented in this chapter
can be very easily modified by replacing R(n) and R(n 1) with R(n 1) and R
(n 2), respectively. Nevertheless, their results should not be very different from
each other since data samples specified by rn should be very few (probably only rn)
in R(n). However, if there is a very large number of data sample vectors specified
by rn, then using R(n 1) to replace R(n) in RT-CEM will certainly make a
significant difference in performance analysis.
The results derived by (5.13) are similar to but different from those derived in
Chen et al. (2014a, b, c) and Chang et al. (2015a, b, c, d, e). Because the matched
signatures used by anomaly detectors in Chen et al. (2014a, b, c) must vary with
data sample vectors rn, the derived real-time anomaly detectors cannot be
implemented by recursively updating the previous results, as shown in (5.13),
where the RT-CEM can actually take advantage of previously generated CEM
results, δRT -CEM ðr Þ, to update the current CEM result, δRT - CEM (r ) because its
n1 n n n
matched signature is specified by a designated signature d, which remains constant
and fixed for all processed data sample vectors. Also, the derivations of (5.13) are
not provided in Chang et al. (2015a, b, c, d, e).
5.4 CEM Using BIL 133
5.4 CEM Using BIL
RT-CEM using the BIL format is quite different from RT-CEM using the BIP
format because the former must be implemented line by line, in which case data
sample vectors are only available after a line of data sample vectors is completed.
Thus, the update of the sample correlation matrix must be done line by line instead
of sample by sample. Also, assume that the size of a data matrix is M N, where
there are M data lines and each line has N pixels. Let Ll be the lth data line and L(n)
N , n
the data comprising all of the first n data lines, that is, LðnÞ ¼ [l¼1
n
Ll ¼ ril i¼1, l¼1 .
XN T
Then RðLl Þ ¼ ð1=N Þ i¼1 ril ril , where the number of pixels in each line is N and
Xn Xn, N T
RðLðnÞÞ ¼ ð1=nÞ l¼1 RðLl Þ ¼ ðnN Þ1 l¼1, i¼1 ril ril . To drive a real-time
version of CEM using the BIL format, we use the following matrix identity in
Appendix A:
1
ðA þ BCDÞ1 ¼ A1 A1 B DA1 B þ C1 DA1 : ð5:14Þ
If we also let A ¼ ðn1 Þ

n RðLðn 1ÞÞ, B ¼ n RðLn Þ, and C ¼ D ¼ I ¼ identity
1
matrix, then
R1 ðLðn 1ÞÞ

A1 B ¼ RðLn Þ ¼ R ; ð5:15Þ
n1 n n1
and (5.14) becomes

1
ðA þ BÞ1 ¼ A1 A1 B A1 B þ I A1 : ð5:16Þ
Now, using (5.16) we can further derive

1
ðn1Þ
R1 ðLðnÞÞ ¼ n RðLðn 1ÞÞ þ RðnLn Þ
( 1 )
¼ ðn=ðn 1ÞÞ R1 ðLðn 1ÞÞ R R þI R1 ðLðn 1ÞÞ :
n n1 n n1
ð5:17Þ
Figure 5.2 presents a diagram describing the implementation of (5.17), where

D is a time delay by a data line.
It should be noted that a formula similar to (5.17) was also derived as Eq. (5.13)
in Du and Nekovei (2009), which was the core of the algorithms they used, the
TCIMF (Ren and Chang 2000b; Chang 2002a, b), constrained linear discriminant
analysis (CLDA) (Du and Chang 2001a, b), and RX-detector (Reed and Yu 1990),
all of which took advantage of Eq. (5.13) to calculate the sample correlation matrix
R1 specified by Eq. (5.14) in Du and Nekovei (2009). Unfortunately, it stopped
Rn|n-1 Rn|n-1+I (.)-1 R-1(L(n))
I
(n-1)-1 -1 D
n/(n-1)
R-1(L(n-1))
Xn Ln R(Ln)
Fig. 5.2 Diagram showing implementation of (5.17)
short of deriving recursive equations for all the algorithms, which is actually crucial
for them to be implemented in real time. It is our belief that without these recursive
equations, their implementation will not be considered as a true real-time process.
As will be demonstrated in what follows, deriving such recursive equations is not a
trivial matter.
Let Xn ¼ r1n r2n rNn be the nth data line, where rni is the ith pixel and N is the
total number of pixels in each line. If CEM uses the desired signature d and operates
on the correlation matrix formed by the (n 1)st data lines, then, by virtue of (5.17),
CEM using the desired signature d and the nth line data matrix Xn is given by
-CEM ðX Þ ¼ dT R1 ðLðn 1ÞÞXn

δRT n : ð5:18Þ
n1
dT R1 ðLðn 1ÞÞd
In this case, CEM using the desired signature d and the nth line data matrix Xn can
be derived by the following Eq. (5.20) as follows:
dT R1 ðLðnÞÞX(
n )
1
T 1
¼ ðn=ðn 1ÞÞ d R ðLðn 1ÞÞXn d R R þI T 1
ðR ðL ðn 1 ÞÞ Þ X n ;
n n1 n n1
ð5:19Þ
1
d R ðLðnÞÞ (d
T
)
1
T 1
¼ ðn=ðn 1ÞÞ d R ðLðn 1ÞÞd d R R þI 1
ðRðLðn 1ÞÞÞ d :
T
n n1 n n1
ð5:20Þ
Thus, for n 2 we can have

5.5 Computational Complexity 135
RT -CEM
T 1 1 T 1
δn1

ðXn Þ ¼ d R ðLðn 1ÞÞd d R ðLðn 1ÞÞXn
1 1
¼ dT R1 ðLðn 1ÞÞd dT R1 ðLðn 1ÞÞr1n , . . . , dT R1 ðLðn 1ÞÞd dT R1
T
ðLðn 1ÞÞrNn
ð5:21Þ
and
δRT -CEM ðX Þ ¼ dT R1 ðLðnÞÞd1 dT R1 ðLðnÞÞX

n n
n
1
¼ dT(R1 ðLðnÞÞd ðn=ðn 1ÞÞ )
1
d R ðLðn 1ÞÞXn d R
T 1 T
R þI 1
R ðLðn 1ÞÞXn
n n1 n n1
8 T 1 9
>
> d R ðLðn 1ÞÞd RT -CEM >
>
>
< δ ðXn Þ >
=
dT R1 ðLðnÞÞd " n1 #
¼ ðn=ðn 1ÞÞ 1 :
>
> T 1
1 1 >
>
>
: ð ð Þ Þd T
þ ð ð ÞÞX n ;>
nn1 nn1
d R L n d R R I R L n 1
ð5:22Þ
5.5 Computational Complexity
When an algorithm is used for hardware implementation, its computational complexity

determines how complicated the hardware design will be. In this section, we analyze
the complexity of implementing RT-CEM using BIP/BIS and BIL in terms of how
information is processed and updated via recursive equations (5.13) and (5.22).
5.5.1 RT-CEM Using BIP/BIS
According to (5.13), the computational complexity of RT-CEM using BIP/BIS is

solely determined by finding the vector R1 ðn 1Þrn by multiplying R1 ðn 1Þ by
rn and an inner product of two vectors, R1 ðn 1Þrn , by d and another inner
product of rn by R1 ðn 1Þrn , plus rnT R1 ðn 1Þrn , which requires an additional
calculation of an inner product rn by R1 ðn 1Þrn . As a result, a total number of a
matrix product with a vector, R1 ðn 1Þrn , and two inner products, R1 ðn 1Þrn
and rnT R1 ðn 1Þrn , are required for updating δRTCEM ðrn Þ. The only matrix
inverse that needs to be calculated for RT-CEM is the initial condition R1(n0),
where the n0 must be chosen to be n0 L, with L being the total number of spectral
bands to avoid the singularity in calculating a matrix inverse.
5.5.2 RT-CEM Using BIL
As will be shown in this section, the computational complexity of RT-CEM using

BIL is greater than that required by RT-CEM using BIP/BIS in Sect. 5.5.
From (5.22) δnRT - CEM (Xn) can be updated by δRT -CEM ðX Þ specified by (5.21) and
n1 n
1
R (L(n)) specified by (5.17) through the following three pieces of information:
1. Processed information: δRT -CEM ðX Þ, R1 ðLðn 1ÞÞ;
n1 n

2. Incoming new line data matrix: Xn ¼ r1n r2n rNn ;
3. Innovation information: R given by (5.15).
n n1
Thus, the computational complexity of δnRT - CEM (Xn) is determined by the

computational complexity of δRT -CEM ðX Þ as well as the computational complexity
n1 n
of R1 ðLðn 1ÞÞ, which are in turn by
• δRT -CEM ðX Þ calculated by (5.21): the total number of 2N inner products, dT R1
n1 n
ðLðn 1ÞÞrin , with 1 i N;
• R1 ðLðnÞÞ calculated by (5.17): a matrix product of R1 ðLðn 1ÞÞRðLn Þ
1

calculated by (5.15) for R , an inverse of R þI , plus two matrix
n n1 n n1
1
products calculated by (5.17) to calculate R R þI and
n n1 n n1
1
R þI R1 ðLðn 1ÞÞ;
n n1
• Initial condition: R1 ðLðn0 ÞÞ with n0 chosen to be the smallest number of data
lines that satisfy n0 N L to avoid the singularity in calculating the matrix
inverse where L is the total number of spectral bands.
Note that, unlike δRT ‐ CEM(rn) in (5.13), which processes data rn sample by
sample, δRT -CEM ðX Þ in (5.21) and δRT - CEM (X ) in (5.22) process data line by data
n1 n n n
line in the sense that they will not process data line Xn until the nth line data matrix
T 1
is completed. Thus, initial conditions must start with δRT -CEM ðX Þ ¼ d R ðL1 ÞX1 and
1 1 dT R1 ðL1 Þd
-CEM ðX Þ dT R1 ðL1 ÞX2
δRT
1 2 ¼ .
That is, we need to wait for the first line to come in and
dT R1 ðL1 Þd
1
be processed to calculate R ðL1 Þ prior to δ1RT - CEM (X1) and then further wait for the
second line to be processed to calculate δ1RT - CEM (X2). This is because both
δRT - CEM (X ) and δRT - CEM (X ) use the same initial sample correlation matrix R1
1 1 1 2
ðL1 Þ to complete their operations. As a consequence, for n 2, (5.22) is repeatedly
implemented recursively by
δRT -CEM ðX Þ ! δRT -CEM ðX Þ ! δRT -CEM ðX Þ ! !

1 1 1 2 2 2
δRT -CEM ðX Þ ! δRT -CEM ðX Þ ! δRT -CEM ðX Þ ! !

n1 n1 n1 n n n ð5:23Þ
RT -CEM RT -CEM -CEM ðX Þ;
δM1 ðXM1 Þ ! δM1 ðXM Þ ! δRT
M M
which is similar to Kalman filtering (Poor 1994).

Figure 5.3 presents the diagrams for implementing RT-CEM using BIP/BIS,
pffiffiffi 2 1
where c1 ðnÞ ¼ ðð1 1=nÞ nÞ , c2 ðnÞ ¼ ðð1 1=nÞnÞ1 , κ ðnÞ ¼ dT R1 ðnÞd ,
and R1(n 1) can be calculated by (5.12). Figure 5.4 shows a diagram illustrating

the implementation of RT-CEM using BIL, where κðnÞ ¼ dT R1 ðLðnÞÞd and R1
(L(n)) can be calculated by (5.17).
Two real hyperspectral image scenes were used for experiments to conduct a
performance evaluation of anomaly detection.
5.6.1 HYDICE Data
The image scene shown in Fig. 5.5 (and Fig. 1.9a) was used for experiments. It was
acquired by the airborne HYperspectral Digital Imagery Collection Experiment
(HYDICE). It has a size of 64 64 pixel vectors with 15 panels in the scene and the
ground truth map in Fig. 5.5b (Fig. 1.9b), where the ith panel signature, denoted by
pi, was generated by averaging the red panel center pixels in row i, as shown in
Fig. 5.5c (and Fig. 1.10) These panel signatures will be used to represent target
knowledge of the panels in each row.
The panel signatures in Fig. 5.5c were used as prior target knowledge required
by CEM for the desired target signatures. Figure 5.6 shows five detection maps
produced by CEM using the five panel signatures p1, p2, p3, p4, and p5 in Fig. 5.5c
as the desired target signatures.
In accordance with the ground truth, the panels in rows 2 and 3 were made by the
same materials with slightly different colors of paint, a light olive parachute and a
dark olive parachute. As a consequence, detecting panels in row 2 would also be
able to detect panels in row 3 and vice versa. This fact was confirmed in Fig. 5.7b, c
and is reproduced in Fig. 5.6 in decibel (db). Similarly, it was also applied to panels
in rows 4 and 5, which were also made by the same materials with slightly different
paint colors than in Fig. 5.7d, e.
Using the results of Fig. 5.6 for a benchmark comparison, RT-CEM was
implemented in two formats, BIP/BIS and BIL, using the same five panel signatures
p1, p2, p3, p4, and p5 in Fig. 5.5c as desired target signatures. Figures 5.8, 5.9, 5.10,
5.11, and 5.12 show color sample-varying CEM-detection maps in db of panel
Fig. 5.3 Diagram showing implementation ofδRTCEM ðrn Þ in (5.13)

n/(n-1)
RT CEM
d (n) (-1)(.)-1 n (Xn )
R-1(L(n)) RT CEM R-1(L(n-1))

n 1 (Xn )
Rn|n-1 Rn|n-1+I (.)-1 (.)T(.)
Xn D Xn-1
I
R-1(L(n-1))
Fig. 5.4 Diagram showing implementation ofδnRTCEM ðXn Þ in (5.22)

139
a b
c
7000
P1
6000 P2
P3
P4
5000 P5
Radiance
4000
3000
2000
1000
0
0 20 40 60 80 100 120 140 160 180
Band
of the 15 panels; (c) spectra of p1, p2, p3, p4, and p5
Fig. 5.6 Detection maps of CEM using (a) p1, (b) p2, (c) p3, (d) p4, and (e) p5 as desired target
signature; (a) row 1, (b) row 2, (c) row 3, (d) row 4, and (e) row 5
Fig. 5.7 Detection maps in db of CEM using (a) p1, (b) p2, (c) p3, (d) p4, and (e) p5 as desired
target signature. (a) Panel pixels in row 1, (b) panel pixels in row 2, (c) panel pixels in row 3, (d)
panel pixels in row 4, and (e) panel pixels in row 5
Fig. 5.8 Detection maps in db of RT-CEM using BIP/BIS with p1 used as desired target signature
Fig. 5.9 Detection maps in db of RT-CEM using BIP/BIS with p2 used as desired target signature
Fig. 5.10 Detection maps in db of RT-CEM using BIP/BIS with p3 used as desired target
signature
signature
pixels in each of the five rows produced by RT-CEM as processed in real time
according to the BIP/BIS format. As in Fig. 5.7, panels in rows 2 and 3 were
detected by Figs. 5.9 and 5.10 and panels in rows 4 and 5 were also detected in
Figs. 5.11 and 5.12. Most importantly, the color sample varying CEM-detection
maps produced by RT-CEM using the BIP/BIS format yielded an intriguing
finding. Initially, there were no pixels matched with the desired signature d, the
detection maps showed completely random abundance fractions in Figs. 5.8a–5.12a
until the first panel pixel that matched the d was detected, in which case, the
signature
background was suddenly suppressed in Figs. 5.8b–5.12b and the detected panel
pixels stood out as bright pixels with very high-intensity values in Figs. 5.8c–5.12c.
Then the background was continuously suppressed until the process was completed
as shown in Figs. 5.8d, f to 5.12d, f. Of particular interest are detection maps of
panels in row 3 in Fig. 5.12c–e and detection maps of panels in row 5 in Fig. 5.12c–
e, where the background was not supposed to be suppressed until the process
reached the matching panel pixels in Figs. 5.10e and 5.12e. However, as noted,
the signatures of panels in rows 2 and 3 are very similar due to the fact that they
were made by the same parachutes with light and dark olive paints, respectively.
Similarly, this is also true of panels in rows 4 and 5. Thus, when RT-CEM
processed the panels in row 2 in Fig. 5.10c and panels in row 4 in Fig. 5.12c,
their signatures were close to the matching signatures p3 and p5, respectively. As a
consequence, these panels were picked up by RT-CEM. When RT-CEM reached
the panels in row 3 in Fig. 5.10d and the panels in Fig. 5.12e that matched the
desired signatures, the background was further suppressed once again. Obviously,
such a phenomenon cannot be provided by Fig. 5.7, which represents results
obtained by implementing CEM in a one-shot operation, that is, the use of global
correlation matrix R, not causal correlation matrix R(n). Comparing Figs. 5.8, 5.9,
5.10, 5.11, and 5.12 to Fig. 5.7, it is clear that RT-CEM seemed to have better
background suppression than CEM did in Fig. 5.7. Nevertheless, final results
produced by RT-CEM using BIP/BIS were comparable to those produced by
CEM, with little appreciable difference between them.
Following similar experiments performed by RT-CEM using BIP/BIS,
Figs. 5.13, 5.14, 5.15, 5.16, and 5.17 show the time-varying progressive detection
maps of panels in five rows produced by RT-CEM according to the BIL format,
Fig. 5.13 Detection maps in db of RT-CEM using BIL with p1 used as desired target signature
where the detected amounts of panel pixels are plotted in db for better visual
assessment.
Experimental results are the same as those obtained by BIP/BIS, except for
several very interesting findings.
1. A significant improvement in the detection of panels in row 1 compared to
Figs. 5.8 and 5.13, where the three panel pixels in row 1 were clearly detected in
Fig. 5.13, where the subpixel panel p13 in row 1 were barely detected in Fig. 5.8
and were invisible in Fig. 5.15;
2. In detecting panels in row 4, RT-CEM using BIL also detected panels in row
1 prior to their detection. Interestingly, once the panels in row 4 were detected,
the detected panels in the row faded away immediately, and eventually small
amounts of abundance fractions of panels in row 1 were picked up in the final
detection map. Based on the spectral signature analysis conducted in Chap. 2 in
Chang (2003a, b), the spectral signature of p1 is indeed very similar to that of p4
and p5. Thus, before panels in row 4 was detected, the panels in row 1 were
considered to be those close to p4, in which case they were actually picked up as
if they were panels in row 4. However, once p411 was detected, the detected
abundance fractions of three panel pixels, p11, p12, and p13, were quickly
diminished. Interestingly, this phenomenon was not observed in Figs. 5.7d and
5.16;
3. Similarly, the foregoing observations also applied to the detection of panels in
row 5. These experiments demonstrated that RT-CEM using BIL did have an
advantage in terms of finding similar signatures while proceeding line by line;
4. As noted, BIL was implemented as a compromise between RT-CEM using R(n)
according to the BIP/BIS format and CEM using the global correlation matrix R.
It turned out that it did have some advantages, such as those mentioned earlier,
and computational savings, to be described in the following section.
Since we have the ground truth of the 19 R panel pixels, we can further calculate
detection rates of RT-CEM using BIP/BIS and BIL formats via a receiver operating
characteristic (ROC) analysis for performance evaluation, that is, the detection of
panel pixels p11, p12, and p13 in row 1 using p1, the detection of panel pixels p211,
p221, p22, and p23 in row 2 using panel signature p2, the detection of panel pixels
p311, p312, p32, and p33 in row 3 using panel signature p3, the detection of panel
pixels p411, p412, p42, and p43 in row 4 using panel signature p4, and the detection
of panel pixels p511, p521, p52, and p53 in row 5 using panel signature p5. To
include the number of pixels and data lines as a parameter to see the various
detection performances of RT-CEM on these 19 R panel pixels sample by sample
and line by line, we do not follow the traditional approach to plotting an ROC
curve of detection probability, PD, versus false alarm probability, PF. Instead, we
choose the common practice used in medical diagnosis to calculate the area under
an ROC curve, referred to as area under curve (AUC) (Metz 1978). Using such an
AUC, an alternative ROC curve can be plotted in terms of AUC on the y-axis
a b
1 1
0.98 BIP BIP
BIL 0.98 BIL
0.96
0.94 0.96
AUC
AUC
0.92 0.94
0.9
0.92
0.88
0.9
0.86
0.84 0.88
0 5 19 32 45 58 64 0 5 19 32 45 58 64
data line being processed data line being processed
c d
1 1
BIP BIP
0.9996 BIL BIL
0.9995
0.9994
0.999
0.9992
AUC
AUC
0.999 0.9985
0.9988
0.998
0.9986
0.9975
0.9984
0.997
0 5 19 32 45 58 64 0 5 19 32 45 58 64
data line being processed data line being processed
e
1
BIP
0.9999
BIL
0.9998
0.9997
0.9996
AUC
0.9995
0.9994
0.9993
0.9992
0.9991
0 5 19 32 45 58 64
data line being processed
Fig. 5.18 AUC values of 19 R panel pixels detected by RT-CEM: (a) panel pixels in row 1;
(b) panel pixels in row 2; (c) panel pixels in row 3; (d) panel pixels in row 4; and (e) panel pixels in
row 5
versus pixels or data lines on the x-axis. Figure 5.18 plots the values of AUC for
real-time performance of RT-CEM in detection of the 19 R panel pixels using five
panel signatures in Fig. 5.5c as desired target signatures where data lines instead
of pixels are used as a parameter on the x-axis to be able to conduct a comparative
analysis between BIP/BIS and BIL, and plots were not started until the data line
that contains panel pixels to be detected was reached. For example, for detecting
panel pixels in row 5, the plot in Fig. 5.18e was not started until it reached the 58th
data line.
As demonstrated in Fig. 5.18, it is interesting to note that RT-CEM using
BIP/BIS produced higher values of AUC than did RT-CEM using BIL for the
detection of panels in rows 2 and 4. However, the conclusion is reversed for the
detection of panels in rows 1, 3, and 5. Moreover, there were significant discrep-
ancies between RT-CEM using BIP/BIS and BIL in detecting panels in rows 1 and
2 compared to small differences in detecting panels in rows 3–5. This is mainly
due to the fact that the spectral signatures in rows 1 and 2 are closer to those in
rows 3–5 according to analysis (Chang 2003a, b, Chap. 2) where sample-varying
RT-CEM could perform better detection than data line-varying RT-CEM did. In
addition, comparison of Figs. 5.17 and 5.12 makes it very clear that RT-CEM
using BIP outperformed BIL in terms of false alarms prior to the detection of the
first panel pixel in row 5 at the 58th data line since RT-CEM using BIP did not
pick up the panel pixels in row 1 as RT-CEM using BIL did. However, this fact
was not reflected in the ROC curve analysis in Fig. 5.18e, where RT-CEM using
BIL performed slightly better than RT-CEM using BIP/BIS. Thus, visual inspec-
tion of detection maps in Figs. 5.12 and 5.17 reveals that RT-CEM using BIP/BIS
performed better than RT-CEM using BIL. On the other hand, examination of the
ROC analysis in Fig. 5.18e leads to the opposite conclusion. Similarly, compar-
ison of Figs. 5.15 and 5.10 showed the same phenomenon in Fig. 5.18c. These
simple examples demonstrated that without implementing CEM in real time, an
incorrect final conclusion could be drawn if one looked only at the final results
produced by CEM. This is because ROC analysis is performed solely based on the
performance of detection probability relative to false alarm probability, not on
both individually. In other words, ROC analysis does not deal with the back-
ground suppression issue (Wang et al. 2013a, b, c). To address this issue, a 3D
ROC analysis recently developed by Chang (2010) and Chang (2013, Chap. 3) can
be used for this purpose.
5.6.2 Computing Time Analysis
One major advantage of implementing RT-CEM is that it makes use of the most
recent incoming data samples, a pixel from BIP/BIS and a line from BIL, to update
currently available information via recursive equations without reprocessing past
information. This can reduce costs by minimizing the memory required for
processing very large hyperspectral image cubes. Specifically, a typical image
cube is near 1 GB in size, while RT-CEM inverting a correlation matrix combined
with newly received data only needs less than 100 KB. Figure 5.19 plots the
processing times required for RT-CEM using BIP/BIS and BIL to complete the
HYDICE scene where the y-axis is the processing time in seconds and the x-axis is
the value of n corresponding to the data line, Xn, being processed. As we can see
Fig. 5.19 Comparison 0.03

between processing time of
RT-CEM using BIP/BIS
0.025
and BIL BIP
BIL
0.02
Time (sec)
0.015
0.01
0.005
0
0 10 20 30 40 50 60 64
after initial conditions, the processing times for both formats are nearly constant. As
also shown in the plots, the processing time required for RT-CEM using BIL is
much less than that required for RT-CEM using BIP/BIS. This is due to the fact that
the processing time for RT-CEM using BIP/BIS must include time to process each
individual pixels compared to RT-CEM using BIL, which only processes one data
line at a time, in which case it only needs to invert one single correlation matrix to
update the current information.
Despite the fact that RT-CEM has advantages over CEM, RT-CEM does have
some disadvantages as well. The primary disadvantage is that, with the exception of
the pixels/data lines used to generate the initial inverse correlation matrix, each
pixel or line must be processed using a different and new causal sample correlation
matrix. As a result, it increases the computational complexity. Nevertheless,
RT-CEM has been shown to have significant advantages over CEM in many
respects. Specifically, the ability of RT-CEM to process data in real time without
waiting for a complete data set opens up opportunities for applications such as
moving target detection, an advantage not be offered by the typical CEM algorithm.
5.6.3 AVIRIS Data
In analogy with HYDICE experiments, we also performed similar experiments for

the LCVF data shown in Fig. 1.7 and reproduced in Fig. 5.20, where the five desired
target signatures plus two anomalous pixels are highlighted by open circles in
Fig. 5.20a and their corresponding spectral signatures in Fig. 5.20b.
However, owing to a lack of complete ground truth, an objective quantitative
analysis, such as the ROC analysis conducted for HYDICE experiments, is not
feasible. In this case, we must rely on a visual assessment, where background
suppression becomes crucial, as shown in Chap. 16. Figures 5.21, 5.22, 5.23,
a b
2000
dry lake
2 1800 cinders
3 vegetation
1600 shade
1400 rhyolite
anomaly1
1200 anomaly2
4 7 6
1000
800
5
600
400
1
200
0
0 20 40 60 80 100 120 140 160
Fig. 5.20 Five desired targets and two anomalies shown in (a) with corresponding spectra of
signatures in (b)
Fig. 5.21 Detection maps in db of dry lake by RT-CEM using BIP/BIS. (a) BIP/BIS. (b) BIL
Fig. 5.22 Detection maps in db of cinders by RT-CEM using BIP/BIS. (a) BIP/BIS. (b) BIL
5.24, 5.25, and 5.26 show real-time CEM-detection maps for five signatures: dry
lake, cinders, vegetation, shade, and rhyolite plus anomaly by RT-CEM using
BIP/BIS and BIL, where the desired signatures were obtained by averaging pixels
within a 3 3 window according to the ground truth shown in Fig. 5.20a, b. For
better visual assessment the results in Figs. 5.21, 5.22, 5.23, 5.24, 5.25, and 5.26 are
presented in db so that the background suppression can be brought up more visibly
for inspection.
As we can see from these progressive real-time detection maps, RT-CEM using
BIL always had better background suppression than RT-CEM using BIP/BIS at
various stages of detection when areas are large such as cinders, shade, rhyolite, and
the middle stages of dry lake. Of particular interest is Fig. 5.23, which shows that
Fig. 5.23 Detection maps in db of vegetation by RT-CEM using BIP/BIS. (a) BIP/BIS. (b) BIL
real-time detection maps of vegetation produced by RT-CEM using BIL suppressed

background much better than RT-CEM using BIP/BIS. While we do not have
complete pixel-level ground truth of the entire LCVF, the experiments seemed to
support the idea that RT-CEM using BIL performed better than RT-CEM using
BIP/BIS in the sense of background suppression.
Fig. 5.24 Detection maps in db of shade by RT-CEM using BIP/BIS. (a) BIP/BIS. (b) BIL
Finally, Fig. 5.27 also plots the computer processing time required by RT-CEM
using BIP/BIS and BIL. As in Fig. 5.19, RT-CEM using BIL required much less
time than RT-CEM using BIP/BIS, as expected.
Fig. 5.25 Detection maps in db of rhyolite by RT-CEM using BIP/BIS. (a) BIP/BIS. (b) BIL
Fig. 5.26 Detection maps in db of anomaly by RT-CEM using BIP/BIS. (a) BIP/BIS. (b) BIL
0.05
0.045 BIP
BIL
0.04
0.035
0.03
Time (sec)
0.025
0.02
0.015
0.01
0.005
0
0 50 100 150 200
data line being processed
Fig. 5.27 comparison between computer processing time required by RT-CEM using BIP/BIS
and BIL for LCVF image scene
5.7 Conclusions
This chapter developed a real-time processing of CEM with several new ideas. First
and foremost was the introduction of a new concept of causality into CEM so that
RT-CEM can be implemented in real time. Second, a new real-time CEM using
BIP/BIS via a causal sample-varying correlation matrix for performing sample-by-
sample subpixel detection was developed. Third, a new real-time CEM using BIL
via a causal line-varying correlation matrix for performing subpixel detection line
by line was also developed, where the derivations for RT-CEM using BIL were
completely new and only recently reported in Chang et al. (2015d). It is by no
means trivial, but instead, much more involved than that deriving RT-CEM using
BIP/BIS, as shown in its computational complexity in Sect. 5.4. Finally, for
RT-CEM to deal with nonstationarity and improve its computational complexity,
RT-CEM is further extended to recursive RT-CEM, where recursive equations are
derived in a fashion similar to how a Kalman filter is carried out for RT-CEM using
both BIP/BIS and BIL. Accordingly, only the most recently received data sample
vector will be used to update results recursively sample by sample and line by line
similarly to how a Kalman filter is carried out. As a result, RT-CEM can extract
targets of interest during its ongoing processing on a timely basis.
Chapter 6
Processing for Passive Target
Detection: Anomaly Detection
Abstract In Chap. 5, a particular subtarget detection technique in active

hyperspectral target detection, called constrained energy minimization (CEM),
was developed for its real-time and causal implementation. Rather than CEM,
this chapter focuses on passive hyperspectral target detection and investigates a
commonly used passive target detection technique, anomaly detection (AD), espe-
cially for real-time and causal processing capabilities that are developed for CEM
in Chap. 5 and will also be derived in a similar manner for AD. To this end, this
chapter can be considered a companion chapter of Chap. 5. Technically speaking,
passive target detection must be carried out without any prior knowledge, specif-
ically, a complete lack of availability of target knowledge. Owing to the nature of
hyperspectral imaging sensors, which can uncover many unknown material sub-
stances, passive hyperspectral target detection is a major task in hyperspectral
image analysis and has been studied extensively in the literature. Applications of
passive target detection include surveillance and monitoring, where no knowledge
is required a priori. Of particular interest in passive target detection is AD, which is
generally performed in a completely blind environment. To suppress unknown
backgrounds, AD makes use of the global sample correlation/covariance matrix
R/K, referred to as R/K-AD, so as to enhance its detectability. In general, anomalies
appear unexpectedly and cannot be detected by visual inspection. Most importantly,
they vary with time and data sample vectors. Accordingly, developing real-time
processing for R/K-AD on a timely basis sample by sample is crucial and critical in
many real-world applications, for example, abnormalities in agriculture and for-
estry, environmental monitoring, combat vehicles on a battlefield, drug trafficking
in law enforcement, and food inspection. As noted in Chap. 5, this real-time process
requires that causality be included in an algorithm’s design. In analogy with real-
time CEM, the concept of a causal sample correlation matrix (CSCRM), introduced
in Chap. 5, is also applicable to AD for this purpose. However, in a broader sense,
CSCRM is expanded into a causal sample correlation/covariance matrix (CSCRM/
CSCVM) to replace R/K with CSCRM/CSCVM specified by R(rn)/K(rn), respec-
tively, where rn is the current data sample vector being processed. The capability
for real-time processing is then derived from such a CSCRM/CSCVM because
CSCRM/CSCVM varies with data sample vectors. Thus, a causal processing of
K-AD/R-AD requires repeatedly calculating inverses of such sample-varying
CSCRM/CSCVM and must be processed in a causal manner sample by sample,
referred to as causal K-AD/R-AD (CR-K-AD/CR-AD). To further reduce

DOI 10.1007/978-3-319-45171-8_6
158 6 Real-Time RHSP for Passive Target Detection: Anomaly Detection
computational complexity and computer processing time for CK-AD/CR-AD,

the notion of innovation in a Kalman filter developed in Chap. 3 is once again
used to derive recursive causal update equations to calculate CSCRM/CSCVM,
which leads to recursive versions of K-AD/R-AD, to be called recursive CK-AD/
CR-AD (R-CK-AD/R-CR-AD). In particular, when R-CK-AD/R-CR-AD is real-
ized by real-time processing, it is referred to as RT CK-AD/RT-CR-AD in this
chapter.
6.1 Introduction
With very high spectral resolution a hyperspectral imaging sensor is capable of

uncovering many subtle signal sources that cannot be known by prior knowledge or
be visually inspected by image analysts. Many such signal sources generally appear
as anomalies in data. Accordingly, anomaly detection (AD) has received consider-
able attention in hyperspectral data exploitation in recent years. While a cut-and-
dried definition of anomaly may not be possible, a general consensus is that
anomalies should be those targets that stand out in their surrounding neighborhood.
In other words, an anomaly should be a target whose presence cannot be known
prior to data processing but can be characterized by several unique features: (1) it
has unexpected presence, (2) has a low probability of occurrence, (3) is an
insignificant object resulting from a relatively small sample population, and,
(4) most importantly, has a signature that is spectrally distinct from the spectral
signatures of its surrounding data samples. Chapter 14 in Chang (2016) investi-
gates these issues and characterizes anomalies in great detail. Targets with these
properties include endmembers defined as pure signatures to specify spectral
classes, special species in agriculture and ecology, rare minerals in geology, toxic
wastes in environmental monitoring, oil spills in water pollution, drug and
smuggler trafficking in law enforcement, artificial objects on a battlefield,
unusual terrorism activities in intelligence gathering, and tumors in medical
imaging. To effectively detect such targets, an algorithm developed by Reed
and Yu (1990), referred to as the Reed–Xiaoli detector (RXD), has been widely
used for this purpose. Since its introduction, many AD-like anomaly detectors
have been proposed. Of particular interest are anomaly detectors that modify
RXD by replacing the global sample covariance matrix, K, with the global
sample correlation matrix, R. In this case, the resulting RXD is called R-AD,
while, by contrast, the RXD using K is denoted by K-AD. This R-AD was further
used as a base to develop a causal version of R-AD, referred to as causul R-AD
(CR-AD) in Chap. 14 (Chang 2016), which implements R-AD using a causal
sample correlation matrix (CSCRM), R(rn), formed by only data sample vectors,
fri gn1
i¼1 , up to the data sample vector currently being processed, rn. This CR-AD
is then used to derive a causal version of K-AD, referred to as causal K-AD
(CK-AD). One of the most important applications for AD is the detection of

moving unknown targets in real time to locate these targets before their disap-
pearance in a short time period or before they are compromised by background or
from being dominated by other signal sources. Both CR-AD and CK-AD pave the
way to real-time causal versions of anomaly detectors.
In AD real-time causal processing is particularly important and vital in many
practical applications. First, it brings tremendous cost and data storage saving,
specifically, data archiving in data communication as well as data transmission.
Second, it can satisfy constraints on available limited bandwidth. Third, it also
achieves data compression while data are being processed. Fourth, it can detect
anomalies, such as moving targets, which may appear in a very short time frame.
This type of target may show up suddenly and instantly, then disappear quickly.
Finally and most importantly, real-time processing allows users to see time-varying
progressive background suppression, which cannot be accomplished by traditional
one-shot-operation anomaly detectors such as K-AD/R-AD. This is because no
prior knowledge is available for AD, and progressive background suppression
varying with time provides an opportunity to see how anomalies are detected in
real time as the AD process is taking place. It is also particularly critical for cases
where weak anomalies are detected and may be overwhelmed by subsequently
detected strong anomalies. Therefore, for an algorithm to be able to detect these
targets on a timely basis, the process must be able to be carried out in real time. In
the meantime, the data that can be used for real-time data processing should be only
those that have been visited and already processed. Accordingly, an AD process
must be carried out in a causal manner. However, owing to the nature of AD, an
anomaly generally has distinct spectral properties from those in its surrounding
neighborhood that cannot be causal. To capture such distinct spectral characteris-
tics, an anomaly detector generally requires intersample statistics, such as sample
covariance/correlation statistics, for example, using a sliding window centered at
the data sample currently being processed. This requirement makes AD inapplica-
ble to causal processing because it must calculate sample covariance or correlation
statistics from the entire set of data sample vectors or sample vectors within a given
window, which certainly cannot be calculated causally, as discussed in Chap. 18
(Chang 2016). Furthermore, since K-AD makes use of the global sample covariance
matrix, K, the global sample mean of all data sample vectors must be calculated,
which requires full access to the entire data set, and must be done prior to
AD. Therefore, from an algorithmic implementation point of view, K-AD is neither
a causal processing algorithm nor a real-time processing algorithm. To resolve this
issue, an anomaly detector proposed in Chang and Chiang (2002) suggested the use
of the sample correlation matrix, R, to replace the sample covariance matrix in
K-AD as R-AD, which was further extended by Chang and Hsueh (2006) to causal
R-AD (CR-AD) so that CR-AD can be implemented via a QR decomposition in real
time (Chang et al. 2001). However, this approach did not really address the issue of
causality in real-time implementation, nor was it addressed in Du and Nevovei
(2005, 2009), or Tarabalka et al. (2009). Interestingly, on the one hand, causal
processing does not have to be done in real time. On the other hand, real-time
processing must be causal. Unfortunately, this causal issue has never been
addressed by many reported real-time processing algorithms. Following an approach
similar to that treated in Chap. 5, two commonly used anomaly detectors, K-AD and
R-AD, can be first extended to their causal versions, CK-AD and CR-AD, which can
process data in a causal manner sample by sample. Then CK-AD and CR-AD are
further extended to their respective recursive versions, referred to as recursive
CK-AD/CR-AD (R-CK-AD/R-CR-AD), which can update data information via
recursive equations without reprocessing all past data sample vectors as the causal
sample correlation/covariance matrix (CSCRM/CSCVM) varies with the data sample
vectors being processed. Since R-CK-AD and R-CR-AD are eventually realized by
real-time processing, they are actually real-time CK-AD (RT-CK-AD) and real-time
CR-AD (RT-CR-AD).
6.2 Anomaly Detection
One of the most widely used anomaly detectors is the algorithm developed by Reed
and Yu, the K-Anomaly Detector (K-AD), which makes use of the global sample
covariance matrix, K, to account for spectral statistics among data sample vectors.
Since its introduction, many different K-AD-like anomaly detectors have been
proposed. Among them, of particular interest is causal AD (CR-AD) developed in
Chang and Chiang (2002) and Chang (2003a).
In what follows, we briefly describe these two anomaly detectors. Assume that
N
fri gi¼1 , where N is the total number of entire data sample vectors in the data and
ri ¼ ðr i1 ; r i2 ; ; r iL ÞT is the ith data sample vector, where L is the total number of
spectral bands.
6.2.1 K-AD/R-AD
The K-AD, denoted by δK-AD(r), is also known as RXD, developed by Reed and Yu
(1990), and is given by
δK-AD ðrÞ ¼ ðr μÞT K1 ðr μÞ; ð6:1Þ
where μ is the sample mean and K is the sample data covariance matrix. The form
of δK-AD(r) in (6.1) is actually the well-known Mahalanobis distance. However,
from a detection point of view, the use of K1 can be interpreted as a whitening
process to suppress image background. By replacing the global covariance matrix
K and (r μ) with the global correlation matrix R and the data sample vector r,
6.2 Anomaly Detection 161
Chang and Chiang (2002) further modified K-AD by developing a new correlation
matrix, an R-based anomaly detector, referred to as R-AD, which is given by
δR-AD ðrÞ ¼ rT Rr; ð6:2Þ
where the correlation matrix R is used to handle first- and second-order statistics by
including the sample mean μ, compared to K-AD, which uses the covariance matrix
K to account for only second-order statistics.
6.2.2 Causal R-AD/K-AD
N
Let fri gi¼1 be a set of data sample vectors to be processed. The causal R-AD
(CR-AD), denoted by δCR-AD(r), is specified by
δCR-AD ðrn Þ ¼ rnT RðnÞ1 rn ; ð6:3Þ
where rn is the nth data sample vector currently

Xn being processed, R(n) is the
CSCRM formed by RðnÞRðrn Þ ¼ ð1=nÞ i¼1 ri riT , and ri ¼ ðr i1 ; r i2 ; ; r iL ÞT
is the ith data sample vector, where L is the total number of spectral bands. By
virtue of (6.1) a causal version of the K-AD in (6.1) can be reexpressed as
δCK-AD ðrn Þ ¼ ðrn μðnÞÞT KðnÞ1 ðrn μðnÞÞ; ð6:4Þ

Xn
where μðnÞ ¼ ð1=nÞ is the causal sample mean averaged over all data
r
i¼1 i
n
Xn
sample vectors, fri gi¼1and KðnÞKðrn Þ ¼ ð1=nÞ i¼1 ðri μðnÞÞðri μðnÞÞT is
,
n
called the CSCVM formed by the data sample vectors, fri gi¼1 . In light of (6.4), the
K-AD can be considered a special case of CK-AD, and they are identical only when
both detectors reach the last data sample vector, rN. That is, K-AD in (6.1) can be
reexpressed as δK-RXD ðrn Þ ¼ ðrn μðN ÞÞT KðN Þ1 ðrn μðN ÞÞ , where μðN Þ ¼
XN
ð1=N Þ i¼1 ri is the global sample mean averaged over all data sample vectors,
XN
N
fri gi¼1, and KðN Þ ¼ ð1=N Þ i¼1 ðri μðN ÞÞðri μðN ÞÞT is the global covariance
N
matrix formed by all the data sample vectors, fri gi¼1 . Thus, the K-AD can be
considered a special case of CK-AD, and they are identical only when both
detectors reach the last data sample vector, rN.
An interesting note is worthwhile at this point. If the rTn in (6.3) is replaced by
T
d=dT R1 ðnÞd , where the d is the desired signature to be detected, then (6.3)
becomes a causal version of a well-known subpixel detector, constrained energy
minimization (CEM), developed in Harsanyi (1993) and Chang (2003).
6.3 Matrix Inverse Calculation
According to a functional form, (6.1) to (6.4), operated by an anomaly detector, a

matrix inverse calculation generally involves finding the inverse of either CSCVM,
K(n), or CSCRM, R(n), to account for surrounding environments of anomalies
n
when a given set of data sample vectors fri gi¼1 is considered.
6.3.1 Causal Sample Covariance/Correlation Matrix
n
Assume that fri gi¼1 is a set of input data sample vectors, where ri ¼ ðr i1 ; r i2 ; ;
T
r iL Þ is the ith data sample vector and L is the total number of spectral bands.
Suppose that rn is the most recent input data sample vector and μðnÞ ¼ ð1=nÞ
Xn n
r is the sample mean of fri gi¼1
i¼1 i
. Then CSCRM and CSCVM formed by
n
Xn
fri gi¼1 , R(n), and K(n) are defined by RðnÞ ¼ ð1=nÞ i¼1 ri riT and KðnÞ ¼ ð1=nÞ
Xn T
i¼1
ðri μðnÞÞðri μðnÞÞT respectively and the relationship between R(n) and
K(n) can be derived as follows:
Xn
KðnÞ ¼ ð1=nÞ i¼1 ðri μðnÞÞðri μðnÞÞT
Xn Xn Xn
¼ ð1=nÞ i¼1 ri riT ð1=nÞ i¼1 ri μT ðnÞ ð1=nÞ i¼1 riT μðnÞ
Xn
þ μðnÞμT ðnÞ þ ð1=nÞ i¼1 μðnÞμT ðnÞ, ¼ RðnÞ
Xn Xn ð6:5Þ
ð1=nÞ i¼1 riT μðnÞ ð1=nÞ i¼1 r μT ðnÞ
þ μðnÞμT ðnÞ ¼ RðnÞ μðnÞμT ðnÞ μðnÞμT ðnÞ

þ μðnÞμT ðnÞ ¼ RðnÞ μðnÞμT ðnÞ:
There are two forms of particular interest.
6.3.1.1 Real-Time Processing Form
The form specified by (6.5) for R(n) and K(n) can be expressed by separating the
most recent input data sample vector rn from its previous input data sample vectors,
n
fri gi¼1 , to derive a form to be used for real-time processing as follows:
Xn Xn1
RðnÞ ¼ ð1=nÞ i¼1 ri riT ¼ ð1=nÞ i¼1 ri riT þ ð1=nÞrn rnT
Xn1
¼ ððn 1Þ=nÞð1=ðn 1ÞÞ i¼1 ri riT þ ð1=nÞrn rnT ð6:6Þ
¼ ððn 1Þ=nÞRðn 1Þ þ ð1=nÞrn rnT

6.3 Matrix Inverse Calculation 163
and
Xn
KðnÞ ¼ ð1=nÞ ðri μðnÞÞðri μðnÞÞT
i¼1
ð6:7Þ
¼ ððn 1Þ=nÞKðn 1Þ þ ðrn μðnÞÞðrn μðnÞÞT ;
respectively. Xn Xn
Since μðnÞ ¼ ð1=nÞ i¼1 ri and KðnÞ ¼ ð1=nÞ i¼1 ðri μðnÞÞðri μðnÞÞT ,
we can derive
Xn
KðnÞ ¼ ð1=nÞ i¼1 ðri μðnÞÞðri μðnÞÞT
Xn Xn Xn
¼ ð1=nÞ i¼1 ri riT ð1=nÞ i¼1 ri μT ðnÞ ð1=nÞ i¼1 riT μðnÞ
Xn
þ μðnÞμT ðnÞ þ ð1=nÞ i¼1 μðnÞμT ðnÞ
Xn Xn
¼ RðnÞ ð1=nÞ i¼1 riT μðnÞ ð1=nÞ i¼1 r μT ðnÞ þ μðnÞμT ðnÞ
¼ RðnÞ μðnÞμT ðnÞ μðnÞμT ðnÞ þ μðnÞμT ðnÞ

¼ RðnÞ μðnÞμT ðnÞ
ð6:8Þ
and
Xn Xn1
μðnÞ ¼ ð1=nÞ r ¼ ð1=nÞ
i¼1 i i¼1
ri þ ð1=nÞrn
Xn1
¼ ððn 1Þ=nÞð1=ðn 1ÞÞ i¼1
ri þ ð1=nÞrn
¼ ð1 1=nÞμðn 1Þ þ ð1=nÞrn :
Thus,
μðnÞμT ðnÞ ¼ ððn 1Þ=nÞ2 ½μðn 1ÞμT ðn 1Þ

þ ððn 1Þ=n2 Þ μðn 1ÞrnT þ rn μT ðn 1Þ þ ð1=n2 Þrn rnT ;
Xn
KðnÞ ¼ RðnÞ μðnÞμT ðnÞ ¼ ð1=nÞ i¼1 ri riT μðnÞμT ðnÞ
Xn1
¼ ð1=nÞ i¼1 ri riT þ ð1=nÞrn rnT μðnÞμT ðnÞ
¼ ððn 1Þ=nÞRðn 1Þ þ ð1=nÞrn rnT ððn 1Þ=nÞ2 ½μðn 1ÞμT ðn 1Þ

ððn 1Þ=n2 Þ μðn 1ÞrnT þ rn μT ðn 1Þ ð1=n2 Þrn rnT
¼ ððn 1Þ=nÞfRðn 1Þ ððn 1Þ=nÞ½μðn 1ÞμT ðn 1Þg
ððn 1Þ=n2 Þ μðn 1ÞrnT þ rn μT ðn 1Þ þ ððn 1Þ=n2 Þrn rnT
¼ ððn 1Þ=nÞfKðn 1Þ þ ð1=nÞ½μðn 1ÞμTðn 1Þg
ððn 1Þ=n2 Þ μðn 1ÞrnT þ rn μT ðn 1Þ
þ ððn 1Þ=n2 Þrn rnT
¼ ððnh 1Þ=nÞKðn 1Þ þ ððn 1Þ=n2 Þ i

μðn 1Þμðn 1ÞT μðn 1ÞrnT rn μT ðn 1Þ þ rn rnT
h i
¼ ð1 1=nÞKðn 1Þ þ ððn 1Þ=n2 Þ ðrn μðn 1ÞÞðrn μðn 1ÞÞT :
ð6:9Þ
6.3.1.2 Matrix Form

n
Define Xl ¼ ½r1 r2 rn1 rn as the data matrix formed by fri gi¼1 ; then R(n) can be
expressed in an alternative form:
RðnÞ ¼ ð1=nÞXXT : ð6:10Þ

Xn
Now, if we define e e ¼ ½e
r i ¼ priffiffin and X r 1e
r 2 e
r n , then μ e ðnÞ ¼ ð1=nÞ e
r ¼
pffiffiffi i¼1 i
ð1= nÞμðnÞ is the mean of data sample vectors fe n
r i gi¼1 , and (6.6) becomes
eX
R ð nÞ ¼ X eT: ð6:11Þ
Similarly, we can define ri ¼ ri μ ðnÞ

pffiffi and X ¼ ½r1 r2 rn ; then
n
T
KðnÞ ¼ XX : ð6:12Þ
Xn pffiffiffi
e ðnÞ ¼ ð1=nÞ
Now, if we further define μ e
r ¼ ð1= nÞμðnÞ as the mean of data
i¼1 i
sample vectors fe n
r i gi¼1 , then ri ¼ ri μn
ðnÞ
pffiffi ¼ priffiffi
n
μpðnffiffinÞ ¼ e e ðnÞ and
ri μ
Xn Xn
KðnÞ ¼ ð1=nÞ i¼1 ðri μðnÞÞðri μðnÞÞT ¼ i¼1
ðe e ðnÞÞðe
ri μ e ðnÞÞT
ri μ
¼ RðnÞ μðnÞμT ðnÞ ¼ RðnÞ ðnÞe
μ ðnÞe
μ T ðnÞ:
ð6:13Þ
e and X have the same

It should be noted that from (6.11) and (6.12) both matrices X
size of L L, which is determined by the total number of spectral bands, L, not the
total number of data sample vectors, n.
6.3.2 Calculation of Matrix Inverse Using Matrix Identities
The most challenging issue in performing AD is the matrix inverse calculation of

R(n) and K(n).
6.3 Matrix Inverse Calculation 165
6.3.2.1 Real-Time Processing Form Using Woodbury Identity
To implement real-time processing, the Woodbury matrix identity

1 T 1
1 A u v A
A þ uvT ¼ A1 ð6:14Þ
1 þ vT A1 u
is very helpful.
1
Since R1 ðnÞ ¼ ððn 1Þ=nÞRðn 1Þ þ ð1=nÞrn rnT , from (6.10), we can
calculate R1(n) via (6.14) by setting A ¼ ððn 1Þ=nÞRðn 1Þ and
pffiffiffi
u ¼ v ¼ ð1= nÞrn as follows:
R1 ðnÞ ¼ ½ð1 1=nÞRðn 1Þ1

n pffiffiffi on pffiffiffi o
pffiffiffi pffiffiffi :
ð6:15Þ
Similarly, we can calculate K1(n) from (6.12) in terms of R1(n) by setting

A ¼ R1 ðnÞ, u ¼ μðnÞ, and v ¼ μðnÞ in (6.14),
1
1 1 1 R ðnÞμðnÞ μT ðnÞR1 ðnÞ
ðKðnÞÞ ¼ RðnÞ μðnÞμ ðnÞ
T
¼ R ð nÞ þ ;
1 μT ðnÞR1 ðnÞμðnÞ
ð6:16Þ
where R1(n) from (6.15) can be used to calculate (6.16).

As an alternative to (6.16), we can also calculate K1(n) directly without
pffiffiffiffiffiffiffiffiffiffiffi
resorting to R1(n) by setting A ¼ ð1 1=nÞKðn 1Þ, u ¼ v ¼ n 1=n
ðμðn 1Þ rn Þ in (6.14) as follows:
K1 ðnÞ ¼ ½ð1 1=nÞKðn 1Þ1
h pffiffiffiffiffiffiffiffiffiffiffi ihpffiffiffiffiffiffiffiffiffiffiffi i
fð1 1=nÞKðn 1Þg1 n 1=n ðrn μðn 1ÞÞ n 1=n ðrn μðn 1ÞÞT fð1 1=nÞKðn 1Þg1
hpffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi i
1þ n 1=n ðrn μðn 1ÞÞT fð1 1=nÞKðn 1Þg1 n 1=n ðrn μðn 1ÞÞ
¼ ½ð1 1=nÞKðn 1Þ1

pffiffiffiffiffiffiffiffiffiffiffi 2
ð1 1=nÞ2 n 1=n ðKðn 1Þðrn μðn 1ÞÞÞðKðn 1Þðrn μðn 1ÞÞÞT
pffiffiffiffiffiffiffiffiffiffiffi 2 :
1 þ ð1 1=nÞ n 1=n ðrn μðn 1ÞÞT Kðn 1Þðrn μðn 1ÞÞ
ð6:17Þ
6.3.2.2 Matrix Form Using Sample Correlation Matrix Identity
Assume that M is a matrix of size L p and can be written M ¼ ½U d, with U and
d being anL ðp 1Þ matrix and an L-dimensional vector. Then, according to a matrix
inverse identity derived in Settle (1996), the inverse of MTM can be expressed as
" #1 " 1 T #
T 1 UT U UT d UT U þ βU# ddT U# βU# d
M M ¼¼ ¼ T ;
dT U dT d βdT U# β
ð6:18Þ
1 n h 1 i o1
1
where U# ¼ UT U U and β ¼ dT I U UT U UT d ¼ dT P ⊥
U d .
Now, if we let U ¼ ½r1 rn2 rn1 ¼ Xn1
T
and d ¼ rn, then
M ¼ ½Ud ¼ ½r1 r2 rn1 rn ¼ Xn is an L n matrix and MTM becomes an L
L matrix with its inverse calculated by (6.18) as follows:
" #1
1 1 1
T
Xn1 Xn1 Xn1 rn
T
M M ¼ Xn XnT ¼ nR ðnÞ ¼
rnT Xn1 rnT rn
" 1 T #
T
Xn1 Xn1 þ βX#n1 ddT X#n1 βX#n1 rn
¼ T
βrnT X#n1 β
" # T #
1
ðn 1ÞR ðn 1Þ þ βXn1 dd Xn1
# T
βX#n1 rn
¼ T ;
βrnT X#n1 β
ð6:19Þ
1 T n h i o1
with X#n1 ¼ Xn1 Xn1
T
Xn1 and β ¼ rnT P⊥
X T rn .
n1
e¼e
Using (6.11) with d e n ¼ ½e
r i ¼ priffiffin and X r 1e
r 2 e
r n , (6.19) can be reexpressed as
" #1
h i e n1 X eT e rn
eT ¼
e nX X n1 X n1 e
R1 ðnÞ ¼ X n
e e n1
r nT X e
r nT e
rn
2 1 T 3
e n1 X
X eT þ β Xe# d edeT X e# β e# e
X r
6 n1 n1 n1 n1 n
7
¼4 T 5
βe r nT X e# β
n1
2 T 3
ðn 1Þ=nR1 ðn 1Þ þ βX e# d edeT Xe# β e# e
X r
6 n1 n1 n1 n
7
¼4 T 5;
βe r nT Xe# β
n1
ð6:20Þ
1 1
e# ¼ X
where X eT
e n1 X eT
X r n Pe T e
and β ¼ e T ⊥
rn .
n1 n1 n1
X n1
6.4 Real-Time Causal Anomaly Detection 167
6.4 Real-Time Causal Anomaly Detection
Two types of real-time anomaly detectors are derived in this section, real-time
causal R-AD, which uses CSCRM, R(n), to account for the spectral correlation
characterized by the first two orders of spectral statistics correlation, and real-time
causal K-AD, which uses K(n) to account for the second-order statistics spectral
correlation.
6.4.1 Real-Time Causal R-AD
From (6.10) the innovation information is provided by the pixel currently being
processed, rn, and R1(n 1) is the processed information obtained by previous
n 1 data sample vectors fri gn1
i¼1 . By means of (6.15) we can derive
δRT-CR-AD ðrn Þ ¼ rnT R1 ðnÞrn ¼ rnT ½ð1 1=nÞRðn 1Þ1 rn

n pffiffiffi on pffiffiffi 1 o
½ð1 1=nÞRðn 1Þ1 ð1= nÞrn ð1= nÞrnT 1 1=n Rðn 1Þ
rnT pffiffiffi pffiffiffi rn
¼ ð1 1=nÞ1 rnT ½Rðn 1Þ1 rn
pffiffiffi 2 n on o
ðð1 1=nÞ= nÞ rnT ½Rðn 1Þ1 rn rnT ½Rðn 1Þ1 rn
:
1 þ ðð1 1=nÞ=nÞ1 rnT ½Rðn 1Þ1 rn
ð6:21Þ
If we further let A ¼ ððn 1Þ=nÞRðn 1ÞR e ðnÞ and u ¼ v ¼ ð1=pffiffinffiÞrn e

r n in
e 1 ðn 1Þ ¼ ðð1 1=nÞRðn 1ÞÞ1 and e pffiffiffi
(6.14), with R r n ¼ ð1= nÞrn , (6.21)
becomes
δRT-CR-AD ðrn Þ ¼ rnT R1 ðnÞrn

2 T 3
ð n 1 Þ=nR 1
ð n 1 Þ þ β e# d
X ed
eT Xe# e# e
βX r
6 n1 n1 n1 n 7
¼ rnT 6
4 T
7rn
5
βe rT X e# β ð6:22Þ
n n1
T
e# d
¼ rnT ½ð1 1=nÞRðn 1Þ1 rn þ βrnT X ed
eT Xe# rn
n1 n1
e# e
þ2βrnT X n1 r n þ βrn rn :
T
Fig. 6.1 Flowchart showing implementation of recursive equation (6.7) to find R1(n)
By taking advantage of (6.22), a flowchart showing the implementation of

δRT-C-R-RXD ðrn Þ is depicted in Fig. 6.1, where D is one time unit delay.
Of particular interest in Fig. 6.1 is how R e ðnÞ can be updated recursively by
previously calculated R e ðn 1Þ and the currently being processed data sample
vector er n via (6.22).
6.4.2 Real-Time Causal K-AD
Unlike the C-R-AD, which uses the autocorrelation matrix without calculating the
sample mean, finding a real-time causal version of K-AD is not trivial because it
requires knowing the causal sample mean prior to calculating the causal covariance
matrix. In this case, we reexpress
rn μðnÞ
Xn1
¼ rn ððn 1Þ=nÞð1=ðn 1ÞÞ ri þ ð1=nÞrn ð6:23Þ
i¼1
¼ rn ð1 1=nÞμðn 1Þ þ ð1=nÞrn :
Using (6.23) in conjunction with (6.17), we can further derive a real-time causal
version of K-AD (RT-CK-AD) in (6.24) as follows, which can be easily updated by
K1(n 1), μ(n 1) as well as the current input data sample vector rn, where no
inverse need be calculated once the initial calculation of K1(L ) is done:
6.4 Real-Time Causal Anomaly Detection 169
δRTCKAD ðrn Þ ¼ ðrn μðnÞÞT K1 ðnÞðrn μðnÞÞ

n h io1
¼ ðrn μðnÞÞT ð1 1=nÞKðn 1Þ þ ððn 1Þ=n2 Þ ðμðn 1Þ rn Þðμðn 1Þ rn ÞT ðrn μðnÞÞ
¼ ðrn μðnÞÞT ½ð1 1=nÞKðn 1Þ1 ðrn μðnÞÞT

ð rn μ ð nÞ Þ T
h pffiffiffiffiffiffiffiffiffiffiffi ihpffiffiffiffiffiffiffiffiffiffiffi i
fð1 1=nÞKðn 1Þg1 n 1=n ðμðn 1Þ rn Þ n 1=n ðμðn 1Þ rn ÞT fð1 1=nÞKðn 1Þg1
hpffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffi i
1þ n 1=n ðμðn 1Þ rn ÞT fð1 1=nÞKðn 1Þg1 n 1=n ðμðn 1Þ rn Þ
ðrn μðnÞÞ
¼ ð1 1=nÞ1 ðrn μðnÞÞT ½Kðn 1Þ1 ðrn μðnÞÞT
ð rn μ ð nÞ Þ T
pffiffiffiffiffiffiffiffiffiffiffi 2 h ih i
ð1 1=nÞ2 n 1=n fKðn 1Þg1 ðμðn 1Þ rn Þ ðμðn 1Þ rn ÞT fKðn 1Þg1
pffiffiffiffiffiffiffiffiffiffiffi 2 h i
1 þ ð1 1=nÞ1 n 1=n ðμðn 1Þ rn ÞT fKðn 1Þg1 ðμðn 1Þ rn Þ
ðrn μðnÞÞ:
ð6:24Þ
It should also be noted that there is no similar derivation of (6.21) and (6.24)
derived in Du and Nekovei (2009) since all the classifiers (Du and Nekovei 2009)
used the sample correlation matrix R, not the sample covariance matrix K. Also,
comparison of (6.24) and (6.21) shows that it is more complicated to calculate a
causal form of RT-CK-AD than of RT-CR-AD.
6.4.3 Computational Complexity
This section provides a detailed analysis of the complexity of calculating (6.21) and
(6.24). Since the computational complexity of (6.24) is the same as that of (6.21),
only (6.21) is discussed.
First, to avoid a singularity problem, the initial condition of calculating R1(n)
in (6.21) should involve collecting a sufficient number of pixels to ensure the
sample correlation/covariance matrix is of full rank. In this case, the causal
processing must begin with the Lth pixel, rL where L is the total number of spectral
bands. Also, according to (6.21), the R1(n) in the RT-CR-AD, δRT-CR-AD(rn), can
be easily updated using previously processed information R1(n 1) and the
innovation information provided by rn without actually calculating the matrix
inverse once the initial calculation of R1(L) is done. In other words, we only
need to calculate a matrix inverse once, which is R1(L ). However, in causal
processing the R1(n) must be recalculated for each incoming pixel rn. This may
lead to a belief that its computational complexity may be much higher than that
required for real-time causal processing using (6.21). As a matter of fact, it is not. In
causal processing there are three major operations involved for each data sample
vector. One is to calculate the outer product of the incoming pixel rn, which is rnrTn ,
to update R1(n 1) to R1(n), which requires L2 multiplications. Another is to
calculate the inverse of R1(n) with size L L. The third one is to calculate rnT
½RðnÞ1 rn , which requires (L + 1) inner products, each of which requires
L multiplications. As a result, the computational complexity of finding R1(n) is
the same as that required for R1(L ) and is constant for all n.
On the other hand, the real-time causal processing using (6.21) requires three
operations for calculating xTAy (i.e., two calculations of rnT ½Rðn 1Þ1 rn and one
n on o
calculation of rnT ½Rðn 1Þ1 rn rnT ½Rðn 1Þ1 rn), each of which requires L
+ 1 innernproducts, whichoresults
n in L(L + 1)o multiplications as well as one outer
product ½Rðn 1Þ1 rn rnT ½Rðn 1Þ1 , which requires L2 multiplications.
More specifically, at the nth data sample vector rn with n L þ 1, the calculation of
R1(n) requires calculations of L-dimensional inner products, that is, ½Rðn 1Þ1
rn and rnT ½Rðn 1Þ1 rn , which result in L2 multiplications and one outer product
n on o
Oðn 1Þ ¼ ½Rðn 1Þ1 rn rnT ½Rðn 1Þ1 , which carries out L2 multiplica-
tions plus another L-dimensional inner products to calculate rnT Oðn 1Þrn , which
amounts to L multiplications. As a result, a total of two L-dimensional scalar inner
products and one matrix outer product are required and give rise to L2 + L multi-
plications, which is exactly the same as that derived in Du and Nevovei (2009). But
the entire complexity to carry out a real-time causal anomaly detector in (6.21) is
actually 3(L2 + L) + L2 multiplications plus the complexity of calculating the initial
condition, R1 ðLÞ=K1 ðLÞ.
According to the foregoing analysis, the computational complexity of causal and
real-time anomaly detectors is determined by the computer processing time (CPT)
required by calculating three elements, an inverse of an L L matrix, an inner
product of two L-dimensional vectors, and an outer product of two L-dimensional
vectors. Most importantly, in both causal processing and real-time causal
processing the computational complexity is independent of the data sample vector
to be processed and is linearly increased with the number of data sample vectors. A
detailed study of computational complexity is provided in Table 6.1.
Table 6.1 tabulates the computational complexity of causal and real-time causal
versions of R-AD and K-AD in terms of the required number of multiplications “”
used for calculation, where in the second column under causal anomaly detectors an
additional number of nL2 multiplications is required to calculate the sample corre-
lation/covariance matrix R(n)/K(n) plus processing time of the matrix inversion of
R(n)/K(n) as n varies, neither of which is needed if a real-time causal algorithm in
the third column is used via (6.21) or (6.24).
To conclude this section, it is worth noting that the CK-AD and CR-AD can be
indeed implemented without real-time processing. In other words, the sample
covariance matrix K(n) and sample correlation matrix R(n) can be recalculated
without using the causal innovation update equations (6.21) and (6.24) for updating
Table 6.1 Comparative analysis of computational complexity between causal and real-time
causal versions of R-AD and K-AD
Computational
complexity CR-AD/CK-AD RT-CR-AD/RT-CK-AD
Initial condition N/A R1(L )/K1(L )
Input rn R1(n)/K1(n) for each input N/A
sample rn
Number of inner L+1 3(L + 1)
products/sample
Number of outer L (one for (rn(rn)T) L (one for
n on oT
products /sample
( ½Rðn 1Þ1 rn r½Rðn 1Þ1 rn )
Number of “”/ [(L + 1)L + L2 ¼ 2 L2 + L + c]/2 [3(L + 1)L + L2 ¼ 4 L2 + 3 L + c]/2
sample
Number of “”/ (N L )(2 L2 + L + c)/2 (N L )(4 L2 + 3 L + c)/2
data set
Complexity O(N ) O(N )
innovations information. As a result of not using (6.21) and (6.24), the following
experiments demonstrate that the computer processing time for CK-AD and
CR-AD is linearly increased with the number of pixels to be processed compared
to the computer processing time remaining constant when their real-time counter-
parts, RT-CK-AD and RT-CR-AD, are used.
2 2 mixed pixel panels for each row in the third column, and five 1 1 subpixel
panels for each row in both the fourth and fifth columns, where the mixed and
subpanel pixels were simulated according to the legends in Fig. 6.2. Thus, a total of
100 pure pixels (80 in the first column and 20 in the second column), referred to as
endmember pixels, were simulated in the data by the five endmembers A, B, C, K,
and M. An area marked “BKG” in the upper right corner of Fig. 1.14a was selected
to find its sample mean, that is, the average of all pixel vectors within the “BKG”
area, denoted by b and plotted in Fig. 1.14b, to be used to simulate the background
(BKG) for the image scene with a size of 200 200 pixels in Fig. 6.2. The reason
for this background selection is empirical since the selected “BKG” area seemed
more homogeneous than other regions. Nevertheless, other areas could also be
selected for the same purpose. This b-simulated image background was further
100%
A
B
corrupted by an additive noise to achieve a certain level of signal-to-noise ratio

(SNR), which was defined in Harsanyi and Chang (1994) as a 50 % signature (i.e.,
reflectance/radiance) divided by the standard deviation of the noise. Once the target
pixels and background are simulated, two types of target insertion, referred to as
target implantation (TI) and target embeddedness (TE), can be designed to simulate
experiments for various applications. Two types of six anomaly detectors, correla-
tion matrix R-based anomaly detectors, R-AD, CR-AD, and RT-CR-AD, and
covariance matrix K-based anomaly detectors, K-AD, CK-AD, and RT-CK-AD,
are evaluated for detection performance and CPT.
6.5.1 Target Implanted
The first type of target insertion is TI, in which the previously mentioned 130 panel
pixels are inserted into the image by replacing their corresponding background
pixels. Thus, the resulting synthetic image has clean panel pixels implanted in a
noisy background with an additive Gaussian noise of SNR ¼ 20:1 for this scenario,
as shown in Fig. 6.3a. Figure 6.3b–g shows the results of traditional R-AD and
K-AD, along with their causal and real-time causal versions in terms of detected
abundance fractions, where the value of x is represented by decibel (db), that is,
20 log10x with x being the detected abundance fractions to enhance visual assess-
ment. As we see, all versions of anomaly detector performed very comparably,
except for the different degrees of background suppression resulting from the use of
global and causal correlation matrices.
Since CR-AD and RT-CR-AD use the same updating autocorrelation matrix R
(n), they both produce exactly the same detection performance with the only
difference in CPT resulting from whether or not the causal innovation information
update equation (6.22) is implemented. Similarly, this is also true of CK-AD and
Fig. 6.3 Detection results for TI with detected abundance fractions in db: (a) 50th band of TI
scenario, (b) K-AD, (c) CK-AD, (d) RT-CK-AD, (e) R-AD, (f) CR-AD, and (g) RT-CR-AD
Fig. 6.4 CK-AD detection results for TI with detected abundance fractions in db: (a) no panels
detected, (b) row 1 panels detected, (c) row 2 panels detected, (d) row 3 panels detected, (e) row
4 panels detected, and (f) row 5 panels detected
RT-CK-AD. Thus, Figs. 6.4, 6.5, 6.6, and 6.7 show the detection results of the
correlation matrix K-based real-time causal anomaly detectors CK-AD and RT-
CK-AD and the covariance matrix R-based anomaly detectors, CR-AD and
RT-CR-AD, respectively.
Fig. 6.5 RT-CK-AD detection results for TI with detected abundance fractions in db: (a) no
panels detected, (b) row 1 panels detected, (c) row 2 panels detected, (d) row 3 panels detected, (e)
row 4 panels detected, and (f) row 5 panels detected
Fig. 6.6 CR-AD detection results for TI with detected abundance fractions in db: (a) no panels
6.5.2 Target Embedded
The second type of target insertion is TE, which is the same as TI described earlier
except for the way panel pixels are inserted. Background pixels are not removed to
Fig. 6.7 RT-CR-AD detection results for TI with detected abundance fractions in db: (a) no
Fig. 6.8 Detection results for TE with detected abundance fractions in db: (a) 50th band of TE
scenario, (b) K-AD, (c) CK-AD, (d) RT-CK-AD, (e) R-AD, (f) CR-AD, and (g) RT-CR-AD
accommodate the inserted panel pixels as in TI but are rather superimposed with the
inserted panel pixels. Thus, in this case, the resulting synthetic image shown in
Fig. 6.8a has clean panel pixels embedded in a noisy background with the same
additive Gaussian noise as TI. The same experiments conducted for TI in Sect. 6.5.1
were repeated for the TE scenario. Figure 6.8b–g shows the detection results
produced by K-AD and R-AD, along with their causal and real-time causal
Fig. 6.9 CK-AD detection results for TE with detected abundance fractions in db: (a) no panels
Fig. 6.10 RT-CK-AD detection results for TE with detected abundance fractions in db: (a) no
versions, where their performances are close with various degrees of background
suppression due to the use of global and causal correlation matrices. Figures 6.9,
6.10, 6.11, and 6.12 show the detection results of correlation matrix K-based real-
time causal anomaly detectors CK-AD and RT-CK-AD and covariance matrix
R-based anomaly detectors CR-AD and RT-CR-AD, respectively.
Fig. 6.11 CR-AD detection results for TE with detected abundance fractions in db: (a) no panels
Fig. 6.12 RT-CR-AD detection results for TE with detected abundance fractions in db: (a) no
Despite the fact that causal and real-time causal anomaly detectors produce exactly the
same detection results, their required CPTs are different. The computer environments
used for experiments were a 64-bit operating system on an Intel i5-2500 3.3 GHz CPU
T 1
Fig. 6.13 CPT of CR-AD, RT-CR-AD,
n CK-AD,
onRT-CK-AD required
o for TI: (a) rnrn , (b) R (n),
(c) rnT ½Rðn 1Þ1 rn , and (d) ½Rðn 1Þ1 rn rnT ½Rðn 1Þ1
and 8 GB RAM. Figures 6.13a–d and 6.14a–d plot the CPT of calculating R1(n), rnrTn ,
n on o
rnT ½Rðn 1Þ1 rn , and ½Rðn 1Þ1 rn rnT ½Rðn 1Þ1 required for per pixel
vector processing by running four anomaly detectors, CR-AD, RT CR-AD, K-AD, and
RT-CK-AD, on TI and TE scenarios respectively where the x- and y-axes represent the
order of pixels being processed, that is, the nth pixel vector with n varying from 189 to
4000 and the CPT required to process the nth pixel vector. Comparing Figs. 6.13a, c
and 6.14a, c to Figs. 6.13b, d and 6.14b, d reveals that the computational complexity
requiring computing a matrix inverse in 103 s is one order higher than computing an
inner product in 104 s.
As shown in Figs. 6.13 and 6.14, the processing time of running the causal
anomaly detector, CR-AD/CK-AD and RT-CR-AD/RT-K-AD, on both the TI and
TE scenarios is nearly constant for each pixel vector. Thus, if an anomaly detector
is run on the entire image data, the CPT must be calculated by processing all image
pixel vectors. In this case, the CPT required for a causal anomaly detector, C-AD or
Fig. 6.14 CPT of CR-AD, RT-CR-AD, n CK-AD, RT-CK-AD

on requiredo for TE: (a) rnrTn , (b) R1
(n), (c) rnT ½Rðn 1Þ rn , and (d) ½Rðn 1Þ rn rnT ½Rðn 1Þ1
1 1
X40000
CK-AD, is CPT(OP(L ))+ n¼189
½CPTðMIðnÞÞ þ ðn þ 1ÞCPTðIPðLÞÞ , where
CPT(MI(n)) and CPT(IP(L )) are the CPT required to process finding the matrix
inverse, R1(n)/K1(n), and inner product of two L-dimensional vectors respec-
tively and the CPT(OP(L )) is used to calculate the matrix outer product, rnrTn of the
new nth L-dimensional input vector rn. On the other hand, the CPT required for a
real-time causal anomaly detector, RT-CR-AD or RT-CK-AD, is ð40000 189Þ

ðL þ 1ÞCPTðIPðLÞÞ þ CPT OPðLÞ , plus the CPT of processing the initial condi-
tion, CPT(MI(L )) where CPT(OP(L ) is the CPT required to calculate the matrix
outer product of two L-dimensional vectors. The results of various CPTs in seconds
required by running CR-AD, CK-AD, RT-CR-AD, and RT-CK-AD on complete
images of scenarios TI and TE are tabulated in Table 6.2.
One final comment on the performance in Figs. 6.3 and 6.8 is worthwhile.
Plotting the areas under their receiver operating characteristic (ROC) curve (Poor
1994) shows that they are all nearly close to one. However, their detection maps are
180
Table 6.2 CPT for causal and real-time causal versions of R-AD and K-AD
CPT CR-AD/CK-AD RT-CR-AD/RT-CK-AD
Initial condition CPT(R(L )) ¼ LCPT(OP(L )) CPT(R(L ) ¼ LCPT(OP(L )) + CPT(MI(L )) (R1(L )/K1(L ))
TI TE TI TE
0.01375 s 0.02111 s 0.0150 s 0.02882 s
n on o
CPT/pixel n > L CPT (rn(rn)T) 1 T 1
CPT ½Rðn 1Þ rn rn ½Rðn 1Þ þ 3CPTðxT AyÞ
TI TE
0.0001075 s 0.00009723 s TI TE
CPT(R1(n)/K1(n)))
TI TE 0.0004898 s 0.0004650 s
0.001222 s 0.001215 s
XN
Total CPT CPTðOPðnÞÞ þ CPTðMIðnÞÞ (N L ){[3(L + 1)CPT(IP(L ))} + CPT(OP(L ))]}
n¼L
TI TE TI TE
54.0114 s 53.3096 s 20.8528 s 6.8661 s
6 Real-Time RHSP for Passive Target Detection: Anomaly Detection
actually quite different in terms of background suppression by visual inspection.

This is mainly due to the use of different sample covariance/correlation matrices
implemented by various anomaly detectors. This phenomenon will be particularly
evident in the HYperspectral Digital Imagery Collection Experiment (HYDICE)
experiments conducted in Sect. 6.6.
Two real hyperspectral image scenes were specifically selected for experiments to
conduct a performance evaluation of AD.
6.6.1 AVIRIS Data
An Airborne Visible Infrared Imaging Spectrometer (AVIRIS) image data set was
used for the experiments shown in Fig. 6.15, which is the Lunar Crater Volcanic
Field (LCVF) shown in Fig. 1.8 located in Northern Nye County, Nevada. Atmo-
spheric water bands and low SNR bands were removed from the data, reducing the
image cube from 224 to 158 bands. The image in Fig. 6.15 has a 10 nm spectral
resolution and 20 m spatial resolution. There are five targets of interest, the radiance
spectra of red oxidized basaltic cinders, rhyolite, playa (dry lake), vegetation, and
shade. This scene is of particular interest because there is a two-pixel-wide anomaly
located at the left top edge of the crater.
Figure 6.16 shows the final detection maps in db produced by six anomaly
detectors, K-AD, R-AD, CK-AD, CR-AD, RT-CK-AD, and RT-CR-AD, where
all of them were able to detect the two-pixel-wide anomaly.
Figures 6.17, 6.18, 6.19, and 6.20 show progressive real-time causal processing
of CK-AD, CR-AD, RT-CK-AD, and RT-CR-AD in six progressive stages where
vegetation
cinders
shade anomaly
rhyolite
dry lake
Fig. 6.15 AVIRIS LCVF subscene

Fig. 6.16 Detection maps of LCVF with detected abundance fractions in db: (a) K-AD, (b)
CK-AD, (c) RT-CKAD, (d) R-AD, (e) CR-AD, and (f) RT-CR-AD
Fig. 6.17 CK-AD detection results with detected abundance fractions in db: (a) vegetation, (b)
cinders, (c) playa and anomaly detected, (d) shade, (e) rhyolite, and (f) RT-CK-AD
the detected abundance fraction maps are displayed in db for better visual
assessment.
Interestingly, as the detection process progresses (Figs. 6.17, 6.18, 6.19, and
6.20), different levels of background suppression could also be witnessed. This was
particularly evident when the background was significantly suppressed once the
process detected the anomaly. This was because the detected abundance fraction of
the anomaly was so strong that the previously detected background information was
Fig. 6.18 CR-AD detection results with detected abundance fractions in db: (a) vegetation, (b)
cinders, (c) playa and anomaly detected, (d) shade, (e) rhyolite, and (f) CR-AD
Fig. 6.19 RT-CK-AD detection results with detected abundance fractions in db: (a) vegetation,
(b) cinders, (c) playa and anomaly detected, (d) shade, (e) rhyolite, and (f) RT-CK-AD
overwhelmed by the anomaly. This is a very good example demonstrating the issue
of background suppression in AD, which will be further discussed in Sect. 6.6.3.
To further evaluate computational complexity, Fig. 6.21 plots the averaged CPT
[104 s for (a, c) and103 s for (b, d)] of running CR-AD, RT-CR-AD, CK-AD,
and RT-CK-AD on LCVF five timesnto compute various on individual operations,
o
rnrTn , R1(n), rnT ½Rðn 1Þ1 rn , and ½Rðn 1Þ1 rn rnT ½Rðn 1Þ1 , where
the x- and y-axes represent the order of pixels being processed, that is, the nth pixel
Fig. 6.20 RT-CR-AD detection results with detected abundance fractions in db: (a) vegetation,
(b) cinders, (c) playa and anomaly detected, (d) shade, (e) rhyolite, and (f) RT-CR-AD
Fig. 6.21 CPT of CR-AD, RT-CR-AD, nCK-AD, RT-CK-AD on required foro LCVF: (a) rnrTn ,
(b) R (n), (c) rn ½Rðn 1Þ rn , and (d) ½Rðn 1Þ rn rn ½Rðn 1Þ1
1 T 1 1 T
vector in the data and CPT required to process the nth pixel vector. It is clearly
shown that the CPT resulting from using the recursive update equations (6.6) and
(6.8) nearly constant for each image pixel vector. This implies that the CPT is
indeed linearly increased with the number of data sample vectors processed. Similar
to Figs. 6.13 and 6.14, it also required one order higher to invert a matrix (in 103 s)
than to compute the inner product (in 104 s).
6.6.2 HYDICE Data
The HYDICE image scene in Fig. 1.10 is used for experiments and reproduced in
Fig. 6.22a, b. It has a size of 200 74 pixel vectors shown in Fig. 6.22a, along with
its ground truth provided in Fig. 6.22b, where the center and boundary pixels of
objects are highlighted by red and yellow, respectively.
The upper part in Fig. 6.22b contains fabric panels with sizes of 3, 2, and 1 m2
from the first column to the third column. Since the spatial resolution of the data is
1.56 m2, the panels in the third column are considered subpixel anomalies. The
lower part in Fig. 6.22c contains different vehicles with sizes of 4 8 m (the first
four vehicles in the first column) and 6 3 m (the bottom vehicle in the first
column) and three objects in the second column (the first two have a size of
two pixels and the bottom one has a size of three pixels), respectively. In this
particular scene, there are three types of artificial targets with different sizes, small
targets (panels of three different sizes, 3, 2, and 1 m2) and large targets (vehicles of
two different sizes, 4 8 m and 6 3 m and three objects two pixels and three
pixels in size) to be used to validate and test AD performance.
There are several advantages of using this HYDICE image scene in Fig. 6.22a.
One is that the ground truth provides precise spatial locations of all artificial target
pixels, which allows us to evaluate the real-time processing performance of AD
pixel by pixel, a task that has not been explored before now. Second, the provided
ground truth enables us to perform ROC analysis for AD via ROC curves of the
detection rate versus false alarm rate. Third, the scene has objects of various sizes
that can be used to evaluate the ability of an anomaly detector to detect anomalies of
different sizes, an issue that has not received sufficient treatment in many reports.
Fourth, this scene can be processed by operating the same anomaly detector on
three different image sizes (Fig. 6.22a–c) (i.e., a 15-panel scene 64 64 pixel
vectors in size, marked by an upper rectangle, and a vehicles + objects scene of
size 1000 64 pixel vectors marked by a lower rectangle, and the entire scene
containing 15 panels and vehicles + objects) to evaluate the effectiveness of its
performance. Finally and most importantly, the clean natural background and
targets make it easier to see various degrees of background being suppressed by
an anomaly detector.
Fig. 6.22 HYDICE scene with ground truth map of spatial locations of 15 panels, 5 vehicles and
3 objects. (a) HYDICE scene with ground truth map of spatial locations of 15 panels, 5 vehicles
and 3 objects, (b) 15 panels, and (c) vehicles + object scene with ground truth map of 5 vehicles
and 3 objects
6.6.2.1 Real-time Causal Processing
To see how causal and real-time causal AD perform, Figs. 6.23, 6.24, 6.25, and 6.26
show the real-time causal processing of CK-AD, CR-AD, RT-CK-AD, and RT-CR-
AD on Fig. 6.22a with their detected abundance fractions in db, where each pass
shows a real-time detection map of different targets.
Since panels b and c in Fig. 6.22 are part of the scene in Fig. 6.22a, the results of
real-time processing of these two subscenes are not included here. Nevertheless,
their detection results will be discussed in detail in the following two subsections. It
is worth noting that to avoid the singularity problem of calculating the inverse of the
sample correlation/covariance matrix used by anomaly detectors, an anomaly
detector will not begin to operate until it collects a sufficient number of initial
data sample vectors, which is the total number of spectral bands of the image to be
processed. It should be noted that K-AD and R-AD are not included in the
experiments because they are neither causal nor real time. By visually inspecting
the results in Figs. 6.23, 6.24, 6.25, and 6.26, sample correlation matrix R-based
causal and real-time causal anomaly detectors seemed to perform slightly better that
than their K-based counterparts in terms of panel pixel detection. Interestingly, the
conclusion is reversed if the detection of vehicles is of major interest. This obser-
vation is further confirmed by the following ROC analysis. Since the CPT for this
scene is very similar to that in Fig. 6.21, its plots are not included here.
Fig. 6.23 CK-AD with detected abundance fractions in db

Fig. 6.24 CR-AD with detected abundance fractions in db
Fig. 6.25 RT-CK-AD with detected abundance fractions in db

Fig. 6.26 RT-CR-AD with detected abundance fractions in db
6.6.2.2 Detection Performance and 3D ROC Analysis
Using the ground truth provided by Fig. 6.22, we can perform a quantitative study
via ROC analysis. In doing so, an idea similar to that proposed in Chang and Ren
(2000) and Chang (2003) can be derived by converting real values to hard decisions
as follows.
Assume that δAD(r) is the detected abundance fraction obtained by operating an
anomaly detector on a data sample vector r. We then define a normalized detected
abundance fraction ^δ
AD
ðrÞ by
normalized
^δ AD ðrÞ minr^δ AD ðrÞ

^δ AD
normalized ðrÞ ¼ : ð6:25Þ
maxr^δ ðrÞ minr^δ ðrÞ
AD AD
More specifically, ^δ normalized ðrÞ in (6.25) can be regarded as a probability vector that
AD
calculates the likelihood of the data sample vector r to be detected as an anomaly

according to its detected abundance fraction, δAD(r). By virtue of (6.25), we can
develop an abundance percentage anomaly converter (APAC), with a% as a
thresholding criterion, referred to as a%APAC, χ a % APAC(r) similar to one pro-
posed in Chang (2003) and Ren and Chang (2000), as follows:
8
< 1, if ^δ AD a
normalized ðrÞ τ ¼ ,
χ a%APAC ðrÞ ¼ 100 ð6:26Þ
:
0, otherwise:
If ^δ normalized ðrÞ in (6.26) exceeds τ ¼ a%=100, then the r will be detected as an

AD
anomaly. Thus, a “1” produced by (6.26) indicates that pixel r is detected as an

anomaly; otherwise, it is considered a background pixel.
In the context of (6.26), we consider the Neyman–Pearson detection theory for a
binary hypothesis testing problem to perform signal detection (Poor 1994), where
^δ AD
normalized ðrÞ in (6.25) can be used as a Neyman–Pearson detector to perform the
ROC analysis as a performance evaluation tool. For example, for a particular
threshold τ, a detection probability/power, PD, and a false alarm probability, PF,
can be calculated. Varying the threshold τ ¼ a%=100 in (6.26), we can produce an
ROC curve of PD versus PF and further calculate the area under the ROC curve for
quantitative performance analysis. Interestingly, the threshold τ is absent in the
traditional ROC curve. But according to (6.26), the values of PD and PF are actually
calculated through τ. To address this issue, a three-dimensional (3D) ROC anal-
ysis was recently developed in Chang (2010, 2013), where 3D ROC curves can be
generated by considering PD, PF, and τ as three parameters, each of which
represents one dimension. In other words, a 3D ROC curve is a 3D curve of
(PD,PF,τ), from which three two-dimensional ROC curves can also be generated,
that is, a 2D ROC curve of (PD,PF), which is the traditional ROC curve discussed
in Poor (1994), along with two other new 2D ROC curves, a 2D ROC curve of
(PD,τ) and a 2D ROC curve of (PF,τ). Figure 6.27a–d plots 3D ROC curves along
with their three corresponding 2D ROC curves produced by six AD algorithms,
K-AD, R-AD, CK-AD, CR-AD, RT-CK-AD, and RT-CR-AD for the three image
scenes in Fig. 6.22a–c, an entire image scene,a 15-panel scene, and a vehicles
+ objects scene.
To perform quantitative analysis, we further calculate the area under curve
(AUC), denoted by Az, for each of the 2D ROC curves in Fig. 6.27b–d by six
AD algorithms, K-AD, R-AD, CK-AD, CR-AD, RT-CK-AD, and RT-CR-AD,
and their results are tabulated in Tables 6.3, 6.4, and 6.5, where the best
results are highlighted and the results of two global anomaly detectors, K-AD
and R-AD, are also included for comparison. For the 2D ROC curves of (PD,
PF) and (PD,τ), the higher the value of Az, the better the detector. Conversely,
for the 2D ROC curves of (PF,τ), the lower the value of Az, the better the
detector.
Based on Tables 6.3, 6.4, and 6.5, the best AD performance varies with image
size, even the same targets present in the three image scenes in Fig. 6.22a–c. For
example, the same 15 panels appear in Figure 6.22a, b, but the best anomaly
detector was different in terms of Az calculated for 2D ROC curves of (PD,PF)
and (PD,τ) in Tables 6.3 and 6.4, that is, K-AD for the entire image and R-AD for
the 15-panel scene. On the other hand, for the same five vehicles and three objects
Fig. 6.27 3D ROC curve and its three corresponding 2D ROC curves. (a) 3D ROC curve of (PD,
PF,τ) for 15-panel, vehicle, and entire scenes; (b) 2D ROC curve of (PD,PF); (c) 2D ROC curve of
(PD,τ); and (d) 2D ROC curve of (PF,τ)
in Fig. 6.22a, c, the best anomaly detector was real-time or causal K-AD for the
vehicles scene in Table 6.5 according to the values of Az calculated for the 2D ROC
curves of (PD,PF) and (PD,τ). Interestingly, for all three image scenes, the best
anomaly detector to produce the smallest Az of (PF,τ) was real-time/causal K-AD/
R-AD. This indicates that a smaller Az of (PF,τ) implies less background suppres-
sion. Furthermore, a higher Az of (PD,PF) does not necessarily imply a higher Az of
(PF,τ), as shown in Tables 6.4 and 6.5. Unfortunately, two such pieces of informa-
tion are not provided by the traditional 2D ROC analysis, Az of (PD,PF). These
experiments demonstrate the utility of 3D ROC analysis via three 2D ROC curves
generated from a 3D ROC curve, where AD performance can be analyzed through
interrelationships among PD, PF, and the threshold τ via three 2D ROC curves
plotted based on three different pairs, (PD,PF), (PD,τ), and (PF,τ).
6.6.3 Background Suppression
In general, AD performance is evaluated based on its detection rates or ROC

analysis, as demonstrated earlier in Sect. 6.6.3. However, since AD is carried out
without prior knowledge or ground truth, there is no way to use ROC analysis to
conduct performance evaluation. It must rely on visual inspection, which becomes
the only means of evaluating AD performance. In this case, background suppression
has an impact on visual inspection and is crucial for AD, as already demonstrated in
Figs. 6.17, 6.18, 6.19, and 6.20 for the LCVF scene, where the two-pixel-wide
anomaly dominates the entire detection process. In other words, if we consider
background the null hypothesis, H0, versus targets as the alternative hypothesis,
H1, in a binary hypothesis testing problem, 3D ROC analysis dictates the behavior of
a detector in terms of detection rate, PD, and false alarm rate, PF, versus the threshold
τ. That is, a better target detection produces a higher Az of (PD,τ) as well as a higher
Az of (PF,τ) as a false alarm probability and, thus, also results in better background
suppression, which indicates poor background detection according to a binary
hypothesis testing formulation. Unfortunately, to the authors’ best knowledge, the

issue of background suppression has not been explored or investigated. This
HYDICE image data offer an excellent opportunity to look into the issue and further
demonstrate that an anomaly detector with a high detection rate may generate a
higher false alarm rate, which in turn may have more background suppression. But
does it imply that better background suppression gives rise to better AD? To
illustrate this phenomenon, Fig. 6.28c–f shows detected abundance fraction maps
of three different scenes generated by completing the real-time processing of
CK-AD, CR-AD, RT-CK-AD, and RT-CR-AD. In this case, we also include the
detected abundance fraction maps produced by the global anomaly detectors K-AD
and R-AD in Fig. 6.28a, b for comparison. By examining the abundance fractions
detected by the six anomaly detectors, there is no appreciable visual difference
among all the results. However, if we display the original detected abundance
fractional values in db [i.e., 20(log10x), with x being the original detected abundance
fraction], as shown in Fig. 6.29, it turns out that Fig. 6.29 in db provides better visual
inspection and assessment than does Fig. 6.28.
Figure 6.29 suggests that all six anomaly detectors performed comparably in the
detection of targets, but the global anomaly detectors K-AD and R-AD had better
background suppression than their real-time and causal counterparts in terms of
suppressing grass surrounding panels, vehicles, and objects. This certainly made
sense. Since a global anomaly detector utilizes the global spectral correlation
provided by the sample correlation/covariance matrix of all the image data, it
performs better background suppression than any local anomaly detector, as
expected. However, on many occasions, when no prior knowledge is available,
background information may help image analysts perform better data analysis
because background generally provides crucial information surrounding anomalies.
If background suppression is overdone, we may have no clue about anomalies. For
example, in Fig. 6.28, anomalies were detected with very clean background sup-
pression, in which case we have no idea what these anomalies are; all we know are
their spatial locations. But if we look closely at Fig. 6.29, the background has a tree
line along the left edge, panels were placed on grass, while vehicles were actually
Table 6.3 Values of areas under three 2D ROC curves, Az, produced by six algorithms (entire
panels + vehicles scene)
Algorithm K-AD CK-AD RT-CK-AD R-AD CR-AD RT-CR-AD
Az of (PD,PF) 0.9886 0.9818 0.9819 0.9840 0.9747 0.9747
Az of (PD,τ) 0.2368 0.1372 0.1372 0.2349 0.1356 0.1356
Az of (PF,τ) 0.0193 0.0144 0.0144 0.0199 0.0145 0.0145
Table 6.4 Values of areas under three 2D ROC curves, Az, produced by six algorithms (15-panel
scene)
Az of (PD,PF) 0.9898 0.9680 0.9683 0.99 0.9691 0.9691
Az of (PD,τ) 0.3329 0.2590 0.2590 0.3342 0.2596 0.2596
Az of (PF,τ) 0.0428 0.0372 0.0372 0.0433 0.0377 0.0377
Table 6.5 Values of areas under three 2D ROC curves, Az, produced by six algorithms (vehicles
scene)
Az of (PD,PF) 0.9751 0.9776 0.9776 0.9669 0.9662 0.9662
Az of (PD,τ) 0.2172 0.1307 0.1307 0.2150 0.1294 0.1294
Az of (PF,τ) 0.0332 0.0221 0.0221 0.0333 0.0222 0.0222
parked in a dirt field. Background suppression is particularly true for medical

imaging, where background detection is interpreted as tissue anatomical structures,
which assist doctors greatly in making diagnoses.
This section calculated the computing time in seconds required by running two
causal anomaly detectors, CK-AD and CR-AD, and two real-time causal anomaly
detectors, RT-CK-AD and RT-CR-AD, on LCVF and three HYDICE scenes by
averaging five different runs where the computer environments used for experi-
ments was 64-bit operating system on an Intel i5-2500 3.3 GHz processor, , and
8 GB RAM. Their results are tabulated in Table 6.6 and show interesting findings.
The time required by causal AD to compute initial conditions for a vehicle scene
was greater than that for an entire scene that included a vehicle scene as a subscene.
Similarly, this is also true of the real-time causal AD, where the computing time for
the panel scene is greater than the scene with panels + vehicles. This is because
none of the causal and real-time causal anomaly detectors starts processing data
until sufficient data sample vectors are collected to compute initial conditions, such
as the total number of spectral bands, to avoid an ill-ranking issue in calculating
Fig. 6.28 Detection maps with detected abundance fractions: (a) K-AD, (b) R-AD, (c) CK-AD,
(d) CR-AD, (e) RT-CK-AD, and (f) RT-CR-AD
sample correlation/covariance matrix. In this case, all the algorithms use the same
set of 169 image pixels to calculate their initial conditions, and their algorithmic
structures determine the computational complexity.
Fig. 6.29 Detection maps

with detected abundance
fractions in db: (a) K-AD,
(b) R-AD, (c) CK-AD, (d)
CR-AD, (e) RT-CK-AD,
and (f) RT-CR-AD
As we see from Table 6.6, a real-time causal anomaly detector generally runs
two or three times faster than its causal counterpart while showing the same
performance. Table 6.7 also tabulates the computing time required by two global
anomaly detectors, K-AD and R-AD, to process all the image data.
198
Table 6.6 CPT in seconds for CR-AD/CK-AD and RT-CR-AD/RT-CK-AD

CPT CR-AD/CK-AD RT-CR-AD/RT-CK-AD
Initial CPT(R(L )) ¼ LCPT(OP(L )) CPT(R(L ) ¼ LCPT(OP(L )) + CPT(MI(L )) (R1(L )/K1(L ))
condition LCVF Vehicles Panels Vehicles LCVF Vehicles Panels Vehicles
+ panels + panels
0.03075 0.009483 0.004943 0.01503 0.01965 0.003772 0.01040 0.002998
n on o
CPT/pixel CPT (rn(rn)T) CPT ½Rðn 1Þ1 rn rnT ½Rðn 1Þ1 þ 3CPTðxT AyÞ
n>L LCVF Vehicles Panels Vehicles
+ panels
9.224 1005 6.227 1005 5.4498 1005 6.026 1005 LCVF Vehicles Panels Vehicles
1 1 + panels
CPT(R (n)/K (n))
LCVF Vehicles Panels Vehicles 3.398 1004 3.171 1004 3.198 1004 3.22 1004
+ panels
8.737 1004 9.9041 1004 9.6491 1004 9.931 1004
XN
Total CPT CPTðOPðnÞÞ þ CPTðMIðnÞÞ (N L ){[3(L + 1)CPT(IP(L ))} + CPT(OP(L ))]}
n¼L
LCVF Vehicles Panels Vehicles LCVF Vehicles Panels Vehicles
+ panels + panels
39.426 15.1476 4.2796 7.9846 6.754 4.4624 1.3203 6.3856
6 Real-Time RHSP for Passive Target Detection: Anomaly Detection
6.7 RT-AD Using BIL 199
Table 6.7 CPT for K-RDX and R-AD in seconds

K-AD R-AD
K1 R1
+ panels + panels
0.108553 0.036878 0.015634 0.018485 0.115556 0.035656 0.013076 0.026824
(rn μ)TK1(rn μ) (rn)TR1(rn)
+ panels + panels
0.190389 0.069578 0.024734 0.050299 0.120606 0.067772 0.020679 0.027323
A comparison of Tables 6.7 and 6.6 may lead one to believe that the computa-
tional complexity required by a real-time causal anomaly detector is exceedingly
high. As a matter of fact, it is not if we consider that a real-time causal anomaly
detector must update and recalculate its sample correlation/covariance matrix every
time a new input data sample vector comes in, and its computational complexity is
linearly increased with the number of data sample vectors that must be processed, as
shown in Fig. 6.21, where the processing time for each pixel is nearly constant.
Thus, since a global anomaly detector only needs to calculate the sample correla-
tion/covariance matrix once for all data sample vectors, the computing time of a
real-time causal anomaly detector is supposed to be the computing time required by
a global anomaly detector multiplied by the total number of data sample vectors
being processed by it, to some extent. However, the computing time documented in
Table 6.6 is significantly less than that. This tremendous savings was mainly due to
the fact that a real-time causal anomaly detector makes use of a recursive causal
update equation specified by either (6.21) or (6.24), which uses only innovation
information provided by the current data sample vector to update the equation
without reprocessing already visited data sample vectors.
6.7 RT-AD Using BIL
Since RT-AD using the sample correlation matrix R(n) to suppress the background
is carried out in the same way as RT-CEM using R(n), a theory similar to RT-CEM
using band-interleaved-line (BIL) in Sect. 5.4 can also be developed for RT-AD
using BIL as follows.
6.7.1 RT-R-AD Using Band-Interleaved-Line
Suppose that the size of a data matrix is M N, where there are M data lines and
each line has N pixels. Let Ll be the lth data line and L(n) the data comprising all the

N , n
first n data lines, that is, LðnÞ ¼ [l¼1
n
Ll ¼ ril i¼1, l¼1 . Then RðLl Þ ¼ ð1=N Þ
XN T
r l r l , where the number of pixels in each line is N and RðLðnÞÞ ¼
i¼1 i i
Xn Xn, N T
ð1=nÞ l¼1 RðLl Þ ¼ ðnN Þ1 l¼1, i¼1 ril ril . To derive a real-time version of
AD using BIL, we need the following matrix identity in Appendix A
1
ðA þ BCDÞ1 ¼ A1 A1 B DA1 B þ C1 DA1 : ð6:27Þ
If we also let A ¼ ðn1 Þ

n RðLðn 1ÞÞ, B ¼ 1n RðLn Þ , and C¼D¼
I ¼ identity matrix
R1 ðLðn 1ÞÞ

A1 B ¼ RðLn Þ ¼ R ; ð6:28Þ
n1 n n1
then (6.27) becomes

1
ðA þ BÞ1 ¼ A1 A1 B A1 B þ I A1 : ð6:29Þ
Now, using (6.29), we can further derive

1
ðn1Þ
R1 ðLðnÞÞ ¼ n RðLðn 1ÞÞ þ RðnLn Þ
( 1 )
1
¼ ðn=ðn 1ÞÞ R ðLðn 1ÞÞ R R þI 1
R ðLðn 1ÞÞ :
n n1 n n1
ð6:30Þ
which can be implemented by a flowchart in Fig. 6.30.
Rn|n-1 Rn|n-1+I (.)-1 R-1(L(n))
I D
(n-1)-1 -1 n/(n-1)
R-1(L(n-1))
Xn Ln R(Ln)
Fig. 6.30 Flowchart of updating R1(L(n)) implementing (6.29)

According to the concept of causal processing and the recursive equation derived
in (6.26), a real-time R-AD, referred to as RT-R-AD, can be formulated as
δnRT -R(-AD ðXn Þ ¼ rT R1 ðLðnÞÞXn ¼ ðn=ðn 1ÞÞ )

1
T 1
r R ðLðn 1ÞÞXn r R T R þI 1
R ðLðn 1ÞÞXn
n n1 n n1
( " 1 #)
¼ ðn=ðn 1ÞÞ δn1 RT -R-AD
ðXn Þ rT R R þI R1 ðLðn 1ÞÞXn :
n n1 n n1
ð6:31Þ
6.7.2 RT-K-AD Using BIL
To implement RT-K-AD using BIL, we first express the sample covariance matrix
K(L(n)) of the first n data lines L(n) as
Xn XN l T
KðLðnÞÞ ¼ ð1=nN Þ1 l¼1 i¼1
r i
l
μð L ð n Þ Þ r i μ ð L ð n Þ Þ
Xn XN Xn XN
1 l l T 1
¼ ð1=nN Þ l¼1
r
i¼1 i i
r 2 ð 1=nN Þ l¼1
r
i¼1 i
l
μT ðLðnÞÞ
þ ð1=nN Þ1 μðLðnÞÞ μT ðLðnÞÞ ¼ RðLðnÞÞ μðLðnÞÞμT ðLðnÞÞ:

ð6:32Þ
Using (6.31) and (6.32) we can derive RT-K-AD using BIL as follows:
δnRTKAD ðXn Þ ¼ ðr μðLðnÞÞÞT K1 ðLðnÞÞðXn μðLðnÞÞÞ

1
R ðLðnÞÞμðLðnÞÞ μT ðLðnÞÞR1 ðLðnÞÞ
¼ ðr μðLðnÞÞÞT R1 ðLðnÞÞ þ
1 μT ðLðnÞÞR1 ðLðnÞÞμðLðnÞÞ
ðXn μðLðnÞÞÞ ¼ ðr μðLðnÞÞÞT R1 ðLðnÞÞðXn μðLðnÞÞÞ
1
T R ðLðnÞÞμðLðnÞÞ μT ðLðnÞÞR1 ðLðnÞÞ
þðr μðLðnÞÞÞ ðXn μðLðnÞÞÞ
1 μT ðLðnÞÞR1 ðLðnÞÞμðLðnÞÞ
¼ ðr μðLðnÞÞÞT ½ð1 1=nÞRðLðn 1ÞÞ1 ðXn μðLðnÞÞÞ
½ð1 1=nÞRðLðn 1ÞÞ1 ð1= nÞr ð1= nÞrT ½ð1 1=nÞRðLðn 1ÞÞ1
ðr μðLðnÞÞÞT pffiffiffi pffiffiffi
1 þ ð1= nÞrT ½ð1 1=nÞRðLðn 1ÞÞ1 ð1= nÞr
ðXn μðLðnÞÞÞ:
ð6:33Þ
Unlike RT-R-AD, using BIL where δRT - R - AD (X ) can be updated by δRT -R-AD ðX Þ
n n n1 n
RT - K - AD
recursively via (6.31), δn (Xn) specified by (6.33) cannot be updated recur-
RT -K-AD
sively by δn1 ðXn Þ. This implies that RT-K-AD does not have the same
advantage as RT-R-AD in terms of being implemented as a Kalman filter.
In analogy with the experiments conducted for the 15-panel HYDICE scene in
Sect. 5.6.1, Fig. 6.31a, b shows the final anomaly detection maps resulting from
operating RT-K-AD using BIL and RT-R-AD using BIL on the 15-panel HYDICE
data in Fig. 6.26b respectively for comparison, where detection maps are shown in
grayscale maps, color maps, and color maps in db.
Despite the fact that both RT-K-AD and RT-R-AD produced nearly the same
result with no appreciable differences in grayscale detection maps in Fig. 6.31 by
inspection, there remain some visible differences in detection of panels in rows
4 and 5 shown in color detection maps due to different levels of background
suppression. As also shown in Fig. 6.31, the color detection maps suppressed
more background than the color detection map in dB did because dB closed the
gap in gray values between the detected panel pixels and the background. As
discussed in Chaps. 5 and 16 in Chang (2016), background suppression is a crucial
factor in performance assessment when no prior knowledge is available. To further
see this, Figs. 6.32 and 6.33 show a sequence of progressive color detection maps
produced by RT-K-AD using BIL and RT-R-AD using BIL respectively along their
respective results in dB shown in Figs. 6.34 and 6.35, where the color detection
maps suppressed more background than the color detection map in dB did.
Fig. 6.31 Final detection maps of RT-AD using BIL on 15-panel HYDICE data in Fig. 6.26b. (a)
δRT - K - AD (X ), (b) δRT - R - AD (X )
n n n n
Fig. 6.32 RT-K-AD using K(L(n)) on HYDICE
Fig. 6.33 RT-R-AD using R(L(n)) on HYDICE

Fig. 6.34 RT-K-AD using K(L(n)) in db on HYDICE
Fig. 6.35 RT-R-AD using R(L(n)) in db on HYDICE

6.8 Conclusions 205
As we can from the foregoing results, progressive detection maps provide more
information than one-shot results (Fig. 6.31). For example, the panel pixels in row
1 were visibly detected in the first few passes and began to fade away when more
panel pixels were detected.
With regard to experiments for synthetic image data in Fig. 6.2, the interested
reader is referred to Li and Chang (2016), which has detailed experiments.
6.8 Conclusions
One of the most important applications in hyperspectral data exploitation is

AD. However, to see how effectively an anomaly detector can perform, real-time
processing is more practical in real-world applications, specifically in the detection
of moving targets or instantaneous targets. Most significantly, real-time processing
AD also provides an unparalleled advantage that commonly used anomaly detectors
cannot offer, that is, time-varying progressive changes in different levels of back-
ground suppression for visual assessment and evaluation. Unfortunately, a true real-
time causal processing algorithm generally does not exist if real-time processing is
interpreted to have input and output data simultaneously. However, from a practical
point of view, as long as an algorithm can process data with negligible time-
satisfying constraints imposed by specific applications, it can be viewed as a real
processing algorithm. With this interpretation many algorithms currently claimed to
be real-time processing algorithms are actually fast computational algorithms that
explore various data organizations, parallel structures, Field Programmable Gate
Array (FPGA) architectures, and so forth. Nevertheless, there is a missing element
in such real-time processing algorithms, which is causality, a very important
prerequisite in real-time processing. In other words, a real-time processing algo-
rithm must also be a causal algorithm because a real-time processing algorithm
does not have access to future inputs during the course of data processing. This is
particularly true of many anomaly detectors using local windows, which are
actually not causal at all and so are not actually real-time processing detectors
(Chap. 18). This chapter is believed to be the first work devoted to incorporating
this concept into AD. Specifically, it further derives a causal innovation information
update equation for implementing real-time causal AD. To investigate the compu-
tational complexity issue, a comprehensive comparative analysis on computer
processing time of running causal and real-time causal AD-based anomaly detec-
tors was conducted at theoretical and experimental levels. Finally, the real image
experiments conducted in Sect. 6.6 brought up an interesting and intriguing issue in
time-varying progressive background suppression, which has a substantial impact
on AD. In global AD, very little has been done with respect to background
suppression. However, as demonstrated in AD of LCVF in Fig. 6.21, real-time
processing offers a significant advantage in terms of revealing time-varying
changes in the suppression of background information as time progresses, where
various levels of background suppression produce different rates of false alarms,
and thus has a tremendous effect on visual assessment. This issue is worth pursuing
further but beyond the scope of this chapter.
Part III
Signature Spectral Statistics-Based
Recursive Hyperspectral Sample
Processing
Part III develops signature spectral statistics-based recursive hyperspectral sample

processing (RHSP) algorithms that allow data processing to be executed sample by
sample in a progressive manner but that also produce signatures in a recursive
manner. The main goal of Part III is to find new target signatures signature by
signature recursively while data processing is being carried out sample by sample
progressively. This part is completely different from Part II, which implements
RHSP algorithms in real time sample by sample in a recursive manner like a
Kalman filter but without finding signatures. In particular, these algorithms produce
new targets one at a time recursively, and their processing can also be implemented
in a progressive manner. More specifically, the algorithms developed in Part III
grow signature matrices by augmenting one target signature at a time without
reference to the causal sample correlation matrix/causal sample covariance matrix
(CSCRM/CSCVM) used in Part II. Since such grown signature matrices only
involve newly generated target signatures, while the previously known and avail-
able target signatures remain unchanged, these algorithms can also be implemented
as recursive algorithms in the sense that only newly generated target signatures are
used to update data information for processing. In other words, as mentioned in Part
II, a recursive process decomposes data information into three pieces of informa-
tion: (1) processed information obtained by processing already visited data sample
vectors, (2) new information given by data sample vectors currently being
processed, and (3) innovation information provided by new information that cannot
be obtained by the processed information. Thus, a recursive process makes use of
recursive equations to update results only through innovation information. It is this
innovation information that significantly reduces computational complexity and
processing time. In addition, this innovation information can not only be used to
generate new target signatures of interest; it can also be used to determine how
many of these generated target signatures are indeed real targets. To resolve the
issue of how to automatically terminate a recursive process in real time, a Neyman–
Pearson detector (NPD) is developed. The idea is to compute and consider the
maximal orthogonal projection (OP) leakage of each newly generated target signa-
ture into the complement subspaces linearly spanned by previous target signatures
208 Part III Signature Spectral Statistics-Based Recursive Hyperspectral Sample Processing
as a signal source and then use this signal source to formulate a binary hypothesis
testing problem. A desired NPD can be developed in a manner similar to how the
Harsanyi–Farrand–Chang (HFC) method was developed by Harsanyi et al. (1994)
to estimate the virtual dimensionality (VD), which is extended to target-specified
VD in Chap. 4, where target signatures to be specified can be generated by a
recursive process. The NPD determines whether or not a signal source specified
by the maximal OP leakage of a target signature fails the test, in which case the
considered target signature is declared to be a real target. Since the maximal OP
leakages calculated from progressively generated targets are monotonically
decreasing, the NPD generally fails in the beginning and the test is then continued
until it reaches the first time the NPD passes the test, in which case the NPD is
terminated. Because Neyman–Pearson detection is performed in conjunction with a
recursive process that generates each new target signature, it can be implemented in
real time while a new target is being generated at the same time.
Part III is mainly focused on RHSP. Chapter 7, “Recursive Hyperspectral
Sample Processing of the Automatic Target Generation Process,” extends a well-
known automatic target detection algorithm, the automatic target generation pro-
cess (ATGP), to RHSP-ATGP, which can implement ATGP recursively. Using an
idea similar to that used to derive RHSP-ATGP, Chap. 8, “Recursive Hyperspectral
Sample Processing of Orthogonal Subspace Projection,” also extends the well-
known orthogonal subspace projection (OSP) to RHSP-OSP. Similar to OP used
by both ATGP and OSP to derive recursive equations, a second application is
discussed in Chap. 9, “Recursive Hyperspectral Sample Processing of Linear
Spectral Mixture Analysis,” and Chap. 10, “Recursive Hyperspectral Sample
Processing of Maximum Likelihood Estimation,” use least-squares error (LSE) in
RHSP-LSMA and RHSP-MLE, respectively. Finally, growing simplex volume
analysis (GSVA) is considered as a third application in finding endmembers.
Chapter 11, “Recursive Hyperspectral Sample Processing of Growing Simplex
Volume Analysis” makes use of OP to derive a recursive version of SGA developed
in Chang et al. (2006) called RHSP OP-Based Simple Growing Algorithm (RHSP-
OPSGA), which not only quickly computes simplex volumes but also significantly
reduces computer processing time. With another new approach different from
RHSP-OPSGA, Chap. 12, “Recursive Hyperspectral Sample Processing of
Geometric Simplex Growing Algorithm” makes use of the Gram–Schmidt orthog-
onalization process (GSOP) to derive an RHSP-geometric SGA (RHSP-GSGA). It
turns out that RHSP-OPSGA and RHSP-GSGA are the best algorithms of all the
SGA-based variants reported in the literature in terms of computational complexity
and computing time.
Chapter 7
Recursive Hyperspectral Sample Processing
of Automatic Target Generation Process
Abstract The automatic target generation process (ATGP) described in

Sect. 4.4.2.3 is an active unsupervised subpixel target detection technique. It has
been used in a wide range of applications in hyperspectral image analysis to find
unknown targets and endmembers. Since it is a pixel-based technique, it can be very
easily implemented in real time. In addition, because it is also unsupervised, it can
be used to find unknown targets automatically without prior knowledge via a
succession of orthogonal subspace projections (OSPs) to search for potential targets
of interest in sequence. In this regard, ATGP is actually a progressive target
detection technique in the sense that it finds one target at a time in real time and
one target after another progressively. However, ATGP cannot be implemented as a
real-time process to find all targets because ATGP repeatedly implements OSPs
on growing subspaces, which are augmented by newly found targets one at a time.
This process requires a significant amount of computing time, which will grow
exponentially as the number of targets is increased. Another is that it does not
have an automatic stopping rule to terminate the process in real time. This chapter
develops a recursive version of ATGP, called recursive hyperspectral sample
processing of ATGP (RHSP-ATGP), to address these two issues. By taking
advantage of the fact that target subspaces are nested in a cascade in the sense
that previous target subspaces are always embedded as part of subsequently
generated target subspaces, RHSP-ATGP derives recursive equations to update
current target subspaces without reprocessing previously generated target sub-
spaces. As a result, it works as if it were a Kalman filter as a real-time processing
algorithm. To terminate RHSP-ATGP in real time, a Neyman–Pearson detector
based on target-specified virtual dimensionality, developed in Chap. 4, is further
developed to test each target found by ATGP in real time to determine whether
ATGP is to be terminated. This idea is similar to the Harsanyi–Farrand–Chang
method used to estimate virtual dimensionality (VD) (Harsanyi et al. Annual
meeting, proceedings of American society of photogrammetry and remote sens-
ing, Reno, 236–247, 1994a; Chang Hyperspectral imaging: techniques for spec-
tral detection and classification. Dordrecht: Kluwer Academic, 2003a; Chang and
Du, IEEE Transactions on Geoscience and Remote Sensing 42:608–619, 2004).

DOI 10.1007/978-3-319-45171-8_7
210 7 Recursive Hyperspectral Sample Processing of Automatic Target Generation Process
7.1 Introduction
The automatic target generation process (ATGP), developed by Ren and Chang
(2003), was originally designed to find targets of interest in an unsupervised manner
when no prior knowledge is available. Due to its effectiveness and simple imple-
mentation, ATGP has become a versatile technique and has also proven very useful
in many applications. For example, it has been used for anomaly detection (Chang
2003a), endmember finding (Chang and Plaza 2006; Chang 2013, 2016),
unsupervised linear spectral mixture analysis (Chang et al. 2010a), and magnetic
resonance imaging (Wang et al. 2002). Its Field Programmable Gate Array (FPGA)
design (Bernabe et al. 2011) and graphics processing unit (GPU) implementation
(Bernabe et al. 2013) have also been studied. Most recently, it was shown that many
algorithms, such as vertex component analysis (VCA) developed by Nascimento
and Bioucas-Dias (2005) and the simplex growing algorithm (SGA) developed by
Chang et al. (2006a, b, c, d), can be considered as its variants (Chang et al. 2013),
and their underlying ideas are exactly the same as that of ATGP (Chang 2013,
Chap. 11, 2016, Chap. 13; Chang et al. 2016a, b).
Although ATGP has already been proven very effective, it still has room
for improvement from a practical point of view, specifically, its repeated imple-
mentation of orthogonal subspace projections (OSPs). Assume that t1 is its initial
target obtained
by
finding
a data sample vector with maximal vector length, that is,
t1 ¼ arg maxr r , to produce an initial target subspace hU1i generated by a
matrix U0 formed by U1 ¼ ½t1 . By means of U1 ATGP finds its second target, t2,
with the maximal vector length
in the subspace orthogonal to hU1i, that is,
T T 1 T
t2 ¼ arg maxr P⊥ U1 r P ⊥
U1 r , where P⊥ U 1 ¼ I t1 t1 t1 t1 . Suppose
that t1 , t2 , . . . , tp1 are p 1 targets already found by ATGP. Forming Up1 ¼

t1 t2 tp1 as a target subspace ATGP can extract its pth target by finding a data
sample vector with the maximal vector length in a subspace orthogonal to Up1 as
follows:
T
tp ¼ arg maxr P⊥ Up1 r P ⊥

Up1 r ; ð7:1Þ
where
1
P⊥
Up1 ¼ I U p1 U T
U
p1 p1
T
Up1 ; ð7:2Þ

with Up1 defined as a target subspace linearly spanned by p 1 targets, t1 , t2 ,
. . . , tp1 , already found by ATGP. In this case, a sequence of nested embedded

target subspaces {hUpi}p can be produced as hU1 i hU2 i Up1 Up
hUk i , where target subspaces generated by targets previously found by
ATGP are embedded in and also part of the target subspaces produced by targets
subsequently found by ATGP.
7.2 Recursive Hyperspectral Sample Processing of ATGP 211
For ATGP to find a set of p targets, t1, . . ., tp it must repeatedly implement (7.1)
and (7.2) p times. It does not take full advantage of the fact that {hUpi}p is a
sequence of nested embedded subspaces where Up1 in Up ¼ Up1 tp is fixed and
remains unchanged when ATGP finds all subsequent targets beyond the p 1
targets. Second, in Ren and Chang (2003), ATGP derived a stopping measure,
called the orthogonal projection correlation index (OPCI), defined by
t1T PU⊥p1 t1 ð7:3Þ
to be used to terminate the algorithm. Unfortunately, it requires an error threshold ε

to be determined in advance, that is, t1T P⊥
Up1 t1 < ε. The selection of ε is empirical
but not automatic. Thus, this chapter addresses the aforementioned issues and
further develops a recursive version of ATGP, referred to as recursive hyperspectral
sample processing of ATGP (RHSP-ATGP), described in the following section.
7.2 Recursive Hyperspectral Sample Processing of ATGP
The key step in implementing ATGP is (7.2), which

requires repeatedly
calculating
P⊥
Up1 as p is increased. Assume that Up1 ¼ t 1 tjþ1 t p1 is a matrix of size
L ðp 1Þ made up of p 1 previously found targets. Also, let tp be a new
L-dimensional
target
vector to be added to form a new target matrix,
Up ¼ Up1 tp ¼ t1 t2 tp1 tp . Then, according to a matrix inverse identity
derived in Appendix A and Chang (2013, Eq. 12.25), the inverse of UTp Up can be
expressed as
" #1
h i1 T
Up1 Up1 T
Up1 tp
UpT Up ¼
tpT Up1 tpT tp
2 1 T 3
ð7:4Þ
T
6 Up1 Up1 þ βU#p1 tp tpT U#p1 βU#p1 tp 7
¼6
4 T
7;
5
βtpT U#p1 β
1
where U#p1 ¼ Up1
T
Up1 T
Up1 and
1
1 n h i o1
β ¼ tpT I Up1 Up1
T
Up1 T
Up1 tp ¼ tpT P⊥
Up1 tp ; ð7:5Þ
p p1
2 T 3
1
6 Up1 þ βUp1 tp tp Up1 Up1 βUp1 tp tp 7
# # T # T # T
U#p ¼ UpT Up UpT ¼ 4 T 5; ð7:6Þ
βtpT U#p1 Up1 T
þ βtpT
1
Up U#p ¼ Up1 tp UpT Up UpT
T
¼ Up1 U#p1 þ βUp1 U#p1 tp tpT U#p1 Up1
T
βUp1 U#p1 tp tpT ð7:7Þ
T
βtp tpT U#p1 Up1
T
þ βtp tpT ;

P⊥ ⊥
Up ¼ I Up Up ¼ PUp1 β u
#
ep1 u
ep1
T
2e
u p1 tpT þ tp tpT
T ð7:8Þ
¼ P⊥Up1 β u ep1 tp u ep1 tp ;
T
r T P⊥ T ⊥
Up r ¼ r PUp1 r βr u
T e
p1 tp u ep1 tp r
T 2 ð7:9Þ
¼ r T P⊥
Up1 r β r u ep1 tp ;
where tp+1 can also be found by (7.1) and (7.9) is used to find the next tp+1 target.
1
As noted, P⊥ Up1 ¼ I Up1 U #
p1 ¼ I Up1 U T
p1 Up1
T
Up1 , and tp is
ep1 ¼ Up1 U#p1 tp and tpT u
obtained by (7.1), along with u ep1
ep1 ¼ u T
tp . It should

T e
T
also be pointed out that r u p1 tp ¼ u ep1 tp r and β are scalars, where
n h i o1
tpT u ep1
ep1 ¼ u T
tp and β ¼ tpT P⊥
Up1 tp are used to account for the correlation
between Up1 and tp.
According to (7.2), P⊥ Up1 implemented by ATGP can be updated by PUp2
⊥
without recalculating P⊥ Up1 using all p 1 targets, t1 , t2 , . . . , tp1 . Also, replacing

T
t0 in (7.3) with tp yields P⊥Up1 tp P⊥ T ⊥
Up1 tp ¼ tp PUp1 tp, which can be updated by
(7.9) via P⊥ ⊥
Up1 . Thus, using (7.8) and (7.9) ATGP can update PUp1 recursively,
while ATGP can proceed to produce the new target tp progressively signature
by signature as p is increased. The resulting ATGP is called recursive hyper-
spectral sample processing of ATGP (RHSP-ATGP), and its step-by-step imple-
mentation is given as follows.
Recursive Hyperspectral Sample Processing of ATGP

• The outer loop is a successive process index using the parameter p to find a
L
growing set of target signal sources, tp p¼1 .
Find an initial target pixel vector, t1 ¼ argfmaxr rT rg. Set U1 ¼ ½t1 , and
calculate P⊥U1 .
7.2 Recursive Hyperspectral Sample Processing of ATGP 213
• Inner Loop
(a) Progressive process indexed by parameter i (running through all data
N
sample vectors fri gi¼1 ):
For p 2, find tp by maximizing riT P⊥ Up1 ri via (7.9) over all data
sample vectors, r.
T
It is worth noting that riT P⊥Up1 r i in (7.9) is identical to P ⊥
Up1 r i
T
PUp1 ri owing to the fact that PUp1 ¼ PUp1 and P⊥
⊥ ⊥ ⊥
Up1 is
T T
idempotent. That is, P⊥
Up1 ri P⊥Up1 ri ¼ ri PUp1
T ⊥
P⊥Up1 ri ¼
riT P⊥
Up1 ri .
(b) Recursive process by updating P⊥
Up :
Use (7.6) and (7.8) to update U#p and P⊥
Up via previously calculated
U#p1 and P⊥
Up1 .
• End (Inner Loop)

2. Stopping rule:
If a stopping rule is satisfied (to be discussed in Sect. 7.3), RHSP-ATGP is
terminated. Otherwise, let p p þ 1, and go to step 1(a).
• End (Outer Loop)
Figure 7.1 shows a flowchart diagram illustrating the implementation of
RHSP-ATGP, which consists of three processes: a progressive process to process
data sample vectors sample by sample progressively to calculate the maximal value
of riT P⊥ #
Up1 ri to find tp, a recursive process used to update Up and PUp , and a
⊥
L
successive process to grow target signal sources, tp p¼1 , where nVD is a stopping
rule that can be estimated by a Neyman–Pearson detector described in Sect. 7.3.
One significant advantage of RHSP-ATGP over ATGP is that no matrix inverse
calculation is required compared, whereas ATGP must compute the inverse of P⊥ Up
in (7.1) and (7.2) repeatedly as p grows. In other words, according to RHSP-ATGP,
1
the only matrix inverse that needs to be calculated the inverse of a scalar, t1T t1 ,
T 1 T 1
in the initial condition specified by P⊥ U1 ¼ I t1 t1 t1 t1 ¼ I t1T t1 t1 t1T ,
which turns out to be an outer product of t0. This reduces the computational
complexity significantly in hardware design. If real-world applications have some
partial knowledge that can be used as initial conditions, RHSP-ATGP can be
modified by replacing U1 with U1 ¼ ½t11 t12 t1intial , where t11, t12, . . ., t1intial are
the initial targets provided by prior knowledge, in which case P⊥ U1 requires
#
calculating the matrix inverse, U1 only once. Nevertheless, this is the only time
we need to calculate the matrix inverse.
Fig. 7.1 Flowchart

of RHSP-ATGP
Finally, what follows is a summary of four advantages of RHSP-ATGP over

ATGP:
1. It uses recursive update equations to update target subspaces without
reprocessing targets already obtained.
2. There is no need to invert a matrix as with ATGP for finding each new target.
7.3 Determination of Targets for RHSP-ATGP to Generate 215
3. It significantly reduces the computational complexity in hardware design

because of its recursive structure.
4. It provides an automatic stopping rule derived by the Neyman–Pearson detection
theory to terminate the algorithm, as described in the following section.
7.3 Determination of Targets for RHSP-ATGP to Generate
To terminate RHSP-ATGP, a stopping rule was proposed in Ren and Chang (2003),
where (7.3) was introduced as an error threshold for convergence. Interestingly,
we can modify the OPCI by replacing t0 in (7.3) with the pth target, tp, obtained
by (7.1) to yield a new measure,
ηp ¼ tpT P⊥
Up1 tp ; ð7:10Þ
which turns out to be exactly the same signal source under the binary hypothesis
testing problem considered in the maximal orthogonal subspace projection
(MOSP)
in Chang et al. (2011c), where the energy of the pth target signal tp, tp 2 , was used
to determine the virtual dimensionality (VD) or the rank of rare signals by the
maximal orthogonal complement algorithm (MOCA) in Kuybeda et al. (2007). The
only difference between OPCI in (7.3) and ηp in (7.10) is that the former requires a
prescribed error ε to terminate ATGP, while the latter needs a predetermined false
alarm probability to terminate RHSP-ATGP.
Following the target-specified virtual dimensionality (TSVD) developed in
Sect. 4.5.1 of Chap. 4, we assume that for each 1 p L, U0 ¼ ∅, and Up1 is
n op1
the target space linearly spanned by the p 1 targets tR-ATGP j previously
j¼1
found by RHSP-ATGP, which is exactly the space Up1 described in the RHSP-
ATGP algorithm, Up1 ¼ Up1. Then for 1 p L we can find
n o
tpRHBP-ATGP ¼ arg maxr P⊥
Up1 r
ðwhich is the same as ð7:1ÞÞ; ð7:11Þ
2
ηpRHSP-ATGP ¼ P⊥ RHSP-ATGP 2
Up1 tp ¼ maxr P⊥
Up1 r ; ð7:12Þ
where ηRHSP - ATGP in (7.12) is the maximal residual of the pth target data sample,
p
RHSP - ATGP
tp found by RHSP-ATGP leaked from < Up1 > into < Up1 >⊥, which is
the complement space orthogonal to the space < Up1 >. This ηpRHSP - ATGP is
exactly the same as vkspecified by equation
(7.11) in Kuybeda et al. (2007).
It should be noted that P⊥ U0 r ¼ rT r in (7.11) when p ¼ 1. Since ηRHSP - ATGP
p
in (7.12) calculates the maximum residual of tpRHSP - ATGP leaked into < Up1 >⊥ , it
n oL
is this sequence, ηRHSP -ATGP , that will be used as the signal source in a binary
p
p¼1
composite hypothesis testing problem formulated as follows to determine whether

or not the lth potential target candidate, tRHSP - ATGP is a true target by a detector.
p
⊥ RHSP-ATGP 2 h ⊥ RHSP-ATGP iT h ⊥ RHSP-ATGP i

P ¼ P t PUp1 tp
Up1 tp Up1 p
T ð7:13Þ
¼ tRHSP -ATGP T P⊥ P⊥ RHSP-ATGP
:
p Up1 Up1 tp
T
Since P⊥
Up1 is symmetric and idempotent, that is, P ⊥
Up1 P⊥ ⊥
Up1 ¼ PUp1 , (7.13)
is reduced to
⊥ RHSP-ATGP 2
P ¼ tRHSP-ATGP TP⊥ RHSP-ATGP
¼ ηRHSP-ATGP ; ð7:14Þ
Up1 tp p Up1 tp p
which is exactly (7.12). This signal energy can be used as a signal source in a binary
composite hypothesis testing problem (Poor 1994) to determine whether the pth
potential target candidate tpRHSP - ATGP is a true target by a detector. That is, for p
¼ 1, 2, . . . , L,

H 0 : ηRHSP-ATGP governed by a pdf p ηRHSP-ATGP H ¼ p ηRHSP-ATGP
p p 0 0 p
versus
H : ηRHSP-ATGP governed by a pdf p ηRHSP-ATGP H ¼ p ηRHSP-ATGP ;
1 p p 1 1 p
ð7:15Þ
where the alternative hypothesis H1 and the null hypothesis H0 represent two cases
of tpRHSP - ATGP as a target signal source under H1 and not a target signal source under
H0, respectively, in the sense that H0 represents the maximum residual resulting
from the background signal sources, while H1 represents the maximum residual
leaked from the target signal sources. To make (7.15) work, we need to find
probability distributions under both hypotheses. The assumptions made on (7.15)
are derived from Kuybeda et al. (2007), where H0 represents a background that can
be characterized by a Gaussian distribution and H1 represents targets of interest that
are distributed uniformly. That is, according to Kuybeda et⊥ al. (2007),
the signal
P⊥ r can be modeled by a Gaussian distribution so that P r2 is a chi-squared
Up1 Up1
distribution that can be approximated
2 by a Gaussian distribution asymptotically.

Then the maximal value of P⊥ Up1 r over r, ηRHSP-ATGP ¼ maxr P⊥ r2 via
p Up1
(7.12), can be shown by the extreme value theory (Leadbetter 1987) to converge to
a Grumbel distribution as was done in Kuybeda et al. (2007). For details on deriving
probability distributions, we refer the interested reader to Kuybeda et al. (2007).
By virtue of the Neyman–Pearson detection theory, we can derive the following
Neyman–Pearson detector, which maximizes the detection power PD subject to the
false alarm probability PF, that in turn determines the threshold value τp:
8
>
> 1, if Λ ηRHSP-ATGP
> τp
>
>
p
<
RHSP-ATGP -ATGP ¼ τ ,
δNP
RHSP-ATGP ηp ¼ 1 with probability κ, if Λ ηRHSP p
>
>
p
>
>
: 0, if Λ ηRHSP-ATGP < τ ;
p p
ð7:16Þ

where the likelihood ratio test Λ(ηpRHSP - ATGP ) is given by Λ ηpRHSP-ATGP ¼ p1

ηpRHSP-ATGP =p0 ηpRHSP-ATGP , with p0 ηpRHSP-ATGP , and p1 ηpRHSP-ATGP ,
given by (7.15), and the constant κ is the probability of saying H1. Thus, a
case of ηpRHSP-ATGP ¼ tpT P⊥ RHSP - ATGP
Up1 tp > τp indicates that δRHSP - ATGP (ηp
NP
) in
RHSP - ATGP
(7.16) fails the test, in which case t p is assumed to be a desired
target. Note that the test for (7.16) must be performed for each of the
potential target candidates. Therefore, for a different value of p, the threshold
τp varies. Using (7.16) the VD can be determined by calculating
n h io
RHSP-ATGP
VDNP
RHSP-ATGP ð P F Þ ¼ arg max δNP
p RHSP-ATGP η p ¼ 1 ; ð7:17Þ
j k
RHSP-ATGP
where PF is a predetermined false alarm probability, δNPRHSP-ATGP ηp ¼ 1,
j k
RHSP-ATGP RHSP-ATGP
only if δNP RHSP-ATGP ηp ¼ 1 and δNP RHSP-ATGP ηp ¼ 0 if

RHSP-ATGP
RHSP-ATGP ηp
δNP < 1.
A comment on (7.17) is worthwhile. The sequence of {ηRHSP - ATGP } is mono-
p
RHSP - ATGP
tonically decreasing. Thus, the sequence {δNP RHSP - ATGP (ηp )} starts with a

RHSP-ATGP
failure of the test (7.16), that is, δRHSP-ATGP ηp
NP
¼ 1 and continues as
the value of p is increased
until preaches the value where the test is passed, in
RHSP-ATGP
which case δRHSP-ATGP ηp
NP
< 1. The largest value of p makes δNP RHSP-ATGP

RHSP-ATGP
ηp ¼ 1 the value of VDRHSP - ATGP (PF) according to (7.17). This unique
NP
property allows VDNP RHSP - ATGP (PF) to be implemented in real time as the process is
continued with increasing p.
The synthetic image data shown in Fig. 7.2 (also shown in Fig. 1.15) have two
scenarios, target implantation (TI) and target embeddedness (TE), that can be used
for experiments.
100%
A
B
There are 25 panels with five 4 4 pure pixel panels for each row in the first
column, five 2 2 pure pixel panels for each row in the second column, five 2 2
mixed pixel panels for each row in the third column, and five 1 1 subpixel panels
for each row in both the fourth and fifth columns, where the mixed panel pixels and
subpanel pixels, all of which were simulated according to the legends in Fig. 7.2.
Thus, a total of 100 pure pixels (80 in the first column and 20 in the second column),
referred to as endmember pixels, were simulated in the data by the five
endmembers A, B, C, K, and M. An area marked “BKG” in the upper right corner
of Fig. 1.14a was selected to find its sample mean, that is, the average of all pixel
vectors within the “BKG” area, denoted by b and plotted in Fig. 1.14b, to be used to
simulate the background (BKG) with a signal-to-noise ratio (SNR) of 20:1, defined
in Harsanyi and Chang (1994) for an image scene with a size of 200 200 pixels in
Fig. 7.2. Once the target pixels and background are simulated, two types of target
insertion, TI and TE, can be designed to simulate experiments for various
applications.
Figure 7.3 shows 189 targets generated recursively by RHSP-ATGP for TI and
TE target by target, where numerals are used to indicate the orders in which these
targets were generated. Because different criteria were used, (7.3) for ATGP and
(7.10) for RHSP-ATGP, ATGP and RHSP-ATGP did not necessarily produce the
same targets as shown in Fig. 7.3. Nonetheless, sets of their extracted targets were
nearly identical.
Two real hyperspectral image scenes were used for experiments. The first image
data to be studied are the HYperspectral Digital Imagery Collection Experiment
(HYDICE) image scene shown in Fig. 7.4a (and Fig. 1.10a), with a size of 64 64
pixel vectors with 15 panels in the scene and the ground truth map in Fig. 7.4b
(Fig. 1.10b). It was acquired by 210 spectral bands with a spectral coverage from
0.4 to 2.5 μm. Low signal/high noise bands, bands 1–3 and 202–210, and water
vapor absorption bands, bands 101–112 and 137–153, were removed. Thus, a total
of 169 bands were used in the experiments. The spatial resolution and spectral
resolution of this image scene are 1.56 m and 10 nm, respectively.
Fig. 7.3 One hundred eighty-nine targets generated by RHSP-ATGP for TI and TE
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
of the 15 panels
Another data set is one of the most widely used hyperspectral image scenes
available in the public domain, Cuprite Mining District site in Nevada, shown in
Fig. 7.5a (and Fig. 1.6). It is an image scene with 20 m spatial resolution collected by
224 bands using 10 nm spectral resolution in the range of 0.4–2.5 μm. Since it is well
understood mineralogically and has reliable ground truth, this scene has been studied
extensively. There are five pure pixels in Fig. 7.5a, b, which can be identified as
corresponding to five different minerals, alunite (A), buddingtonite (B), calcite (C),
kaolinite (K), and muscovite (M), labeled A, B, C, K, and M in Fig. 7.5b.
Figure 7.6 shows targets generated recursively by RHSP-ATGP for TI and TE
target by target with numerals used to indicate the orders in which these targets
were generated. In general, both ATGP and RHSP-ATGP should not generate
identical sets of targets because of the different criteria used to generate targets,
(7.3) for ATGP and (7.10) for RHSP-ATGP. Interestingly, as shown in Fig. 7.6,
most of the targets extracted by ATGP and RHSP-ATGP were nearly the same
except in a few cases. In this case, if the targets generated by ATGP and
Fig. 7.5 (a) Spectral band number 170 of Cuprite AVIRIS image scene; (b) spatial positions of
five pure pixels corresponding to minerals alunite (A), buddingtonite (B), calcite (C), kaolinite (K),
and muscovite (M)
Fig. 7.6 Targets generated by ATGP and RHSP-ATGP for HYDICE and Cuprite data sets. (a)
HYDICE (169 targets) and (b) cuprite (189 targets)
RHSP-ATGP are the same, they are highlighted by yellow circles; otherwise,
targets from ATGP and RHSP-ATGP are highlighted by red circles and blue
circles, respectively, where four targets are different for the HYDICE data in
Fig. 7.6a and only one target is different for the Cuprite data in Fig. 7.6b.
7.5 Discussions on Stopping Rule for RHSP-ATGP 221
Table 7.1 nRHSP-ATGP estimated by (7.12)

PF PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
VDNP
RHSP - ATGP (PF) 4/21 4/19 3/19 3/19 3/19
7.5 Discussions on Stopping Rule for RHSP-ATGP
According to the discussions in Sect. 2.6.1.1, to terminate ATGP, the original

ATGP suggested a criterion, called OPCI specified by (7.3) which calculates the
leakage of the initial target t0 into the growing target subspace Up1 linearly
p1
spanned by tj j¼1 . If OPCI is smaller than a prescribed error threshold ε, ATGP
is terminated. However, the major problem with this approach is how to appropri-
ately select the threshold ε, which is clearly determined by various applications
empirically. The stopping rule developed in Sect. 7.3 and specified by ηp in (7.10) is
modified from (7.3), which is specially designed to calculate the maximal leakage
of the target tp to be generated into the target subspace Up1 linearly spanned by
p1
tj j¼1 . This is different from the initial target t0 used in (7.3). As shown in (7.4),
2
ηp ¼ P⊥
Up1 tp
¼ t T P⊥ tp ;
p Up1 ð7:18Þ
which is exactly the energy of the residual of tp, ηp, leaked into Up1. By means of
this ηp a recursive version of ATGP, RHSP-ATGP, can be derived and further used
to estimate ATGP-specified VD, nRHSP-ATGP, to determine how many targets are
required for RHSP-ATGP to generate via a binary hypothesis testing problem
(7.16), that is, nRHBP-ATGP ¼ VDNPRHSP-ATGP ðPF Þ, specified by (7.17).
Since noise whitening can affect VD estimation (Chang 2003a, b; Chang and Du
2004), we also ran RHSP-ATGP on two different data sets, the original data set and
a noise-whitened data set, where the noise covariance matrix was estimated using
the technique developed in Roger and Arnold (1996). Table 7.1 tabulates the values
n oL
of n RHSP-ATGP estimated by (7.17), where ηRHSP-ATGP
p in (7.12) was used as
p¼1
the signal source for the binary hypothesis testing problem specified by (7.15), with
a/b indicating with/without noise whitening.
As we can see, the estimated values of nRHSP-ATGP without noise whitening
was 19, which is very close to the 18 reported in Chang et al. (2010a, 2011b), that
is, the number of endmembers. But VD with noise whitening is three or four,
which indicates that many signals were considered noise due to their weak signal
energies.
Another legitimate criterion by which to terminate RHSP-ATGP or ATGP is to
use unmixed error as a stopping rule where the fully constrained least-squares
(FCLS) method developed by Heinz and Chang (2001) was used to perform linear
spectral data unmixing. In this case, FCLS was performed using the targets gener-
ated by RHSP-ATGP for TI and TE in Fig. 7.3 and for the HYDICE and Cuprite
a b
9 unmixing error vs the number targets 8 unmixing error vs the number targets
3 x 10 8 x 10
7
2.5
6
2
5
unmixing error
unmixing error
1.5 4
3
1
2
0.5
1
0 0
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
# of targets # of targets
c d
7 unmixing error vs the number targets 9 unmixing error vs the number targets
12 x 10 9 x 10
8
10
7
8 6
unmixing error
unmixing error
5
6
4
4 3
2
2
1
0 0
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180 200
# of targets # of targets
Fig. 7.7 Unmixing error vs. number of targets for HYDICE and Cuprite data sets. (a) TI (189), (b)
TE (189), (c) HYDICE data (115), and (d) Cuprite data (186)
data in Fig. 7.6 for data unmixing. Figure 7.7a–e plots the FCLS-unmixed errors for
the TI, TE, HYDICE, and Cuprite data sets versus the number of target signatures,
where the number of targets in parentheses is the one that yields the minimal
unmixed errors.
For TI and TE this number was 189, which means that the unmixed errors
for both TI and TE are strictly decreasing curves as the number of target
signatures grows. For HYDICE and Cuprite data, this number was 115 and
186, respectively.
It is worth noting that the unmixing error plot is different from that in Gao and
Chang (2014), where 117 yielded the minimal unmixed error (Fig. 7.8). This is
because the number of bands went from 1 to 169 here, compared to Gao and Chang
(2014), where it went from 9 to 169.
7.6 Computational Complexity of RHSP-ATGP 223
Fig. 7.8 FCLS unmixed Total error vs N using ATGP for testing purpose
x 106
error using 169 RHSP- 6
ATGP-generated targets
5
Total error
3
minimal unmixed error, N =117

2
0
0 20 40 60 80 100 120 140 160 180
N
The foregoing experiments demonstrated that the number of target signals

actually determined what value of nRHSP-ATGP should be used under the binary
hypothesis testing specified by (7.15), and RHSP-ATGP seemed to be a very
effective means of finding these targets.
7.6 Computational Complexity of RHSP-ATGP
According to (7.6), the computational complexity required by RHSP-ATGP is the

order of O(L2), while ATGP requires a computational complexity of O( p3 + Lp),
where O( p3) is required to calculate the inverse of a p p matrix in P⊥ Up (http://en.
wikipedia.org/wiki/Computational_complexity_of_mathematical_operations). Thus,
if p is much smaller than L, L will dominate p, in which case there is no visible saving in
computing (run) time, as shown in Fig. 7.9, where the y-axis represents the cumulative
computing time in seconds required to run ATGP and RHSP-ATGP on the TI, TE,
HYDICE, and Cuprite data sets from two up to the number of targets, p, indicated on the
x-axis. However, as p increased, the savings in run time by RHSP-ATGP began to be
visible.
Here we would like to point out that computing (run) time should not be
considered the same as or equivalent to computational complexity. Thus, the
computing time in Fig. 7.9 does not really reflect computational complexity. As a
matter of fact, ATGP requires higher complexity than RHSP-ATGP in the sense
that ATGP needs to calculate matrix inverses compared to RHSP-ATGP, which
only involves vector products. To demonstrate this advantage, we ran ATGP and
RHSP-ATGP for any fixed number of targets. Figure 7.10 shows a comparison in
Fig. 7.9 Comparison of cumulative running times required for running ATGP and RHSP-ATGP
on TI, TE, HYDICE, and Cuprite data sets as the number of targets from one up to numbers
indicated on x-axis
computing time between ATGP and RHSP-ATGP, where the y-axis represents their
computing time in seconds and the x-axis indicates the Nth individual target. As we
can see, for any given target the computing time required by RHSP-ATGP is nearly
constant regardless of how many signatures are used. By contrast, ATGP gradually
increases its computing time as the Nth signature grows.
According to Figs. 7.9 and 7.10, RHSP-ATGP seemed to gain more benefits on
the HYDICE data than the other data sets. Nevertheless, RHSP-ATGP does have a
significant benefit in the reduction of computational complexity, which is a major
advantage for simple and cheap hardware design. An effort to implement RHSP-
ATGP in a FPGA is currently being investigated, which will certainly improve the
work in Bernabe et al. (2011, 2013).
7.7 Conclusions 225
Fig. 7.10 Comparison of running times required for running ATGP and RHSP-ATGP on TI, TE,
HYDICE, and Cuprite data sets for each individual target as the number of targets increase from
one up to numbers indicated on x-axis
7.7 Conclusions
ATGP has shown great potential in many applications in hyperspectral data exploi-
tation, for example, automatic target recognition, supervised subpixel detection/
classification, and endmember extraction. However, when the number of targets it
produces is very large, which is indeed the case in hyperspectral imaging, ATGP
becomes slow as a result of inverting its growing large target matrices performed by
a sequence of orthogonal subspace projections. To resolve this issue, this chapter
developed RHSP-ATGP to update targets found by ATGP via specially derived
recursive equations. As a result of such recursions, RHSP-ATGP works as if it were
a Kalman filter, described in Chap. 3, to perform real-time processing by only
updating the innovation information. Most importantly, requires no matrix inver-
sion, just matrix multiplications and outer products of vectors. Another significant
advantage of RHSP-ATGP over ATGP is that RHSP-ATGP can be terminated by
an automatic stopping rule in real time rather than a prescribed error threshold, as in
Ren and Chang (2003), via OPCI, which must be determined a priori. This stopping
rule is carried out by an ATGP-specified Neyman–Pearson detector to determine
when RHSP-ATGP should stop finding new targets as the process proceeds in
real time.
Chapter 8
of Orthogonal Subspace Projection
Abstract Orthogonal subspace projection (OSP) developed by Harsanyi and

Chang (IEEE Transactions on Geoscience and Remote Sensing 32:779–785,
1994) (see Hyperspectral image: spectral techniques for detection and classifica-
tion, Kluwer Academic Publishers, New York, 2003; Hyperspectral data
processing: algorithm design and analysis, Wiley, Hoboken, 2013) has found its
potential in many hyperspectral data exploitation applications. It works in two
stages: an OSP-based projector to annihilate undesired signal sources in the first
stage, to improve background suppression so as to increase target detectability,
followed by a matched filter in the second stage, to extract the desired signal source
for target enhancement. However, for OSP to be effective it assumes that the signal
sources are provided a priori. As a result, OSP can only be used as a supervised
algorithm. In many real-world applications, there are many unknown signal sources
that can be revealed by hyperspectral imaging sensors. It is highly desirable to
extend OSP to an unsupervised version, called unsupervised OSP (UOSP) devel-
oped by Wang et al. (Optical Engineering 41:1546–1557, 2002), where the signal
sources used for OSP can be found in an unsupervised manner. An issue arising in
UOSP is how to determine the number of such found unsupervised signal sources,
which must be known in advance. This chapter further extends UOSP to progressive
OSP (P-OSP) so that P-OSP can not only generate a growing set of new unknown
signal sources one at a time progressively but can also determine the number of
unknown signal sources to be generated while OSP processing is taking place. Since
the unknown signal sources generated by P-OSP remain unchanged after they are
generated, OSP should be able to take advantage of it without reprocessing these
signal sources. This leads to a new development of a recursive version of OSP, called
recursive hyperspectral sample processing of OSP (RHSP-OSP).
8.1 Introduction
Orthogonal subspace projection (OSP) has proven very effective and useful in
many applications in hyperspectral data exploitation, for example, subpixel target
detection, mixed pixel classification/quantification, and dimensionality reduction.
It can be implemented in two stages in a cascade, annihilation of undesired/

DOI 10.1007/978-3-319-45171-8_8
228 8 Recursive Hyperspectral Sample Processing of Orthogonal Subspace Projection
unwanted signal sources in the first stage, followed by the extraction of desired
signal sources via a matched filter in the second stage. While the matched filter can
be implemented to enhance signal detectability, it is the unwanted signal source
annihilation performed in the first stage that plays a key role in the success of target
extraction. In OSP the first stage process is specially designed to accomplish this
task by using an OSP to eliminate unwanted signatures in such a way that the
background can be suppressed to improve target contrast against the background.
Consequently, the effectiveness of background suppression has a significant impact
on the performance of the follow-up matched filter.
To be more specific, assume that there are p signal signatures, m1, m2, . . . , mp
and tp is the desired signal signature of interest to be extracted without loss of
generality. To effectively extract mp we need to annihilate unwanted signal signa-
tures m1 , m2 , . . . , mp1 prior to extraction of mp. To achieve this goal, we first
form U as an unwanted signal source matrix made up of all the unwanted signal
signatures, m1 , m2 , . . . , mp1 , denoted by Up1 ¼ m1 m2 mp1 . Then we
design an unwanted signature annihilator, P⊥ U , defined in (4.19) as
P⊥
Up1 ¼ I Up1 Up1 ;
#
ð8:1Þ
1
with U#p1 being the pseudo-inverse of Up1 given by U#p1 ¼ Up1
T
Up1 T
Up1
to annihilate the interfering effect of m1 , m2 , . . . , mp1 on the detection of mp.
Once these unwanted signal signatures are annihilated, a matched filter using the
matched signal signature tp is then applied in the following stage to extract mp.
Apparently, for OSP to work effectively, finding an appropriate set of unwanted
signal signatures for P⊥ Up1 to perform unwanted signature annihilation is crucial.
Unfortunately, when OSP is implemented, the p signal signatures m1, m2, . . . , mp
are assumed to be known in advance. To address this issue, an unsupervised version
of OSP, called unsupervised OSP (UOSP), was developed by Wang et al. (2002);
it can be considered an earlier version of the automatic target generation process
(ATGP) developed by Ren and Chang (2003) and is also discussed in Sect. 4.4.2.3.
It allows OSP to repeatedly find unwanted signal sources in an unsupervised
manner so as to achieve better background suppression prior to the extraction of a
desired target signal signature. In this case, UOSP can process OSPs with growing
signatures, as opposed to the Harsanyi–Chang OSP, which requires prior knowl-
edge about signatures at a fixed number of signatures, p. Although UOSP can find
unknown signal sources, it also assumes that the number of unknown signal sources
is known and fixed, as OSP does. In many practical applications, this number is
generally not known a priori. To address this issue, this chapter further extends
UOSP to progressive OSP (P-OSP), which can progressively generate a growing set
of unknown signal sources one after another directly from the data to be processed
without prior knowledge of this number. To effectively determine how many
unwanted signal signatures should be used in Up1, a real-time automatic stopping
rule is further developed. In other words, P-OSP can also automatically determine
8.2 Difference Between OSP and Linear Spectral Mixture Analysis 229
the number of signal signatures, p, by a Neyman–Pearson detector while signals are

being generated by P-OSP without resorting to virtual dimensionality (VD), devel-
oped by Chang (2003a) and Chang and Du (2004).
According to (8.1), P⊥
Up1 accounts for most computer processing time as a result
T
of inverting the matrix Up1 Up1 . This is particularly true when the number of
unknown signal signatures, p 1, becomes large, a case that does indeed occur in
hyperspectral data. As a consequence, the computational complexity of P⊥ Up1
increases exponentially as p grows. Interestingly, the unknown signal source
matrices Up1 used in (8.1) are generated by P-OSP one at a time where the set
of previously signal sources found by P-OSP are always part of the set of subse-
quently found signal sources, that is, U1 U2 Up1 . In other words, once
signal sources are found by P-OSP, they will be fixed and remain unchanged in the
signal source matrix Up1 afterwards. With this in mind we should be able to
perform P-OSP without reprocessing these signal signatures m1 , m2 , . . . , mp2 but
rather focus on the newly found signal signature mp1. This advantage leads to the
development of a recursive version of P-OSP, referred to as recursive hyperspectral
sample processing of OSP (RHSP-OSP), which can be derived from the concept of
Kalman filtering (Chap. 3). Most importantly, RHSP-OSP can significantly reduce
computational complexity by eliminating the requirement of matrix inversions
executed by P⊥ Up1 via recursive equations without performing any matrix inversion
where such recursive equations require only inner and outer products of vectors.
More specifically, RHSP-OSP can be implemented not only to find unknown signal
sources to grow Up1 one signal source at a time progressively, as with P-OSP, but
also to process P⊥ Up1 in (8.1) recursively. That is, RHSP-OSP allows users to find
new signal sources to grow Up1 in (8.1) in a progressive manner while updating
P⊥ ⊥
Up1 by only using PUp2 and mp1 in a recursive manner at the same time.
8.2 Difference Between OSP and Linear Spectral Mixture

Analysis
Although OSP can be used to perform linear spectral mixture analysis (LSMA) via
a signal detection model, it is very important to realize its salient differences from
LSMA. The first prominent difference is that OSP is developed to detect the
abundance fraction of a particular specified desired signal signature while making
use of prior knowledge of undesired signal signatures to annihilate their interfering
effects on the detection of the desired signal signature. Thus, from an unmixing
point of view, OSP uses its detected abundance fractions as unmixed results with no
abundance constraints. In order for OSP to perform LSMA to unmix data accu-
rately, it must satisfy two abundance constraints, the abundance sum-to-one con-
straint (ASC) and abundance nonnegativity constraint (ANC). However, since OSP
only detects the abundance fraction of each individual signature present in a data
sample vector independently, it does not satisfy ASC. In addition, its detected
abundance fraction may also be negative, so it does not satisfy ANC either.
Accordingly, OSP should not be considered an abundance-constrained LSMA
technique. Nevertheless, OSP can be considered an abundance-unconstrained
1
LSMA if an error-correction term, dT P⊥ Ud in (4.21), is included in its detector
to become a least-squares OSP (LSOSP) specified by (4.32) as an abundance
estimator.
As also noted in Chang and Heinz (2000), a fully abundance-constrained LSMA
may not be as effective as OSP in terms of signal detection, where the nonnegativity
constrained least-squares (NCLS) method performed better than the fully
constrained least-squares (FCLS) method in detecting signals because the NCLS
method does not satisfy ASC. In other words, to comply with ASC, no abundance
fraction of any signature can exceed one, and the abundance fractions of all
signatures must be constrained to a range between 0 and 1. Consequently, the
more signatures, the more abundance fractions that need to be constrained to the
range from 0 to 1, thus the fewer their abundance fractions. As a result, the detected
abundance fractions of signatures are diminishing. By contrast, OSP is a detection
technique specified by (4.20) which uses the signal-to-noise ratio (SNR) as an
optimal criterion, as opposed to the least squares error (LSE) used by LSMA to
perform abundance fraction estimation via (4.21) as an estimation technique.
Accordingly, OSP can enhance signal strength as much as it can for target
detectability without complying with ASC, regardless of how many signatures
are considered. Therefore, they are completely different theories but their connec-
tion is bridged by (4.20) and (4.21).
8.3 Recursive Hyperspectral Sample Processing of OSP
The idea of RHSP-OSP arises from the need to update the unwanted signature
matrix U for signature annihilation to improve OSP performance as more unwanted
signatures are found and added to U. As noted in (8.1), the impact caused by the
effect of U significantly impairs the detectability of the desired signature d. When
the environment is unknown, the signature matrix M should be adapted and updated
as data processing progresses. This is specifically critical in real-time processing. In
this case, P⊥U in (8.1) must be repeatedly implemented as U varies. From an
algorithmic implementation point of view, this is not effective for target detection
or for real-time implementation. Thus, it is highly desirable to have OSP
implemented adaptively in an unsupervised manner so that only the new signature
m added to the signature matrix U needs to be processed without having to
reimplement U over and over again. This is because the new augmented undesired
signature matrix [U m] includes only the new signature m, while other signatures
in U remain unchanged. This section presents a new look of OSP, RHSP-OSP, to
address this issue.
8.3 Recursive Hyperspectral Sample Processing of OSP 231
8.3.1 Derivations of Recursive Update Equations
For simplicity, we assume that mp is the pth undesired signal signature to be added
to Up1 to form a new matrix, Up, and d is the desired signal signaturenot in U p.
Assume that Up is a matrix of size L p and can be written as Up ¼ Up1 mp ,
with Up1 and mp being an L ðp 1Þ matrix and an L-dimensional vector.
Then, according to a matrix inverse identity derived using Eq. (12.25) in Chang
(2003a, b), the inverse of UTp Up can be expressed as
" #1
h i1 T
Up1 Up1 UpT mp
UpT Up ¼
mpT UpmpT mp
2 1 T 3
T
Up1 Up1 þ β
U#p1 mp mpT U#p1 β
U#p1 mp
6 p p1 p p1 7
¼6
4 T
7;
5
β
mpT U#p1
p p1 p p1
ð8:2Þ
1
where U#p1 ¼ Up1
T
Up1 T
Up1 and
1 1
β
¼ mpT I Up1 Up1

T
Up1 T
Up1 mp
p p1
n h i o1
¼ mpT P⊥
Up1 mp ; ð8:3Þ
1
Up U#p ¼ Up UpT Up UpT ¼ Up1 mp
2 1 T 3
T
Up1 Up1 þ β
U#p1 mp mpT U#p1 β
U#p1 mp " U T #
6 p p1 p p1 7 p1
6
4 T
7
5
β
mpT U#p1 β
mpT
p p1 p p1
1 T
¼ Up1 Up1
T
Up1 T
Up1 þ β
Up1 U#p1 mp mpT U#p1 Up1T

p p1
T
β
T
mp mpT U#p1 Up1 β
Up1 U#p1 mp mpT

p p1 p p1
þ β
mp mpT :
p p1
ð8:4Þ
1
e p p1 ¼ Up1 U#p1 mp , then P⊥
U
If we define m Up ¼ I U U#
p p ¼ I Up U T
U
p p UpT
becomes
1

Up1 T
P⊥
Up1 Up1
¼ T T
β e e e þ T
p
p1
Up I Up1 U U
p1 p1 U p1 m p m p 2 m p m p m m
p p
T
¼ P⊥ β
e
Up1
e
Up1

p
p1
Up1 m p m p m p m p
T
¼ P⊥
U
e p p1 mp m
U
e p p1 ;
Up1 βp
p1 mp m
ð8:5Þ
^ OSP
α p ðrÞ ¼d T
P⊥
Up r

Up1 T
¼ d T P⊥
Up1
β e e
p
p1
Up1 m p m p m p m p r ð8:6Þ
T
¼ d T P⊥
U
e p p1 mp m
U
e p p1 r:
Up1 r βp
p1 d mp m
T
In case there is another new set of undesired signal signatures, Uq, instead of a
single signal signature found, mp, to be added to the signal signature matrix, Up, the
resulting unwanted signal signature matrix becomes a combined signal signature
matrix, Up+q. This situation often occurs when RHSP-OSP is implemented as
RHSP-OSP, where more undesired signal signatures, instead of a single signal
signature, are found and need to be included for undesired signal signature annihi-
lation. For example, assume that p and q are two values of undesired signal
signatures found by different algorithms, in which case we can operate RHSP-
OSP on Up and Uq independently and individually, then fuse their corresponding
results without reprocessing all p + q signal signatures, Up+q. In this case, (8.4) can
be extended as follows:
" # " T #
UpT Up Up UpT Uq
T
Upþq Upþq ¼ Up Uq ¼ ; ð8:7Þ
T
Uq UqT Up UqT Uq
" T #!1
1 Up Up UpT Uq
T
Upþq Upþq ¼
UqT Up UqT Uq
2 1 T 3
ð8:8Þ
U T
6 p p U þ U #
U
p q ΓU T
q Up
#
U#p Uq Γ 7
¼6
4 T
7;
5
Γ T UqT U#p Γ
1 1 n h i o1

where Γ ¼ UqT I Up Up Up
T
UpT Uq ¼ UqT P⊥
Up Uq ¼ ΓT ;
8.3 Recursive Hyperspectral Sample Processing of OSP 233
2 1 T 3
" #
6 UpT Up þ U#p Uq ΓUqT U#p U#p Uq Γ 7 UpT
Upþq U#pþq ¼ Up Uq 6
4 T
7
5
Γ T
UqT U#p Γ UqT
2 1 T 3

T
6 Up Up Up þ Up Uq ΓUq Up Up Up Uq ΓUq 7
T # T # T # T
¼ Up Uq 6
4 T
7
5
Γ Uq Up Up þ ΓUq
T T # T T
1 T
¼ Up UpT Up UpT þ Up U#p Uq ΓUqT U#p UpT Up U#p Uq ΓUqT
T
Uq Γ T UqT U#p UpT þ Uq ΓUqT :
ð8:9Þ
Now we can further define PUpþq ¼ Upþq U#pþq and P⊥

Upþq ¼ I Upþq Upþq . Then
#

dT P⊥
Upþq r ¼ d I P
T
Upþq r ¼ d r d PUpþq r
T T

¼ dT r dT Upþq U#pþq r
T
¼ dT P⊥Up r d T
U p U#
p Uq ΓU T
q Up
#
UpT r þ dT Up U#p Uq ΓUqT r
T
þ dT Uq Γ T UqT U#p UpT r dT Uq ΓUqT r
¼ dT P⊥
Up r d PUp Uq ΓUq PUp r þ d PUp Uq ΓUq r
T T T T ð8:10Þ
þ dT Uq Γ T UqT PUp r dT Uq ΓUqT r

¼ dT P⊥
Up r þ d PUp Uq ΓUq I PUp r d Uq ΓUq I PUp r
T T T T
¼ dT P⊥
Up r þ d P
T
U ΓU T P⊥ r dT U ΓU T P⊥ r
T ⊥
Up q q Up T ⊥ q q Up
¼ d PUp r d I PUp Uq ΓUq PUp r
T
¼ dT P⊥ T ⊥ T ⊥
Up r d PUp Uq ΓUq PUp r:
As a special case, let Uq ¼ mp+1, and let d be the desired signal signature not in
Uq. Then Γ becomes β, (8.9) is reduced to (8.4), and (8.10) is reduced to (8.6):
T ⊥ T ⊥ T ⊥ ⊥
^ OSP
α pþ1 ðrÞ ¼ d PUpþ1 r ¼ d PUp r βd PUp mpþ1 mpþ1 PUp r
T
T ð8:11Þ
¼ dT P⊥ β
dT m T P⊥ T ⊥
pþ1
p
Up r pþ1 Up m P
pþ1 Up r;
1
where P⊥ Up ¼ I U U
p p
#
¼ I U p U T
U
p p UpT . It is worth noting that β
in
p p1
n h i o1
(8.3) can be further reexpressed as β
¼ mpþ1 T
P⊥
Up mpþ1 ¼
pþ1 p
h i T h i
1
P⊥
Up m pþ1 P ⊥
Up m pþ1 because P⊥ Up is symmetric and idempotent.
According to (8.5), P⊥ ⊥
Upþ1 can be updated via PUp by including a correction term of
T
β
m
Up Up
e pþ1 mpþ1 m e pþ1 mpþ1 recursively.
pþ1 p
Several comments on (8.11) are worthwhile.
U
1. me pþ1
p
¼ Up U#p mpþ1 is an innovation vector representing the leakage of the pth
⊥
Up
signal signature mp into Up1 . In this case, mpþ1 m e pþ1
T
Up
mpþ1 me pþ1 in (8.5), which is an outer product of the vector
U
mpþ1 m e pþ1
p
, can be considered an innovation information matrix. Thus, if
⊥
there is no leakage into Up , then U#p is the inverse of Up and Up U#p ¼ I is an
¼ mpþ1 . As a result, P⊥ ⊥
U
identity matrix, which results in m e pþ1
p
Upþ1 ¼ PUp
according to (8.5).
⊥
2. Since P⊥ Up projects the leakage of U p into its complement subspace Up ,
⊥
⊥
PUp mpþ1 is the projection of mp+1 onto Up . Thus, β

accounts for the
pþ1 p
innovations information provided by mp+1, which cannot be provided by Up.
Up
e pþ1
3. Finally, if we examine the term mpþ1 m closely, it is interesting to discover
U
e pþ1
that mpþ1 m p
can be reexpressed as

¼ ILL Up U#p mpþ1 ¼ P⊥
Up
e pþ1
mpþ1 m Up mpþ1 : ð8:12Þ
^ OSP
Using (8.12), α p ðrÞ in (8.6) can actually be calculated by
T
^ OSP
α T ⊥
dT P⊥ mpþ1 P⊥ mpþ1 r
pþ1 ðrÞ ¼ d PUp r β
pþ1
p Up Up
ð8:13Þ
¼α OSP
^ p ðrÞ β
d PUp mpþ1 mpþ1 P⊥
T ⊥ T
pþ1 p Up r;
T
where P⊥ ⊥
Up is idempotent and PUp ¼ P⊥
Up .
Recursive Update Algorithm (RUA) for P⊥

Uj
p ¼ total number of signatures used to form a linear mixing model. Technically
speaking, its value can be determined at one’s discretion.
Start with the first signal signature, denoted by m1. Set U1 ¼ ½m1 , and calculate
P⊥U1 . Let j ¼ 2. 1
2. At the jth recursion use (8.2) and (8.5) to calculate UjT Uj and P⊥ Uj .
3. Stopping rule:
If a stopping rule is satisfied (to be discussed in Sect. 8.4.2), RUA is terminated.
Otherwise, let p p þ 1, and go to step 2.
8.4 Signature Generation by RHSP-OSP 235
The most important advantage of the aforementioned RUA is that it requires no

matrix inversion calculations. The only inverse that needs to be calculated for RUA
is the inverse of a scalar that is the inner product of the first signature, m1, involved
1
in the initial condition, that is, m1T m1 with Up ¼ ½m1 . After the first recursion,

P⊥ ⊥ Up
Upþ1 will be updated by PUp as well as an outer product mpþ1 m e pþ1
T
Up
mpþ1 m e pþ1 via (8.5) and (8.6), both of which only involve inner products
of two L-dimensional vectors with no need for matrix inversion. Another significant
benefit resulting from RUA is that the computational complexity of RUA is nearly
constant as the number of signatures, p, grows compared to OSP, whose computa-
tional complexity increases exponentially as the value of p increases. This is a great
benefit when OSP is extended to RHSP-OSP, as described in Sect. 8.4.
Table 8.1 details the computational complexity of (8.10) to calculate α ^ OSP

p ðrÞ in
(8.6), where the third column summarizes the number of mathematical operations
used to compute specific formulas listed in the second column, where an inner
product and an outer product of two L-dimensional vectors require the number of
L multiplications and the number of L2 multiplications, respectively. Note that there
is no need to calculate matrix inverses but only the outer product of the L-
T
U U
dimensional vector, mp m e p p1 mp m e p p1 .
8.4 Signature Generation by RHSP-OSP
One immediate advantage of using recursive equations for P⊥ Uj is to extend OSP

sample by sample in a progressive manner, referred to as RHSP-OSP, which allows
RHSP-OSP to find signal signatures as target signatures one at a time recursively.
Its algorithmic structure is very similar to that of recursive hyperspectral sample
processing of ATGP (RHSP-ATGP) discussed in Chap. 7. It can be done by a
succession of OSPs via P⊥ Up by growing p target signal sources. In general, as
p grows, P⊥ Up must be reimplemented by the new signature matrix Up repeatedly,
over and over again. It is a very time-consuming process as the value of p becomes
very large, which is indeed the case in hyperspectral imagery. The computational
complexity of P⊥ Up implemented by RHSP-OSP is determined only by
T
U U U
mp me p p1 mp m e p p1 , where m e p p1 ¼ Up1 U#p1 mp does not have to
compute Up1 U#p1 , which had already been computed in the previous recursion.
236
^ OSP
Table 8.1 Computational complexity of (8.6) and (8.13) to calculate α p ð rÞ
1
Initial m1 m1T m1 , m1mT1 One inner product (L multiplications) plus one outer product of an
conditions L-dimensional vector, m1 (L2 multiplications)
T 1
P⊥
U1 P⊥ m1 m1T L subtractions and L multiplications
U1 ¼ I m1 m1
T ⊥ (L + 1) inner products of two L-dimensional vectors (L + L2
^ OSP
α 1 ð rÞ ^ OSP
α 1 ðrÞ ¼ d PU1 r
multiplications)
Available information: U L2 subtractions and L inner products of two L-dimensional vectors
m
e p p1 ¼ Up1 U#p1 mp
Up1 , U#p1 , mp, P⊥
Up1 U (L2 multiplications)
mp m
e p p1 ¼ P⊥
Up1 mp L inner product of L-dimensional vectors (L2 multiplications)
h i T h i 1
Innovation informa- L inner products of two L-dimensional vectors and one inner product
⊥
tion: βp|p1 by (8.3) β
¼ P⊥
Up1 mp PUp1 mp (L2 + L multiplications)
p p1
T U
Update P⊥
Up by (8.5) ⊥
U U One outer product of an L-dimensional vector mp me p p1
P⊥ e p p1 mp m
Up ¼ PUp1 β p
p1 mp m e p p1 2 2 2
(L multiplications) and L subtractions plus L multiplications

T ⊥ T ⊥ U 2
Update α ^ OSP
p ð rÞ ^ OSP
α ^ OSP
p ð rÞ ¼ α p1 ðrÞ βp
p1 d PUp1 mp mp PUp1 r One outer product of mp m e p p1 ¼ P⊥
Up1 mp (L multiplications) + L
by (8.6) inner products of two L-dimensional vectors and one inner product (L2 + L
multiplications) plus one subtraction and one multiplication
8 Recursive Hyperspectral Sample Processing of Orthogonal Subspace Projection
However, RHSP-OSP also comes with two challenging issues: (1) how many target
signal sources does RHSP-OSP have to generate? (2) how does one find these signal
signatures? Each of these issues is discussed in what follows.
8.4.1 Finding Unsupervised Target Signal Sources
First, let us look at the innovation information matrix specified by

T
U U
e pþ1
mpþ1 m p
e pþ1
mpþ1 m p
ð8:14Þ
in the derivation of RHSP-OSP in (8.11). Now, for general purposes, we use target
signal sources t1, t2, . . ., tp to replace the unwanted or undesired signal signatures,
m1, m2, . . ., mp used in previous derivations. To find the next target signal source
tp+1, t1, t2, . . .,tp must be excluded from consideration. In this case,
Up ¼ t1 t2 tp forms an unwanted target signal matrix.
According to (8.11), the most unwanted target signal source that will be added to
create a new ( p + 1)-unwanted signature matrix Up+1 is the one that yields the
maximum innovation information. Since the matrix in (8.11) is the outer product of
the vector tpþ1 et pþ1
Up
, its rank is one, with only one nonzero eigenvalue given by its
inner product,
T
tpþ1 et pþ1 tpþ1 et pþ1
Up Up
: ð8:15Þ
In this case, (8.15) represents the innovation information provided by the correla-
tion between a new target signal source tp and previously generated target signal
sources, t1, t2, . . ., tp. Using (8.11), (8.15) can be further written as
T T
tpþ1 et pþ1 tpþ1 et pþ1 ¼ P⊥ P⊥ ⊥
Up Up
Up tpþ1 ¼ tpþ1 PUp tpþ1 ; ð8:16Þ
T
Up tpþ1
which suggests that the most unwanted new target signal source should be the one
that maximizes
2
r Up Þ ¼ r e
r Up
T
ρ r; Up ¼ ðr er U p Þ ðr e
T ð8:17Þ
¼ P⊥Up r P⊥ T ⊥
Up r ¼ r PUp r
over all data sample vectors r with rUp ¼ Up U#p r.

In light of (8.16) and (8.17), we can design a criterion to find a potential target
signal source, tp+1, which is
n o
tpþ1 ¼ arg maxr ρ r; Up ¼ arg maxr rT P⊥
Up r : ð8:18Þ
Interestingly, the quantity to be maximized in (8.18) turns out to be the inverse of

βp+1|p in (8.3), with mp+1 replaced by tp+1. This indicates that (8.18) in fact
provides a reasonable means of finding unwanted target signal sources required
for UOSP to produce.
8.4.2 Determining the Number of Unwanted Targets
Once a sequence of unwanted target signal sources {tp} is generated by (8.18), the
next challenging issue for RHSP-OSP is how to terminate RHSP-OSP. Using (8.16)
and (8.18) we can define
2
ηp ¼ tRHSP-OSP T P⊥ tRHSP-OSP ¼
P⊥ RHSP-OSP
: ð8:19Þ
p Up1 p Up1 tp
This ηp turns out to be exactly the same signal source under the binary hypothesis
testing problem considered in the maximal orthogonal subspace projection (MOSP)

in Chang et al. (2011c), where the energy of the pth target signal source tp,
tp
2 ,
was used to determine VD or the maximal orthogonal complement algorithm
(MOCA) in Kuybeda et al. (2007) was used to determine the rank of rare signal
dimensionality.
Following Sect. 4.5.1 of Chap. 4, we assume that for each 1 p L, U0 ¼ I is
the identity matrix and Up1 is the target space linearly spanned by previously
n op1
found p 1 target signal sources tRHSP-OSP
j by RHSP-OSP. Then, for 1 p
j¼1
L, we can find tRHSP - OSP
via (8.18) and ηp via (8.19), where ηp in (8.19) is the
p
maximum residual of the pth target signal source, tRHSP - OSP found by RHSP-OSP
p
⊥
leaked from < Up1 > into < Up1 > , which is
the

complement

T
space

orthogonal
to the space < Up1 >. It should be noted that
P⊥ U0 r
r r
in (8.19) when
n oL
RHSP-OSP
p ¼ 1. Since the found tp obtained by RHSP-OSP may be highly
p¼1
correlated, ηp in (8.19) is used instead because ηp represents the maximum residuals
of tRHSP - OSP leaked into < U >⊥ . It is this sequence, η L , that will be used as
p p1 p p¼1
the signal source in a binary composite hypothesis testing problem formulated as
follows to determine whether or not the lth potential target candidate, tpRHSP - OSP is a
true target signal source by a detector.

H0 : ηp p ηp
H 0 ¼ p0 ηp
versus for p ¼ 1, 2, . . . , L ; ð8:20Þ

H1 : ηp p ηp
H 1 ¼ p1 ηp
of tpRHSP - OSP being an endmember under H1 and not an endmember under H0,
respectively, in the sense that H0 represents the maximum residual resulting from
the background signal sources, while H1 represents the maximum residual leaked
from the target signal sources. Note that when p ¼ 0, η0 is undefined in (8.20). To
make (8.20) work, we need to find probability distributions under both hypotheses.
The assumption made on (8.20) is that if a signal source is not an endmember under
H0, then it should be considered part of the background, which can be characterized
by a Gaussian distribution. On the other hand, if a signal source is indeed an
endmember, it should be uniformly distributed over [0, ηp1]. By virtue of extreme
value
theory
(Leadbetter 1987), ηp can be modeled as a Gumbel distribution, that is,
Fvp ηp is the cumulative distribution function (cdf) of vp given by
( h i)
ð2logN Þ1=2 pffiffiffiffiffiffiffiffi
xσ 2 ðLpÞ
Fvp ðxÞ exp e σ 2 2ðLpÞ
: ð8:21Þ
Since there is no prior knowledge available about the distribution of signal

sources, assuming ηp under H1 is uniformly distributed seems most reasonable
Under these two assumptions, we obtain

p H 0 ; ηp ¼ pνp ηp Fνp ηp ¼ pνp ηp ηp =ηp1 ; ð8:22Þ

p H 1 ; ηp ¼ Fνp ηp pνp ηp ¼ Fνp ηp 1=ηp1 : ð8:23Þ
h i
Since pνp ηp ¼ p H 0 ; ηp þ p H 1 ; ηp ¼ 1=ηp1 ηp pνp ηp þ Fνp ηp , we
can obtain an a posteriori probability distribution of p(H0|ηp) given by

ηp pνp ηp
p H 0
ηp ¼ ð8:24Þ
η p pν p η p þ Fν p η p
and an a posteriori probability distribution of p(H1|ηp) is given by

Fν η
p H 1
ηp ¼ p p : ð8:25Þ
ηp pνp ηp þ Fνp ηp
By virtue of (8.24) and (8.25), a Neyman–Pearson detector, denoted by δNP(ηp)

for the binary composite hypothesis testing problem specified by (8.20), can be
obtained by maximizing the detection power PD, while the false alarm probability
PF is fixed at a specific given value, α, which determines the threshold value τp in
the following randomized decision rule:
8
>
> 1, if Λ ηp > τp ,
<
RHSP-OSP ηp ¼
δNP 1 with probability κ, if Λ ηp ¼ τp , ð8:26Þ
>
>
:
0, if Λ ηp < τp ;

where the likelihood ratio test Λ(ηp) is given by Λ ηp ¼ p1 ηp =p0 ηp , with p0(ηp)
and p1(ηp) given by (8.24) and (8.25). Thus, a case of ηp ¼ tpT P⊥Up1 tp > τp in (8.20)
indicates that δRHSP - OSP (ηp) in (8.26) fails the test, in which case tpRHSP - OSP is
NP
assumed to be a desired target signal source. Note that the test for (8.26) must be
performed for each of L potential target signal candidates. Therefore, for a different
value of p the threshold τp varies. Using (8.26) we can derive the OSP-specified VD,
nRHSP-ATGP, by calculating
n h io
nRHSP-OSP ¼ VDNP
RHSP-OSP ðP F Þ ¼ arg max p δ NP
RHSP-OSP η RHSPOSP
p ¼ 1 ;
ð8:27Þ

where PF is a predetermined false alarm probability, δNP - ηp ¼ 1 only if
NP
RHSP OSP
δRHSP-OSP ηp ¼ 1 and δRHSP-OSP ηp ¼ 0 if δRHSP-OSP ηp < 1.
NP NP
Note that the sequence of {ηRHSP - OSP } is monotonically decreasing. Thus, the
p
sequence {δNP (ηRHSP - OSP )} starts with a failure of the test (8.26), that is,
RHSP - OSP p
RHSP-OSP
RHSP-OSP ηp
δNP ¼ 1 , and continues as the value of p is increased until
p reaches the value where the test is passed, in which case

RHSP-OSP RHSP-OSP
δRHSP-OSP ηp
NP
< 1. The largest value of p makes δRHSP-OSP ηp
NP
¼ 1 the value of VDNP RHSP - OSP (PF) according to (8.27). This unique property allows
VDNP (P
RHSP - OSP F ) to be implemented in real time as the process continues with
increasing p.
8.4.3 RHSP-OSP Using an Automatic Stopping Rule
Using (8.27) we can now design an algorithm to perform RHSP-OSP with an

automatic stopping rule. Since RHSP-OSP can be implemented in a manner very
similar to how RHSP-ATGP is implemented in Chap. 7, a similar algorithmic
implementation for RHSP-OSP can also be derived as follows.
Recursive Hyperspectral Sample Processing of OSP

• The outer loop is a successive process index using the parameter p to find a
L
growing set of signatures, mp p¼1 .
Find an initial target pixel vector m1 ¼ argfmaxr rT rg. Set U1 ¼ ½m1 , and
calculate P⊥U1 .
• Inner Loop
(a) Progressive process indexed by the parameter i (running through all
N
data sample vectors fri gi¼1 ):
For p 2, find mp by maximizing riT P⊥ Up1 ri via (8.19) over all data
n h i o1
sample vectors, ri, that is, β
¼ mpþ1 T
P⊥
Up mpþ1 ¼
pþ1 p
h i T h i
1
P⊥Up m pþ1 P ⊥
Up m pþ1 .
^ OSP
(b) Recursive process by updating α p ðrÞ:
^ OSP
Use (8.5) to update α ^ OSP
p ðrÞ via previously calculated α p1 ðrÞ, βp|p1,
U
e p p1 .
and m
• End (Inner Loop)
2. Stopping rule:
If a stopping rule is satisfied by (8.26) and (8.27), RHSP-OSP is terminated.
Otherwise, let p p þ 1, and go to step 1(a).
• End (Outer Loop)
Figure 8.1 presents a flowchart diagram illustrating the implementation of
RHSP-OSP, which consists of three processes, a progressive process used to
process data sample vectors sample by sample progressively to calculate maximal
value of riT P⊥ # ⊥
Up1 ri to find mp, a recursive process used to update Up and PUp , and a
L
successive process to grow target signal sources, mp p¼1, where a stopping rule is
the OSP-specified VD, nRHSP-ATGP estimated by the Neyman–Pearson detector in
(8.26), and VDNP RHSP - OSP (PF) in (8.27).
Since the previously described RHSP-OSP can be implemented in two loops, the
inner loop for data sample vectors in a progressive manner and the outer loop for
Fig. 8.1 Flowchart

of RHSP-OSP
signal signatures in a successive manner, we can also interchange these two loops
by implementing the signal signatures in the inner loop and the data sample vectors
in the outer loop. The resulting RHSP-OSP is called causal RHSP-OSP (CRHSP-
OSP) and is described in what follows.
Causal RHSP-OSP
N
Assume that there are N data sample vectors, fri gi¼1 .
N
2. Outer loop: iterate fri gi¼L by index k.
n oL
3. Inner loop: iterate tR-OSP ðr Þ
j i by index p.
j¼1
Use RHSP-OSP to iterate signal sources tjR - OSP (ri) from j ¼ 1 to pk, where pk is
k
determined by nRHSP-OSP via VDNP RHSP - OSP (PF) based on fri gi¼1 in (8.27).
The foregoing CRHSP-OSP has only theoretical interest but is not practical, for
two reasons. One is that the signal signature tj found by CRHSP-OSP in the inner
loop depends on the data sample vector ri. The other is that pk is determined by a
k
growing data sample pool, fri gi¼1 . Thus, the RHSP-OSP described in Fig. 8.1 is a
more practical version. Nevertheless, RHSP-OSP and CRHSP-OSP have their own
merits. For example, CRHSP-OSP iterates signal signatures in an inner loop that
can produce the appropriate number of signal signatures required for each data
sample to achieve its best possible performance in terms of detecting target signal
signatures of interest. It also utilizes an outer loop to find the number of signal
k
signatures for growing data sample pools fri gi¼1 to determine the best possible
number of signal signatures. An advantage of CRHSP-OSP is that it provides users
with a means of seeing different levels of detecting a particular target signature of
interest as each new data sample vector is fed in. On the other hand, RHSP-OSP
reverses two loops implemented in CRHSP-OSP by iterating all the data sample
vectors in an inner loop while iterating signal signatures in an outer loop. In this
case, RHSP-OSP allows users to see progressive abundance fraction detection maps
by adding one signal signature at a time. This is particularly useful because users
can see the detection performance progressively improved by adding more signal
signatures to the unwanted/undesired signal source matrix U.
A final note on RHSP-OSP is worthwhile. The quantity of ηp specified by (8.19)
is actually the same criterion specified by
T
tp ¼ arg maxr P⊥
Up1 r P ⊥
Up1 r ; ð8:28Þ

with U ¼ t1 t2 tp1 , called the orthogonal projection correlation index
(OPCI) by the automatic target detection and classification algorithm (ATDCA)
that was developed by Ren and Chang (2003) to find a new target signal source tp,
where ATDCA was subsequently renamed ATGP. With this interpretation RHSP-
OSP becomes RHSP-ATGP recently developed in Chap. 7. But it should also be
noted that the OPCI used by ATGP in Ren and Chang (2003) is exactly the same as
the ηp derived in (8.19). Figure 8.2 shows a flow diagram depicting relationships
among all variations derived from OSP.
OSP
Unsupervised OSP (UOSP) Supervised OSP
ATGP
RHSP-ATGP RHSP-OSP
CRHSP-OSP
Fig. 8.2 Diagram of various versions of OSP
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
of the 15 panels
8.5 HYDICE Experiments
The image data to be studied are the HYperspectral Digital Imagery Collection
Experiment (HYDICE) image scene in Fig. 8.3a (and Fig. 1.10a), which has a size
of 64 64 pixel vectors with 15 panels in the scene and the ground truth map in
Fig. 8.3b (Fig. 1.10b).
It was acquired by 210 spectral bands with a spectral coverage from 0.4 to 2.5
μm. Low signal/high noise bands, bands 1–3 and 202–210, and water vapor
absorption bands, bands 101–112 and 137–153, were removed. Thus, a total of
169 bands were used in the experiments. The spatial resolution and spectral
8.5 HYDICE Experiments 245
Table 8.2 nRHSP-ATGP estimated by VDNP

RHSP - OSP (PF)
Signal sources used in (8.20) PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105

ηp ¼ tpT P⊥
Up1 tpin (8.19) 21 19 19 19 19

ηp ¼
tp
2 (Chang et al. 2014a) 39 37 35 35 34
Table 8.2 tabulates the values of VDNP RHSP - OSP (PF), denoted by nRHSP-OSP, with
various false alarm probabilities, PF. Also included in Table 8.2 are the VD results
in Chang et al. (2014a) for comparison.
Interestingly, since ηp in (8.19) represents the maximal residual leakage of the
pth signal tpRHSP - OSP into the subspace P⊥ Up1 and ηp in Chang et al. (2014a) is indeed
R - OSP
the energy of the pth signal t p , they produced two ranges of values, from 19 to
21 and from 34 to 37 (Table 8.2), respectively, both of which have been used in the
literature for various applications. For example, it was shown in Heinz and Chang
(2001) and Chang (2003a, b) that 34 was an appropriate value to estimate the
number of signatures with target pixels found using the unsupervised FCLS
n o34
(UFCLS) method, tpUFCLS , for spectral unmixing. On the other hand, it was
p¼1
also shown in Chang et al. (2010a, 2011b) that a value of 18 was an appropriate
estimate for the number of signatures to use in unsupervised linear spectral
mixture analysis. These two sets

of

2values were actually two VD values estimated

T ⊥
by ηp ¼ tp PUp1 tp and ηp ¼ tp
, respectively, in Table 8.2, where VD ¼ 19

was across the board, with PF 102 for ηp ¼ tpT P⊥ Up1 tp and VD ¼ 34–37, with

PF 102 for ηp ¼
tp
2 . Using these two sets of values as a guide, Fig. 8.4a

n o37
shows a set of 37 targets, tRHSP -OSP , with numerals used to indicate the order
j
j¼1
in which they were extracted. Figure 8.4b also shows that the last target panel pixel
to complete five-panel-pixel extraction was actually the 18th RHSP-OSP-generated
target pixel, which was p411. However, it should be noted that the panel pixel
found in row 2 is the yellow panel pixel, p212 not R panel pixel p221. In other words,
the panel pixel, p212, found by OP-based unsupervised algorithms such as ATGP or
RHSP-OSP is not the same panel pixel, p221, found by endmember finding algo-
rithms such as N-FINDR. This was also noted in Experiments 6.8.1.1 in Sect. 6.8.1
of Chap. 6 in Chang (2016). Nevertheless, only 18 target pixels are required for
RHSP-OSP to find five panel pixels corresponding to five panel signatures, a fact
that was confirmed in Chang et al. (2010a, 2011b), where the number of signatures
was estimated to be twice the value of VD estimated by the Harsanyi–Farrand–
Chang (HFC) method. This value of 18 is indeed very close to nRHSP-OSP ¼ 19, with
PF 102 . In fact, the 19th target pixel found by RHSP-OSP in Fig. 8.4b was
located in the background along the tree line. As a result, there is nearly

no
impact
on unmixed results. In addition, the values produced by means of ηp ¼
tp
2, to be
used as signal sources for (8.20), were in a range of 34–37, with PF 102 , which
are also indeed very close to the value of 34 shown in Heinz and Chang (2001).
Thus, in this case, values of 34–37 were used for experiments.
Fig. 8.4 Target pixels

found by PRHSP-OSP; (a)
37 RHSP-OSP-generated
target pixels; (b) 19 RHSP-
OSP-generated target pixels
Since p13, p23, p33, p43, and p53 in the third column in Fig. 8.3b are subtarget
pixels, detecting these targets in an unsupervised fashion without prior knowledge
generally presents a challenge because their presence cannot be inspected visually.
Thus, our experiments were conducted for the detection of these five subpixel
panels. To see the causality detection of RHSP-OSP in finding abundance fractions
of subtarget pixels p13, p23, p33, p43, and p53 as the number of unwanted signals
sources, p, increases, Fig. 8.5a plots the abundance fractions of these five subtarget
n op
panels detected by CRHSP-OSP using targets tRHSP-OSP j ranging from 1 to
j¼1
p ¼ 169 found by RHSP-OSP; in addition, Fig. 8.5b is a zoomed-in plot of Fig. 8.5a
from p ¼ 1–37 for better visual assessment.
Fig. 8.5 CRHSP-OSP- a

detected abundance x106
fractions of five subtarget 9
panel pixels with the p13 p43
number of signal sources
8
p23 p53
growing from p ¼ 1–169 7 p33
and p ¼ 1–37. (a) Detected
Detected abundances
abundance fractions of 6
CRHSP-OSP with
p ¼ 1–169; (b) zoomed-in 5
plots of detected abundance 4
fractions of CRHSP-OSP
with p ¼ 1–169 3
2
1
0
-1
0 20 40 60 80 100 120 140 160 180
Number of signal sources
b
106
9
p13 p43
8
p23 p53
7 p33
6
Detected fraction
5
4
3
2
1
0
-1
0 5 10 15 20 25 30 35
According to Fig. 8.5a, b, the detected abundance fractions of all subpixel panels
at p ¼ 6, 9, 18 showed three clear cuts for sudden drops, where the dashed lines are
used to highlight p ¼ 6, 9, 18, 19. Since the HYDICE data have five distinct panel
signatures plus a background signature; the first cut at p ¼ 6 is a prior estimate using
the ground truth. Without using prior knowledge, this scene was estimated by VD
as nine calculated by the HFC method (Harsanyi et al. 1994a, b; Chang 2003a, b),
which corresponds to the second cut. The value of nRHSP-OSP ¼ 19 was estimated by
an orthogonal complement maximal residual via (8.19), which is very close to the
third cut at 18. This shows that nRHSP-OSP ¼ 19 is indeed a very good estimate for
VD since after the third cut all the detected abundance fractions of the five subpixel
panels were saturated and flat. Figure 8.6a, b shows the plots of the abundance
fractions of all five subpixel panels estimated by CRHSP-LSOSP with p ¼ 1–169
and zoomed-in plots with p ¼ 1–37. Note that, unlike the very high RHSP-OSP-
detected abundance fractions produced in Fig. 8.5a, b, to enhance signal detectabil-
ity, Fig. 8.6a, b tried to estimate the abundance fractions, in which case the
abundance fractions were relatively small within a range between 0.1 and 0.5.
The implemented CRHSP-LSOSP is simply the same as RHSP-OSP, except that
OSP, α ^ OSP
p ðrÞ, in (4.20) used for abundance detection was replaced by LSOSP,
^ pLSOSP ðrÞ , in (4.21) and used for abundance estimation, which includes
α
T ⊥ 1
d PU d to account for estimation error.
Like Figs. 8.5a, b, 8.6a, b also showed clear cuts at p ¼ 6, 9, and 18. But unlike
Fig. 8.5a, b, where the abundance fractions approached zero, the estimated abun-
dance fractions in Fig. 8.6a, b fluctuated after p ¼ 18. It has been shown that for
supervised LSMA for this HYDICE scene, p ¼ 9 was estimated by VD in Chang
(2003a, b, 2013) and Chang and Du (2004), p ¼ 18 was estimated for unsupervised
LSMA (Chang et al. 2010a, 2011b), p ¼ 19 was estimated by nRHSP-OSP, p ¼ 29 was
estimated using an approach called RHSP-specified VD in Chang et al. (2014b),
and p ¼ 34 was estimated for the UFCLS method in Heinz and Chang (2001). In this
case, Fig. 8.6b dictates changes in the estimated abundance fractions, where the
dashed lines are used to highlight p ¼ 9, 18, 19, 29, 34–37.
The results shown in Figs. 8.5 and 8.6 are RHSP-OSP-detected and RHSP-
LSOSP-estimated abundance fractions of five single subtarget panel pixels pro-
duced by CRHSP-OSP and CRHSP-LSOSP. The figures do not show RHSP-OSP-
detection maps of all panel pixels and their estimated abundance fractions. It is
interesting to see the progressive RHSP-OSP-detection maps produced by RHSP-
OSP as p varies from 1 to 169, along with their estimated abundance fractions, for
comparison. In this way, RHSP-OSP must generate 169 detection maps showing
progressive changes in the detection of abundance fractions. As noted in Figs. 8.5
and 8.6, the only cases of interest are p ¼ 9, 18, 19, 29, 34–37, and 169. Therefore,
we only need to show progressive detection maps and their estimated abundance
fractions for these cases. The recursive equations derived in the appendix allow us
to accomplish this task without going through varying p one by one. Figures 8.7 and
8.8 show the RHSP-OSP-detected and RHSP-LSOSP-estimated abundance frac-
tional maps of 19 R panel pixel targets produced by RHSP-OSP, with p ¼ 9 and
q ranging from 9 ( p + q ¼ 18) in Chang et al. (2010a, 2011b), to q ¼ 10 ( p
+ q ¼ 19 ¼ nRHSP-OSP), q ¼ 20 ( p + q ¼ 29) in Chen et al. (2014a, b, c), and q ¼ 25
( p + q ¼ 34) in Heinz and Chang (2001), and full bands, 169. Since the RHSP-OSP-
detected abundances were so high, their detection maps are plotted in decibel (db)
defined by 20 log10x. Even in doing so, the detected abundance amounts of detec-
tion maps were still very bright compared to those in Fig. 8.8, which are abundance
a
0.6
p13 p43
0.5 p23 p53
p33
0.4
Estimated abundances
0.3
0.2
0.1
0.1
0 20 40 60 80 100 120 140 160 180
b
0.4
0.35
0.3
p13
Estimated fraction
0.25 p43
p23 p53
0.2
p33
0.15
0.1
0.05
-0.05
0 5 10 15 20 25 30 35
Fig. 8.6 Abundance fractions of five subtarget pixels estimated by CRHSP-LSOSP from number
of signal sources growing from 1 to 37. (a) Estimated abundance fractions by CRHSP-LSOSP with
p ¼ 1–169; (b) zoomed-in plot of estimated abundance fractions of CRHSP-LSOSP with p ¼ 1–37
Fig. 8.7 Progressive RHSP-OSP-detected fractional maps using targets produced by RHSP-OSP
as p grows from 9, 18, 19, 29, 34, 35, 36, 37, and 169; (a) 9 signatures; (b) 18 signatures; (c)
19 signatures; (d) 29 signatures of RHSP-OSP; (e) 34 signatures; (f) 35 signatures; (g) 36 signa-
tures; (h) 37 signatures; (i) 169 signatures
fractions estimated by RHSP-LSOSP for true abundances within a range from 0.1
to 0.5, in which case the background was largely suppressed to yield better visual
inspection. Nevertheless, this by no means implies that RHSP-OSP-detected maps
were worse than RHSP-LSOSP-estimated maps. If we examine carefully the results
in Fig. 8.7, we can also find that the five rows of panels, including the five subpixel
panels in the third column, were actually detected effectively.
One note is worthwhile here. According to the ground truth, the panels in rows
4 and 5 were made of the same materials with slightly different colors of paint. As a
consequence, a technique to detect panels in row 4 would also be able to detect
panels in row 5 and vice versa. This phenomenon was confirmed by Figs. 8.7 and
8.8. Similarly, it was also applied to the panels in rows 2 and 3, which were also
made of the same materials with slightly different paint colors.
Interestingly, the good results in Figs. 8.7 and 8.8 were the same as those
produced by p ¼ 18 (Chang et al. 2010a, 2011b), nRHSP-OSP ¼ 19 and 29 (Chang
et al. 2014b), and then performance seemed to deteriorate after p ¼ 34 (Heinz and
Chang 2001), with the worst result given by p ¼ 169. This made sense because the
more signal sources generated by RHSP-OSP, the more annihilation due to the fact
that OSP uses P⊥ U in (8.1) to annihilate unwanted signal sources, which may include
signal sources similar to the signal sources one wishes to detect, for example, the
panels in rows 2 and 3 and those in rows 4 and 8. As a result, the detectability of the
desired signal sources was largely degraded and reduced by annihilation of similar
signal sources in U. But this did not occur for data unmixing via fully abundance-
Fig. 8.8 Progressive RHSP-LSOSP-detected abundance fractional maps using targets produced
by RHSP-LSOSP as p grows from 9, 18, 19, 29, 34, 35, 36, 37, and 169; (a) 9 signatures; (b)
18 signatures; (c) 19 signatures; (d) 29 signatures of RLSOSP; (e) 34 signatures; (f) 35 signatures;
(g) 36 signatures; (h) 37 signatures; (i) 169 signatures
constrained methods, which are designed to estimate rather than detect abundance
fractions. To verify this assertion, Fig. 8.9 shows the abundance fractions of the
same five subtarget panels in the third column produced in Fig. 8.7, where the FCLS
method in Heinz and Chang (2001) was used to perform linear spectral unmixing,
instead of using OSP to perform signal detection, as is done in Fig. 8.7 for panel
pixels. According to Fig. 8.9d, h, the FCLS-unmixed results were indeed very good
and very close after p + q ¼ 18, 19. Comparing the results to those in Fig. 8.8
produced by RHSP-LSOSP, it was demonstrated that RHSP-LSOSP-generated
target signal sources could also be used as unsupervised signal sources to perform
unsupervised LSMA for data unmixing as an alternative to unsupervised least
squares–based fully constrained LSMA (Chang et al. 2010a) and component
analysis–based fully constrained LSMA (Chang et al. 2011b).
As a final comment on the implementation of RHSP-OSP, it should be kept in
mind that the plots in Figs. 8.5 and 8.6 were obtained by CRHSP-OSP and CRHSP-
LSOSP for a single panel pixel, with p as a parameter varying from 1 to 169, while
Figs. 8.7 and 8.8 show progressive RHSP-OSP-detection maps and RHSP-LSOSP-
estimation maps, respectively, with p a fixed value for all data sample vectors,
N
fri gi¼1 .
Fig. 8.9 Progressive FCLS-unmixed abundance fractional maps using targets produced by RHSP-
OSP as p grows from 9, 19, 29, 34, 119, and 169; (a) 9 signatures; (b) 18 signatures; (c)
19 signatures; (d) 29 signatures; (e) 34 signatures; (f) 35 signatures; (g) 36 signatures; (h)
37 signatures; (i) 169 signatures
8.6 Computational Complexity Analysis 255
8.6 Computational Complexity Analysis
Figure 8.10a, b plots the computing times required by OSP and RHSP-OSP as
p grows from 1 to 35 and from 1 to 169, respectively, where the x-axis indicates that
p is the cumulative number of signatures used by OSP to account for all signatures
to be used to form the signature matrix. Figure 8.10a is simply a zoomed-in version
of the plots in Fig. 8.10b, particularly for p from1 to 35 for better visual assessment.
As seen in Fig. 8.10a, RHSP-OSP does not have significant computing time savings
compared to OSP when the cumulative number of signatures, p, is small. However,
as the number of signatures, p, grows large, the savings began to emerge, as shown
in Fig. 8.10b, where RHSP-OSP required nearly constant computing time to process
each signal source compared to OSP, whose computing time grew exponentially
with the value of p. This is not a surprise from a signal processing point of view,
such as Kalman filtering (Poor 1994), because only the new incoming signature,
which is the data sample currently being visited, is required for RHSP-OSP to be
a
x 10
6
4
Computing
1 RHSP-OSP
OSP
0
0 5 10 15 20 25 30 35
b
0.01
0.009
0.008
0.007
Computing time(s)
0.006
0.005
0.004
0.003
0.002
0.001 RHSP-OSP
OSP
0
0 20 40 60 80 100 120 140 160 180
Fig. 8.10 Comparative performance of computing times required by OSP and RHSP-OSP. (a, b)
Computing time
processed, regardless of how many signatures are already being processed, com-
pared to OSP, which requires reprocessing of all signatures when a new signature is
included in a new signature matrix in which case computing time is increased
exponentially as the number of signatures grows.
8.7 Conclusions 257
Table 8.3 Comparison of computational complexity required to process a single signature

RHSP-OSP
LS specified by (2) OSP specified by (4) specified by (13)
Initial condition N/A N/A I m1 m1T =m1T m1
Input M ¼ [m1,m2,. . .mp] M ¼ [m1,m2,. . .mp] mp
Number of inner products p( p + 1) ( p2 2p + 2L + 3) (5L + 2)
of L-dimensional vectors
per signature
Number of outer products pL pL + L2 L 0
of p-dimensional vectors
per signature
Number of outer products 0 0 1
of L-dimensional vectors
Number of multiplications 0 0 L
of scalar with L-dimen-
sional vectors
Number of “” per 2p2L + pL 2p2L + pL2 4pL 7L2 + 2L
signature + L2 + 4L
Complexity of matrix O( p3) O( p3) 0
inversions
Table 8.3 tabulates a comparison of the computational complexity required by

LS, OSP, and RHSP-OSP to process a single signature.
It should be noted that computing time is not necessarily equivalently transferred
to computational complexity because computing time is largely determined by the
computer used for data processing as well as how effective running codes are
programmed. However, computational complexity in Table 8.3 does offer a view
of how complicated operations must be implemented in hardware design such as the
Field Programmable Gate Array (FPGA).
8.7 Conclusions
The OSP developed by Harsanyi and Chang (1994) has produced promising results
in hyperspectral data exploitation. Its potential remains underexplored. This chapter
revisits the Harsanyi–Chang OSP and further develops a recursive version of OSP,
RHSP-OSP, that can implement OSP without inverting matrices as OSP does. As a
result, its computational complexity is significantly reduced. Such recursiveness
allows users not only to find unknown target signal sources in an unsupervised
manner but also to determine the number of signal sources as signatures, p, that
must be found at the same time. With RHSP-OSP there is no need to know the value
of p beforehand because its value can be determined by its ongoing process. Since
RHSP-OSP also iterates the signal signatures to be found in the outer loop, while
iterating the data sample vectors in the inner loop, we can derive a causal version of
RHSP-OSOP, called causal RHSP-OSP (CRHSP-OSP), by interchanging these two

loops with the data sample pools implemented in the outer loop with the signal
signatures to be found in the inner loop. Accordingly, each data sample vector that
produces a varying data sample pool can determine its own number of signal
sources for data processing. In contrast to CRHSP-OSP, the number of signal
signatures in RHSP-OSP can be adapted to various applications.
Many users who have used OSP to perform LSMA to unmix data might believe
that OSP is a linear spectral unmixing (LSU) technique. Technically speaking, it is
not. In order for OSP to unmix data accurately, a correction term is required to
account for estimation errors, which leads to an LSOSP, as developed in Chap. 2 as
well as in Chang (2003a, b, 2013). Even in this case, LSOSP can only be used for
abundance-unconstrained LSU, not abundance-constrained LSU, owing to the fact
that OSP produces the abundance fraction of each signature in the signature matrix
individually and separately, in which case the ASC cannot be applied until all
abundance fractions are generated for all signatures by OSP. This is the main reason
why, when LSMA is implemented for abundance-constrained LSU, least squares–
based methods have been used instead of OSP because OSP is based on a signal
detection criterion, SNR, not an LSE criterion. However, this does not exclude OSP
from being used for abundance-constrained LSU. Recently, several projects
have been undertaken, such as those of Chen et al. (2014b), Li et al. (2014a, b),
Wang et al. (2013a, b), and Yang et al. (2014).
As a final remark, a recent work (Du et al. 2008b) also showed a close relation-
ship between RHSP-OSP and the Gram–Schmidt orthogonalization process
(GSOP). This finding was also independently reported in Song et al. (2014a),
where a new approach, called orthogonal vector projection with its recursive
version, was further developed in Song et al. (2014b). Nevertheless, there are
several significant differences between RHSP-OSP and GSOP. First and foremost
is that RHSP-OSP performs outer products of vectors via (8.5) to find orthogonal
subspaces, while GSOP takes advantage of the inner products of a vector to
orthogonalize a set of vectors. Second, RHSP-OSP produces orthogonal subspaces,
as opposed to GSOP, which produces orthonormal base vectors. In other words,
RHSP-OSP does not perform orthogonalization among signatures in an undesired
signature matrix U. Instead, it performs OSP to create a hyperplane and in the
meantime also suppresses data via OSP. Compared to OSP, GSOP must orthogo-
nalize each signature in U to produce a set of orthonormal base vectors, but it does
not suppress data as RHSP-OSP does. This is a critical difference. As a result of
data suppression by undesired signatures, OSP’s ability to perform mixed pixel
classification is enhanced, which GSOP is unable to do. Third, for GSOP to produce
subspaces, it should include an additional process to find projection subspaces via
the orthonormal base vectors it finds. This creates extra computational complexity.
Finally, RHSP-OSP can be used to extend (8.2)–(8.6) to implement subspaces
recursively, as demonstrated in (8.7)–(8.10), while GSOP does not perform orthog-
onalization of one set of vectors to another, but rather one vector at a time.
Obviously, for GSOP to work on a subset of vectors, it must use an orthogonal
8.7 Conclusions 259
subspace approach to accomplish what RHSP-OSP does. In this case, GSOP can be
considered a simple special case of RHSP-OSP.
Interestingly, we can also follow a similar idea used to develop causal iterative
pixel purity index (C-IPPI) in Chap. 8 and progressive iterative pixel purity index
(P-IPPI) in Chap. 12 (Chang 2016) to implement RHSP-OSP in a two-loop process:
(1) finding new signal signatures one at a time by P-OSP and (2) updating P⊥ Up1 in
(8.1) recursively by RHSP-OSP. Depending on how these two loops are designed
and implemented, two different versions can be derived for RHSP-OSP. One is
called causal RHSP-OSP (CRHSP-OSP), which iterates signal signatures in an
inner loop while iterating data sample vectors in an outer loop. The other is called
progressive hyperspectral sample processing of OSP (PHSP-OSP), which iterates
data sample vectors in an inner loop while iterating signal sources in an outer loop.
Both CRHSP-OSP and PHSP-OSP implement their inner and outer loops in the
complete opposite order. As a result of CRHSP-OSP, each data sample vector r can
produce its own appropriate number of undesired signal signatures for OSP to find
abundance fractions of a desired signal signature present in that particular data
sample vector r. On the other hand, when PHSP-OSP is implemented, we see
abundance fractions of a particular desired signal signature present in each data
sample vector progressively as new signatures are added for annihilation. In
addition, PHSP-OSP also provides a signature-varying progressive profile of
detected abundance fractions for a particular desired signal signature when each
new undesired signal signature is added. In both cases CRHSP-OSP and PHSP-OSP
offer advantages of hardware implementation in chip design such as FPGA design
(Chang and Wang 2008).
There are several benefits. First, RHSP-OSP allows OSP to be implemented with
varying signatures via a recursive equation without repeatedly reinverting
undesired signature matrices. In this case, RHSP-OSP can be carried out in a
manner similar to that of a Kalman filter (Chap. 3) by only retaining the innovation
information to update results without reprocessing already known information,
including data sample vectors already visited. As a result, the computer processing
time required by RHSP-OSP is nearly constant regardless of how many targets need
to be processed. This significant advantage makes RHSP-OSP very attractive and
effective. Second, RHSP-OSP can also be used to find new unknown signal sources
recursively while simultaneously determining a desired number of signal sources
while data processing is happening. A similar idea is also used by RHSP-ATGP in
Chap. 7. As a result, with appropriate interpretations, RHSP-ATGP can be consid-
ered a special case of RHSP-OSP. Third, for practical applications RHSP-OSP can
also be expanded in two different ways, to a causal process and to a progressive
process, which give rise to causal RHSP-OSP and progressive RHSP-OSP, respec-
tively, both of which can be easily realized by hardware design. Finally, in analogy
with RHSP-ATGP, RHSP-OSP also provides a real-time stopping rule via a
Neyman–Pearson detector to determine whether each target that is newly generated
by RHSP-OSP is a target while it is being generated.
Chapter 9
of Linear Spectral Mixture Analysis
Abstract In linear spectral mixture analysis (LSMA) the signatures used to form a
linear mixing model (LMM) have a significant role and impact in unmixing
performance but are generally assumed to be fixed and known a priori. However,
in real-world applications, such prior knowledge is either very difficult if not
impossible to obtain or unreliable even if it is known. Consequently, a realistic
solution to this issue is to adapt signatures while LSMA is taking place. Another
issue is that according to the general consensus, the signatures used to perform
LSMA should be endmembers. As noted in Chaps. 4 and 9 in Chang (Real time
progressive hyperspectral image processing: endmember finding and anomaly
detection, Springer, New York, 2016), this is usually not true. In other words, the
signatures used to from LMM should be those that can best represent the data to be
processed but are not necessarily endmembers as pure signatures. Specifically,
background signatures which are generally mixed are considered to be the most
integral part of data and should be included for data unmixing. To address the
aforementioned issues, this chapter introduces a new concept called virtual signa-
tures (VSs), which is defined as spectrally distinct signatures that are different from
endmembers as pure signatures in the sense that VSs ought to be extracted directly
from the data themselves and cannot be from spectral libraries or databases or
outside the data space. Furthermore, VSs are also different from virtual
endmembers (VEs) introduced in nonlinear spectral mixture analysis according to
a bilinear model. With this definition, finding and adapting VSs is a very challeng-
ing issue because these unknown VSs must be found in an unsupervised manner
without prior knowledge. In addition, an associated issue is the need to determine
how to terminate the VS finding process once an unsupervised VS finding algorithm
is initiated. To accomplish this task, this chapter develops a theory of adaptive
recursive hyperspectral sample processing of LSMA (ARHSP-LSMA). It first
develops a recursive hyperspectral sample processing of LSMA (RHSP-LSMA),
which allows LSMA to be implemented progressively, by growing VSs one at a
time to form an adaptive linear mixing model (ALMM) that makes the LMM adapt
to signatures, as well as recursive by taking advantage of previously known
signatures to update LMM. In this way, LSMA does not necessarily have to be
reimplemented over and over again during data processing. This allows LSMA to
perform data unmixing at the same time that ALMM is being updated. The resulting
LSMA is called ARHSP-LSMA. Most importantly, one great benefit of ARHSP-
LSMA is that it can be used to fuse LSMA results obtained by multiple different

DOI 10.1007/978-3-319-45171-8_9
262 9 Recursive Hyperspectral Sample Processing of Linear Spectral Mixture Analysis
sets of signatures without going through the entire signature set to perform LSMA.
This is particularly useful in data communication and transmission when LSMA
can be processed at different locations by different users independently and sepa-
rately, and then these LSMA results can be further fused by ARHSP-LSMA without
performing LSMA using the complete set of signatures. Finally, it takes advantage
of the concept of target-specified virtual dimensionality developed in Chap. 4 to
determine when ARHSP-LSMA should be terminated automatically by a Neyman–
Pearson detector (NPD).
9.1 Introduction
Assume that any data sample vector r can be represented by a linear admixture of a
finite number of signatures, say, p signatures, m1, m2, . . ., mp by which r can be
unmixed into their corresponding abundance fractions of α1, α2, . . ., αp as an abun-
T
dance vector, αp ¼ α1 ; α2 ; . . . ; αp . In recent years these p signatures have been
assumed to be known and fixed a priori. Such linear spectral mixture analysis
(LSMA) is generally referred to as supervised LSMA (SLSMA). Unfortunately, in
practical applications, such prior signature knowledge is generally unobtainable.
Under this circumstance developing an unsupervised LSMA (ULSMA) is necessary
without appealing for prior knowledge of signatures that are used to form a linear
mixing model (LMM). However, two major challenging issues arise in ULSMA.
The first issue has to do with determining the number of signatures, p, followed by
the second issue, which is that the p signatures, m1, m2, . . ., mp used to form an
LMM must be found in a completely unsupervised manner. To determine an
appropriate value of p, the concept of virtual dimensionality (VD), recently devel-
oped in Chap. 17 (Chang 2003) and Chap. 5 (Chang 2013) and Chang and Du
(2004), can be used. As for finding m1, m2, . . ., mp a general approach is to assume
that these p signatures are endmembers, that is, pure signatures, so that many
endmember extraction algorithms (EEAs) developed in the literature can be used
to extract endmembers. Unfortunately, this assumption is generally not true. First,
the background is considered an integral part of data. Thus, an LMM should include
background signatures to faithfully represent data. Practically speaking, many
background signatures are indeed not pure. Second, even though endmembers
exist in the data, these endmembers may very well be contaminated by other
signatures or corrupted by noise in real-world environments. Therefore, the pres-
ence of endmembers is not guaranteed. In both cases, extracting something that is
not present in the data is meaningless. Interestingly, many EEAs that reportedly
extract endmembers actually find endmember candidates, not true endmembers,
because true endmembers may not be present in the data. A good example is the
N-finder algorithm (N-FINDR) (Winter 1999a, b), which was originally developed
to find endmembers, not extract them. It has been misinterpreted in the literature as
an EEA. Third, the concept of VD, originally developed in Chang (2003) and
Chang and Du (2004), was to determine the number of spectrally distinct signa-
tures, not endmembers, present in the data. Accordingly, using VD to determine the
number of endmembers is misleading. This chapter clarifies this issue by introduc-
ing the new concept of virtual signature (VS), which was previously used in
unsupervised LSMA (Chang et al. 2010a, 2011b) to define a VS as a spectrally
distinct signature that is a real data sample vector and must be present in data. With
this clear definition a VS distinguishes itself from an endmember in two respects.
One is that a VS is not necessarily pure like an endmember must be. In other words,
a VS can be a mixed signature. Another difference is that the existence of VS in the
data is guaranteed. This is unlike endmembers, which may not exist in real data
owing to signature purity. More specifically, a real endmember extracted directly
from data does not imply that it is in fact a true endmember. Previously, to address
this issue, the term virtual endmember (VE) was used in Tompkins et al. (1997) and
Bowles and Gilles (2007) to distinguish it from true endmembers. With this
interpretation endmembers extracted by an EEA in this chapter should be treated
as VEs. However, it is very unfortunate that the same term, VE, was also adopted in
several reports in nonlinear spectral mixture analysis (Chen et al. 2011), where VEs
are defined as relevant endmembers resulting from multiple-scattering effects in
each pixel and can be described by their product terms according to a bilinear
model. Such VEs can be highly correlated, as opposed to true endmembers, which
are generally spectrally distinct. Thus, to avoid any controversy and confusion, VS
is specifically defined and used in this chapter to differentiate VSs from VEs and
endmembers.
One of major applications resulting from LSMA is linear spectral unmixing
(LSU), which is probably one of the most studied subjects in hyperspectral data
exploitation (Chang 2003, 2013). When LSMA is implemented, it generally
assumes that signatures used to form an LMM for mixture analysis must be
known and fixed during data processing. In real practical problems obtaining
such prior knowledge is usually very costly or may not even be reliable owing to
many unknown subtle substances present in the data. To resolve this dilemma, one
feasible approach is to implement LSMA in an unsupervised manner, such as
ULSMA developed by Chang et al. (2010, 2011b) and Chang (2013), where
signatures can be obtained directly from the data being processed by an
unsupervised target detection algorithm. This approach does not address two
major issues: determining the number of signatures, p, required to form an LMM
and finding signatures once the value of p is determined. In ULSMA, the first issue
was resolved by virtual dimensionality (VD), recently developed by Chang (2003)
and Chang and Du (2004), while the second issue was resolved by least squares–
based unsupervised target finding algorithms proposed in Chang et al. (2010a) and
component analysis in Chang et al. (2011b). However, such ULSMA also fixes the
number of signatures during data processing once these unsupervised signatures are
found. In other words, this ULSMA cannot adapt to signatures once they are found
and cannot vary but must be fixed. More specifically, if signatures are found to be
insufficient or need to be expanded, ULSMA must be reimplemented to extract

another set of signatures; it cannot take advantage of previously found signatures.
To cope with this issue, this chapter develops an adaptive recursive hyperspectral
sample processing of LSMA (ARHSP-LSMA), which makes LSMA adapt to
signatures being generated during data processing. It is composed of three state
processes. In the first stage process, it develops a recursive hyperspectral sample
processing of LSMA (RHSP-LSMA) to find new signatures via LSMA. Then, in the
second stage process, it uses the RHSP-found signatures to extend LMM to
adaptive LMM (ALMM), which makes ALMM a signature-varying LMM so that
implementing RHSP-LSMA in conjunction with ALMM results in ARHSP-LSMA,
which allows users to update signatures to perform LSMA according to various
applications. More usefully, ARHSP-LSMA can be used in data communication to
fuse LSMA results obtained by using multiple sets of signatures without having to
perform LSMA using LMM formed by the complete set of signatures. Finally, in
the third stage process, it derives an automatic stopping rule to determine when
ARHSP-LSMA needs to be terminated.
The idea of RHSP-LSMA is similar to the concept of Kalman filtering devel-
oped in Chap. 3, which utilizes recursive equations to update outputs by taking
advantage of previously processed information. To be more specific, the signa-
tures used to form an ALMM can be updated by adding more signatures either one
after another or one set after another without reprocessing the entire set of
signatures. Thus, two remaining issues are (1) finding appropriate signatures to
adapt ALMM and (2) determining the sufficient number of signatures to form an
ALMM. As for the first issue, the unsupervised algorithms developed in Sect. 4.4
of Chap. 4 can be used to generate new signatures. To resolve the second issue
about determining how many signatures are required for RHSP-LSMA, the target-
specified VD (TSVD) developed in Sect. 4.5 of Chap. 4 can be used for this
purpose.
Note that according to Chaps. 3 and 9 in Chang (2016), the signatures used by
LSU to unmix data are not necessarily endmembers, as was commonly thought, for
several reasons. First and foremost is that when LSMA is used to perform LSU, its
task is to unmix the data via an LMM. In this case, the fact that signatures used to
form an LMM are assumed to be endmembers seems valid. It is also natural to see
that a data sample vector can be unmixed into abundance fractions of these
endmembers. However, if LSMA is performed as a linear system represented by
an LMM instead of LSU, then, for LSMA to work effectively, background signa-
tures must be included in its LMM to faithfully represent data through the system,
in which case background signatures are generally mixed. Accordingly, using the
term endmember in connection with LSMA is somewhat misleading. Thus,
throughout this chapter the term signature or VS rather than endmember is used
for clarity.
9.2 LSMA 265
9.2 LSMA
LSMA is quite different from the orthogonal subspace projection (OSP) presented
in Sect. 4.4.2.1 in two crucial respects. First and foremost is the performance
criteria they use. OSP is designed based on signal-to-noise ratio (SNR), whereas
LSMA is derived from least-squares error (LSE). As a result, OSP detects signal
strength compared to LSMA, which estimates signal parameters. Second, LSMA is
a vector estimation technique that can estimate signal abundance fractions simul-
taneously. Thus, it can be used for data unmixing by imposing abundance con-
straints, abundance sum-to-one constraint (ASC) and abundance nonnegativity
constraint (ANC). However, since OSP performs detection for signals one at a
time, ASC cannot be directly applied to OSP unless ASC must be adaptively
applied to each signal it detects consecutively. Therefore, in essence, OSP and
LSMA can be considered two completely different analysis concepts.
Assume that L is the number of spectral bands and r is an L-dimensional image
pixel vector. Assume that there are p material substance signatures, m1, m2, . . ., mp,
which are generally referred to as digital numbers (DNs). A linear mixture model of
r models the spectral signature of r as a linear combination of m1, m2, . . ., mp with
appropriate abundance fractions specified by α1, α2, . . ., αp. More precisely, r is an
L 1 column vector and
M is an L p substance spectral signature matrix, denoted
by m1 m2 mp , where mj is an L 1 column vector represented by the
spectral signature of the jth substance tj resident in the pixel vector r. Let α ¼
T
α1 ; α2 ; . . . ; αp be a p 1 abundance column vector associated with r, where αj
denotes the abundance fraction of the jth substance signature mj present in the pixel
vector r. To restore the pixel vector r, we assume that the spectral signature of the
pixel vector r can be represented by an LMM by m1, m2, . . . , mp as follows:
r ¼ Mα þ n; ð9:1Þ
where n is noise or can be interpreted as a measurement or model error.

Inspection of (9.1) reveals that the LMM used by LSMA is similar to a linear
model used by a Wiener filter, except that (9.1) explores the correlation among
p substance signatures compared to the latter, which uses a linear model to account
for p past observed data samples to make predictions. Furthermore, by virtue of
(9.1), a hyperspectral image viewed as an image cube can be restored by a set of
p abundance fraction maps.
A general approach to solving (9.1) is the least-squares solution given by
1
^ LS ðrÞ ¼ MT M MT r;
α ð9:2Þ

^ LS ðrÞ ¼ α
where α ^ LS
1 ð r Þ, ^
α LS
2 ð r Þ, . . . , ^
α LS
p ðr Þ ^ LS
and α j ðrÞ is the abundance
fraction of the jth substance signature mj estimated from the data sample vector
r, with the data sample vector r included to emphasize that the abundance estimate
is dependent on r.
To faithfully restore the spectral information contained in data sample vectors

via LSMA, three issues need to be addressed. One is to determine the value of p,
that is, the number of substance signatures used to form an LMM in (9.1). Another
is to find a desired set of p substance signatures, m1, m2, . . . , mp to form a
signature matrix M in an unsupervised manner. The X plast issue is to impose two
physical constraints on model (9.1): (1) an ASC, α ¼ 1, and (2) an ANC,
j¼1 j
αj 0, for all 1 j p.
9.2.1 Abundance Sum-to-One Constrained LSMA
The simplest way to impose a constraint on (1) is ASC. In other words, a sum-to-one
constrained least-squares (SCLS) problem can be cast as follows
n o Xp
minα ðr MαÞT ðr MαÞ subject to j¼1
αj ¼ 1: ð9:3Þ
To solve (9.3), we use the Lagrange multiplier λ1 to constrain the ASC, 1Tα ¼ 1, by
1
J ðα; λ1 Þ ¼ ðMα rÞT ðMα rÞ þ λ1 1T α 1 ; ð9:4Þ
2
0 1T
where 1 is the unity vector given by 1 ¼ @1, 1, . . . , 1 A . Differentiating (9.4) with
|fflfflfflfflfflffl{zfflfflfflfflfflffl}
p
respect to α and λ1 yields
SCLS
∂J ðα;λÞ
∂α SCLS
^
¼ MT M α ðrÞ MT r þ λ*1 1 ¼ 0
^
α ð rÞ
1 1 ð9:5Þ
^
)α ðrÞ ¼ MT M MT r λ*1 MT M 1
SCLS
1
^ SCLS ðrÞ ¼ α
)α ^ LS ðrÞ λ*1 MT M 1
and
∂J ðα; λ1 Þ
^ LS ðrÞ 1 ¼ 0:
¼ 1T α ð9:6Þ
∂λ1 λ*
1
Note that (9.5) and (9.6) must be solved simultaneously in such a way that both
^ SCLS ðrÞ and λ1 appear in (9.5).
the optimal solutions, α
Using (9.6) further implies
1
^ SCLS ðrÞ λ*1 1T MT M 1 ¼ 1
1T α
h 1 i1 ð9:7Þ
) λ*1 ¼ 1T MT M 1 ^ SCLS ðrÞ :
1 1T α
9.2 LSMA 267

1 h 1 i1
^ SCLS ðrÞ ¼ α
α ^ LS ðrÞ þ MT M 1 1T MT M 1 ^ LS ðrÞ
1 1T α
1 h 1 i1 ð9:8Þ
¼ P⊥ ^ LS ðrÞ þ MT M 1 1T MT M 1 ;
M, 1 α
where
T 1 h T T 1 i1 T
P⊥
M, 1 ¼ I M M 1 1 M M 1 1 : ð9:9Þ
9.2.2 Abundance Nonnegativity Constrained LSMA
Since there is no physical constraint imposed on the solution α ^ LS ðrÞ, its generated
abundance fractions α ^ LS
1 ðrÞ, α ^ LS ^ LS
2 ðrÞ, . . . , α p ðrÞ may be negative. To avoid this
situation, an ANC must be imposed on (9.1). A general approach to solving such an
ANC-constrained problem is to introduce an LSE-based objective function imposed
by the ANC for j 2 f1; 2; . . . ; pg, defined as
J ¼ ð1=2ÞðMα rÞT ðMα rÞ þ λðα cÞ; ð9:10Þ

T
where λ ¼ λ1 ; λ2 ; . . . ; λp is a Lagrange multiplier vector and a constraint vector
T
c ¼ c1 ; c2 ; . . . ; cp , with cj > 0 for 1 j p. In analogy with (9.5) we obtain
NCLS
∂J ðα;λÞ
∂α α
¼ MT M α ^ ðrÞ M T r þ λ ¼ 0
^ NCLS ðrÞ
1 1 ð9:11Þ
^ NCLS ðrÞ ¼ MT M MT r MT M λ
)α
1
^ NCLS ðrÞ ¼ α
)α ^ LS ðrÞ MT M λ;
which implies
T 1 *
M M λ ¼α ^ LS ðrÞ α^ NCLS ðrÞ
NCLS
) λ* ¼ MT M α ^ LS ðrÞ MT M α^ ðrÞ
h i NCLS ð9:12Þ
1
) λ* ¼ MT M MT M MT r MT M α ^ ðrÞ
) λ* ¼ MT r MT M^
α NCLS ðrÞ:
^ LS ðrÞ to further satisfy the ANC, the following Kuhn–Tucker

In order for α
conditions must be implemented:
λi ¼ 0, i 2 P,
ð9:13Þ
λi < 0, i 2 R;
where P and R denote passive and active sets that contain indices representing
negative or positive abundances respectively. By virtue of (9.11)–(9.13), a numer-
ical algorithm, referred to as a nonnegativity constrained least-squares (NCLS)
algorithm, can be designed to start off with the initial estimate given by α ^ LS ðrÞ in
(9.2). If all abundance fractions are positive, the NCLS algorithm stops. Otherwise,
all indices corresponding to negative and zero abundance fractions will be moved to
passive set P and all positive abundance indices to R. According to Kuhn–Tucker
conditions (9.13), any λi with index i 2 P is set to zero, and other indices are
calculated based on (9.12). If all λi are negative, then the NCLS algorithm stops.
If not, the corresponding most negative index is moved from R to P. A new vector λ
is then recalculated based on the modified index sets, and a new Lagrange multi-
plier vector is further implemented to find a new set of abundance fractions. Then
any negative abundance indices are shuffled from P to R. By iteratively
implementing (9.11) and (9.12), using α ^ LS ðrÞ in (9.2) as an initial abundance
vector, the NCLS algorithm can be derived to find an optimal solution, α ^ NCLS ðrÞ
(Chang and Heinz 2000). A detailed step-by-step implementation of the NCLS
algorithm is given in what follows.
NCLS Algorithm
1. Initialization: Set the passive set Pð0Þ ¼ f1; 2; . . . ; pg, Rð0Þ ¼ ∅, and k ¼ 0.
2. Compute α ^ LS ðrÞ by (9.2), and let α ^ NCLS, ðkÞ ðrÞ ¼ αLS ðrÞ.
3. At the kth iteration, if all components in α ^ NCLS, ðkÞ ðrÞ are nonnegative, the
algorithm is terminated. Otherwise, continue.
4. Let k ¼ k þ 1.
5. Move all indices in P(k1) that correspond to negative components of
^ NCLS, ðk1Þ ðrÞ to R(k1), respectively. Create a new index S(k), and set it
α
equal to R(k).
6. Let α^ RðkÞ denote the vector consisting of all components of α ^ NCLS, ðk1Þ ðrÞ in R(k).
1
7. Form a matrix ΦðkÞ by deleting all rows and columns in the matrix MT M
that are specified by P(k).

ðkÞ 1
8. Calculate λðkÞ ¼ Φα ^ RðkÞ . If all components in λ(k) are negative, go to
α
step 13. Otherwise, continue.
n o
ðkÞ
9. Calculate λðmax
kÞ
¼ arg maxj λj , and move the index in R(k) that corresponds to
ðkÞ
λmax to P(k).
9.2 LSMA 269
ðkÞ 1
10. Form another matrix Ψ λ by deleting every column of MT M that is
specified by P(k).
^ NCLS, ðkÞ Ψ ðλkÞ λðkÞ .
^ S ðk Þ ¼ α
11. Set α
12. If any components of α ^ SðkÞ in S(k) are negative, then move these components
(k) (k)
from P to R . Go to step 6.
ðkÞ 1
13. Form another matrix Ψ λ by deleting every column of MT M that is
specified by P(k).
^ NCLS, ðkÞ ¼ α
14. Set α ^ LS Ψ ðλkÞ λðkÞ . Go to step 3.
9.2.3 Abundance Fully Constrained LSMA
Because the NCLS algorithm does not impose the ASC, its generated abundance
fractions may not necessarily sum up to one. In this case, it must solve the following
constrained optimization problem:
n o n Xp o
min ðr MαÞT ðr MαÞ subject to Δ ¼ α αj 0 for 8j, α j ¼ 1 :
α2Δ j¼1
ð9:14Þ
^ SCLS
The optimal solution to (9.14) first takes advantage of the SCLS solution, α
ðrÞ in (9.8), as an initial estimate to derive
1 1
^ FCLS ðrÞ ¼ P⊥
α ^ SCLS ðrÞ þ MT M 1 1T MT M 1
M, 1 α ð9:15Þ
and
T 1 T T 1 T
P⊥
M, 1 ¼ ILL M M 1 1 M M 1 1 : ð9:16Þ
It then uses SCLS together with ANC by introducing a new signature matrix
N and an observation vector s into the NCLS specified by

ηM ηr
N¼ and s¼ ; ð9:17Þ
1T 1
where η is a parameter to control the effect of ASC on the NCLS algorithm

and is
defined as the reciprocal of the maximum element in the matrix M ¼ mij , that is,

η ¼ 1=maxij mij . The utilization of η in (9.17) controls the impact of the ASC.
Using (9.17), an FCLS algorithm can be derived directly from the NCLS algorithm
^ LS ðrÞ used in the NCLS
described in the previous section by replacing M, r, and α
algorithm with N, s, and α^ SCLS
ðrÞ.
9.3 RHSP-LSMA
The rationale of developing RHSP-LSMA is quite different from that developed

for LSMA in the sense that RHSP-LSMA can be implemented in data trans-
mission and communication, which is not applicable to LSMA. One major
advantage of implementing RHSP-LSMA is that RHSP-LSMA can be
performed in a recursive manner as signatures of interest are generated gradu-
ally and progressively. More specifically, RHSP-LSMA can process signatures
at the same time while they are being generated. This is particularly useful
when the signatures used to form an LMM may not be sufficient and more
signatures need to be generated and included. To deal with this dilemma, the
traditional LMM is no longer applicable and must be rederived to adapt to
varying signatures as follows.
9.3.1 Adaptive Linear Mixing Model
In (9.1) the signature matrix M is assumed to be known and fixed. Thus, if the
signatures m1, m2, . . ., mp in M are not sufficiently representative, then the LMM in
(9.1) needs to be updated by adding more signatures. The LMM adapting varying
signatures is referred to as an adaptive LMM (ALMM). In most cases, when
ALMM is implemented, the current signatures remain unchanged. Only future
signatures need to be updated. In this case, we should be able to take advantage
of what has already been done for the current signatures and only process new
signatures without reprocessing LSMA, which uses signatures already visited.
This type of scenario is very common in communications and signal processing
when a new incoming input is fed into a system. In LSMA, we consider LMM a
linear system with inputs determined by signatures m1, m2, . . ., mp. Thus, if a
new input, such as a new signature mp+1, comes in to update ALMM, the
system should not be processed by entire inputs again but rather updated by
only the new incoming input, mp+1. A good representative example is a Kalman
filter (Chap. 3), which operates in exactly this way. Applying an idea similar to
Kalman filtering, the following subsection develops the so-called RHSP-LSMA,
which can update LSMA recursively in signatures in the same way that a
Kalman filter does.
9.3.2 RHSP-LSMA Updated by Single Signatures
In this section two scenarios can be used to implement RHSP-LSMA. One is to

implement RHSP-LSMA signature by signature recursively. This scenario can be
used to determine an appropriate number of signatures required for LSMA. Another
9.3 RHSP-LSMA 271
scenario is to implement RHSP-LSMA by fusing the LSMA results from two

different sets of signatures. This may occur in data transmission and communica-
tion where two different sets of signatures are acquired by different sensors or
received at different receiving ends or time instances. As a consequence, there is no
need to wait for the completion of signature acquisition or reception. In this case,
LSMA can be processed by two different sets of signatures independently and
separately, and their results can be fused and integrated subsequently. Most impor-
tantly, the correlation between these two LSMA results can be explored via
interaction between two sets of signatures and expressed by derived recursive
equations. Such a correlation is generally referred to as innovations information
that cannot be provided and generated by two independently processed LSMA
results.
Assume that Mp ¼ m1 m2 mp is a matrix of size L p, and let mp+1 be a new

L-dimensional VS vector added to Mpþ1 ¼ m1 m2 mp mpþ1 . Then, according to
a matrix inverse identity derived in (Eq. 12.25 in Chang 2013), the inverse of the
T
matrix (Mpþ1 Mpþ1 ) can be expressed as
" # " #
MpT MpT Mp MpT mpþ1
T
Mpþ1 Mpþ1 ¼ T M m
p pþ1 ¼ T T ; ð9:18Þ
mpþ1 mpþ1 Mp mpþ1 mpþ1
" #!1
1 MpT Mp MpT mpþ1
T
Mpþ1 Mpþ1 ¼
T T
mpþ1 Mp mpþ1 mpþ1
2 1 T 3
6 M T
p M p þ βM # T
p pþ1 pþ1 Mp
m m #
βM#p mpþ1 7
¼6
4 T
7;
5
βmpþ1T
M#p β
ð9:19Þ
1
1 1
where M#p¼ Mp Mp
T
Mp and β ¼ mpþ1 I Mp Mp Mp
T T T T
Mp mpþ1
n h i o1
¼ mpþ1T
P⊥Mp mpþ1 . Note that P⊥
Mp ¼ I PMp can be easily calculated via PMp
1
¼ Mp MpT Mp MpT ¼ Mp M#p using (9.19). The least-squares estimate of αp+1,
^ LS
α pþ1 ðrÞ is then given by
02 311 2 3
1 MpT Mp MpT mpþ1 MpT
^ LS
α pþ1 ðrÞ ¼ T
Mpþ1 Mpþ1 T
Mpþ1 r ¼ @4 5A 4 5r
T T T
mpþ1 Mp mpþ1 mpþ1 mpþ1
2 1 T 3
2 3
6 M p
T
M p þ βM #
p m pþ1 m T
pþ1 M #
p βM #
p m pþ1 7 Mp
T
¼6
4 T
74
5
5r
T
βmpþ1 Mp
T #
β m pþ1
2 1 T 3
6 M p
T
M p M T
p r þ βM #
p m pþ1 m T
pþ1 M #
p M T
p r βM #
p m pþ1 M T
p 7r
¼6
4 T
7
5
βmpþ1 T
Mp Mp r þ βmpþ1
# T T
r
2 T 3
6 ^
α LS
p ð r Þ þ βM #
p m pþ1 m T
pþ1 M #
p M p
T
r βM #
p m pþ1 M T
p r 7
¼4 5

βmpþ1 I PMp r
T
2 LS 3 2 LS 3
^ p ðrÞ þ βM#p mpþ1 mpþ1
α T
PM p I r ^ p ðrÞ βM#p mpþ1 mpþ1
α T
P⊥
Mp r
¼4 5¼4 5;
βmpþ1T
P⊥Mp r αLS
pþ1 ð r Þ
ð9:20Þ
⊥
where PM T
p
¼ PMp ¼ Mp M#p and α ^ LS
pþ1 ðrÞ ¼ βmpþ1 PMp r in (9.20) is a scalar
T
obtained by (9.2) that is exactly the least-squares OSP (LSOSP) estimate of the
T LSOSP
^ pþ1 ðrÞ, via OSP
( p + 1)st abundance fraction of αpþ1 ¼ α1 ; α2 ; ; αp ; αpþ1 , α

in (4.24), where U and d are specified by U ¼ m1 mp1 mp and mp+1, respec-
tively. According to (9.20), there are three pieces of information to implement
RHSP-LSMA.
1. Processed information:
⊥
^ LS
α p ðrÞ, PMp :
2. New information:
mp þ1 :
3. Innovations information provided by the correlation between processed infor-

mation and new information:
n h i o1
β ¼ mpþ1T
P⊥
Mp mpþ1
T
determined by mpþ1 P⊥
Mp . Technically speaking, β is
also updated as p varies. Thus, it should be more specifically expressed as βp+1|p
to indicate such a correlation and dependence between p and p + 1.
9.3 RHSP-LSMA 273

LS
If we further define ^ LS
α ð r Þ^ α ð r Þ ¼ α^ LS ðrÞ, . . . , α
^ LS ðrÞ,
pþ1 pþ1 pþ1 1 pþ1 p pþ1
T
^ LS ðrÞÞT and α
α ^ LS ðrÞ ¼ α ^ LS ðrÞ, . . . , α
^ LS ðrÞ , then α ^ LS
pþ1 ðrÞ ¼
pþ1 pþ1 p pþ1 1 pþ1 p pþ1
0 LS 1
^ ðrÞ
α
LS
^ ðrÞ ¼ @ LS A, where
p pþ1
α
pþ1 pþ1 ^ ðrÞ
α
pþ1 pþ1
^ LS
α ^ LS ðrÞ βM#p mpþ1 mpþ1
ðrÞ ¼ α T
P⊥
Mp r; ð9:21Þ
p pþ1 p p
^ LS ðrÞ ¼ βmpþ1
α T
P⊥
Mp r: ð9:22Þ
pþ1 pþ1
^ LS
Apparently, (9.21) is a recursive equation used to update α ^ LS
pþ1 ðrÞ, that is, α p
α LS ðrÞ ) α
ðrÞ^ ^ LS ðrÞ ) α ^ LS ðrÞ ¼ α
^ LS
pþ1 ðrÞ, which is exactly how a
p p ð9:21Þ p pþ1 ð9:22Þ pþ1 pþ1
Kalman filter is implemented (Chap. 3). In addition,
2 1 T 3
1 h
T
i6 Mp Mp þ βM#p mpþ1 mpþ1T
M#p βM#p mpþ1 7" 1p #
T
1pþ1 T
Mpþ1 Mpþ1 1pþ1 ¼ 1pT 1 6
4 T
7
5
1
βmpþ1
T
M#p β

1 T
¼ 1pT MpT Mp βM#p mpþ1 mpþ1
T
M#p 1p βmpþ1
T
M#p 1p β1pT M#p mpþ1 þ β
1
T
¼ 1pT MpT Mp 1p β 1pT M#p mpþ1 M#p mpþ1 1p 2 1pT M#p mpþ1 þ 1
1 h i2
¼ 1pT MpT Mp 1p β 1pT M#p mpþ1 1 :
ð9:23Þ
1
By virtue of (9.23), P⊥
M, 1 in (9.9) can be calculated recursively by 1
T
MpT Mp 1 in
(9.23), which is a scalar, and α ^ NCLS ðrÞ in (9.12), and α
^ SCLS ðrÞ in (9.8), α ^ FCLS ðrÞ in

^ LS
(9.15) can also be calculated recursively. Note that sign α pþ1 ðrÞ can be calcu-
1
lated directly from the obtained α ^ LS
pþ1 ðrÞ ¼ Mpþ1 Mpþ1
T T
Mpþ1 r once M#pþ1 ¼
1
T T
Mpþ1 Mpþ1 Mpþ1 is calculated.
9.3.3 RHSP-LSMA Fused by Two Signature-Varying

Matrices
Unlike the previous subsection, where RLSMA is updated signature by signature,

this subsection presents another advantage of RLSMA: it can fuse the two LSMA
results obtained by two different sets of signatures without reprocessing the com-
plete set of signatures.
Assume that Mp and Mq are two disjointsignature matrices. Then we form a joint
signature matrix Mp+q by Mpþq ¼ Mp Mq , which can be expressed as follows:
" # " #
MpT MpT Mp MpT Mq
T
Mpþq Mpþq ¼ Mp Mq ¼ ; ð9:24Þ
MqT MqT Mp MqT Mq
" #!1
1 MpT Mp MpT Mq
T
Mpþq Mpþq ¼
MqT Mp MqT Mq
2 1 T 3
ð9:25Þ
6 M T
p M p þ M #
p M q ΓM T
q M#p M#p Mq Γ 7
6
¼4 7;
T 5
ΓT MqT M#p Γ

1 1 n h i o1
where Γ ¼ MqT I Mp MpT Mp MpT Mq ¼ MqT P⊥
Mp Mq , which
can be more specifically expressed as Γ( p,q) to indicate its dependency on matrices
Mp and Mq;
" #!1 " #
1 MpT Mp MpT Mq MpT
^ LS
α pþq ðrÞ ¼ Mpþq
T
Mpþq T
Mpþq r¼ r
MqT Mp MqT Mq MqT
2 1 T 3
" T#
MpT Mp þ M#p Mq ΓMqT M#p M#p Mq Γ
6 7 Mp
¼4 T 5 r
ΓT MqT M#p Γ MqT
2 1 T 3
MpT Mp MpT r þ M#p Mq ΓMqT M#p MpT r M#p Mq ΓMqT r
6 7
¼4 T 5
ΓT MqT M#p MpT r þ ΓMqT r
2 T 3
^ LS
α ðrÞ þ M#p Mq ΓMqT M#p MpT r M#p Mq ΓMqT r
¼4 p

5
ΓT MqT I PMp r
" LS # " LS #
^ p ðrÞ þ M#p Mq ΓMqT PMp I r
α ^ p ðrÞ M#p Mq ΓMqT P⊥
α Mp r
¼ ¼ ;
ΓT MqT P⊥Mp r αLS
q ðrÞ
ð9:26Þ
9.4 Adaptive RHSP-LSMA 275
1 T
where P⊥
Mp ¼ M M
p p
#
¼ M p M T
p M p Mp
T
¼ P ⊥
Mp owing to its symmetry, M#p
n h i o1
T ⊥ ⊥
Mp ¼ I, and αLS
q ð r Þ ¼ Γ M q PM p r ¼ M q PM p M q
T T
MqT P⊥
Mp r in (9.26) is an
estimate of αq(r) in a least-squares sense. Some comments on (9.24) are
worthwhile.
T T
1. Since P⊥ Mp is also idempotent, that is, PMp
⊥
¼ P⊥Mp and PMp
⊥
P⊥ ⊥
Mp ¼ PMp ,
h iT h
i 1
Γ¼ P⊥
Mp Mq P⊥
Mp Mq . This gives rise to αLS
q ðrÞ ¼

T
1
P⊥
Mp Mq P⊥Mp Mq MqT P⊥ ⊥
Mp r , where PMp r maps a data sample vector
⊥
r onto the linear subspace Mp orthogonal to hMpi spanned by Mp via P⊥ Mp .
2. It is also interesting to note that αLS
q ðrÞ has exactly the same form as the LSOSP
1 T ⊥
obtained in (4.21) and (9.2) by αjLSOSP ðrÞ ¼ mj P⊥ U mj mj PU r, with mj and
U being the signature matrix consisting of all signatures but with mj replaced by
Mq and Mp, respectively. As also noted in the first component of (9.25),
M#p Mq ΓMqT P⊥ ⊥
Mp r, the vector of PMp r is simply the residual of the data sample
⊥
vector r leaking from the space Mp into the space Mp .
3. If we let Mq be a singleton set consisting of only one signature mp+1, then (9.26)
is reduced to (9.20), where the matrix Γ becomes the scalar β.
As a final note, the results presented in this section can be extended to multiple
sets of signatures greater than two by processing, using RHSP-LSMA, two sets of
disjoint signature matrices at a time. Conversely, it can be also used to break up the
signature matrix Mp into two disjoint sub-matrices, Mr and Ms with p ¼ r þ s.
9.4 Adaptive RHSP-LSMA
The RHSP-LSMA developed in Sect. 9.3 assumes that the signature knowledge of
m1, m2, . . ., mp is provided a priori. In fact, this is generally not true in practical
applications, and such knowledge must be obtained directly from the data to be
processed. In doing so, two key issues must be addressed: determining the number
of signatures, p, that need to be generated and finding p signatures once the value of
p is determined. The ULSMA was the first method developed to deal with both
issues (Chang et al. 2010a, 2011b; also Chap. 17 in Chang 2013). It first estimated
appropriate values for p by virtual dimensionality (VD), developed in Chang (2003)
and Chang and Du (2004); then it used least-squares methods in Chang et al. (2010)
or component analysis in Chang et al. (2011b) to find the desired signatures, m1,
m2, . . ., mp. In this type of ULSMA, determining the value of p and finding
p desired signatures m1, m2, . . ., mp are decoupled. However, in reality, the
signatures used to form an LMM are actually closely related to the value of p, which
is in turn determined by various applications. In other words, if signatures are
selected adequately, the needed value of p will be smaller; otherwise, it may be
larger than its real value if the signatures are poorly selected. Thus, to cope with this
dilemma, this section presents a new alternative approach, called ARHSP-LSMA,
to resolve these two issues via ALMM defined by adaptively adjusting p signatures
used in LMM in (9.1) as
r ¼ Mp αp þ n; ð9:27Þ

where p is considered to be a variable and Mp ¼ m1 m2 mp1 mp , with mp being
generated by RHSP-LSMA as a new pth signature.
Implementing ARHSP-LSMA requires several pieces of information, which
must be obtained and generated directly from the data to be processed: (1) how to
find appropriate signatures to be used to form an ALMM (9.27); (2) how to
determine the number of signatures needed for an ALMM; (3) how to implement
LSMA without repeatedly processing entire signatures for an ALMM. Each of
these is described in steps 2, 4, and 3, respectively, and summarized as follows.
ARHSP-LSMA
Find the initial signature, t1LSMA ¼ argfmaxr rT rg, over all data sample vectors r.
2. Design a particular algorithm to find the desired signatures to form an ALMM.
3. Implement RHSP-LSMA with the found signatures one by one recursively while
signatures are being generated.
4. Apply a specific stopping rule to determine whether ARHSP-LSMA is to be
terminated, in which case the number of signatures, p, is automatically set at this
stage.
In ARHSP-LSMA, finding the signatures in step 2 and applying a stopping rule
in step 4 are very closely related to applications. For example, if LSMA is used for
subtarget detection or mixed target classification, finding unsupervised targets is the
first priority, and the criterion to determine how many such targets should be
coupled with the algorithm used to generate these targets. On the other hand, if
LSMA is used for LSU to unmix data, an appropriate criterion for optimality may
be an unmixed error, where the algorithm for finding signatures should be a spectral
unmixing method. If LSMA is used to find endmembers, an appropriate criterion
for optimality will be maximum simplex volume, and the endmember finding
algorithm should be one used to generate endmembers. In what follows, two
approaches to generating signatures for ALMM.
9.4 Adaptive RHSP-LSMA 277
9.4.1 OSP-Based Finding Signatures for ALMM
To see how to find new signatures, we consider the second term of (9.21),
βM#p mpþ1 mpþ1

T
P⊥
Mp r; ð9:28Þ
^ LS ðrÞ is used to estimate α

which is an error-correction term when α ^ LS ðrÞ via a
p p p pþ1
^ LS
new signature, mp+1. To make the best possible estimate for α ðrÞ, we need to
p pþ1
minimize this error term as much as we can. This is equivalent to minimizing
n h i o1
β ¼ mpþ1
T
P⊥
Mp m pþ1
T
or maximizing mpþ1 P⊥
Mp in (9.28), both of which turn
out to be the same thing. If we interpret U ¼ Mp and d ¼ mp+1 in (9.1), then β
is exactly the constant used to correct the estimation error in the LSOSP
estimate of α^ pþ1
LSOSP
ðrÞ and α^ LS ðrÞ ¼ βmpþ1
T
P⊥
Mp r , as derived in (9.22).
pþ1 pþ1
n h i o
Now let ρ mpþ1 ; Mp be a measure defined by mpþ1 T
P⊥Mp mpþ1 . Since P⊥
Mp
T
is idempotent and P⊥ Mp ¼ P⊥
Mp ,
n T h ⊥ i o
T T T
ρ mpþ1 ; Mp ¼ mpþ1 PMp mpþ1 ¼ P⊥Mp m pþ1 P ⊥
Mp m pþ1
T 2
⊥
¼ P⊥
Mp m pþ1 P ⊥
Mp m pþ1 ¼ P Mp m pþ1 ¼ β1 :
ð9:29Þ

The significance of (9.29) is that ρ mpþ1 ; Mp is a measure of the residual resulting
from the leakage of signatures in Mp into its orthogonal complement subspace
⊥
Mp , which determines how much energy can be generated from the new
signature mp+1. In other words, maximizing (9.29) is equivalent to minimizing β,
which in turn means minimizing the estimation error. This quantity provides an
effective criterion for generating new signatures.
OSP-Based Algorithm for Finding New Signatures

Find the initial signature, tOSP
1 ¼ argfmaxr rT rg over all data sample vectors r.
Let M1 ¼ ½t1 .
Let ε be a prescribed error threshold.
2. Signature generation:
For p 2 find
n o
T ⊥
tOSP
p ¼ arg max m p
ρ m p ; M p1 ¼ arg max m p m p Mp1 :
P ð9:30Þ
3. Stopping rule:
Use (9.29) and (9.30) to calculate

ρ tOSP
p ; M p1 ð9:31Þ
to see whether (9.31) is less than ε. If it is, the algorithm is terminated.

Otherwise, augment Mp1 by including this new signature tOSP to form a new
h i p
p-member signature matrix as Mp ¼ Mp1 tOSP p and p p þ 1. Go to step 2.

n o
Note that the sequence ρ tOSPp ; Mp1 is monotonically decreasing at p as

p2

p is increased. This is because Mp1
Mp ) ρ tOSP p ; Mp1 ρ tpþ1 ; Mp .
OSP
Interestingly, the preceding OSP-based algorithm can be shown to be the

automatic target detection and classification algorithm (ATDCA) proposed by
Ren and Chang (2003), which is commonly referred to as the automatic target
generation process (ATGP) (Sect. 4.4.2.3).
9.4.2 Linear Spectral Unmixing–Based Finding Signatures

for ALMM
In addition to the OSP-based algorithm developed in Sect. 9.4.1, other algorithms

can also be used to find signatures. For example, we can use LSU to calculate its
unmixed error by
XN T
LSU Mp ¼ ^ p ðri Þ ri Mp α
ri M p α ^ p ðri Þ ð9:32Þ
i¼1

N
as a criterion to replace ρ tOSP
p ; Mp1 in (9.31), where fri gi¼1 are all data sample
vectors. Unlike (9.31), which is monotonically decreasing and requires a prescribed
error threshold to terminate the algorithm, (9.32) is not monotonically decreasing as
Mp is increasingly augmented owing to a model fitting error. In other words, when a
set of {tp} is underrepresented by LMM, the unmixed error in (9.32) will be
increased until it reaches a certain value of p, which is optimal; after that, (9.32)
begins to increase again due to an overfitting error resulting from {tp} which is
overrepresented by LMM. Thus, applying this fact we can use this model fitting
error resulting from LMM to find the optimal value of p, p*, as follows:

p* ¼ arg minp2 LSU Mp : ð9:33Þ
Note that ideas similar to (9.33) are also explored by HySime and GENE
(Ambikapathi et al. 2013).
9.5 Determination of Number of Signatures for ARHSP-LSMA 279
LSU-Based Algorithm for Finding New Signatures

Find the initial signature, tLSE ¼ argfmaxr rT rg, over all data sample vectors r.
LSE 1
Let M1 ¼ t1 .
2. Signature generation:
For p 2 find

p ¼ arg maxr LSU Mp1 ; r
tUE ; ð9:34Þ
which is a data sample vector that yields the maximum unmixed error among all
data sample vectors. The reason for maximizing (9.34) is that tUE
p is causing the
maximum unmixed error and it should be included in the signature matrix to
reduce the unmixedh error. i
3. Augment Mp ¼ Mp1 tLSU
p , and check whether p ¼ L, where L is the total
number of spectral bands. If it is not, go to step 2. Otherwise, continue.
4. Stopping rule:
Find the minimal unmixed error over p via (9.33). However, it is worth noting
that p* is largely dependent on the set of found {tLSU
p } that are used to form the
LMM. Different algorithms find different sets of {tLSUp }, which of course results
*
in different values of p .
^ p in (9.32).
In the LSU-based algorithm, there is no specific algorithm to find α
Thus, when (9.32) is performed, three least-squares algorithms can be used: an
unconstrained-abundance least-squares method, such as α ^ LS
p specified by (9.2);
^ pNCLS specified by
partially abundance-constrained least-squares methods, such as α
(9.11); and fully abundance-constrained least-squares method, such as α ^ pFCLS
specified by (9.15). Thus, if the FCLS method is used to find tLSU
p via (9.34) in
step 2, the LSU-based algorithm becomes the unsupervised fully constrained least-
squares (UFCLS) method in Sect. 4.4.2.1.3.
9.5 Determination of Number of Signatures

for ARHSP-LSMA
This section follows the concept of target-specified virtual dimensionality (TSVD)

developed in Chap. 4. The idea behind it is to use real target samples generated by a
specific algorithm and use them as signal sources for a Neyman–Pearson detection
problem.
We assume that for each 1 p L, Mp1 is the target space linearly spanned by
n op1
p 1 targets tRHSP-LSMA
j previously found by RHSP-LSMA. Here we use
j¼1
“RHSP-LSMA” as a generic term to represent any algorithm used by LSMA, which

can be an OSP-based algorithm, UFCLS, any endmember finding algorithm, an
unsupervised target detection algorithm, ATGP, forexample. Then, for 1 p L,
T
we assume that M ¼ ∅ and P⊥ tRHSP-LSMA ¼ tR-LSMA tR-LSMA . Then we
0 M0 1 1 1
can derive a stopping measure defined by
2
RHSP-LSMA
η p ¼ P⊥
Mp1 tp ; ð9:35Þ
where ηp in (9.35) is the maximum residual of the pth target data sample, tRp - LSMA ,
found by RHSP-LSMA and leaked from < Up1 > into < Up1 >⊥ , which is the
complement space orthogonal to the space < Up1 >. It should be noted that since
n oL
the found tRHSP-LSMA
p obtained by the RHSP-LSMA algorithm may be highly
p¼1
correlated, ηp in (9.35) is used instead because ηp represents the maximum residuals of
tpRHSP - LSMA leaked into < Up1 >⊥ .
It is this sequence, {ηp} given by (9.35), that will be used as the signal source in a
binary composite hypothesis testing problem to determine whether or not the pth
potential target candidate, tpRHSP - LSMA is a true target by a detector formulated as
follows:

H0 : ηp p ηp H 0 ¼ p0 ηp
versus for p ¼ 1, 2, . . . , L ð9:36Þ

H1 : ηp p ηp H 1 ¼ p1 ηp
of tRHSP - LSMA being a target signal source under H and not a target signal source
p 1
under H0, respectively, in the sense that H0 represents the maximum residual
resulting from the background signal sources, while H1 represents the maximum
residual leaked from the target signal sources. Note that when p ¼ 0, η0 is undefined
in (9.35). To make (9.36) work, we need to find probability distributions under both
hypotheses. The assumption made on (9.35) is that if a signal source is not an
endmember under H0, it should be considered part of the background that can be
characterized by a Gaussian distribution. On the other hand, if a signal source is
indeed a desired target signal source, it should be uniformly distributed over a range
from 0 to ηp1. As shown in Sect. 4.5.1, the probability distribution under each
hypothesis can be obtained by

η p pν p η p
p H 0 ηp ¼ ; ð9:37Þ
and a posteriori probability distribution of p(H1|ηp) is given by

Fν η
p H 1 ηp ¼ l p : ð9:38Þ
By virtue of (9.37) and (9.38), a Neyman–Pearson detector, denoted by δNP(ηp) for

the binary composite hypothesis testing problem specified by (9.36), can be
obtained by maximizing the detection power PD, with the false alarm probability
PF being fixed at a specific given value, α, which determines the threshold value τp
in the following randomized decision rule:
8
> 1; if Λ ηp > τ p
<
δARHSP-LSMA ηp ¼ 1 with probability κ; if Λ ηp ¼ τp ;
NP
ð9:39Þ
>
:
0; if Λ ηp < τp

where the likelihood ratio test Λ(ηp) is given by Λ ηp ¼ p1 ηp =p0 ηp , with
p0(ηp) and p1(ηp) given by (9.37) and (9.38). Thus, the case of ηp in (9.35) > τp
indicates that δRHSP ‐ LSMA(ηp) in (9.39) fails the test, in which case tALSMA
p is
assumed to be an endmember. Note that the test for (9.39) must be performed for
each of L potential endmember candidates. Therefore, for l, the threshold τl varies.
Using (9.39) the RHSP-LSMA-specified VD, nRHSP-LSMA, can be determined:
XL
ARHSP-LSMA ðPF Þ ¼
VDNP δ ηp ; ð9:40Þ
NP
p¼1 ARHSP -LSMA

where PF is a predetermined false alarm probability, δNP - ηp ¼ 1, only
NP ARHSP LSMA
if δNP
ARHSP-LSMA p η ¼ 1 and δ ARHSP-LSMA p η ¼ 0 if δARHSP-LSMA ηp < 1:
NP
In analogy with Sect. 4.5.3, we can also replace the signal energy specified by ηp
pffiffiffiffiffi
in (9.35) with the signal strength ηp specified by

pffiffiffiffiffi ⊥ RHSP-LSMA
ηp ¼ PMp1 tp : ð9:41Þ
As an example, we consider the HYperspectral Digital Imagery Collection

Experiment (HYDICE) image scene shown in Fig. 9.1a (and Fig. 1.10a), which
has a size of 64 64 pixel vectors with 15 panels and the ground truth map in
Fig. 9.1b (Fig. 1.10b). It was acquired by 210 spectral bands with a spectral
coverage from 0.4 to 2.5 μm. Low signal/high noise bands, bands 1–3 and
202–210, and water vapor absorption bands, bands 101–112 and 137–153, were
removed. Thus, a total of 169 bands were used in experiments. The spatial
resolution and spectral resolution of this image scene are 1.56 m and 10 nm,
respectively.
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
of the 15 panels
Table 9.1 nRHSP-LSMA via VDNP

ARHSP - LSMA (PF)
Signal sources used in (9.40)

and (9.41) PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
2
RHSP-LSMA 43 44 43 41 40
ηp ¼ P ⊥
Mp1 tp
(ATGP)
2
RHSP-LSMA 107 107 107 107 107
ηp ¼ P ⊥
Mp1 tp
(UNCLS)
2
RHSP-LSMA 116 116 116 116 116
ηp ¼ P ⊥
Mp1 tp
(UFCLS)

pffiffiffiffiffi ⊥ RHSP-LSMA 41 43 41 40 39
ηp ¼ PMp1 tp
(ATGP)

pffiffiffiffiffi ⊥ RHSP-LSMA 107 107 107 107 107
ηp ¼ PMp1 tp
(UNCLS)

pffiffiffiffiffi ⊥ RHSP-LSMA 116 116 116 116 116
ηp ¼ PMp1 tp
(UFCLS)
HFC method 14 11 9 9 7
NWHFC method 20 14 13 13 13
Table 9.1 tabulates the values of nRHSP-LSMA via VDNP ARHSP - LSMA (PF) determined
by (9.40) and (9.41), with various false alarm probabilities, PF, using three
unsupervised target finding algorithms, the OSP-based ATGP, described in Sect.
4.4.2.3, and two LSE-based methods, UNCLS and UFCLS, described in Sects.
4.4.2.2.2 and 4.4.2.2.3. It should be noted that since ATGP is the same as ULSOSP
in Sect. 4.4.2.2.1 scaled by a constant, their TSVD results are the same. Also
included in Table 9.1 are the VD results estimated by the Harsanyi–Farrand–
Chang (HFC)/noise-whitened HFC (NWHFC) method tabulated in Table 4.7 in
Sect. 4.4 for comparison.
Interestingly, according to Table 9.1, nRHSP-LSMA produced by UNCLS using the

signal energy in (9.40) and signal strength in (9.41) was the same and constant
across the board, 107. The same finding is also applied to UFCLS, which produced
116. To further illustrate this, Fig. 9.2 plots the values of (9.40) and (9.41) versus
n oL
the targets tRHSP-LSMA generated by ATGP, UNCLS, and UFCLS.
p
p¼1
ATGP target-specified VD ATGP target-specified VD

107
104
106
Maximum Residual Energy

Maximum Residual Length
3
10
105
104
102
3
10
101 102
0 5 10 15 20 25 30 35 40 45 0 5 10 15 20 25 30 35 40 45
Iteration Iteration
UNCLS target-specified VD UNCLS target-specified VD

104 107
106
103
105
102 104
103
101
102
100 101
0 20 40 60 80 100 120 0 20 40 60 80 100 120
Iteration Iteration
UFCLS target-specified VD UFCLS target-specified VD

104 107
106
103
105
102 104
103
101
102
100 101
0 20 40 60 80 100 120 0 20 40 60 80 100 120
Iteration Iteration
n oL
Fig. 9.2 Plots of values of (9.40) and (9.41) versus targets tRHSP -LSMA generated by ATGP,
p
p¼1
UNCLS, and UFCLS
As shown in Table 9.1, the value of nRHSP-LSMA is always greater than the value
of VD estimated by eigenanalysis-based methods, such as the HFC and NWHFC
methods. Interestingly, the value of nRHSP-LSMA is increased proportionally to the
number of abundance constraints imposed on the used target finding algorithms.
For example, ATGP produced the least value of nRHSP-LSMA, with UFCLS produc-
ing the largest value.
9.6 HYDICE Experiments
To extract panel pixels in five different rows from the HYDICE scene in Fig. 9.1,
the value of VD, nVD, must be at least 18, as demonstrated in Chang et al. (2010a,
2011b). However, according to Table 9.1, the values of nRHSP-LSMA via
VDNPARHSP - LSMA (PF) must be at least 39. In this case, we chose these two values to
form an ALMM to perform ARHSP-LSMA, where Fig. 9.3a, b shows 18 and
39 RHSP-LSMA extracted target pixels, respectively.
As shown in Fig. 9.3a, RHSP-LSMA could extract one panel pixel for each of
the five different rows. This implies that a value of 18 is the minimum number of
target pixels that must be generated by ARHSP-LSMA. Table 9.2 tabulates panel
pixels among target pixels 1–9 and 10–18 extracted by RHSP-LSMA in Fig. 9.3a,
respectively.
Table 9.3 tabulates the abundance fractions of 20 panel pixels, including 19 R
panel pixels and a Y panel pixel, p212, estimated by FCLS using target pixels
extracted by RHSP-LSMA in Fig. 9.3a, b as signatures to form ALMM. The first
and second columns of Table 9.3 tabulate the FCLS-estimated abundance fractions
of the 1st to 9th RHSP-LSMA-generated target pixels, including 3 R panel pixels,
p11, p312, p521, and the 10th to 18th RHSP-LSMA-generated target pixels, including
1 R panel pixel, p521, and 1 Y panel pixel, p212, given in Table 9.2, respectively. As
we can see from Table 9.3, no R panel pixels in rows 2 and 4 were estimated in the
second column since no panel pixels were found by RHSP-LSMA in its first nine
target pixels. Similarly, no panel pixels in rows 1, 3, and 5 were found by RHSP-
LSMA in its 10th–18th target pixels (Fig. 9.3a). Thus, no abundance fractions of the
panel pixels in rows 1, 3, and 5 were estimated in the third column. However, we
can also fuse the results in the second and third columns via (9.26) to obtain the
results in the fourth column, which turn out to be the same as the results in the fifth
column obtained by applying FCLS directly to the signatures made up of the 1st to
18th target pixels. Following the same treatment, we can also fuse the FCLS-
estimated abundance fraction results of 20 panel pixels using the first 18 RHSP-
LSMA-generated target pixels with FCLS-estimated abundance fraction results of
20 panel pixels using the 19th to 39th RHSP-LSMA-generated target pixels in
Fig. 9.3b to find the FCLS-estimated abundance fractions of the 20 panel pixels in
the sixth column, which are also equal to the FCLS-estimated abundance fraction
Fig. 9.3 Target pixels a

extracted by RHSP-LSMA.
(a) 18 RHSP-LSMA-
extracted target pixels;
(b) 39 RHSP-LSMA-
extracted target pixels
Table 9.2 Panel pixels extracted by RHSP-LSMA

Target pixels extracted by RHSP-LSMA Panel pixels extracted by RHSP-LSMA
Target pixels 1–9 p11, p312, p521
Target pixels 10–18 p212, p411
results of the 20 panel pixels in the seventh column using the 39 RHSP-LSMA-
generated target pixels in Fig. 9.3b.
As a final concluding remark, despite the fact that the popular Cuprite scene has
been widely studied for endmember extraction, it is impossible to use this scene in
286
Table 9.3 Abundance fractions of 20 panel pixels estimated by RHSP-LSMA

α19 α108 α9+9 α118 α18+21 α139
p11 1.000 – 1.000 1.000 1.000 1.000
p12 0.488 – 0.479 0.479 0.500 0.500
p13 0.252 – 0.088 0.088 0.138 0.138
p211 – 0.934 0.782 0.782 0.822 0.822
p212 – 1.000 1.000 1.000 1.000 1.000
p221 – 0.968 0.857 0.857 0.839 0.839
p22 – 0.802 0.648 0.648 0.697 0.697
p23 – 0.441 0.364 0.364 0.363 0.363
p311 0.910 – 0.888 0.888 0.874 0.874
p312 1.000 – 1.000 1.000 1.000 1.000
p32 0.508 – 0.583 0.583 0.572 0.572
p33 0.335 – 0.352 0.352 0.352 0.352
p411 – 1.000 1.000 1.000 1.000 1.000
p412 – 1.025 0.802 0.802 0.833 0.833
p42 – 0.822 0.829 0.829 0.809 0.809
p43 – 0.282 0.319 0.319 0.279 0.279
p512 0.718 – 0.687 0.687 0.705 0.705
p521 1.000 – 1.000 1.000 1.000 1.000
p52 0.779 – 0.731 0.731 0.740 0.740
p53 0.151 – 0.067 0.067 0.123 0.123
LSE 2:21 104 2:17 104 6:3 103 6:32 103 2:57 103 2:57 103
9 Recursive Hyperspectral Sample Processing of Linear Spectral Mixture Analysis
9.7 Conclusions 287
our experiments to evaluate data unmixing due to insufficient ground truth about
background within the scene which actually plays a key role in LSMA. Nonethe-
less, the experiments conducted for the HYDICE scene should be sufficient to
justify our findings.
9.7 Conclusions
LSMA is generally performed with signatures, which are used to form an LMM to
be provided a priori. In reality, such signature knowledge must be obtained directly
from the data to be processed since the assumed signature knowledge may not be
reliable. However, to extend LSMA in an unsupervised manner, two main issues
need to be addressed: how many signatures are needed and how these signatures are
to be found. This chapter developed a theory of adaptive recursive hyperspectral
sample processing of LSMA (ARHSP-LSMA) that deals with these two issues
altogether as a whole. Many benefits can be gained from ARHSP-LSMA that
cannot be found in LSMA. (1) The LMM used by LSMA is extendable to an
adaptive LMM (ALMM). (2) To find signatures to update ALMM, ARHSP-LSMA
is developed to allow LSMA to perform recursively signature by signature.
(3) ARHSP-LSMA can fuse LSMA results obtained using two different sets of
signatures without reprocessing all signatures. (4) With the development of
ARHSP-LSMA, the innovations information provided by the correlation between
processed information and new signature information in the recursive equation
explores a means of finding new signatures that can adapt to different applications.
(5) The signatures found by RHSP-LSMA can be used for TSVD (Chap. 4) to
develop RHSP-LSMA-specified VD, nRHSP-LSMA, which further determines how
many signatures are needed to perform ARHSP-LSMA. (6) By virtue of ARHSP-
LSMA, there is no need for data storage or complex calculations. With such
advantages, ARHSP-LSMA can be very useful in data transmission and commu-
nication. (7) Most importantly, ARHSP-LSMA can take advantage of ALMM to
fuse LSMA results obtained by multiple difference sets of signatures without
reprocessing the complete set of signatures. This is particularly useful in satellite
data communication when data processing can be carried out at different locations
by different users. This idea is similar to that of parallel processing, where each
processor corresponds to data processing using one set of signatures. (8) Finally,
owing to its recursive structures, ARHSP-LSMA can be realized in hardware
design, such as in Field Programmable Gate Array implementation.
Chapter 10
of Maximum Likelihood Estimation
Abstract Chapter 9 presents a theory of recursive hyperspectral sample processing

of linear spectral mixture analysis (RHSP-LSMA) to form an adaptive linear
mixing model (ALMM) that can adapt to the signatures, referred to as virtual
signatures (VSs), generated directly from data in an unsupervised and recursive
manner. This chapter considers an alternative approach to RHSP-LSMA, called
recursive hyperspectral sample processing of maximal likelihood estimation
(RHSP-MLE), that uses MLE error instead of orthogonal projection (OP) residual
used by recursive hyperspectral sample processing of OSP (RHSP-OSP) in Chap. 8
and least-squares error (LSE) used by RHSP-LSMA as a criterion to generate VSs.
Following approaches similar to those described in Chaps. 8 and 9, this chapter
develops a theory of RHSP-MLE in conjunction with ALMM to generate unknown
VSs recursively in an unsupervised manner, as RHSP-LSMA does, while
implementing a binary composite hypothesis testing-based Neyman–Pearson detec-
tor (NPD) at the same time to automatically determine when RHSP-MLE should
stop generating VSs.
10.1 Introduction
Maximum likelihood estimation (MLE) has been widely used in multispectral

imaging (Langrebe 2003; Richards and Jia 1999). Basically, it is a pure pixel spatial
domain-based classification technique. In early years linear spectral mixture anal-
ysis (LSMA) was developed by Adams and Smith (1986) and Adams et al. (1989,
1993) and later by Shimabukuro and Smith (1991) and Settle and Drake (1993) to
address the mixed nature present in a single pixel. The orthogonal subspace
projection (OSP) approach by Harsanyi and Chang (1994) is believed to be the
first work to look into LSMA from a statistical signal processing perspective for
mixed pixel classification. Interestingly, Settle (1996) subsequently showed that if
MLE was considered a least-squares error (LSE)-based estimator rather than a
classification method, then OSP can be realized by MLE on the assumption that
the noise in the linear mixing model (LMM) used is an additive Gaussian random
process. However, despite the fact that both MLE and OSP produce identical

DOI 10.1007/978-3-319-45171-8_10
290 10 Recursive Hyperspectral Sample Processing of Maximum Likelihood Estimation
results, Chang (1998) further showed that MLE and OSP were actually two
different approaches designed with different rationales. More specifically, if MLE
performs as an estimator and not a classifier, as is common in traditional remote
sensing, it is indeed an a priori estimation approach with a Gaussian assumption
(Poor 1994), while OSP is an a posteriori signal detection technique that uses
signal-to-noise ratio (SNR) as an optimal criterion without making an additive
Gaussian noise assumption (Tu et al. 1997; Chang 2005, 2013). In particular, the
signatures used to form an LMM can be found directly from the data to be
processed, referred to as virtual signatures (VSs) in Chang et al. (2010a, 2011b).
The relationship between MLE and LSMA in mixed pixel classification was further
explored in Chang et al. (1998) and Chang (2007b). With this in mind we can
expect that there should also be a very close relationship between recursive
hyperspectral sample processing of linear spectral mixture analysis (RHSP-
LSMA) and recursive hyperspectral sample processing of maximal likelihood
estimation (RHSP-MLE) to be derived in this chapter. Accordingly, for least
squares (LS)-based LSMA the three criteria LSE (Chang et al. 1998b), maximal
OSP residual (Chang et al. 2011c), and MLE error (Settle 1996) can be used to find
VSs for LSMA to unmix data.
Similarly, as discussed in Chap. 9, two major issues are associated with finding
VSs. One is the number of VSs that must be determined. The other issue is how to
find VSs to be used by LSMA for data unmixing. Despite the fact that the first issue
can be addressed by virtual dimensionality (VD) defined in Chap. 4, in many reports
the value of VD really has nothing to do with LSMA because it was not determined
by real signatures found in the data to be processed, as pointed out by Chang
et al. (2014). To resolve this dilemma, we can use target-specified VD developed in
Chap. 4 to produce LS-specified VSs for LSMA. With regard to the second issue,
one interesting approach is called minimum estimated abundance covariance
(MEAC), recently proposed by Du (2012). It assumes that the LMM can be
described by a Gaussian distribution and then the resulting Gaussian MLE error
matrix used as a criterion to produce real data samples as VSs. To facilitate the
process, it further develops a fast procedure to find desired VSs via a recursive
equation. Unfortunately, MEAC did not really address the first issue since it did not
use its generated VSs but rather used VD to determine the number of VSs needed
for data unmixing by LSMA. As mentioned earlier, VD is an eigen-based but not an
LS-based criterion. This chapter addresses this issue and further develops an RHSP-
MLE that extends MEAC to deal with the first issue. In addition to RHSP-MLE,
two other recursive approaches, OSP-based recursive hyperspectral sample
processing of OSP (RHSP-OSP), derived in Chap. 8, and LSMA-based recursive
hyperspectral sample processing of LSMA (RHSP-LSMA), derived in Chap. 9, are
also included for comparative study and analysis. All three of these recursive
approaches take advantage of the Neyman–Pearson detection theory Poor (1994)
and Chang et al. (2011c) to determine the number of VSs. It turns out that the
number of VSs is generally much higher than the value determined by VD. This
makes perfect sense because VSs are not necessarily pure and many of them may
also be background mixed signatures. As shown in Chang et al. (2010a, 2011b),
10.2 Criteria for Finding Virtual Signatures for LSMA 291
Chang (2013, 2016), and Gao et al. (2015), for LSMA to work effectively, it is
essential to include background signatures for data unmixing. The experimental
results conducted in this chapter also confirm this fact.
10.2 Criteria for Finding Virtual Signatures for LSMA
Assume that L is the total number of spectral bands and r is an L-dimensional image
pixel vector. Assume that m1, m2, . . ., mp are p material substance signatures. An
LMM models the spectral signature of r as a linear combination of m1, m2, . . ., mp
with appropriate abundance fractions specified by α1, α2, . . ., αp. More precisely,
r is an L 1 column vector and Mp an L p substance spectral signature matrix,
denoted by Mp ¼ m1 m2 mp1 mp , where mj is an L 1 column vector
represented by the spectral signature of the jth substance tj resident in the pixel
T
vector r. Let αp ¼ α1 ; α2 ; . . . ; αp be a p 1 abundance column vector associated
with r, where αj denotes the abundance fraction of the jth substance signature mj
present in the pixel vector r. To restore the pixel vector r, we assume that the
spectral signature of the pixel vector r is linearly mixed by m1, m2, . . ., mp as
follows:
r ¼ Mp αp þ n; ð10:1Þ
where n is noise or can be interpreted as a measurement or model error. Now

suppose that mp+1 is a new
VS to be added to Mp to form a new ( p + 1)-VS
signature matrix Mpþ1 ¼ m1 m2 mp mpþ1 ¼ Mp mpþ1 and (10.1) becomes a
new LMM mixed by p + 1 signatures, m1 , m2 , . . . , mpþ1 , as follows:
r ¼ Mpþ1 αpþ1 þ n: ð10:2Þ
A key issue that arises in (10.2) is how to find a desired mp+1. In what follows,
we develop three LS-based criteria to find VSs to be used for LSMA.
10.2.1 Least-Squares LSMA
A classical approach to solving (10.2) is the least-squares estimation given by

1
^ LS
α pþ1 ðrÞ ¼ Mpþ1 Mpþ1
T T
Mpþ1 r ¼ M#pþ1 r; ð10:3Þ

^ LS
where α pþ1 ðrÞ ¼ α^ LS ^ LS
1 ðrÞ, α ^ LS
2 ðrÞ, . . . , α pþ1 ðrÞ ^ LS
and α j ðrÞ is the abundance
fraction of the jth substance signature mj estimated from the data sample vector r.
^ LS
In (10.3), the subscript of α ^ LS
pþ1 ðrÞ, “p + 1” is specifically used to emphasize α pþ1
^ LS
ðrÞ obtained by p + 1 signatures and boldfaced α is used to indicate that α pþ1 ðrÞ is
^ LS
an abundance vector, not a scalar α pþ1 ðrÞ obtained by Chang et al. (1998b) and
Chang (2003, 2005, 2013) as follows:
1
⊥
^ LS
α pþ1 ð r Þ ¼ m T
P m
pþ1 Mp pþ1
T
mpþ1 P⊥
Mp r; ð10:4Þ
where
1
P⊥
Mp ¼ I M p M #
p ¼ I Mp M T
p M p MpT ð10:5Þ
1
and M#p is the pseudo-inverse of Mp given by MpT Mp MpT . In particular,
1
T
mpþ1 P⊥
Mp mpþ1 ð10:6Þ
is included to account for the LSE resulting from estimating the abundance fraction
αp+1 of mp+1. As a consequence, minimizing (10.6) provides an optimal criterion to
find a desired VS:
1
⊥
tLS
pþ1 ¼ arg minmpþ1 mpþ1 PMp mpþ1
T
n o ð10:7Þ
¼ arg maxmpþ1 mpþ1
T
P⊥
Mp mpþ1 :
Using (10.7), an algorithm for finding new VSs can be described as follows.
LS-Based Algorithm for Finding New VSs

1 ¼ argfmaxr ½r rg, where r is run over all image pixel vectors, and set
T
Let tLS
p ¼ 1.
2. Use (10.7) to find tp+1.
3. If a stopping rule is satisfied, the algorithm is terminated. Otherwise, let p p
þ1, and go to step 2.
As a concluding remark, it is worth mentioning that the preceding algorithm is
quite different from the LSE-based algorithms in Sect. 9.4.2 in the sense that the
former uses the LSE correction term (10.6) as a criterion to produce VSs, whereas
the latter makes use of unmixed linear spectral unmixing (LSU) error specified by
(9.32) as a criterion to generate signatures.
10.2.2 Orthogonal Projection-Based LSMA
This section presents an approach similar to the OSP-based algorithm described in

Sect. 9.4.1. In solving (10.2), Harsanyi and Chang offered a completely different
approach (Harsanyi et al. 1994). It converts the LS problem (10.2) to a signal
detection problem where the SNR was used to replace least squares as a perfor-
mance measure. Specifically, it looked into a signal detection approach by detecting
the abundance fraction of a particular substance signature, say, the abundance
fraction of the ( p + 1)st substance signature mp+1, denoted by α ^ OSP
pþ1 in (10.2).
Considering (10.2) as a standard signal detection model, Mpþ1 αpþ1 is a desired
signal vector to be detected and n is a corrupted noise. Since Mpþ1 αpþ1 is a mixed
signal by m1 , m2 , . . . , mpþ1, a direct use of a signal detection technique is not
applicable. The idea of Harsanyi and Chang’s approach is to divide the set of the
p substance signatures, m1 , m2 , . . . , mpþ1, into two groups of signals, one for a
desired substance, for example, mp+1, and the other consists of undesired sub-
stances, m1, m2, . . ., mp. In this case, the undesired substances m1, m2, . . ., mp can
be considered interferers to mp and their interfering effect can be annihilated prior
to the detection of mp+1. With the annihilation of the undesired substance signa-
tures, the detectability of mp+1 can be enhanced. In doing so, we first need to
separate mp+1 from m1, m2, . . ., mp in M and rewrite (10.2) as
r ¼ αpþ1 mpþ1 þ Mp γ þ n; ð10:8Þ

where Mp ¼ m1 m2 mp is the undesired substance spectral signature matrix
made up of m1, m2, . . ., mp. Here, without loss of generality, we assume that the
desired substance is a single substance signature mp+1. According to the derivation
provided in Harsanyi and Chang (1994), the solution to (10.2) can be obtained by
signal detection theory using SNR as an optimal criterion to produce an OSP solution
given by
⊥
^ OSP
α pþ1 ðrÞ ¼ mpþ1 PMp r:
T
ð10:9Þ
The concept of (10.9) was extended to an unsupervised version of OSP, known

as the automatic target detection and classification algorithm (ATDCA) in Ren and
Chang (2003), which becomes the well-known automatic target generation process
(ATGP) (Chang 2013). It can be considered an unsupervised and unconstrained
OSP technique that performs a succession of orthogonal subspace projections
specified by (10.9) to find a set of sequential data sample vectors that represents
targets of interest as follows.
OSP-Based Algorithm for Finding New VSs

Find tOSP
1 ¼ argfmaxr ½rT rg. Set p ¼ 1.
h i
2. Apply P⊥
Mp via (10.5), with Mp ¼ t1 tp
OSP OSP
, to all pixels
ATGP
mp+1 in the image to find the ( p + 1)th target tpþ1 by
n h
i o
tOSP ¼ arg max m

P⊥ mpþ1
2 : ð10:10Þ
pþ1 pþ1 Mp

þ1 and go to step 2.
T 2
Interestingly, since P⊥Mp is idempotent, that is, PMp
⊥
P⊥ ⊥
M p ¼ PM p ¼ P⊥ Mp , it
yields

T

P mpþ1
2 ¼ P⊥ mpþ1 P⊥
Mp Mp Mp mpþ1
T ð10:11Þ
¼ mpþ1
T
P⊥
Mp P⊥ ⊥
Mp mpþ1 ¼ mpþ1 PMp mpþ1 :
T
In light of (10.11) it easy to see that (10.10) is indeed the same as (10.7) and also
similar to (9.27) in Sect. 9.4.1. In other words, using an LS-based criterion to find
new VSs by minimizing the LSE specified by (10.7) is identical to using an
SNRHSP-based criterion to find new VSs via OSP specified by (10.10). This is
not a coincidence. It was shown in Chap. 12 of Chang (2013) that an OSP can also
be derived as a least-squares OSP (LSOSP) estimate for αp, denoted by α ^ pþ1
LSOSP
ðrÞ,
1
by including a constant mpþ1 T
P⊥Mp mpþ1 ^ OSP
in α pþ1 ðrÞ in (10.9) as follows:
1
^ pþ1
α LSOSP
ðrÞ ¼ mpþ1
T
P⊥
Mp m pþ1 ^ OSP
α pþ1 ðrÞ: ð10:12Þ
α pþ1
Comparing (10.12) to (10.4) it turns out that^ LSOSP
ðrÞin (10.12) is exactly the same
α LS
as^ pþ1 ðrÞin (10.4). However, it should be noted that there are fundamental differences
T
between α^ OSP
pþ1 ð r Þand ^
α LS
pþ1 ð r Þ. First, ^
α LS
pþ1 ðr Þ ¼ ^
α LS
1 ð r Þ , . . . , ^
α LS
p ð rÞ , α LS
pþ1 ðr Þ is an
T
estimator that estimates the abundance fraction vector αpþ1 ¼ α1 ; . . . ; αp ; αpþ1 ,
whereas α^ OSP
pþ1 ðrÞis merely a detector that detects the amount of abundance fraction of a
^ OSP
particular VS, mp+1, αp+1 present in r. In order for α pþ1 ðrÞ to perform the estimation of
abundance fraction for αp+1, α ^ pþ1
LSOSP
ðrÞ is developed in Settle (1996), Chang
1
T
et al. (1998b), and Chang (2005, 2007c, 2013) by including mpþ1 P⊥
Mp mpþ1 in
^ OSP
α ^ LS
pþ1 ðrÞ , which turns out to identical to α ^ OSP
pþ1 ðrÞ. Second, α pþ1 ðrÞ performs the
detection of the specific substance signature mp+1 as a signal compared to α ^ LS
pþ1 ðrÞ,
which estimates the abundance fractions of all signatures, with m1 , m2 , . . . , mpþ1 as a
parameter vector. Nevertheless, it was shown in Chang (2007c, 2013) that for all
1 j p þ 1, α^ LS
j ðrÞ ¼ α^ jLSOSP ðrÞ. Third, α
^ OSP
pþ1 ðrÞhas an advantage over α^ LS
pþ1 ðrÞin
^ OSP
that α pþ1 ðrÞ provides a constructive means of producing new signal sources in an
unsupervised manner. One such approach is the ATGP described earlier to perform
automatic target recognition. Finally, it is worth noting that the interpretations of (10.7)
and (10.10) are completely different. While (10.7) is used to minimize the abundance
estimation error, (10.10) is used to maximize the maximal OP residual in the hMpi
hyperplane.
10.2.3 Maximal Likelihood Estimation-Based LSMA
An earlier approach to solving (10.1) was to assume that the n in (10.1) is a white
Gaussian noise with its covariance matrix given by σ 2 ILL (Settle 1996). Then its
solution could be obtained by finding the MLE of α, α^ MLE
pþ1 ðrÞ by
1
^ MLE
α pþ1 ðrÞ ¼ Mpþ1 Mpþ1
T T
Mpþ1 r ¼ M#pþ1 r; ð10:13Þ
which is identical to (10.3). Its prediction error matrix is also given by the following
MLE error matrix:
1
Error Matrix α^ MLE
pþ1 ð r Þ ¼ σ 2
M T
M
pþ1 pþ1 : ð10:14Þ
Since (10.14) is a matrix, its trace

1
trace σ 2 Mpþ1
T
Mpþ1 ð10:15Þ
is then used to measure its error performance. Like LS and OSP, (10.15) can be used
as an optimal criterion to find VSs to derive an MLE-based algorithm, summarized
as follows.
MLE-Based Algorithm for Finding New VSs

1. Find an initial VS given by tMLE
1 ¼ argfmaxr ½rT rg. Set M1 ¼ [t1] and p ¼ 1.
2. Find
1
tMLE
pþ1 ¼ arg min m pþ1
trace σ 2
M T
pþ1 M pþ1 ; ð10:16Þ
where m p+1 is the ( p + 1)th signature needed to be found and

Mpþ1 ¼ Mp mpþ1 .
þ1 and go to step 2.
If the n in (10.1) is not a white noise but a color Gaussian noise with its
covariance matrix given by Σn, then (10.10) becomes
1
1
^ MLE
α pþ1 ðrÞ ¼ Mpþ1 Σn Mpþ1
T T
Mpþ1 Σ1
n r: ð10:17Þ
Now we can use the square-root representation of Σ1n in Poor (1994) to whiten the
T
color Gaussian noise. In this case, Mpþ1 Σ1
n M pþ1 in (10.17) can be reexpressed as
T
Σ1 1=2
Σ1=2 eT M e
n Mpþ1 ¼ Σn Mpþ1 ¼ M pþ1 pþ1 ; ð10:18Þ
T
Mpþ1 Mpþ1 n
e pþ1 ¼ Σ1=2 Mpþ1 is the noise-whitened Mp+1. Accordingly,

where M n
1
1
^ LS
α pþ1 ðrÞ ¼ Mpþ1 Σn Mpþ1
T
Mpþ1T
Σ1 nr
h T i1 1=2 T
¼ Σ1=2
n Mpþ1 Σ1=2n Mpþ1 Σn Mpþ1 Σ1=2 n r ð10:19Þ
1
eT Me eT e e# r ¼α
¼ M pþ1 pþ1 M pþ1 r ¼ M pþ1 e ^ LS
pþ1 ðe
r Þ;

r ¼ Σ1=2
which is reduced to (10.13) with e n r .
Several comments are in order here.
1. The same idea that used (10.18) and (10.19) as criteria to find a new VS was also
used in Du (2012) to derive an abundance covariance-based method, MEAC. As
a consequence, MEAC can be considered a special case of MLE in (10.17). That
is, if we let Rpþ1 ¼ K1=2 Mpþ1 used in Du (2012), MLE specified by (10.17)
becomes exactly the same as MEAC.
2. The sequential forward feature search (SFFS) used in MEAC is essentially the
same as a sequential process using (10.16), where each signature produced by
SFFS is the same as each new VS generated by MLE via (10.16).
3. Using the noise covariance matrix to perform noise whitening is completely
different from using the data covariance matrix to perform data whitening
because the data covariance matrix K is formed by entire data samples and the
noise covariance matrix Σn must be estimated by a reliable method such as the
one developed by Roger and Arnold (1996), which is used in all the experiments
performed in this chapter. The difference between using noise-whitened data by
Σ1=2
n and data whitened by K1/2 will be clearly demonstrated in the experi-
ments conducted in Sects. 10.6 and 10.7.
4. It should be noted that the three criteria OSP, LS, and MLE/MEAC presented in
this section are not interchangeable, even though they produce the same results.
For example, OSP is a detection technique that is derived based on a signal
detection model, with SNR used as an optimal criterion. Unlike OSP which is a
detection technique, both LS and MLE/MEAC are estimation techniques. Thus,
10.3 Recursive Hyperspectral Sample Processing of MLE 297
all these techniques are indeed quite different in their design rationales. More
specifically, LS is a least-squares error-based estimation technique that makes no
assumptions about the probability distributions of data. By contrast,
MLE/MEAC assumes the model error or noise term n in (10.1) and (10.2) to
be a Gaussian distribution. Thus, it is an a priori approach. On the other hand, LS
makes no such statistical assumption about the LMM, so it is an a posteriori
approach. Therefore, OSP, LS, and MLE/MEAC are not the same techniques.
Nevertheless, they can be interpreted by one another in an appropriate manner.
10.3 Recursive Hyperspectral Sample Processing of MLE
In Sect. 10.2, three criteria, LSE in (10.7), maximal OSP residual in (10.10), and
MLE error in (10.16), are derived to find VSs. However,
to find new VSs,
mp+1,
using these equations, the signature matrix Mpþ1 ¼ m1 m2 mp , mpþ1 must be
recalculated. These three equations cannot
take advantage
of the results previously
produced by Mp to calculate Mpþ1 ¼ Mp mpþ1 using two separate components,
Mp and mp+1. This section addresses this issue and develops recursive algorithms to
implement these three criteria to find each new VS one after another recursively.
10.3.1 RHSP-LS-Based Algorithm

1
As noted, T
mpþ1 P⊥
Mp m pþ1 in (10.6) is used to account for estimation error.

Similarly, PMp mpþ1

in (10.10) is used to find the maximal OSP in the
hyperplane of ‹Mp›, which accounts for the maximal residual of data sample vectors
leaked from ‹Mp›. As shown, minimizing
(10.6) is
2equivalent to maximizing (10.7)

as well as maximizing (10.10) via
P⊥ Mp m pþ1

. Equation (10.11) shows that

T
maximizing (10.7) and (10.10) is equivalent to maximizing mpþ1 P⊥
Mp mpþ1 . In this
T
case, we only need to derive a recursive equation of mpþ1 P⊥
Mp mpþ1 for both the
LS-based algorithm and the OSP-based algorithm. Using a matrix identity in
Appendix A,
" #1
h i1 MpT Mp MpT mpþ1
T
Mpþ1 Mpþ1 ¼ T T
mpþ1 Mp mpþ1 mpþ1
2 1 T 3
MpT Mp þ β
M#p mpþ1 mpþ1

T
M#p β
M# mpþ1
6 pþ1 p pþ1
p p 7
¼4 T 5;
β
mpþ1T
M#p β

pþ1 p pþ1 p
ð10:20Þ
1
where M#p ¼ MpT Mp MpT and
1 1
β
¼ T
mpþ1 I Mp Mp Mp
T T
Mp mpþ1
pþ1 p
ð10:21Þ
n h i o1
¼ T
mpþ1 P⊥Mp mpþ1 ;

T
mpþ1 P⊥
Mp mpþ1 ¼ ⊥
β
mpþ1
mpþ1 PMp1 mpþ1
T T et p tp et p tp T mpþ1
pþ1 p
h i2
¼ mpþ1
T
P⊥ β
m T et p tp ;
Mp1 m pþ1
pþ1 p pþ1
ð10:22Þ
where et p ¼ Mp1 M#p1 tp .

Using (10.20), a recursive hyperspectral sample processing of LS/OSP (RHSP-
LS/OSP) algorithm can be derived for both LS-based and OSP-based algorithms as
follows.
RHSP-LS/OSP Algorithm
RHSP-LS=OSP
(a) Find an initial target pixel vector t1 ¼ arg maxm1 m1T m1 . Set
h i
RHSP-LS=OSP
M1 ¼ t1 .
(b) Calculate m2T P⊥
M1 m2 as follows:
m2T P⊥RHSP-LS=OSP m2
t1
" 1 #
T T
RHSP-LS=OSP RHSP-LS=OSP RHSP-LS=OSP RHSP-LS=OSP
¼ m2T I t1 t1 t1 t1 m2
" 1 #
T T
RHSP-LS=OSP RHSP-LS=OSP RHSP-LS=OSP RHSP-LS=OSP
¼ m2T I t1 t1 t1 t1 m2
ð10:23Þ
and use (10.11) to find
10.3 Recursive Hyperspectral Sample Processing of MLE 299

RHSP-LS=OSP T ⊥
t2 ¼ arg maxm2 m2 PtRHSP-LS=OSP m2 : ð10:24Þ
1
h i
RHSP-LS=OSP RHSP-LS=OSP
(c) Form M2 ¼ t1 t2 .
(d) Let p ¼ 2.
RHSP-LS=OSP
2. At the ( p + 1)th iteration, find tpþ1 by maximizing (10.22) over all data
sample vectors, mp+1:
n h io
RHSP-LS=OSP
tpþ1 ¼ arg maxmpþ1 mpþ1
T
P⊥
Mp mpþ1 ð10:25Þ
h i
RHSP-LS=OSP RHSP-LS=OSP
where Mp ¼ t1 tp .
3. Stopping rule:
If a stopping rule to be discussed in the following section is satisfied, RHSP-LS/
OSP is terminated. Otherwise, let p p þ 1 and go to step 2.
10.3.2 RHSP-MLE
To implement MLE recursively as RHSP-MLE, we can derive a recursive equation

to find (10.16) using (10.20) as follows:
1 1 T
T
trace Mpþ1 Mpþ1 ¼ trace MpT Mp þ β 1 þ trace M#p mpþ1 mpþ1
T
M#p
1 T
¼ trace MpT Mp þ β 1 þ mpþ1
T
M#p M#p mpþ1
- 1 n h i o1 T
¼ trace MpT Mp þ T
mpþ1 P⊥
Mp mpþ1 1 þ mpþ1 Mp Mp mpþ1
T # #
1 n h i o1
LS

¼ trace MpT Mp þ mpþ1
T
P⊥
Mp mpþ1 ^ <Mp > mpþ1
2
1 þ
α
1
2
LS

¼ trace MpT Mp þ
P⊥
Mp mpþ1

1 þ
α
^ <Mp > mpþ1
2
0
12
1 2
α^ LS ð Þ
þ @
<Mp >
A ,
m
¼ trace ^ pþ1
þ α OP pþ1
MpT Mp mpþ1

PMp mpþ1
ð10:26Þ
T
^ LSM ¼ mpþ1
with α h pi
T ^ pþ1
M#p M#p mpþ1 , where α OP
mpþ1 ¼
P⊥
Mp mpþ1 is the OP
of mp+1 to account for the maximal OP residual from the hyperplane of ‹Mp› and
OSP
OP OSP
αˆ p + 1 (t p +1 ) t p +1
m p+1
OP
αˆ p + 1 (m p + 1 ) MLE
t p +1
OP MLE f
αˆ p + 1 (t p + 1 )
q LS MLE
αˆ p + 1 ( t p + 1 )
LS OSP
αˆ p + 1 (t p + 1 ) LS MEAC
αˆ p + 1 (t p + 1 )
Fig. 10.1 Illustration of relationship between ATGP and MLE

^ LSM mpþ1 is the abundance vector of mp+1 unmixed by t1, t2, . . ., tp in the
α h pi
hyperplane ‹Mp› via (10.3). In addition, tRHSP-MLE can be found by pþ1
8 8 0

12 9 9
>
< >
<
>>

2
α^ hMp i ðmpþ1 Þ
==
LS
RHSP-MLE
tpþ1 ¼ arg minmpþ1
PMp mpþ1
þ @
>
: >
:
P⊥Mp mpþ1
;>
> ;
8 8 0
12 99
>
< >
<
> =>=

2
α^ LS
hMp i ð pþ1 Þ
m
¼ arg minmpþ1

α^ pþ1 mpþ1
OP

þ @ A :
>
: >
: jjα^ pþ1 ðmpþ1 Þjj
OP
;>
> ;
ð10:27Þ
Further comparing (10.27) to (10.7), we immediately discover that (10.7) is only

one of two terms in (10.27) that need to be minimized. In other words, (10.27) must
0

12

2
α^ LS
hMp i ð m Þ
simultaneously minimize both
PMp mpþ1
and @
pþ1

A at the same

P⊥Mp mpþ1
2
time, while (10.7) only has to minimize
P mpþ1
. Mp
jjα^ LS ðmpþ1 Þjj
Now, let tan φ ¼ tan ðπ=2 θÞ ¼ α^ pþ1 . Figure 10.1 illustrates the rela-
jj pþ1OP
ðmpþ1 Þjj

tionship between α ^ pþ1

OP
mpþ1 and α ^ LS MLE
p mpþ1 . From (10.27), tpþ1 maximizes

^
α OP
mpþ1
and minimizes
α ^ pþ1 mpþ1
, while tOSP
LS
pþ1 only maximizes
pþ1OP
α^ pþ1 mpþ1
. In this case, if there is more than one target sample that yields
OP
LS
α
^ mpþ1
, tMLE will find the one with the smallest
α ^ mpþ1
, whereas
pþ1 pþ1 pþ1
ATGP
tpþ1 does not specify which one should be selected. However, it should be noted

OP ATGP

that
α
^ t
pþ1

α
pþ1 ^ OP mpþ1 for all mp+1. In this case, t ATGP always yields
pþ1 pþ1
an OP at least greater than or equal to the OP produced by tMLE
pþ1 .
10.4 Stopping Rule for RHSP-MLE 301
RHSP-MLE
Interestingly, according to the preceding derivations, finding tpþ1 by
RHSP-MLE is identical to the MEAC method using fast processing, with Rp ¼
Σ1=2
n Mp , in Du (2012). Furthermore, using RHSP-MLE to find the set of
RHSP - MLE
{tp } via (10.27) is also identical to MEAC using the sequential forward
feature search (SFFS).
10.4 Stopping Rule for RHSP-MLE
One key challenging issue in implementing the developed recursive algorithms is

step 3, which requires a stopping rule to terminate the VS finding process. This
section presents an approach similar to that proposed in Ren and Chang (2003) to
developing a stopping rule. Its idea is to use VS as a signal source via a binary
composite hypothesis testing problem, where a Neyman–Pearson detector is then
derived to determine whether the considered VS is an appropriate signature that can
be used for LSU.
Following the same approach as the one developed in Chang et al. (2014), we
assume that for each 1 p LM0 ¼ ∅ and Mp1 the VS space is linearly spanned
n op1
by previously found tVS j by any algorithm developed in Sect. 10.3, that is,
j¼1
RHSP-LS/OSP or RHSP-MLE, where tVS j is generated by either RHSP-LS/OSP as
RHSP - LS=OSP
tj or RHSP-MLE as tjRHSP - MLE . Then for 1 p L we can find
n
;
p ¼ arg maxr PMp1 r
tVS ð10:28Þ

2
ηp ¼
P⊥ VS
2
Mp1 tp ¼ maxr
P⊥
Mp1 r ; ð10:29Þ
where ηp in (10.29) is the maximal residual of the pth target data sample, tVS
p is
⊥
found by the VS leaked from Mp1 into Mp1 , which is the complement space

orthogonal to the space Mp1 and turns out to be exactly the same criterion used

by OSP specified by (10.10). It should be noted that
P⊥
M0 r ¼ r r in (10.29)
n oL
when p ¼ 1. Since the found tVSp obtained by a VS may be highly correlated,
p¼1
ηp in (10.29) is used instead because ηp represents the maximum residuals of tVS p
⊥ L
leaked into Mp1 . It is this sequence, ηp p¼1 , that will be used as a signal
source in a binary composite hypothesis testing problem to determine whether or
not the pth potential target candidate tVS
p is a true target using a detector formulated
as follows:
⊥
h iT h i

P tVS
2 ¼ P⊥ tVS P⊥ VS
Mp1 p Mp1 p Mp1 tp
T T ð10:30Þ
¼ tVS
p P⊥
Mp1 P⊥ VS
Mp1 tp :
T
Since P⊥
Mp1 is symmetric and idempotent, that is, P⊥
Mp1 P⊥ ⊥
Mp1 ¼ PMp1 ,
(10.30) is reduced to

⊥
T

P tVS
2 ¼ tVS P⊥ VS
; ð10:31Þ
Mp1 p p Mp1 tp
which is exactly the reciprocal of (10.6). This shows that the stopping rule using
(10.29) for terminating VSs is identical to signal sources used under each hypoth-
esis by the Neyman–Pearson detection theory proposed in Chang (2003) and Chang
and Du (2004) to determine VD.
It is this sequence, {ηp} given by (10.29) that will be used as a signal source in a
binary composite hypothesis testing problem to determine whether or not the pth
VS tVS
p is a true target using a detector formulated as follows:

H 0 : ηp p ηp
H 0 ¼ p0 ηp
versus
for p ¼ 1, 2, . . . , L ; ð10:32Þ
H 1 : ηp p ηp
H 1 ¼ p1 ηp
of tVS
p being a target signal source under H1 and not a target signal source under H0
respectively in the sense that H0 represents the maximum residual resulting from
the background signal sources, while H1 represents the maximum residual leaked
from the target signal sources. To make (10.32) work, we need to find the proba-
bility distributions under both hypotheses. The assumption made in (10.32) is that if
a signal source is not an endmember under H0, it should be considered part of the
background, which can be characterized by a Gaussian distribution. On the other
hand, if a signal source is indeed a desired target signal source, it should be
uniformly distributed over the interval of (0, ηp1). For details on deriving proba-
bility distributions, we refer the reader to Kuybeda et al. (2007). By virtue of the
Neyman–Pearson detection theory, we can derive the following Neyman–Pearson
detector (NPD), which maximizes the detection power PD subject to the false alarm
probability PF, which determines the threshold value τp:
8
< 1, if Λ ηp > τp ,

δNP η ¼ 1 with probability κ, if Λ ηp ¼ τp, ð10:33Þ
VS p
:
0, if Λ ηp < τp ;

and p1(ηp) given by (10.33), and κ is the probability of saying H1. Thus, a case of
10.5 Discussions 303
T
ηp ¼ tVS
p P⊥
Mp1 tp > τ p indicates that δVS ηp ¼ 1 in (10.33) fails the test, in
VS NP
which case tVS

p is assumed to be an endmember. It should be noted that the test for
(10.33) must be performed for each of L VSs. Therefore, for a different vale of p the
threshold τp varies. Using (10.33) the value of VD, nVS, can be determined by
calculating
NP
VS ðPF Þ ¼ arg maxp δVS ηp ¼ 1 ;
VDNP ð10:34Þ

where PF is a predetermined false alarm probability, δNP ηp ¼ 1 only if δNP ηp
NP VS VS
¼ 1, and δVS ηp ¼ 0 if δVS ηp < 1.
NP
According to (10.29), the sequence of {ηp} is monotonically decreasing. Thus,

δNP
VS (ηp) begins with a failure of the NPD test specified by (10.33) when p ¼ 1,

δNP ð η Þ ¼ 1. Then δ VS ηp ¼ 1 continues on as the value of p is increased
NP
until
VS 1
p reaches the value when the NPD test passes, in which case δVS ηp < 1. The
NP

largest value of p making δNP VS ηp ¼ 1 is the value of VDVS (PF) according to
NP
NP
(10.34). This unique property allows VDVS (PF) to be implemented in real time as
the process is continued with increasing p.
As a concluding remark, we should point out that there is no stopping rule on
how to terminate MEAC in Du (2012). Instead, MEAC assumes that the number is
determined by VD.
10.5 Discussions
It is known that an endmember is an idealistic and pure signature that is generally

used to specify a spectral class. However, justification of their existence in reality is
very difficult for several practical reasons. One is a lack of careful data calibration
such as an atmospheric correction. Another is a lack of ground truth such as a
database or spectral library. A third one is a lack of prior knowledge that can be
used for validation. As a result, finding “true” endmembers is extremely challeng-
ing because endmembers are usually corrupted by many unknown effects and these
contaminated endmembers are no longer pure signatures. This is the scenario that is
most likely the case we must deal with in real-world applications.
It is generally believed that signatures used to form an LMM for LSMA to
perform spectral data unmixing are known as endmembers. Unfortunately, as
shown in several recent reports (Chang et al. 2010a, 2011b), this is not true. In
other words, the signatures used by LSMA should be those that can represent the
data as being processed and are not necessarily pure-signature endmembers. Spe-
cifically, background signatures that are generally mixed should be included for
data unmixing. Since these signatures are generally different from endmembers,
using endmembers to specify m1, m2, . . ., mp in (10.1) for LSMA is misleading. The
concept of VS arises from this need. It is derived from VD, which was originally
developed to find the number of spectrally distinct signatures in data that can be
used by LSMA to unmix data via these signatures defined as VSs. With this
interpretation the signatures m1, m2, . . ., mp used to form the LMM in (10.1) are
actually VSs.
Since VSs do not necessarily have to be endmembers, those algorithms that are
designed for the purpose of finding endmembers may not be applicable to finding
VSs owing to constraints imposed on the algorithms, such as simplex-based algo-
rithms, N-FINDR, and simplex growing algorithm (SGA) (Chang et al. 2006). To
resolve this dilemma, this chapter takes a rather different approach by looking at
various criteria used for LSMA and other designs and develops approaches to
finding VSs where three criteria, LS, OSP, and MLE, are developed in Sect. 10.2
and two recursive algorithms (RHSP-LS/OSP for LS/OSP and RHSP-MLE for
MLE) are designed in Sect. 10.3.
Recently, endmember variability has received considerable attention from
researchers (Dennison and Roberts 2003; Somers et al. 2011) who take into account
the variability of an endmember present in the data. Instead of working on a single
endmember a group of signatures, referred to as an endmember class, one type of
endmember is represent so as to address signature corruption caused by physical
effects, such as noise or interference encountered in real environments. In this case,
several issues need to be addressed. One is how to find an endmember class that can
appropriately account for a given true endmember. Another issue is dealing with
individual samples in an endmember class. A third issue concerns finding a
signature that represents its endmember class. Interestingly, the concept of VS
can be used to address these issues because a VS need not be a true endmember
as a pure signature. Instead, it must be a real data sample vector directly extracted
from the data. More details can be found in Gao et al. (2014).
Among 25 panels are five 4 4 pure-pixel panels for each row in the first column
and five 2 2 pure-pixel panels for each row in the second column, five 2 2 mixed
pixel panels for each row in the third column, and five 1 1 subpixel panels for
each row in both the fourth and fifth columns, where the mixed and subpanel pixels
were simulated according to the legends in Fig. 10.2. Thus, a total of 100 pure
pixels (80 in the first column and 20 in second column), referred to as endmember
pixels, were simulated in the data by the five endmembers A, B, C, K, and M. An
area marked “BKG” in the upper right corner of Fig. 1.14a was selected to find its
sample mean, that is, the average of all pixel vectors within the BKG area, denoted
by b and plotted in Fig. 1.14b, to be used to simulate the background (BKG) for an
100%
A
B
Fig. 10.2 A set of 25 panels simulated by A, B, C, K, and M
image scene with a size of 200 200 pixels in Fig. 10.2. The reason for this
background selection is empirical since the selected BKG area seemed more
homogeneous than other regions. Nevertheless, other areas can also be selected
for the same purpose. This b-simulated image background was further corrupted by
an additive noise to achieve a certain SNR, defined as 50 % signature (i.e., reflec-
tance/radiance) divided by the standard deviation of the noise in Harsanyi and
Chang (1994). Once target pixels and background are simulated, two types of target
insertion can be designed to simulate experiments for various applications.
The first type of target insertion is target implantation (TI), which can be
simulated by inserting clean target panels into a noisy image BKG by replacing
their corresponding BKG pixels, where the SNR is empirically set to 20:1. That is,
TI implants clean target panel pixels into noise-corrupted image BKG with an
SNR of 20:1, in which case there are 100 pure panel pixels in the first and second
columns.
A second type of target insertion is target embeddedness (TE), which is simu-
lated by embedding clean target panels into a noisy image BKG by superimposing
target panel pixels over the BKG pixels where the SNR is empirically set to 20:1.
That is, TE embeds clean target panel pixels into noise-corrupted image BKG with
SNR ¼ 20:1 in which case all 100 pure panel pixels in the first and second columns
are no longer pure. In other words, a salient difference between TI and TE is worth
mentioning. Since TE inserts targets by adding target pixels to and superimposing
over background pixels instead of replacing background pixels like TI does for
target insertion. As a consequence, the abundance fraction of the pixel into which a
target pixel is embedded is not summed to one.
RHSP-LS/OSP and RHSP-MLE were then implemented in conjunction with
(10.29) to estimate nVS for TI and TE scenarios. For the purpose of comparative
analysis the unsupervised fully constrained least-squares (UFCLS) method devel-
oped by Heinz and Chang (2001) was also implemented. However, in this case, due
Table 10.1 nVS estimated by ATGP, MLE, and UFCLS for TI data
NPD NPD NPD NPD NPD
PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105 MLD
ATGP F F F F F F
ATGP (K) 9 8 8 7 7 8
ATGP (Σn) 8 8 8 7 7 8
RHSP-MLE N/A N/A N/A N/A N/A N/A
RHSP-MLE (K) N/A N/A N/A N/A N/A N/A
(MEAC)
RHSP-MLE (Σn) N/A N/A N/A N/A N/A N/A
UFCLS F F F F F F
UFCLS(K) 8 7 7 7 7 7
UFCLS (Σn) 6 6 6 6 6 6
Table 10.2 nVS estimated by ATGP, MLE, and UFCLS for TE data
NPD NPD NPD NPD NPD
PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105 MLD
ATGP F F F F F F
ATGP(K) 8 7 6 6 6 6
ATGP (Σn) 7 6 6 6 6 6
RHSP-MLE N/A N/A N/A N/A N/A N/A
RHSP-MLE (K) N/A N/A N/A N/A N/A N/A
(MEAC)
RHSP-MLE (Σn) N/A N/A N/A N/A N/A N/A
UFCLS F F F F F F
UFCLS(K) 7 6 6 6 6 6
UFCLS (Σn) 6 6 6 6 6 6
to its use of full abundance constraints, UFCLS does not have an analytical form to
be implemented recursively.
Tables 10.1 and 10.2 tabulate the results with various false alarm probabilities
along with maximum likelihood detector (MLD), where an “F’ indicates that NPD
failed the test and all 189 generated target pixels are treated as VSs. Since RHSP-
MLE makes use of a linear model in (10.1) with assuming white Gaussian noise
(Σ ¼ I) or color Gaussian noise (Σ ¼ Σn) (i.e., MEAC method), experiments were
also conducted on data noise-whitened by Σ1=2 n , with Σn estimated using the
method in Du (2012), and data whitened by K1/2 (i.e., the square root of the sample
covariance matrix K, K1/2 was used as the whitened matrix (Poor 1994)). Simi-
larly, both RHSP-LS/OSP and UFCLS were also implemented on the original data
with white Gaussian noise, Σ ¼ I, where no preprocessing is required, noise-
whitened data by Σ1=2n , and whitened data by K1/2.
Interestingly, for both TI and TE, RHSP-MLE did not work for all scenarios, nor
did RHSP-LS/OSP and UFCLS on the original data with Σ ¼ I. This implied that
more than 189 background pixels were also treated as VSs due to the additive white
Fig. 10.3 Targets found by MEAC for TI scenario. (a) Noise-whitened data. (b) Whitened data
Gaussian noise. By contrast, when RHSP-LS/OSP and UFCLS were implemented

on noise-whitened data and whitened data, they all produced good numbers of VSs.
Despite the fact that TI has only five pure signatures as endmembers and TE has no
pure signatures, the number of VSs, nVS, estimated by VD via (10.24) was indeed
higher than 5. This explains that the VSs used to perform LSMA are not necessarily
pure signatures as many thought that they should be endmembers. Technically
speaking, for both TI and TE there are only six spectrally distinct signatures (five
panel signatures, A, B, C, K, and M, with pure or corrupted signatures plus a
background signature that is a mixed signature). In this regard, UFCLS was the best
among all the three methods.
Furthermore, there are some interesting findings on experiments applying
MEAC to TI and TE. Figures 10.3 and 10.4 show the results obtained by applying
MEAC to TI and TE where noise-whitened data and whitened data are used for
experiments. As we can see from the results, MEAC kept finding the same
signature-specified target panel pixels. Accordingly, the error matrix specified by
(10.14) and (10.15) is close to a singular matrix so that VD may not be applicable to
finding nVS (“N/A” in Tables 10.1 and 10.2).
a
17 18
16
10
17 18 5
412 14
7 412 14
7
2 3
15
1 13 86
9 2 3
15
11
Targets found by MEAC Zoomed-in cropped area

b
9
3 2468
11
10
9 11
2468
10 12
157 157
Targets found by MEAC Zoomed-in cropped area
Fig. 10.4 Targets found by MEAC for TE scenario. (a) Noise-whitened data. (b) Whitened data
Figures 10.5, 10.6, and 10.7 also show the VSs found by RHSP-LS/OSP, RHSP-
MLE, and UFCLS for TI and TE from the original data space, noise-whitened data,
and whitened data, respectively. Since the VSs were generated sequentially and
those generated by a smaller value of nVD were also part of the VSs generated by a
large value of nVS, the number of VSs was set to the largest value, for example, 9 for
TI and 8 for TE.
Apparently, RHSP-MLE did not work well for both scenarios. No matter which
data set was used, it missed at least one panel signature (Fig. 10.5). As also
demonstrated in Fig. 10.5, RHSP-MLE did not pick up panel pixels corresponding
to endmembers in five rows. This is because MLE is not a criterion particularly
designed to find endmembers. By contrast, RHSP-LS/OSP and UFCLS worked
very effectively for both scenarios for all types of data sets in Fig. 10.4, where five
panel pixels corresponding to five different mineral signatures were already found
by the first five RHSP-LS/OSP-generated pixel vectors. This is because OP has
been widely used for finding endmembers such as PPI (Boardman 1994). Compared
to RHSP-LS/OSP, UFCLS always extracted a panel pixel in the fifth row as its first
a
9 9
9
2 3 3 73
7
4
8 6
6
24 248 5
6
1 5 18 1
7
5
original space noise-whitened data whitened data

b
9 9
2 5 5
3 64
8
7 4
9
258 63 2 6 4
1 1 7 138
7
Fig. 10.5 9 VSs found by MLE. (a) TI. (b) TE
2 3 3
8
3 4 4
5 5 5
4 2 2
7 7 7
1 1 1
6 6
6
8 8

b
2 8 3 3
3 4 4
5 5 5
4 2 2
7 8 6 8 6
1 1 1
6
7 7
Fig. 10.6 8 VSs found by ATGP. (a) TI. (b) TE

a
7 8
3 8 4 4
5 5 5
7 9 6 6
2 2 2
4 9 3 3
7
9
1 1 1
6 8

b
3 4 4
9 9
4 5 5
8
6 6 6
7
2 2 2
5 7 3 3
9
1 1 1 7
8 8
Fig. 10.7 9 VSs found by UFCLS. (a) TI. (b) TE
VS and a background pixel as its second VS on all data sets. This indicates that a
background signature is a crucial signature for both TI and TE scenarios and must
be included in the LMM to properly represent data so that UFCLS can perform
LSMA effectively. However, such a background signature is certainly not a pure
signature but rather a mixed signature.
The image scene shown in Fig. 10.8 (also shown Fig. 1.10a) was used for exper-
iments. It was acquired by the airborne Hyperspectral Digital Imagery Collection
Experiment (HYDICE). It has a size of 64 64 pixel vectors with 15 panels in the
scene and the ground truth map in Fig. 10.8b (Fig. 1.10b).
As noted in Fig. 10.8b, the panel pixel, p212, marked in yellow is of particular
interest. According to the ground truth, this panel pixel is not a pure panel pixel but
rather a boundary panel pixel marked in yellow. Thus, when an endmember finding
algorithm, such as N-FINDR developed by Winter (1999), is implemented, p221 is
always extracted. However, if an unsupervised target detection algorithm is used,
panel pixel p212, instead of panel pixel p221, is always the one that gets extracted to
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
Fig. 10.8 (a) A HYDICE panel scene containing 15 panels and (b) ground truth map of spatial
Table 10.3 nVS estimated by (10.29) with various false alarm probabilities for HYDICE data
where RHSP-LS/OSP, RHSP-MLE, and UFCLS are used to find VSs
NPD NPD NPD NPD NPD
PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105 MLD
RHSP-LS/OSP 118 118 116 115 114 118
RHSP-LS/OSP (K) 54 53 49 49 46 53
RHSP-LS/OSP (Σn) 33 32 32 30 29 32
RHSP-MLE 128 128 127 126 123 128
RHSP-MLE (K) 79 70 68 59 59 73
(MEAC)
RHSP-MLE (Σn) 44 44 43 43 43 44
UFCLS F F F F F F
UFCLS (K) 133 133 130 129 128 133
UFCLS (Σn) 81 79 79 75 69 79
represent the panel signature in row 2. Also, because of such ambiguity, the panel
signature representing panel pixels in the second row is either p221 or p212, which
endmember finding algorithms always have a difficult time finding. This implies
that the ground truth of R panel pixels in the second row provided in Fig. 10.8b may
not be as accurate as thought.
Similar to synthetic image experiments, RHSP-MLE was also implemented on
the original data, data noise-whitened by Σ1=2
n , and data whitened by K1/2, where
the noise covariance matrix Σn was estimated by the method in Du (2012). For
comparison, RHSP-LS/OSP and UFCLS were also implemented for three cases.
Table 10.3 tabulates nVD estimated for various scenarios using signal energy
specified by (10.24).
Obviously, the number of VSs estimated by VD in Table 20.3 for this HYDICE
scene is higher than the number of endmembers obtained in Chang (2003, 2013)
and Chang and Du (2004). This is mainly because the generated VSs include
endmembers, and some of them are not necessarily pure signatures as endmembers.
Fig. 10.9 VSs found by MLE, RHSP-LS/OSP, and UFCLS using (10.24). (a) 118 VSs produced
by RHSP-LS/OSP. (b) 128 VSs produced by RHSP-MLE. (c) 81 target pixels produced by UFCLS
Figure 10.8 shows the VSs found by RHSP-LS/OSP, RHSP-MLE, and UFCLS
via (10.24) from the original data space, noise-whitened data, and whitened data,
respectively. Since these VSs were generated sequentially and those generated by a
smaller value of nVS were also part of the VSs generated by a large value of nVS, the
number of VSs was set to the largest value, for example, 128. As shown in Fig. 10.9,
all methods were able to find pixels corresponding to five panel signatures, and a
large number of VSs are actually background pixels representing mixed signatures.
10.8 Unmixed Error Analysis 313
10.8 Unmixed Error Analysis
This section performs an unmixed error analysis on three algorithms, RHSP-LS/

OSP, RHSP-MLE, and UFCLS, where three different data sets, the original data,
data noise-whitened by Σ1=2
n , and data whitened by K1/2, are used for experi-
ments. In particular, the RHSP-MLE implemented on whitened data is exactly
MEAC in Du (2012).
1. First, total LSEs were calculated for the entire image scene unmixed by a fully
constrained least-squares method (Heinz and Chang 2001) using the VSs
found in Fig. 10.9 to form an LMM to unmix data samples in the scene.
Figure 10.10a–f plots FCLS-unmixed errors using VSs generated by RHSP-
LS/OSP, RHSP-MLE, and UFCLS methods starting from nVS ¼ 9 up to the
estimated value of nVD, which yielded the minimal FCLS-unmixed errors. As
can be seen, all three unmixed error curves are monotonically decreasing until
the very end. This phenomenon indicates that it is always more effective to
include more signatures for data unmixing since the more signatures there are,
the better the data representation.
2. The best results were those obtained by UFCLS compared to the worst results
derived from RHSP-MLE. This is mainly because UFCLS was designed to
unmix data samples using a specific set of target samples, generally referred to
as spectrally distinct signatures, not necessarily endmembers. As a result,
UFCLS produced the least FCLS-unmixed errors for both the entire image
scene and 19R panel pixels, while RHSP-MLE yielded the maximal errors.
These interesting experiments provide evidence that MLE may not be effective
at finding VSs for LSMA.
Table 10.4 tabulates several sudden drops marked at the curves in Fig. 10.10,
which shows significant improvements made on the unmixed errors. As we can see,
the first significant drop always occurred in the range from 9 to 18, where 9 was
estimated by the Harsanyi–Farrand–Chang (HFC) method by Harsanyi et al. (1994)
and 18 was twice nVD ¼ 9 in Chang et al. (2010a, 2011b). Such small numbers are
derived from eigenanalysis methods such as eigenvalues that can be used to
determine the number of endmembers. Unlike the first drop, the second drop
occurred in a wider range from 20 to 37 and the third drop occurred in a wider
range from 43 to 78. These high numbers are determined by real target samples and
not necessarily pure samples as endmembers. The first three sudden drops in
Table 10.4 can explain the values in Table 10.3. Furthermore, the results also
demonstrated that a VS is not necessarily an endmember. Obviously, most found
VSs were actually background pixels. This also provides evidence that RHSP-MLE
could not be used to estimate the number of VSs.
It is worth noting that no experiments were conducted for Cuprite Mining District
data as was done here. This is primarily because Cuprite data have been used for
endmember finding, not for LSU, where there is no ground truth to verify our findings.
Nevertheless, HYDICE data experiments should be sufficient to draw conclusions.
a
x 105
5.5 ATGP
UFCLS
5 MLE
4.5
N=9
FCLS-unmixed errors
3.5 N=13
3 N=11 Minimum error N=169
2.5 N=71
N=92
2 Minimum error N=168
1.5
N=25 Minimum error N=168
1
20 40 60 80 100 120 140 160

Number of VSs
b
x 104
ATGP
3 UFCLS
MLE
2.5
FCLS-unmixed errors
N=13
N=15
2
N=23
1.5 N=25
N=62 N=71
Minimum error N=168
1
N=18
N=35
0.5 N=43 N=49
Mimum error N=167
20 40 60 80 100 120 140 160

Number of VSs
Fig. 10.10 FCLS-unmixing errors using ATGP-, MLE (MEAC)-, and UFCLS-generated VSs. (a)
FCLS-unmixed error for entire image. (b) FCLS-unmixed error for 19 pixels. (c) FCLS-unmixed
error for entire image (K). (d) FCLS-unmixed error for 19 pixels (K). (e) FCLS-unmixed error for
entire image (Σn). (f) FCLS-unmixed error for 19 pixels (Σn)
10.8 Unmixed Error Analysis 315
c
x 105
16 ATGP
UFCLS
N=15 MLE
14
12
FCLS-unmixed errors
10 N=25
8
N=35
N=45
6 N=10
4 N=97
N=73
2
20 40 60 80 100 120 140 160

Number of VSs
d
x 104
ATGP
2.5 UFCLS
MLE
FCLS-unmixed errors
2 N=12
N=13
N=20
1.5
N=85
Minimum error N=168
N=28
1 N=57
N=33
0.5 N=37 Minimum error N=169
20 40 60 80 100 120 140 160
Number of VSs
e
x 105
ATGP
6
UFCLS
MLE
5
FCLS-unmixed errors
N=14
4
N=18
N=10 N=74
N=72
1 N=29
20 40 60 80 100 120 140 160
Number of VSs
f
x 104
ATGP
UFCLS
MLE
2 N=10
N=15
FCLS-unmixed errors
N=20
1.5
N=13
N=20 N=78
N=31
0.5
Minimum error N=167
20 40 60 80 100 120 140 160

Number of VSs
10.9 Conclusions 317
Table 10.4 First four sudden drops at unmixed curves generated by ATGP, UFCLS, and MLE
First drop Second drop Third drop Fourth drop
VD (entire scene/19R) (entire scene/19R) (entire scene/19R) (entire scene/19R)
ATGP 9/18 NA/62 NA/71 NA/NA
ATGP (K) 10/12 13/14 73/25 NA/57
ATGP (Σn) 18/10 72/15 NA/20 NA/NA
MLE 13/13 71/15 92/25 NA/61
MLE (K) 15/12 25/85 35/NA 45/NA
MLE (Σn) 14/10 74/13 NA/NA NA/NA
UFCLS 11/23 25/35 NA/43 NA/49
UFCLS (K) NA/13 NA/20 NA/28 NA/33
UFCLS 10/10 29/20 50/24 NA/31
(Σn)
10.9 Conclusions
In recent years, the simplex volume has been widely used to find endmembers that
are supposed to be pure signatures (Schowengerdt 1997). Unfortunately, these
endmembers may not be effective for LSMA when they are used to perform data
unmixing. This is mainly because the signatures used to form an LMM for LSMA
must be representative, and they are not necessarily pure signatures. This is
particularly true of some background signatures that may be mixed, as demon-
strated in Chang et al. (2010a, 2011b). This issue was already addressed in Chaps. 4
and 9 (Chang 2016). This chapter dealt with the same issue from the viewpoint of
MLE from several new aspects. First and foremost, the concept of VS for LSMA
was introduced. Second, three criteria, LSE, OSP, and MLE, were designed to find
VSs where each of them leads to a particular approach, called LS, OSP, and MLE
methods, respectively. Third, to produce VSs in an unsupervised manner, this
chapter also derived recursive algorithms to implement LS, OSP, and MLE,
where RHSP-MLE turned out to be MEAC (Du 2012). In addition, RHSP-LS and
RHSP-OSP were also shown to be the same algorithm, which happens to be the
same recursive algorithm, called recursive ATGP (R-ATGP), developed in Gao and
Chang (2014) and Chang et al. (2015g), and RHSP-ATGP in Chap. 7. Finally, to
further determine the number of VSs, a Neyman–Pearson detection theory-based
approach using real target pixels derived from Chang et al. (2014), called target-
specified virtual dimensionality (TSVD) described in Chap. 4, is proposed to
replace VD to reflect real applications.
Chapter 11
of Orthogonal Projection-Based Simplex
Growing Algorithm
Abstract The simplex growing algorithm (SGA) developed by Chang et al.

(A growing method for simplex-based endmember extraction algorithms. IEEE
Transactions on Geoscience and Remote Sensing 44(10): 2804–2819, 2006b) has
been used for finding endmembers and was studied in Chap. 10 of Chang (Real time
progressive hyperspectral image processing: endmember finding and anomaly
detection, Springer, New York, 2016). It can be considered a sequential version
of the well-known N-finder endmember finding algorithm (N-FINDR) developed
by Winter (Proceedings of 13th international conference on applied geologic
remote sensing, Vancouver, BC, Canada, pp. 337–344, 1999a; Image
spectrometry V, Proceedings of SPIE 3753, pp. 266–277, 1999b) to find
endmembers one after another by growing simplexes one vertex at a time. How-
ever, one of the major hurdles for N-FINDR and SGA is the calculation of a simplex
volume (SV), as discussed in Chap. 2, which poses a great challenge in designing
any algorithms using a SV to find endmembers. This chapter develops an orthog-
onal projection (OP)-based approach to SGA, called OPSGA, which essentially
resolves this computational issue. The idea is based on a geometric SV (GSV) from
structures of simplexes. If we consider a j-vertex simplex Sj specified by
j previously found endmembers as a base and the next endmember mj+1 to be
found as a new vertex to be added to Sj to form a new ( j + 1)-vertex simplex, Sj+1,
then calculating the GSV of this new ( j + 1)-vertex simplex, Sj+1, is equivalent to
multiplying the GSV of Sj, which is considered a base with the OP on the base Sj
from mj+1, which is considered its height. As a result, finding mj+1 to yield a Sj+1
with the maximal GSV is equivalent to finding mj+1 with the maximal OP on Sj. On
this interpretation, OPSGA converts the issue of calculating a determinant-based
SV (DSV) to finding OP without actually computing matrix determinants. Accord-
ingly, OPSGA can be considered a technique for calculating a GSV by OP
(GSV-OP). To further reduce the computational complexity, OPSGA is also
extended to a recursive Kalman filtering-like recursive hyperspectral sample
processing of OPSGA (RHSP-OPSGA), which has several advantages and benefits
in terms of computational savings and hardware implementation over N-FINDR
and SGA.

DOI 10.1007/978-3-319-45171-8_11
320 11 Recursive Hyperspectral Sample Processing of Orthogonal Projection-Based. . .
11.1 Introduction
One of major tasks in hyperspectral image analysis is to find endmembers present

in hyperspectral imagery because endmembers can be used to specify particular
spectral classes for data sample vectors. Accordingly, over the past years the
finding of endmembers has attracted considerable interest, and many different
criteria have been proposed to design and develop endmember finding algorithms
(Chang 2013). In particular, the simplex volume (SV) has emerged as a leading
and preferred criterion for optimality due to the fact that it imposes two physical
abundance constraints, an abundance sum-to-one constraint (ASC) and an abun-
dance nonnegativity constraint (ANC). For example, the minimum volume trans-
form (MVT) developed by Craig (1994) is an early approach to using an SV as a
criterion to find a set of endmembers that form a simplex with the minimum
volume to embrace all data sample vectors. In contrast, N-FINDR also finds a set
of endmembers that form a simplex with the maximal volume embedded in the
data space. More specifically, MVT deflates simplexes by reducing SV to include
all data sample vectors until it achieves the minimal SV, as opposed to N-FINDR,
which inflates simplexes embedded in the data space until it yields the maximal
SV. However, these two approaches generally do not produce the same results.
Nevertheless, MVT and N-FINDR have set the pace for future design and devel-
opment in the finding of endmembers. For example, convex cone analysis
(CCA), developed by Ifarragaerri and Chang (1999), and convex cone volume
analysis (CCVA) in Chap. 7 (Chang 2016) and Chang et al. (2016a), arose from
the concept of MVT. On the other hand, N-FINDR has gone through many
different modifications and revisions in various forms, from a redesign of
its sequential versions, such as SeQunetial N-FINDR (SQ N-FINDR) and
SuCcessive N-FINDR (SC N-FINDR) in Chap. 6 of Chang (2016), to its progres-
sive version, the simplex growing algorithm (SGA), in Chap. 10 of Chang (2016).
All these variants are focused on developing algorithms to reduce computer
processing time. Unfortunately, these algorithms must still deal with SV calcula-
tion directly.
It is known that N-FINDR faces several challenging issues. First, it must
determine the number of endmembers, p, required for N-FINDR to generate
beforehand. In addition, an exhaustive search must be conducted to find an optimal
set of p endmembers, which is considered to be nearly impossible. To resolve the
first issue, the concept of virtual dimensionality (VD) (Chang 2003a; Chang and Du
2004) can be used for this purpose. As for the second issue, N-FINDR has been
modified as various versions of numerical algorithms to ease the computational
burden (Chang 2013), in particular, in two sequential versions, SQ N-FINDR and
SC N-FINDR (Xiong et al. 2011; Chang 2013). As an alternative approach to easing
computational complexity, Chang et al. (2006b) recently developed a new algo-
rithm, SGA, for finding endmembers by growing simplexes one at a time, where
each newly generated endmember yields the maximal SV during the simplex
growing process. Consequently, SGA significantly reduces the computer
processing time required for calculating SV. However, all the aforementioned
algorithms involve computing SV via a matrix determinant calculation, referred
to as determinant-based SV (DSV) calculation, which usual requires excessively
high computing time. In particular, both N-FINDR and SGA run into the same
challenging numerical issue. That is, the matrices used to calculate the DSV is
generally not of full rank because the number of endmembers, p, to be used to
form a simplex is relatively small compared to the data band dimensionality, L,
the total number of spectral bands. Under such a circumstance, two commonly
used approaches are suggested. One is to reduce the band dimensionality, L, to
p 1. A second approach is to use singular value decomposition (SVD) without
band dimensionality reduction (DR). In the former case, different DR transforms
result in different sets of endmembers (Li et al. 2015a). On the other hand, in the
latter case, SVD may cause numerical instability (Li et al. 2015a). Most impor-
tantly, the DSVs found by both approaches are not necessarily true SV, as shown
in Li et al. (2015a). In particular, it was shown in Chang (2013) and Chang et al.
(2016b) that DSV-based SGA and OP-based vertex component analysis (VCA)
are essentially the same as the automatic target generation process (ATGP)
developed in Sects. 4.4.2.3, provided that the initial conditions were appropri-
ately selected [also see Chaps. 4 and 13 in Chang (2016)]. However, a challeng-
ing issue in computational complexity remains: finding the matrix inverse in
OP-based endmember finding algorithms (EFAs) and calculating DSV in
SV-based EFAs. Interestingly, the issue of how to calculate simplex volume
has received little attention.
In this chapter we present an OP-based SGA (OPSGA) recently developed by Li
and Chang (2016b) that transforms the DSV computational issue into one of finding
OP. Its key idea lies in the fact that an SV can be calculated from a geometric point
of view, referred to as geometric SV (GSV) calculation, by multiplying the SV of its
base by its height, where the base of a simplex is formed by previously found
endmembers and its height is actually the OP of the next new endmember perpen-
dicular to the base. If we grow simplexes one vertex at a time from a single vertex,
then the vertices of previously constructed simplexes will form part of the vertices
of subsequently constructed simplexes. By taking advantage of such a growing
process, we can construct a ( j + 1)-vertex simplex from the previously generated j-
vertex simplex by adding a newly generated endmember, mj+1, as the ( j + 1)th
vertex. SGA turns out to be a perfect candidate to be used for this purpose because it
grows SV one after another in sequence, and vertices previously found by SGA are
indeed part of vertices of subsequent simplexes found by SGA. The next issue is
finding a desired mj+1. In doing so we first consider the SGA-generated j-vertex,
SSGA
j as a base from which SGA grows by adding its next ( j + 1)th endmember mj+1
to SSGA
j to form a new ( j + 1)-vertex simplex, SSGA SGA
jþ1 . In this case, the volume of Sjþ1
can be simply calculated by multiplying the volume of SSGA j by its height, which can
be calculated by finding the mj+1 perpendicular to the base that is exactly SSGA j .
Interestingly, this height can be further solved by finding the OP of mj+1 in the
complement subspace orthogonal to the base SSGA j via orthogonal subspace projec-
tion (OSP) discussed in Sect. 4.4.2.1 of Chap. 4. In other words, we can use vertices
previously
found by SGA, m1, m2, . . ., mj to form an undesired signature matrix
Uj ¼ m1 m2 mj and find its orthogonal complement subspace via an operator
P⊥Uj ¼ I Uj Uj specified by (4.19), with U being the pseudo-inverse of Uj given
# #
1
by U#j ¼ UjT Uj UjT . Then the height of SSGA
j must be on the hyperplane
specified by P⊥
Uj . More specifically, we can find
n o
mjþ1 ¼ arg maxr P⊥
Uj r ð11:1Þ
as its height, which yields SSGA

jþ1 with the maximal GSV. It also turns out that finding
(11.1) can be carried out by ATGP. In other words, OPSGA can be interpreted as a
new application of ATGP in finding new endmembers. Interestingly, this is also
demonstrated in Chap. 13 in Chang (2016) and in Chang et al. (2016b), where the
p
endmembers ej j¼1 found by SGA [which is referred to as determinant-based
SGA (DSGA) in Chap. 2] are actually identical to those found by ATGP, provided
that the initial condition used by SGA and ATGP as an endmember e1 is the same.
p
This indicates that OPSGA essentially finds the same set of endmembers ej j¼1
found by SGA. That is, both OPSGA and SGA produce identical sets of
endmembers, but SGA requires repeatedly calculating the DSV through matrix
determinants.
According to (11.1), P⊥ Uj inverts the matrix Uj. For SGA to generate new
⊥
endmembers, PUj must be reimplemented over and over again with growing Uj.
This is because each time a new endmember mj+1 is generated, Uj also grows and is
augmented by adding this new endmember mj+1 to form a new Uj+1. That is, to
generate the ( j + 1)th endmember, mj+1, P⊥ Ujþ1 must be recalculated by a new Uj+1,
which includes Uj and the new endmember mj+1. Since a recursive version of
ATGP, referred to as recursive ATGP (RHSP-ATGP), is derived in Chap. 7, an
idea similar to RHSP-ATGP can also be used to derive a recursive version of
OPSGA, to be called recursive OPSGA (RHSP-OPSGA) to address this issue.
Specifically, RHSP-OPSGA can be considered a Kalman-like filter which makes
use of a recursive equation to update P⊥ ⊥
Ujþ1 via PUj with no need of reprocessing all
the previous j endmembers, m1, m2, . . ., mj. Most interestingly, the recursive equa-
tions also provide a means of calculating the GSV recursively while in the mean-
time also determining the number of endmembers required for SGA to generate.
These two advantages have never been reported in the literature and are significant
in hardware design and implementation.
11.3 Orthogonal Projection-Based Simplex Growing Algorithm 323
11.2 Simplex Volume Analysis
According
to Sect. 2.2
of Chap. 2, the volume of a p-vertex simplex,
Sp ¼ S m1 ; m2 ; . . . ; mp , formed by any p data sample vectors m1, m2, . . ., mp is
defined as V(m1, m2, . . ., mp) and can be calculated by the following determinant:
" #
1 . . . 1
1
Det

m1 m2 . . . mp
DSV m1 ; m2 ; . . . ; mp ¼ : ð11:2Þ
ðp 1Þ!
The p vertices of S(m1, m2, . . ., mp), m1, m2, . . ., mp in (11.2) are L-dimensional
vectors. The SGA developed by Chang et al. (2006) uses (11.2) to calculate the
DSV without DR. More specifically, SGA is designed to find a set of j data sample
vectors, denoted by {m1 , m2 , . . ., mj } one after another in such a way that for each
1 j p, mj yields the maximum DSV of (11.2), that is,
n o
m*j ¼ arg maxmj DSV m*1 ; . . . ; m*j1 ; mj : ð11:3Þ
To emphasize SGA using the matrix determinant specified by (11.2) to calculate

the DSV in (11.3), such an SGA is referred to as DSGA, as noted in Sect. 2.2 of
Chap. 2. As we can see from (11.3), the most time-consuming step is to repeatedly
recalculate matrix determinants via (11.2).
11.3 Orthogonal Projection-Based Simplex Growing

Algorithm
In order for SGA to find"endmembers, we need to #calculate the DSV via (11.2)
1 1
1 1
through the determinant ð1Þ for each j to produce the jth
e e ð2Þ
e ðj1Þ
r
endmember. Although there are fast computational methods proposed in Xiong
et al. (2010) to relieve the computational complexity, it does not change the fact of
determinant computation. In a recent study by Chen et al. (2014), it has shown that
finding the height of a simplex is equivalent to finding the OP onto the hyperplane
linearly spanned by its base. Since the volume of a simplex can be calculated by
multiplying its base by its height, this implies that the simplex OP is actually the OP
from a vertex perpendicular to the simplex formed by all the remaining vertices lies.
Figure 11.1a illustrates this concept with p ¼ 3, where a two-dimensional
(2D) three-vertex simplex with m1 and m2 forms its base and a third vertex
represented by three different data sample vector, m3, m e 3 , and m^ 3 , with the same
magnitude of OP specified by m3A, m e 3 B, and m^ 3 C as its heights when these three
a b
~ m̂ 4 ~
m
m 3 m3 m̂ 3 m4 4
m3
m1 B m2 C m1 B A
A C
m2
Fig. 11.1 Finding volumes of 2D and 3D simplexes. (a) 2D three-vertex simplexes. (b) 3D four-
vertex simplexes
ATGP
t4 ATGP
SGA
t4
t 4 VCA
SGA
t4
t4
VCA
t4
m3
A
B
C
m1 m2
PU r
U m1 , m 2 , m 3
Fig. 11.2 Finding endmembers by ATGP, VCA, and SGA via OP and simplex
vertices are orthogonally projected onto the line connecting m1 and m2. This
indicates that the volumes of the three three-vertex simplexes, m1m2m3,
m1 m2 me 3 , m1 m2 m
e 3 , are identical because they have the same base formed by the
line segment connecting m1 and m2 and the same height specified by the identical
magnitude of the three different OPs.
Similarly, Fig. 11.2b shows the case of p ¼ 4, where a three-dimensional
(3D) four-vertex simplex with a triangle formed by m1, m2, and m3 as its base
and a fourth vertex represented by three different data sample vectors, m4, m e 4, and
m^ 4, with the same magnitude of OP specified by m4A, m e 4 B, and m
^ 4 C, as its heights
when these three vertices are orthogonally projected onto a hyperplane where the
triangle formed by m1, m2, and m3 lies. Figure 11.2 further shows how this concept
plays out by the three different EFAs discussed in Chap. 3 (Chang 2016): ATGP,
VCA, and SGA.
Assume that m1, m2, and m3 are three previously found endmembers and form a
hyperplane U ¼ hm1 ; m2 ; m3 i and a fourth endmember to be found by ATGP,

⊥
⊥
VCA, and SGA is denoted by tATGP , tVCA and tSGA with t4ATGP , tVCA , and
SGA
⊥ 4 4 4 4
t4 representing their corresponding OP vectors in the hyperplane specified by

⊥
⊥
PU . The resulting OPs are denoted by their vector lengths, t4ATGP , tVCA
⊥ ,
SGA
⊥ 4
and t 4
.
From Fig. 11.2 we see that, instead of using (11.2) directly to calculate the DSV
formed by the three-vertex (m1, m2, m3) simplex and the fourth vertex, tSGA 4 as
DSV(m1, m2, m3, tSGA
4 ) an equivalent method of finding the DSV is to calculate the
GSV as the product of its base, which is the GSV(m1, m2, m3) of a simplex formed
by the three previously found endmembers m1, m2, m3, that is, the area of

⊥
the triangle formed by m1, m2, m3 with its height specified by the OP, tSGA 4

that is,

⊥
GSV m1 ; m2 ; m3 ; tSGA
4 ¼ ð1=3!Þ tSGA
4
GSVðm1 ; m2 ; m3 Þ: ð11:4Þ
Similarly, we can also obtain

⊥
GSV m1 ; m2 ; m3 ; t4ATGP ¼ ð1=3!Þ t4ATGP GSVðm1 ; m2 ; m3 Þ; ð11:5Þ

⊥
GSV m1 ; m2 ; m3 ; tVCA
4 ¼ ð1=3!Þ tVCA
4
GSVðm1 ; m2 ; m3 Þ: ð11:6Þ
Since tSGA
4 must be the one to produce the maximal GSV, this implies that

4 GSV m1 ; m2 ; m3 ; t4ATGP ; ð11:7Þ
which implies
SGA
⊥ ATGP
⊥
t t ð11:8Þ
4 4
because of (11.4) and (11.5). On the other hand, ATGP finds tATGP to produce the
4 ATGP
⊥ 2
maximal OP among all data sample vectors, that is, t4 ¼
⊥
T ⊥
⊥ ATGP
T ⊥ ATGP
max PU r PU r ¼ PU t4 P U t4 , i.e.,
ATGP
⊥ 2 ⊥ SGA
T ⊥ SGA
SGA
⊥ 2
t P t PU t4 ¼ t4 ; ð11:9Þ
4 U 4
which yields
ATGP
⊥ SGA
⊥
t t : ð11:10Þ
4 4

⊥
⊥
Combining (11.8) and (11.10) yields t4ATGP ¼ tSGA
4
. As a result, their
simplex volumes must be the same given by

GSV m1 ; m2 ; m3 ; t4ATGP ¼ GSV m1 ; m2 ; m3 ; tSGA

4 : ð11:11Þ
With regard to VCA, if the maximal OP is used as a criterion for optimality, then
ATGP
⊥ VCA
⊥
t t : ð11:12Þ
4 4
According to (11.6), this implies that

GSV m1 ; m2 ; m3 ; t4ATGP GSV m1 ; m2 ; m3 ; tVCA

4 : ð11:13Þ
Also, if the maximal GSV is used as the optimal criterion, then

4 GSV m1 ; m2 ; m3 ; tVCA
4 : ð11:14Þ
In the cases of both (11.13) and (11.14), the best performance of the VCA can be
achieved by ATGP in terms of maximal OP and by SGA in terms of the
maximal GSV.
The foregoing argument concludes that, even though the three target pixels
tATGP
4 , tVCA
4 and tSGA
4 may be different in the original data space, tATGP
4 , tVCA
4 and
SGA ⊥
t produce an identical OP in the orthogonal complement subspace PU , that is,
4 ATGP
⊥ VCA
⊥ SGA
⊥
t ¼ t ¼ t . Accordingly, their GSVs are actually
4 4 4
the same as

GSV m1 ; m2 ; m3 ; t4ATGP

¼ GSV m1 ; m2 ; m3 ; tVCA
4 ð11:15Þ

¼ GSV m1 ; m2 ; m3 ; tSGA
4 :
It should be noted that there is a significant difference between finding OPs

perpendicular to endmembers and finding OPs perpendicular to a simplex.
Figure 11.3 shows a three-endmember simplex example for illustration. Assume
that m1, m2, and m3 are three endmembers that form a simplex
Sðm1 , m2 m1 , m3 m1 Þ , which is specified by the triangle connected by the
three vertices m1, m2 m1, and m3 m1 lying on the hyperplane highlighted in
blue, and Sðm1 , m2 m1 Þ is simply the segment connected by m1 and m2, with the
simple volume given by the vector length of m2 m1, ||m2 m1||. Since
Sðm1 , m2 m1 Þ must satisfy the ASC, m1 þ m2 ¼ 1, Sðm1 , m2 m1 Þ is reduced
to a one-dimensional (1D) simplex.
Thus, the third endmember for Sðm1 , m2 m1 , m3 m1 Þ is found by operating
P⊥½m2 m1 on the space perpendicular to the segment m2 m1. The data sample
vector m3 is assumed to be the one that yields the maximal OP. In this case, m3 can
m 2 – m1 , m 3 – m 1
m3
m3 = P[ m – m ] m 3
2 1
m 3 – m1 m
3
m3 – m 2
A
m1 m 2 – m1 m2
m3
t 3 = P[m m ]t 3
1 2
m1 , m 2
O
Fig. 11.3 Three-endmember simplex example
⊥
be expressed as a sum of two terms, m3 ¼ m⊥ ⊥
3 þ ðm3 Þ⊥, where m3 ¼ P½m2 m1 m3 is
the maximal OP on hm2 m1 , m3 m1 i⊥ perpendicular to the hyperplane
hm2 m1 , m3 m1 i , and ðm3 Þ⊥ ¼ m3 m⊥ 3 ¼ P½m2 m1 m3 is the projection
onto the hyperplane hm2 m1 , m3 m1 i (Fig. 11.3). This
Sðm1 , m2 m1 , m3 m1 Þ is simply the triangle Δ(m1, m2, m3) obtained by adding
to the two-vertex simplex Sðm n 1 , m2 m1 Þ a third endmember m3, which yields
⊥ o
the maximal OP, m3 ¼ arg maxr P r , where the segment Am3 is
½m2 m1
perpendicular to Sðm1 , m2 m1 Þ and its volume is calculated as
GSVðm1 , m2 m1 , m3 m1 Þ ¼ m⊥ 3 ð m 2 m1 Þ.
⊥
However, as also shown in Fig. 11.3, t⊥ 3 ¼ P½m1 m2 t3 is assumed to be the one
yielding the maximal OP perpendicular to the hyperplane
n highlighted inored linearly
⊥
spanned by m1 and m2 with no ASC, where t3 ¼ arg maxr P r is derived
½m1 m2
from the well-known ATGP and is different from m3, which is found by the
segment m2 m1 rather than the space linearly spanned by m1 and m2. Then the
simplex specified by the triangle formed by m1, m2, and t⊥ 3 is highlighted in red, and
its volume can be calculated by GSVðm1 ; m2 ; t3 Þ ¼ t⊥ 3 GSV ðm1 ; m2 Þ according to
(11.5). Obviously the simplex formed by the
triangle Δ(m 1, m2, m3) is different
from that formed by the triangle Δ m1 ; m2 ; t⊥ 3 . This major difference arises from
⊥
the fact that P½m1 m2 operates on a 2D space linearly spanned by two independent
vectors, m1 and m2, while P⊥ ½m2 m1 operates on a 1D space spanned by m2 m1 due
to the fact that the effect of the vector m1 is removed from the simplex volume
calculation simply by placing m1 at the origin. Nevertheless, if all data sample
vectors
in
the data space are linearly translated by m1, then
Δ m1 ; m2 ; t⊥
3 ¼ Δðm1 ; m2 ; m3 Þ. In other words, m1 is considered a biased vector.
Once it is removed, DSGA and ATGP will produce the same set of endmembers
and the same GSV.
11.4 Orthogonal Projection-Based SGA (OPSGA)
According to Sect. 2.3 of Chap. 2, a j-dimensional simplex Sj+1 is a ( j + 1)-vertex

convex polygon. For example, a simplex in 1D space is simply a line segment
between two specified points. A 2D simplex is a triangle (three vertices), a 3D
simplex is a tetrahedron (four vertices), and so on. Thus, the volume of a simplex
Sj+1, denoted by V(Sj+1) (viz., the length of a 1D simplex, j ¼ 1, the area of a 2D
simplex, j ¼ 2, the volume of a 3D simplex, j ¼ 3, and so on) can be expressed very
simply as a function of the coordinates of the j + 1 vertices. Suppose that B(Sj+1) is
the base of Sj+1 and h(Sj+1) is the height of Sj+1, with the ( j + 1)th vertex, that is
perpendicular to B(Sj+1), to be added. Assume that V(B(Sj+1)) is the volume of B
(Sj+1) and h(Sj+1) is the height of Sj+1 that is the perpendicular distance of the apex
from the subspace containing the base B(Sj+1). The volume V(Sj+1) of Sj+1 is then
given by (http://mathpages.com/home/kmath664/kmath664.htm)
ð hðSjþ1 Þ !j

h
GSV Sjþ1 V Sjþ1 ¼ V B Sjþ1
dh
0 h Sjþ1

ð11:16Þ
h Sjþ1
¼ V B Sjþ1 ¼ ð1=ðj þ 1Þ!Þh Sjþ1 h Sj hðS1 Þ;

jþ1
where h(Sj+1) is the height of the simplex Sj+1, which is perpendicular to its base
B(Sj+1). For example, h(S2) is the height of a two-vertex simplex S2, which is simply
a distance from one point, h(S3) is the height of a triangle S3, which is the third
vertex perpendicular to the line segment connecting two vertices specified by S2,
h(S4) is the height of a pyramid S4, which is the fourth vertex perpendicular to
the plane containing a triangle S3 formed by the first three vertices, and so on. The
method using (11.16) to calculate the GSV is called OPSGA.
As an alternative, the volume of a simplex Sn can also be calculated as
" #
1 1 1

v1 v2 vn
DSVðSn Þ ¼ ; ð11:17Þ
ðn 1Þ!
which is identical to (11.2). The v1, v2, . . ., vn in (11.17) are n + 1 vertices of Sn,
which are L-dimensional vectors. The method using (11.9) to calculate the DSV is
called DSGA. Details on DSV calculation by (11.16) and (11.17) can be found in Li
et al. (2015).
Theoretically, both (11.16) and (11.17) should yield the same SV. In fact, when
it comes to practical implementation, they do not produce the same answers owing
to the use of SVD in (11.17) compared to (11.16), which does not have such an
issue. Consequently, OPSGA is generally preferred to DSGA for two reasons. First,
DSGA may not yield correct results when it is used to find the determinant of a
11.5 Recursive OP-Simplex Growing Algorithm 329
nonsquare matrix. The other reason is that DGSA generally entails very high
computational costs, while OPSGA does not.
Finally, it should be noted that since a simplex is fully constrained, it requires
extra care when it comes to computing its volume. Assume that a simplex Sp ¼ S

m1 ; . . . ; mp is formed by p vertices, m1, . . ., mp. Then the correct volume is

actually computed by

1
DSV Sp ¼ DetðME Þ; ð11:18Þ
ðp 1Þ!
where
" # " #
1 1 1 1 0 0
DetðME Þ ¼ Det ¼ Det
m1 m2 mp m1 m2 m1 mp m1 ð11:19Þ
¼ Det½ m2 m1 m3 m1 mp m1 :
In other words, the correct volume of the simplex Sp should be computed based
on the simplex with m1 centered at the origin. In doing so, we simply subtract m1
e j ¼ mj m1 for all 2 j
from all the p 1 endmembers, m2, . . ., mj by letting m
p, and the correct volume of Sp can be calculated as follows:

1
GSV Sp ¼ Det½ m
e2 e3
m e p :
m ð11:20Þ
ðp 1Þ!
11.5 Recursive OP-Simplex Growing Algorithm
According
to
(11.4), two operations are involved in GSV computation,
V m1 ; . . . ; mj1 and the maximal OP found by SGA.
11.5.1 Recursive GSV Calculation
By virtue of (11.4)–(11.7) and (11.12), a general expression can be further

represented by

GSV m1 ; . . . ; mj1 ; tj ¼ ð1=ðj 1ÞÞt⊥j V m1 ; . . . ; mj1

1 ð11:21Þ
¼ Det me 2m e j1et j
e 3 m
ðj 1Þ!
for 1 j p, where tj is the jth endmember to be determined, et j ¼ tj m1, and t⊥

j is
the OP of tj orthogonally projected onto the hyperplane containing points
m1 , . . . , mj1 in geometric space. It is worth noting that t⊥

j is the same as the OP of
D E
et j ¼ tj m1 orthogonally projected onto the hyperplane U e j1 , with U
e j1 ¼

m e 3 m
e 2m e j1 and me j ¼ mj m1 for all 2 j p. Also, since m1 is a constant
⊥
⊥
vector, et j ¼ tj m1 .

To find the jth endmember, mj, we use (11.20) and (11.21) to find a data sample
vector m e j ¼ mj m1 that yields the maximal GSV of (11.16), that is,

e j ¼ arg maxtj GSV m1 ; . . . ; mj1 ; et j
m

ð11:22Þ
¼ arg maxeet ⊥
j GSV m1 ; . . . ; mj1 ;
tj
which is equivalent to finding a target data sample vector tj producing the maximal
D E⊥ D E
OP in the space, Ue j1 , perpendicular to the linear subspace U e j1 spanned by
e 2, . . . , m
the previously found j 1 endmembers, m e j1 . Interestingly, solving
(11.22) is equivalent to solving

⊥

e j ¼ arg maxe t j
m e ¼ arg max ⊥e
r e
T
r ; ð11:23Þ
tj er ¼rm12 e
U j1
which is in turn equivalent to finding

⊥
e j ¼ arg maxe et ⊥
m
¼ arg maxer Pe e r ; ð11:24Þ
U j1
tj j
1
with P⊥ ¼ I U e # and U
e j1 U e# ¼ U eT U e j1 e T being the pseudo-
U
e
U j1
j1 j1 j1 j1
2 T T
⊥
e
inverse of U j1 , and Pe e ⊥
r ¼ Pe e r Pe e⊥
r ¼e ⊥
r Pe e
T ⊥
r and Pe ¼
U j1 U j1 U j1 U j1 U j1
2
P⊥e ¼ Pe
⊥
being idempotent. In other words, (11.4) can be solved via
U j1 U j1
(11.23) and (11.24) by

e j ¼ arg maxr GSV m
m e 2; . . . ; m
e j1 ; e
r
n ⊥
o
¼ arg maxer ð1=ð j 1ÞÞe r GSV m e j1
e 2; . . . ; m
( ð11:25Þ
T )
¼ arg max e ⊥ e r e
T
r ¼ arg maxer Pe e ⊥
r ⊥
Pe e r :
er2 U j1 U j1 U j1
Coincidentally, the solution to (11.25) can also be found by ATGP. For example,
Fig. 11.1 can be redepicted in Fig. 11.4 to illustrate its idea.
a b
ATGP
SGA ATGP VCA SGA
t4 VCA t4
t3 t3 t3 t4
m3-m1
0 m2 - m1 C 0 B C A
B A
m 2- m1
Fig. 11.4 Finding volumes of 2D and 3D simplexes. (a) 2D three-vertex simplexes. (b) 3D four-
vertex simplexes
While we can find an SV via (11.16), we can further derive a recursive equation
that reduces one dimension at a time for (11.21) recursively. More specifically, we
can rewrite (11.21) via (11.22) as

⊥
GSV m1 ; . . . ; mj ¼ ð1=ðj 1ÞÞm e j GSV m1 ; . . . ; mj1

" #
Yk
¼ ððj kÞ!=ðj 1Þ!Þ me GSV m1 ; . . . ; mjk1

⊥
i
i¼0
ð11:26Þ
for 0 k j 3 until k ¼ j 3 in which case (11.23) becomes

⊥
GSV m1 ; . . . ; mj ¼ ð1=ðj 1ÞÞm e j GSV m1 ; v; mj1

" #
Yj3 ð11:27Þ
⊥
¼ ð1=ðj 1Þ!Þ me ji GSVðm1 ; m2 Þ;
i¼0
where GSV(m1, m2) is a 1D simplex with two vertices whose volume is the length
of the line segment connecting m1 and m2, m e 2 ¼ m2 m1 . It should be noted that
RHSP-OPSGA does not directly use (11.1) to calculate an SV. Instead it makes use
of (11.21) and (11.27) without DR. In addition, it is also worth noting that the
volume GSV(m1, . . ., mj) is actually calculated by

1
GSV m1 ; . . . ; mj ¼ e 2m
Det m e 3 m e j :
e j1 m ð11:28Þ
ðj 1Þ!
11.5.2 Derivations of RHSP-OPSGA Equations
As shown in (11.27), the key element in calculating the SV is the product of

maximal OPs,
j3
Y
⊥
me ji ; ð11:29Þ
i¼0
produced by endmembers where mj solves (11.21) and (11.22). This section

develops (a recursive equation to) solve (11.25). According to (11.25),
T
e j ¼ arg maxer P⊥ e
m r P ⊥ e
r , which corresponds to finding the
e e U j1 U j 1
D E⊥
maximal OP in the space e j1
U . In this case, we need to derive an equation
that can calculate P⊥ recursively via P⊥ as follows. Let Ue j1 ¼
e
Uj e
U j1

ðm2 m1 Þ ðm3 m1 Þ mj1 m1 ¼ m e 3 m
e 2m e j1 and ej ¼
U
h i
m e 3 m
e 2m ej ¼ U
e j1 m e j1 m
e j . Then we can obtain the following recursive equa-
tions for implementing OPSGA:
" #1
h i1 eT U e eT m
U j1 j1 U j1 e j
e TU
U ej ¼
j
m e j1 m
e jT U e jT m
ej
2 1 T 3 ð11:30Þ
eT U e j1 e# m e jm
e jT Ue# e# m
6
U j1 þ βU j1 j1 βU j1 e j 7
¼4 T 5;
βm e jT Ue# β
j1
1
e# ¼ U
where U eT U e j1 e T and
U
j1 j1 j1
1 1 1
β¼ e jT
m e e T e
I U j1 U j1 U j1 e T
ej
U j1 m ej P
¼ m T ⊥
e ; ð11:31Þ
e mj U j1
1
e# ¼ U
U e TUej UeT
j j j
2 1 T 32 3
UeT U e j1 þ βU e# m e jm
e jT U e# βU e# m ej eT
U
6 j1 j1 p1 j1 74 j1
5
¼4 T 5
βm T e
e j U j1 #
β e
m j
T
ð11:32Þ
2 T 3
e # þ βU
U e# m e e
m T e#
U Ue T βU e# m e e
m T
6 j1 j1 j j j1 j1 j1 j j 7
¼4 T 5;
βm e jT Ue# Ue T þ βm e T
j1 j1 j
h i 1
e jU
U e# ¼ U e j1 m
ej U e TU ej eT
U
j j j
2 T 3
h e # þ βU
i6 U e# m e me T e#
U Ue T
β e
U #
e
m e
m T
j1 j1 j j j1 j1 j1 j j 7
¼ Ue j1 m
ej 6 7
4 T 5
T e# e
βmj U j1 U j1 þ βmj
T T
T
e j1 U
e # þ βU e# m
e j1 U e# e T βU e# m
e j1 U
¼U j1 j1 e j m
e jT U j1 U j1 j1 e j m
e jT
T
βm e jT U
e jm e# e T þ βm
U e jT
e jm
j1 j1
T
¼U e # þ βe
e j1 U u j1 e
u T
βe
u j1 e
m T
β e
u j1 e
m T
þ βm e jT ;
e jm
j1 j1 j j
ð11:33Þ
e# m
e j1 U
ej1 ¼ U
where u j1 e j ;
1
P⊥ ¼ I U e# ¼ I U
e jU ej Ue TUej eT
U
eUj
j j j
¼ P⊥
e βe ej1
u j1 u T
þ βe e jT þ βm
u j1 m e ju
ej1
T
βm e jT
e jm ð11:34Þ
U j1

T
¼ P⊥
e β u ej u
ej1 m ej ;
ej1 m
U j1
1
where P⊥ ¼ I e j1 U
U e# ¼ I U e j1 U eT U e j1 e T . It should also be
U
e
U j1
j1 j1 j1

T
Te
noted that all mj u j1 ¼ u ej1 m
T e j, m ej u
T ej1 m ej ¼ u ej m
ej1 m e j , β in
e e e
(11.30)–(11.34) are scalars. If U p1 is invertible, then U p1 U p1 ¼ I. It should be
#
1
e e
ej1 ¼ U j1 U j1 m
noted that u #
e j, m
ej u
T ej1 ¼ uej1 m
T e j, and β ¼ m ej P
T ⊥ e are
e mj
U j1
e j1 and m
used to account for the correlation between U e j.
11.5.3 RHSP-OPSGA
According to (11.21)–(11.22), the jth endmember m e j can be found by maximizing

D E⊥
the residuals of P⊥ r over all the data sample vectors in the space e j1 , and
U
e U j1
(11.34) is a recursive equation that can be used to compute P⊥ ⊥
ej via PU
ej1 recursively.
U
By taking advantage of (11.26), (11.27), and (11.34), we can derive recursive
hyperspectral sample processing of OPSGA (RHSP-OPSGA) as follows.
RHSP-OPSGA
(a) Find two endmembers, m1 and m2, that yield a line segment with the
maximal length. In other words, find m1 and m2 with the maximal Euclidean
distance, that is, ðm1 ; m2 Þ ¼ arg maxðr;sÞ dðr; sÞ , where d ð; Þ is the
Euclidean distance measure.
1
(b) Find P⊥ ¼ I U e# ¼ I U
e 2U e2 U e TUe2 e T , with U
U e 2 ¼ ½m2 m1 and
e U2
2 2 2
V ðm1 ; m2 Þ ¼ dðm1 ; m2 Þ.
(c) Set j ¼ 3.
2. At the jth recursion:

⊥
(a) Find m j ¼ arg maxer Pe r
e e via (11.22).
U j1
(b) Calculate

⊥ ⊥
e j ¼
m mj
e
Pe
U j1 : ð11:35Þ
e⊥
(c) Compute, using m j obtained in step 2(b),

⊥
GSV m1 ; . . . ; mj ¼ ð1=ðj 1ÞÞm

e j GSV m1 ; . . . ; mj1 : ð11:27Þ
(d) Use the following recursive equation to calculate P⊥

e via P⊥
e :
Uj U j1

T
P⊥ ⊥
ej ¼ PU
ej1 β u
ej u
ej1 m ej ;
ej1 m ð11:36Þ
U
e# m
e j1 U
ej1 ¼ U
where u j1 e j .
3. If j < p, then go to step 2. Otherwise, continue.

e j þ m1 for
4. At this step, all p endmembers, m1, m2, . . ., mp with mj ¼ m
2 j p, have been generated by RHSP-OPSGA, and their formed simplex
volume can be calculated by (11.27) in step 2.
Several advantages can be gained from OPSGA that are not found in SGA.
1. There is no random issue in initialization since the initial condition is obtained
by two endmembers, m1 and m2, with maximal Euclidean distance.
2. Once P⊥ e e e ⊥
e is calculated with U 2 ¼ ½m 2 , with m 2 ¼ m2 m1 , Pe can be
U2 Uj
calculated recursively from j ¼ 3 to p 1 in step 2(d).
3. By virtue of P⊥
e , for each j we can obtain the jth endmember mj via (11.27) in
U j1
step 2(a).
11.6 Various Versions of GSVA-Based Algorithms 335
4. In the meantime we can also find the SV via (11.27) in step 2(c).
5. Most importantly, the SV calculation equation specified by (11.27) is exactly the
same as (11.21) but can be recursively implemented by RHSP-OPSGA to find
SVs of growing simplexes.
11.6 Various Versions of GSVA-Based Algorithms
All the algorithms derived from DSGA can be considered variants of simplex
growing volume analysis (GSVA)-based algorithms. These include SGA,
RT-SGA, GSGA, OPSGA, and a recently developed distance-based SGA (Dist-
SGA) (Wang et al. 2013). However, depending on how initial conditions are
implemented, these GSVA-based algorithms also produce different results. For
example, an initial condition can be initialized by one endmember m1 with zero
simplex volume or 1D two-vertex simplex with maximal volume, which is the
maximal Euclidean distance of a pair (m1, m2) among all data sample vectors. This
section takes initial conditions into account to derive various GSVA algorithms.
Since initial conditions are applied to all GSVA variants and SGA-like algorithms,
the terminology of GSVA is used for a generic and general purpose, while the
algorithm’s implementation is described by OPSGA for illustration. Nevertheless,
the same algorithmic description can also be applied to other SVA-based algo-
rithms, such as DSGA, GSGA, and Dist-SGA.
11.6.1 1-GSVA
The first version of GSVA starts the initial condition with a zero-dimensional
one-vertex simplex of zero SV where a one-vertex simplex is simply a single data
sample vector and can be considered a degenerated simplex. Using such a
degenerated simplex we can derive a special version of 1-GSVA.
1-GSVA

(a) Find m1 with maximal vector length, that is, m1 ¼ arg maxr rT r .
T
1 T
(b) Find P⊥ U1 ¼ I U1 U1 ¼ I U1 U1 U1
#
U1 , with U1 ¼ ½m1 and
pffiffiffiffiffiffiffiffiffiffiffiffiffi

GSVðm1 Þ ¼ m1 ¼ m1 m1 , where U1 ¼ ½m1 ¼ U
T e 1.
(c) Set j ¼ 2.
2. At the jth recursion:

⊥
(a) Find m
e j ¼ arg maxer P e r via (11.22), where e
r ¼ r m1 ,
U j1
e
P⊥ ¼IU e# .
e j1 U
e
U j1
j1

⊥ ⊥
(b) Calculate m
e j ¼ P
U e
ej1 .
m j

(c) Compute

⊥
GSV m1 ; . . . ; mj ¼ ð1=ðj 1Þ!Þm

e j GSV m1 ; . . . ; mj1 ð11:27Þ
e⊥
using m j obtained in step 2(b).
(d) Use the following recursive equation to calculate Pe j ⊥ via P⊥
e :
Uj U j1

T
P⊥ ⊥
ej ¼ PU
ej1 β u
ej u
ej1 m ej ;
ej1 m ð11:36Þ
U
e# m
e j1 U
ej1 ¼ U
where u j1 e j . Continue.
3. If j < p, then go to step 2. Otherwise, continue.

4. At this step, all p endmembers, m1, m2, . . ., mp, have been generated by 1-SVA,
and their formed simplex volume can be calculated by (11.27) in step 2(c).
According to the preceding description, the 1-GSVA is indeed identical to
recursive ATGP. It is also interesting to note that U1 in the 1-GSVA is formed by

1 T
1
a single vector, U1T U1 ¼ m1 m1 , which is simply an inverse of the scalar

1
m1 m1. As a result, PU1 is an outer product of m1, m1mT1 , multiplied by m1T m1 ,
T

1
that is, PU1 ¼ m1T m1 m1 m1T . In this case, there is no matrix inverse involved in
P⊥ ⊥ ⊥
e . However, since Pe is updated by Pe according to (11.36), only inner
U1 Uj U j1
products and outer products are required for finding P⊥ ej , and no matrix inverse
U
computation is needed. This is a significant advantage for the implementation of the
1-GSVA in hardware design.
11.6.2 Adaptive GSVA
A second version of the GSVA is called the adaptive GSVA (AGSVA), which can
adapt its initial condition to different needs by starting with any dimensional
simplex as its initial condition. In particular, let m1, . . ., mj be the initial
j endmembers with j 3 that are provided a priori by any means, either through
visual inspection or from prior knowledge. To specify how many endmembers are
used as initial endmembers, AGSVA using ninitial endmembers as its initial
endmembers is denoted by ninitial-GSVA. With this definition, 1-GSVA and
GSVA can be considered special cases of ninitial-ROPSGA with ninitial ¼ 1 and
2, respectively.
11.7 Determining the Number of Endmembers for RHSP-OPSGA 337
ninitial-RHSP-GSVA
(a) Let ninitial ¼ j and find j endmembers, m1, . . ., mj that yield a j-vertex simplex
with the maximal volume, GSV(m1, . . ., mj).
1
(b) Find P⊥ ¼ I U e# ¼ I U
e jU ej Ue TUej eT ,
U with ej ¼
U
e j j j
Uj

m2 m1 mj m1 and the simplex volume V(m1, . . ., mj)
2. At the ( j + 1)th recursion:

⊥
(a) Find m
e jþ1 ¼ arg maxer P e r via (11.22).
Uj
e

⊥ ⊥
(b) Calculate m e jþ1 ¼ P
U ej
e
m jþ1
.

⊥
e jþ1 obtained in step 2(b),
(c) Compute, using m

⊥
e jþ1 GSV m1 ; . . . ; mj :
GSV m1 ; . . . ; mjþ1 ¼ ð1=jÞm ð11:27Þ
(d) Use the following recursive equation to calculate P⊥

e via P⊥
e:
U jþ1 Uj

T
P⊥
e ¼ P⊥
e β u
e jþ1 u
ej m e jþ1 ;
ej m ð11:36Þ
U jþ1 Uj
ej ¼ U
where u e #e
e jU
j r.
3. If j þ 1 < p, then go to step 2. Otherwise, continue.

4. At this step, all p endmembers, m1, m2, . . ., mp, have been generated by ASVA,
and their formed simplex volume can be calculated by (11.27) in step 2(c).
It is worth noting that the AGSVA developed earlier can be considered a general
version of the GSVA. For example, when ninitial ¼ 1 and 2, the AGSVA is reduced
to the 1-GSVA and GSVA, respectively. If ninitial ¼ p, AGSVA finding the initial
condition becomes N-FINDR.
11.7 Determining the Number of Endmembers

for RHSP-OPSGA
In OPSGA and RHSP-OPSGA, the value of p, that is, the total number of
endmembers, is known and fixed. When its value is unknown, it must be determined
in an unsupervised fashion. The concept of VD, recently developed in Chang
(2003) and Chang and Du (2004), has been widely used for this purpose. In this
section, we take a rather different approach using real target data sample vectors
instead of eigenvalues or eigenvectors as signal sources for Neyman–Pearson

detection problems.
1
ej P ⊥ e
The idea originated from β ¼ m T
e mj in (11.31), which accounts for
U j1
the accuracy in the prediction of P⊥ ⊥
ej1 , where the smaller β is, the better the
ej via PU
U
⊥ ⊥
prediction of PUj by Pe is. In this case, minimizing β is equivalent to maximizing
U j1
me jT P⊥ m e j . Suppose that p is the number of endmembers that must be deter-

e U j1
mined. Then the quantity
e pT P⊥ m
ηp ¼ m ep ð11:37Þ
e U p1
can be used to determine how many endmembers for RHSP-OPSGA should be

generated. In other words, both the pth endmember, m e p , and hp can be found by
(11.37). More interestingly, the ηp turned to be exactly the same signal sources
under the binary hypothesis testing problem considered in the maximal orthogonal
subspace projection (MOSP) in Chang et al. (2011) for determining the VD or
maximal orthogonal complement algorithm (MOCA) in Kuybeda et al. (2007) for
determining the rank of the rare signal dimensionality, demonstrated as follows.
Following the same approach as in Kuybeda et al. (2007) and Chang et al.
(2011), we assume that for each 2 p L, E0 ¼ ∅ and Ep1 is the endmember
space linearly spanned by previously found ( p 1) endmembers
n op1
e RHSP-OPSGA
m by RHSP-OPSGA, which is exactly the space U e describedp1
j
j¼1
in the RHSP-OPSGA, E e p1 ¼ U
e p1 . Since RHSP-OPSGA is recursive and more
effective than the OPSGA, it will be used to generate the endmembers. In addition,
it should be noted that the value of p starts with 2, where m1RHSP - OPSGA and
mRHSP - OPSGA are the two endmembers generated by finding the maximal segment
2
in the data space and m e RHSP-OPSGA ¼ m2RHSP-OPSGA m1RHSP-OPSGA makes up the
R-OPSGA 2
space Ee1 ¼ m e2 . Then for 2 p L we can find via (11.23) and (11.24),

⊥
e pHSPR-OPSGA
m ¼ arg maxr Pe e
r ; ð11:38Þ
E p1

e pT P⊥ m
ηp ¼ m e p ¼ P⊥ m e p 2 ; ð11:39Þ
e
E p1 e
E p1
where ηp in (11.37) can be reexpressed by the maximal residualD of the E pth

endmember, mepRHSP - OPSGA
, found by the RHSP-OPSGA leaked from E e p1 into
D E⊥ D E
e p1 , which is the complement space orthogonal to the space E
E e p1 . Since
n oL
the found m e pRHSP-OPSGA obtained by the RHSP-OPSGA may be highly
p¼2
11.7 Determining the Number of Endmembers for RHSP-OPSGA 339
correlated, ηp in (11.39) is used instead because ηp represents the maximum

D E⊥
residuals of me ROP -SGA leaked into Ee p1 . It is this sequence, ηp L , that
p p¼2
will be used as the signal source in a binary composite hypothesis testing problem to
determine whether or not the lth potential target candidate, me RHSP -OPSGA , is a true
p
target by a detector formulated as follows:
T
⊥ RHSP-OPSGA 2
P ep
m ¼ P ⊥ m e HSPR-OPSGA P ⊥
me RHSP-OPSGA
eE p1 e
E p1
p e
E p1
p
T T
e RHSP -OPSGA ⊥ ⊥ e RHSP-OPSGA
¼ m p P e P e m p :
E p1 E p1
ð11:40Þ
T
⊥ ⊥ ⊥ ⊥
Because Pe is symmetric and idempotent, that is, Pe Pe ¼ Pe ,
E p1 E p1 E p1 E p1
(11.38) is reduced to
⊥ RHSP-OPSGA 2 RHSP-OPSGA T ⊥ RHSP-OPSGA
P ep
m ¼ m
ep Pe ep
m : ð11:41Þ
eE p1 E p1
⊥
Now if we replace Pe in (11.41) with P⊥ e , with Ep1 ¼ Up1, (11.41)
E p1 U p1
becomes (11.37) and (11.39). This shows that the stopping rule using (11.41) to
terminate RHSP-OPSGA is identical to the signal sources used under each
hypothesis by Neyman–Pearson detection theory to determine the VD.
It is this sequence, {ηp} given by (11.39), that will be used as the signal sources
in a binary composite hypothesis testing problem to determine whether or not the
pth potential endmember candidate m e RHSP -OPSGA is a true endmember by a detector,
p
formulated as follows:

H 0 : ηp p ηp H0 ¼ p0 ηp
versus for p ¼ 1, 2, . . . , L ; ð11:42Þ

H 1 : ηp p ηp H1 ¼ p1 ηp
e pRHSP-OPSGA being an endmember signal source under H1 and not an endmember
of m
signal source under H0 respectively in the sense that H0 represents the maximum
residual resulting from the background signal sources, while H1 represents the
maximum residual leaked from the endmember signal sources. Note that when
p ¼ 0, η0 is undefined in (11.42). To make (11.42) work, we need to find probability
distributions under both hypotheses. The assumption made on (11.42) is that if a
signal source is not an endmember under H0, it should be considered part of the
background, which can be characterized by a Gaussian distribution. On the other
hand, if a signal source is indeed a desired endmember, it should be uniformly

distributed over [0, ηp1]. By virtue of extreme value theory
(Leadbetter

1987), ηp
can be modeled as a Gumbel distribution, that is, Fvp ηp is the cumulative
distribution function (cdf) of vp given by
( h i)
ð2 log N Þ1=2 pffiffiffiffiffiffiffiffi
xσ 2 ðLpÞ
ð2 log N Þ1=2 þ12ð2 log N Þ1=2 ðlog log Nþlog 4π Þ
Fvp ðxÞ exp e σ 2 2ðLpÞ
: ð11:43Þ
Since there is no prior knowledge available about the distribution of signal

sources, assuming ηp under H1 uniformly distributed seems most reasonable
Under these two assumptions, we obtain

p H 0 ; ηp ¼ pνp ηp Fξp ηp ¼ pνp ηp ηp =ηp1 ; ð11:44Þ

p H 1 ; ηp ¼ Fνp ηp pξl ηp ¼ Fνp ηp 1=ηp1 : ð11:45Þ

h

i
Since pηp ηp ¼ p H 0 ; ηp þ p H 1 ; ηp ¼ 1=ηp1 ηp pνp ηp þ Fνp ηp , we
can obtain an a posteriori probability distribution of p(H0|ηp) given by

ηp pνp ηp
p H 0 ηp ¼

ð11:46Þ
η p pν p η p þ Fν p η p
and an a posteriori probability distribution of p(H1|ηp) is given by

Fνl ηp
p H 1 ηp ¼

: ð11:47Þ
By virtue of (11.46) and (11.47), a Neyman–Pearson detector (NPD), denoted by

δNP(ηp), for the binary composite hypothesis testing problem specified by (11.40)
can be obtained by maximizing the detection power PD, with the false alarm
probability PF being fixed at a specific given value, α, which determines the
threshold value τp in the following randomized decision rule:
8
>
> 1, if Λ ηp > τp,

<
δNP
RHSP-OPSGA ηp ¼ 1 with probability κ, if Λ ηp ¼ τp , ð11:48Þ
>
>
: 0, if Λ ηp < τp ;

and p1(ηp) given by (11.44) and (11.45). Thus, according to (11.29), a case of
ηp > τp indicates that δNPRHSP - OPSGA (ηp) in (11.48) fails the test, in which case
e RHSP
m -OPSGA is assumed to be an endmember. Note that the test for (11.48) must be
p
performed for each of the L potential endmember candidates. Therefore, for a
different value of p the threshold τp varies. Using (11.48) the value of VD, nRHBP-
OPSGA, can be determined by calculating
n h io
RHSP-OPSGA
VDNP
RHSP-OPSGA ð P F Þ ¼ arg max p δ NP
RHSP-OPSGA ηp ¼ 1 ; ð11:49Þ

where PF is a predetermined false alarm probability, δNP - η ¼ 1 only if

NP
RHSP
p
OPSGA
δRHSP-OPSGA ηp ¼ 1, and δRHSP-OPSGA ηp ¼ 0 if δRHSP-OPSGA ηp < 1. Note that
NP NP
since p starts with 2, the actual value of VD estimated for RHSP-OPSGA in (11.49)
should be VDNP RHSP-OPSGA ðPF Þ þ 1.
Since the sequence of subspaces {Ẽp} is nested in terms of E e1

Ee p , the

⊥
sequence of maxer Pe e r produced by (11.38) is monotonically decreasing.
E p1
This further implies that the sequence of {ηp} produced by (11.39) is also mono-
tonically decreasing. Using this fact, the sequence of detectors {δNP (ηp)}
- OPSGA
RHSP
starts with a failure of the test specified by (11.48), that is, δRHSP-OPSGA ηp ¼ 1, and
NP
continues on as the value of p is increased

until p reaches the value at which the test
passes, in which case δNP RHSP-OPSGA ηp < 1. The largest value of p makes

δRHSP-OPSGA ηp ¼ 1 the value of VDNP

NP
RHSP - OPSGA (PF) according to (11.49). This
unique property allows VDNP RHSP - OPSGA (PF) to be implemented in real time as the
process is continued with increasing p.
Interestingly, it was shown in Chang et al. (2015) that the residual energy of the
signal source m e p in (11.37) into P⊥ can be further modified by the residual
e
U p1
e p in (11.37) into P⊥
strength of the signal source m Up1 and given by
1=2
pffiffiffiffiffi T ⊥
ρp ¼ ηp ¼ e e
m p Pe m p : ð11:50Þ
U p1
The detector in (11.48) is then performed on the signal residual strength ρp in

(11.49) under the binary hypothesis testing problem (11.39). In this case, the VD in
(11.49) becomes
NP

RHSP-OPSGA ðPF Þ ¼ arg maxp δRHSP-OPSGA ρp ¼ 1 :
VDNP ð11:51Þ
Although many real hyperspectral image scenes can be used for experiments, their
conclusions are very similar. In this case, we used two real image scenes that have
been studied extensively in the literature for comparative analysis.
11.8.1 HYDICE Data
The image scene shown in Fig. 11.5 (also in Fig. 1.10a) was used for the experi-
ments. It was acquired by the airborne HYperspectral Digital Imagery Collection
Experiment (HYDICE). It has a size of 64 64 pixel vectors with 15 panels in the
scene and the ground truth map in Fig. 11.5b (Fig. 1.10b).
It is worth noting that panel pixel p212, marked in yellow in Fig. 11.5b, is of
particular interest. Based on the ground truth, this panel pixel is not a pure panel
pixel and is marked by yellow as a boundary panel pixel. However, with our
extensive and comprehensive experiments, this yellow panel pixel is always
extracted as the one with the most spectrally distinct signature compared to R
panel pixels in row 2. This indicates that the signature of spectral purity is not
equivalent to the signature of spectral distinction. As a matter of fact, in many cases
panel pixel p212 instead of panel pixel p221 is the one extracted by EFAs to represent
the panel signature in row 2. Also, because of such ambiguity, the panel signature
representing panel pixels in the second row is either p221 or p212, which is always
the last one to be found by EFAs. This implies that the ground truth of R panel
pixels in the second row in Fig. 11.5b may not be as pure as was thought.
Even though the RHSP-OPSGA can be used to find endmember candidates up to
the number of total bands, L, there is no necessity for doing so. Using (11.37) and
(11.50) we can estimate the number of endmembers that must be generated by the
RHSP-OPSGA. Table 11.1 tabulates nRHBP-OPSGA estimated by the MOCA as well
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
Fig. 11.5 (a) HYDICE panel scene containing 15 panels and (b) ground truth map of spatial
Table 11.1 nRHBP-OPSGA estimated by (11.39) and (11.50)

ηp in (11.39) MOCA PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
DSGA 98 98 98 98 98 98
RHSP-OPSGA 45 46 44 43 40 39
ρp in (11.50) MOCA PF ¼ 10-1 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
DSGA 98 98 98 98 98 98
RHSP-OPSGA 43 45 43 40 39 37
as NPD for various false alarm probabilities, where MOCA is actually a maximum
likelihood detector.
According to Chang (2003) and Chang and Du (2004), the VD estimated for this
HYDICE scene by the Harsanyi–Farrand–Chang (HFC) method (Harsanyi et al.
1994) would be 9. Recently, an approach using real target signal sources proposed
for VD estimation in Chang et al. (2014) and described in Chap. 4 was used to
estimate nRHBP-OPSGA to be in a range between 39 and 45. However, when
unsupervised fully constrained least squares (FCLS) was used in Heinz and
Chang (2001) to find targets, the number of targets was 34. Thus, the values of
VD, nVD, was set from 9 to 45. Figure 11.6 shows endmember pixels found by the
four EFAs RHSP-OPSGA, OPSGA, DSGA, and a new distance-based SGA (Dist-
SGA) developed in Wang et al. (2013a). However, looking more closely at the
development of Dist-SGA, we see that the used distance is actually the orthogonal
distance, which is exactly an OP. Accordingly, it is not surprising to see from these
figures that the RHSP-OPSGA, OPSGA, and Dist-SGA produced identical
endmember pixels, while the DSGA produced completely different results. The
reason the DSGA produced different endmembers from the other algorithms is
mainly that the SV calculated by the DSGA using a determinant via SVD to find
endmember pixels is different from that calculated by the OPSGA, RHSP-OPSGA,
and Dist-SGA using the OP. Interestingly, none of the four algorithms could find
pure panel pixels corresponding to the five panel signatures when nVD ¼ 9, but they
could do so, as shown in Fig. 11.6b, when nVD ¼ 18, which is twice value of
nVD ¼ 9, a fact that was also noted in Chang et al. (2010a, 2011b). Since the four
algorithms are sequential, the endmember pixels generated by a smaller value of
nRHBP-OPSGA are always part of endmember pixels generated by a larger value of
nRHBP-OPSGA, and the endmember pixels found in Fig. 11.6 are separated into
four different ranges: (a) nVD ¼ 9, (b) nVD ¼ 18, (c) nRHBP-OPSGA ¼ 34, and
(d) nRHBP-OPSGA ¼ 45.
As expected, the Dist-SGA, OPSGA, and RHSP-OPSGA found identical sets of
endmember pixels owing to the fact that the maximal OP of a vertex perpendicular
to a simplex base is indeed its height.
11.8.2 CUPRITE Mining District Data
Another real image scene to be used for experiments is a well-known Airborne

Visible / Infrared Imaging Spectrometer (AVIRIS) image scene, Cuprite, shown in
Fig. 11.7 (also see Fig. 1.4a) available at the USGS Web site http://aviris.jpl.nasa.
gov/. This scene is a 224-band image with a size of 350 350 pixels and was
collected over the Cuprite mining site in Nevada in 1991. It is one of the most
widely used hyperspectral image scenes available in the public domain and has
20 m spatial resolution and 10 nm spectral resolution in the range of 0.4 and 2.5 μm.
Since it is well understood mineralogically and has reliable ground truth, this scene
has been studied extensively. Two data sets for this scene, reflectance and radiance
Fig. 11.6 Endmember pixels found by DSGA, Dist-SGA, OPSGA, and RHSP-OPSGA on
HYDICE. (a) 9 endmember pixels. (b) 10–18 endmember pixels. (c) 19–34 endmember pixels.
(d) 35–45 endmember pixels
data, are also available for study. There are five pure pixels in Fig. 11.7a, b, which
can be identified as corresponding to five different minerals: alunite (A),
buddingtonite (B), calcite (C), kaolinite (K), and muscovite (M), labeled by A,
B, C, K, and M in Fig. 11.7b, along with their spectral signatures plotted in
Fig. 11.7c, d.
Although the data set contains more than five minerals in the data set, the ground
truth available for this region only provides the locations of the pure pixels: alunite
(A), buddingtonite (B), calcite (C), kaolinite (K), and muscovite (M), shown in
Fig. 11.7b with their reflectance and radiance spectra plotted in Fig. 11.7a, b.
Since there is no available prior knowledge about the spatial locations of
endmembers, we must rely on an unsupervised means of identifying whether an
extracted target pixel is an endmember. To address this issue, the following
Endmember IDentification algorithm (EIDA), developed in Chang et al. (2014),
was used for this purpose.
EIDA
J p
1. Assume that tj j¼1 are J extracted target pixels and fmi gi¼1 are known p ground
truth endmembers.
J p
2. Cluster all extracted pixels, tj j¼1, into p endmember classes Cj j¼1 according
to the following rule:

tj 2 Cj* , j* ¼ arg min1ip SAM tj ; mi ; ð11:52Þ
where the spectral angle mapper (SAM) is a spectral similarity measure.

3. For each of the endmembers, mi, find the target pixel ti* among all the extracted
J
pixels tj j¼1 that is closest to mi by

i* ¼ arg min1jJ SAM tj ; mi : ð11:53Þ
Fig. 11.7 (a) Cuprite AVIRIS image scene. (b) Spatial positions of five pure pixels corresponding
to the following minerals: alunite (A), buddingtonite (B), calcite (C), kaolinite (K), and muscovite
(M); (c) five mineral reflectance spectra; (d) five mineral radiance spectra
4. Find all target pixels that satisfy

i* ¼ arg min1jJ SAM tj ; mi , ti* 2 Ci : ð11:54Þ
5. All target pixels found in step 4 are extracted as endmembers.

First, we use (11.37) and (11.48) to estimate the VD for the reflectance and
radiance data sets; the results are tabulated in Tables 11.2 and 11.3, where “F”
indicates that an NPD test passed all the signal sources, in which case nRHBP-
OPSGA ¼ 169 for HYDICE, nRHBP-OPSGA ¼ 120 for Cuprite reflectance data, and
Table 11.2 nRHBP-OPSGA estimated for Curpite reflectance data by (11.37) and (11.50)
DSGA F F F 119 119 117
RHSP-OPSGA 75 82 74 69 64 62
ρp in (11.50) MOCA PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
DSGA 119 119 119 117 103 95
RHSP-OPSGA 75 82 74 69 64 62
Table 11.3 nRHBP-OPSGA estimated or Cuprite radiance data by (11.37) and (11.50)
DSGA F F 109 106 104 95
RHSP-OPSGA 72 79 72 69 65 62
ρp in (11.50) MOCA PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
DSGA 103 106 103 95 95 85
RHSP-OPSGA 67 71 67 62 58 56
nVD ¼ 139 for Cuprite radiance data. It should be noted that for Cuprite data the
DSGA could not produce 189 target signal sources owing to its use of SVD to
calculate SV.
In addition, according to Chang (2003) and Chang and Du (2004), the value of
the VD estimated by the HFC method was nVD ¼ 22 for reflectance data with
2nVD ¼ 44 and nVD ¼ 15 for radiance data with 2nVD ¼ 30. Using these values,
along with the values of nRHBP-OPSGA in Tables 11.2 and 11.3, the experiments were
conducted for three ranges: (a) nVD ¼ 22 for both reflectance and radiance data,
(b) nRHBP-OPSGA ¼ 46 for both reflectance and radiance data, and (c) nRHBP-
OPSGA ¼ 75 for both reflectance and radiance data. The reason that we could do
this is because the DSGA, Dist-SGA, OPSGA, and RHSP-OPSGA are all sequen-
tial algorithms and the endmember pixels generated by a smaller value of nRHBP-
OPSGA are always part of the endmember pixels generated by a larger value of
nRHBP-OPSGA.
Figures 11.8 and 11.9 show the endmembers found by the DSGA, OPSGA,
RHSP-OPSGA, and Dist-SGA for Cuprite reflectance data and Cuprite radiance
data, respectively, where the pixels marked by yellow open circles were found by
the algorithms using numerals used to indicate their found orders.
To identify which pixels among the found endmembers in Figs. 11.8 and 11.9 are
desired endmember pixels, the EIDA, described earlier, was used to find the spatial
locations of these desired endmember pixels shown in Figs. 11.10 and 11.11, where
the pixels marked by lowercase a, b, c, k, and m with red triangles are the desired
endmember pixels found by the EIDA that correspond to the five ground truth
mineral endmembers marked by uppercase A, B, C, K, and M with yellow crosses.
Table 11.4 further tabulates the five desired endmember pixels found by the
EIDA among target pixels found by the DSGA, {tDSGA j } and target pixels found by
OP - SGA
the OPSGA, {tj } where the subscript j indicates the order in which a
particular target pixel was found.
Fig. 11.8 Endmember pixels found by DSGA without DR, Dist-SGA, OPSGA, RHSP-OPSGA
on Cuprite reflectance data. (a) 22 endmember pixels. (b) 23–46 endmember pixels. (c) 47–75
endmember pixels
As we can see from Table 11.4 for the reflectance data, the DSGA required at
least 64 target pixels to find the last mineral signature K to complete all five mineral
signatures, while 53 target pixels were required by the RHSP-OPSGA to find the
last mineral signature B. For radiance data it took the DSGA at least 103 target
pixels to find the last mineral signature K to complete all five mineral signatures,
while 68 target pixels were required by the RHSP-OPSGA to complete all five
mineral signatures with the mineral signature C being the last one to be found.
Nonetheless, both the DSGA and RHSP-OPSGA found the same first mineral
signature, which was M because the spectrum of M is probably the most distinct
among the five mineral signatures according to Fig. 11.7a, b. Once the spatial
locations of the five desired endmember pixels were found in Figs. 11.8 and 11.9,
we can further perform a comparative spectral analysis between EIDA-identified
endmember pixels in Table 11.4 and the ground truth pixels in Fig. 11.7b. For each
of the spectral signatures of the five mineral signatures, A, B, C, K, and M, for
reflectance data and radiance data, Figs. 11.12a–e and 11.13a–e plot three spectra,
ground truth pixel spectrum, and the spectra of tDSGA j and tjR - OPSGA identified in
Table 11.4.
Fig. 11.9 Endmember pixels found by DSGA without DR, Dist-SGA, OPSGA, RHSP-OPSGA
on Cuprite radiance data. (a) 22 endmember pixels. (b) 23–46 endmember pixels. (c) 47–75
endmember pixels
Fig. 11.10 Endmembers found by EIDA using DSGA and RHSP-OPSGA for Cuprite reflectance
data. (a) DSGA. (b) RHSP-OPSGA
a b
a
A A
c
b B b B
m M m M
K k K
k
c
a
C C
Fig. 11.11 Endmembers found by EIDA using DSGA and RHSP-OPSGA for Cuprite radiance
data. (a) DSGA. (b) RHSP-OPSGA
Table 11.4 Endmember pixels found by EIDA

Cuprite A B C K M
DSGA Reflectance tDSGA
11 tDSGA
13 tDSGA
44 tDSGA
64 tDSGA
8
Radiance tDSGA
12 tDSGA
14 tDSGA
28 tDSGA
103 tDSGA
6
RHSP-OPSGA Reflectance tOP - SGA
29 tOP - SGA
53 tOP - SGA
40 tOP - SGA
27 tOP - SGA
9
Radiance tOP - SGA tOP - SGA OP - SGA
t68 OP - SGA
t11 t7OP - SGA
16 14
While the plots of Figs. 11.12 and 11.13 offer the advantage of visual assessment
about how close a found endmember pixel is to a ground truth pixel, it does not
provide quantitative measurements on their spectral similarity. Tables 11.5 and
11.6 calculate the spectral similarity values of the plots in Figs. 11.12 and 11.13,
respectively, where SAM and spectral information divergence (SID) were used as
spectral measures. As we can see in Tables 11.5 and 11.6, the spectral similarity
values between pixels found by the DSGA and RHSP-OPSGA compared to ground
truth pixels were indeed very close even though the pixels found by the DSGA and
RHSP-OPSGA were identified by EIDA in different locations.
11.9 Computer Processing Time Analysis
One of the great advantages that the RHSP-OPSGA provides is computational cost
savings. Figures 11.14 and 11.15 show the cumulative computing time in seconds
required by the four algorithms DSGA, Dist-SGA, OPSGA, and RHSP-OPSGA to
run HYDICE and Cuprite data sets respectively as the number of endmembers, p,
11.9 Computer Processing Time Analysis 351
a b
6000 6000
5500 5500
5000 5000
4500 4500
Reflectance
4000
Reflectance
4000
3500 3500
3000 3000
2500 2500
2000 Groundtruth 2000 Groundtruth

DSGA DSGA
1500 RHSP-OPSGA 1500 RHSP-OPSGA
1000 1000
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Band Band
c 4000
d 5500
5000
3500
4500
3000 4000
Reflectance
Reflectance
3500
2500
3000
2000
2500
1500 2000
Groundtruth 1500 Groundtruth
1000 DSGA DSGA
RHSP-OPSGA 1000 RHSP-OPSGA
500 500
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Band Band
e
7000
6000
5000
Reflectance
4000
3000
Groundtruth
2000 DSGA
RHSP-OPSGA
1000
0 20 40 60 80 100 120 140 160 180 200
Band
Fig. 11.12 Comparative plots between spectral signatures found by DSGA and R-OPSGA
and ground truth pixels for Cuprite reflectance data. (a) “a” signature against “A.” (b) “b”
signature against “B.” (c) “c” signature against “C.” (d) “k” signature against “K.” (e) “m”
signature against “M”
increases to 169, where the computer environment was executed on the following
system: Windows operating system, 7 64 bits, with an Intel Core i5-2500 CPU @
3.30 GHz; Memory: 8.00 GB; programming language: MATLAB V8.0. Note that
the computer processing time to run the Cuprite reflectance and radiance sets was
nearly the same because the same image scene was used. Thus, there is no visible
difference. In this case, only reflectance data were plotted. As shown in Figs. 11.14a
and 11.15a, the DSGA required extremely high computing time compared to the
a b
12000 9000
Groundtruth Groundtruth
DSGA 8000 DSGA
10000 RHSP-OPSGA RHSP-OPSGA
7000
8000 6000
Radiance
Radiance
5000
6000
4000
4000 3000
2000
2000
1000
0 0
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Band Band
c d
8000 10000
7000 9000
DSGA DSGA
RHSP-OPSGA 8000 RHSP-OPSGA
6000
7000
5000
6000
Radiance
Radiance
4000 5000
3000 4000
3000
2000
2000
1000
1000
0 0
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Band Band
e
12000
Groundtruth
DSGA
10000 RHSP-OPSGA
8000
Radiance
6000
4000
2000
0
0 20 40 60 80 100 120 140 160 180 200
Band
Fig. 11.13 Comparative plots between spectral signatures found by DSGA and R-OPSGA
and ground truth pixels for Cuprite radiance data. (a) “a” signature against “A.” (b) “b”
signature against “B.” (c) “c” signature against “C.” (d) “k” signature against “K.” (e) “m”
signature against “M”
other three, so there is no visible difference among the other three due to the large
magnitude caused by the DSGA. To provide a better visual assessment, Figs. 11.14b
and 11.15b replot the computing times for the RHSP-OPSGA, OPSGA, and Dist-
SGA on the same scale, where the RHSP-OPSGA was slightly better than the
OPSGA, but both the OPSGA and RHSP-OPSGA were indeed better than the Dist-
SGA in terms of computing time.
Table 11.5 SAM/SID of closet endmembers to ground truth endmembers by DSGA and RHSP-
OPSGA on Cuprite reflectance data
SAM (A,a) (B,b) (C,c) (K,k) (M,m)

SID
DSGA 0.0167 0.0334 0.0379 0.0341 0
0.0002 0.0009 0.0009 0.0007 0
OPSGA 0 0.0497 0.0379 0.0304 0.0264
0 0.0012 0.0009 0.0006 0.0005
Table 11.6 SAM/SID of closet endmembers to ground truth endmembers by DSGA and RHSP-
OPSGA on Cuprite radiance data
SAM (A,a) (B,b) (C,c) (K,k) (M,m)

SID
DSGA 0.0205 0 0.0247 0.0123 0
0.0003 0 0.0006 0.0001 0
OPSGA 0.0098 0 0.0253 0.0100 0
0.0001 0 0.0010 0.0001 0
a b
1000 4
DSGA
Accumulative computing time (sec)
900 3.5 Dist-SGA

Dist-SGA OPSGA
800
OPSGA 3 RHSP-OPSGA
700 RHSP-OPSGA
600 2.5
500 2
400 1.5
300
1
200
100 0.5
0 0
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180
Number of endmembers (p) Number of endmembers (p)
Fig. 11.14 Cumulative computing time in seconds of DSGA without DR, DSGA, Dist-SGA,
OPSGA, RHSP-OPSGA as p increases on HYDICE data
Tables 11.7 and 11.8 also tabulate computer processing times required for
the RHSP-OPSGA, OPSGA, Dist-SGA, and DSGA to generate Figs. 11.6
and 11.11.
As we can see from Tables 11.6 and 11.7, the RHSP-OPSGA was the best,
requiring the least computing time, while the DSGA was the worst, requiring a
much higher computing time. In addition, the more endmember pixels that had to be
found, the more computational savings can be achieved. It is worth noting that the
Dist-SGA was shown to be the best in Wang et al. (2013) among all variants of
SGA. Figures 11.14 and 11.15 and Tables 11.6 and 11.7 further show that our
developed OPSGA and RHSP-OPSGA outperformed the Dist-SGA.
a b
12000 140

DSGA 120 Dist-SGA

10000 OPSGA
Dist-SGA
RHSP-OPSGA
OPSGA 100
8000 RHSP-OPSGA
80
6000
60
4000
40
2000 20
0 0
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Fig. 11.15 Cumulative computing time in seconds of DSGA without DR, DSGA, Dist-SGA,
OPSGA, RHSP-OPSGA as p increases on Cuprite data
Table 11.7 Comparison of Method

computing time in seconds
p DSGA Dist-SGA OPSGA RHSP-OPSGA
by various methods along
with DSGA for HYDICE data 9 1.7086 0.1134 0.1136 0.1026
18 5.2592 0.2542 0.2292 0.2114
34 20.1801 0.5202 0.4177 0.4014
45 43.7703 0.7192 0.5482 0.5344
Table 11.8 Comparison of Method

p DSGA Dist-SGA OPSGA RHSP-OPSGA
by various methods along
with DSGA for Cuprite data 22 262.31 11.1501 11.6834 11.6477
46 1465.45 11.9714 16.3953 16.3544
75 4430.95 39.8937 26.9343 26.8683
To conclude this section, some comments are noteworthy. It is known that VCA
is also an OP-based algorithm and has become popular for finding endmembers in
recent years. However, it suffers from several implementation issues. One is its use
of random initial conditions that result in finding inconsistent sets of endmembers.
Another is that it generally requires preprocessing prior to finding endmembers
such as DR transform to reduce data volumes. As shown in Chang (2013), different
DR transforms generally produce different sets of endmembers. Most importantly,
VCA is not a fully constrained algorithm that generally finds suboptimal solutions.
It is shown that VCA cannot compete against ATGP in the sense of finding the
maximal OP or against the SGA in the sense of finding the maximal SV. Compared
to VCA, Wang et al.’s Dist-SGA and the proposed OPSGA, along with RHSP-
OPSGA, have no such issues. The only major advantage that VCA can offer is its
lower computational complexity, which requires only inner products. But this
advantage can also be gained by the RHSP-OPSGA, which requires only inner
and outer products of vectors, none of which are matrix multiplications.
a b
10 140
ATGP ATGP
9

VCA (no DR) VCA (no DR)
VCA
120 VCA
8 Dist-SGA Dist-SGA
OPSGA 100 OPSGA
7 RHSP-OPSGA RHSP-OPSGA
6
80
5
4 60
3 40
2
20
1
0 0
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180 200
Fig. 11.16 Computing times for ATGP, VCA with/without DR, Dist-SGA, OPSGA, and RHSP-
OPSGA. (a) HYDICE data. (b) Cuprite data
Figure 11.16a, b shows that the computing time of VCA varies with data sets and is
not stable compared to other algorithms because VCA with no DR required the least
computing time for both data sets, while VCA had the worst computing time for
HYDICE but the second best computing time for Cuprite data. Overall, the Dist-
SGA required the most computing time among all six test algorithms (except VCA
with DR in the HYDICE data set), while the OPSGA and RHSP-OPSGA are ranked
as the third best algorithm next to VCA with no DR and the ATGP. However, since
VCA is not designed to find the maximal SV, according to Chang (2013) and Chen
et al. (2014), despite its savings in computing time, VCA did not produce good
endmember results, except when VCA used ATGP-generated target pixels as initial
conditions. It is also interesting to see that VCA with DR required the most time for
HYDICE (Fig. 11.16a) but had the second best time for Cuprite data (Fig. 11.16b).
This is because DR is performed prior to VCA. When the data size is very large, as
with Cuprite, VCA with DR actually takes advantage of a DRHSP-reduced data
cube to save computing time owing to the fact that DR is only a one-time operation.
As a result, VCA with/without DR achieved the best computing times (Fig. 11.16b)
regardless of its performance.
Another comment is in order. Since VCA makes use of random initial conditions
for each of the endmembers it generates, it is interesting to see how random initial
conditions affect its final endmember results. In Chang (2013) and Chen et al.
(2014), three initial conditions were investigated for VCA with no DR: (1) initial
conditions generated by Gaussian random variables originally used by VCA
Nascimento and Dias (2005); (2) the unity vector with all components specified
0 1T
by 1s, that is, @1, 1, . . . , 1 A ; and (3) ATGP-generated target pixels. It was shown
|fflfflfflfflfflffl{zfflfflfflfflfflffl}
L
in Chang (2013) and Chen et al. (2014) that the best performance produced by VCA
was the one that used ATGP-generated target pixels as the initial conditions.
Interestingly, in this case, VCA was simply reduced to the ATGP, as shown in
Chang (2013) and Chen et al. (2014). Moreover, it was also shown in Chang (2013)
and Chen et al. (2014) that if VCA is implemented with DR or using random initial
conditions, VCA generally did not perform as well as the ATGP.
Finally, we would point out that as long as the maximal OP and maximal SV are
used as criteria for finding endmembers, VCA may not be effective, even though
VCA with/without DR may require the least computing time (Fig. 11.16). Unfor-
tunately, this must be traded off against its suboptimal performance as a compro-
mise, for which we consider it not worthwhile. All things considered, the RHSP-
OPSGA is generally preferred among all currently available SV-based EFAs in
terms of performance and computing efficiency.
11.10 Conclusions
Owing to its simple architecture and competing performance, the SGA has emerged
as a good alternative to N-FINDR when it comes to reducing computational
complexity and savings. However, calculating an SV remains a challenging issue
in terms of computational cost. This chapter develops an OP-based SGA (OPSGA)
that transform the calculation of an SV into finding an OP. In particular, a recursive
version of OPSGA, RHSP-OPSGA, is derived to allow OPSGA to perform the SGA
via an OP in a recursive manner. The RHSP-OPSGA offers several significant
benefits. First, it provides an effective means of determining how many
endmembers the RHSP-OPSGA needs to generate in an unsupervised fashion
versus the SGA, which must know this number in advance prior to implementation.
Another benefit is that it reduces the computational cost of calculating an SV to
only finding inner and outer products of vectors, not matrices, specifically, no
matrix inverses are required during recursion. Finally, the recursive equation
developed to implement the RHSP-OPSGA is a Kalman filter-like filter that can
be easily implemented in hardware design.
Chapter 12
of Geometric Simplex Growing Algorithm
Abstract Simplex volumes (SVs) have been used in the literature as a criterion for
finding endmembers. A main issue that arises in finding SVs is inverting a
nonsquare matrix, which involves excessive computing time in calculating the
matrix determinant. This type of SV calculation is referred to as a determinant-
based SV (DSV) calculation (Chap. 2). Therefore, several preprocessing steps are
suggested for DSV calculation in Chap. 2 to ease computational complexity, such
as reducing data dimensionality, easing computational complexity by manipulating
determinant calculation, narrowing search regions via Pixel Purity Index (PPI), or
developing algorithms such as the simplex growing algorithm (SGA) to grow
simplexes one after another sequentially instead of finding all endmembers together
simultaneously. However, all these DSV calculation techniques remain stuck with
certain inherent drawbacks encountered in finding SVs using a matrix determinant
calculation, in addition to another issue: the calculated SV may not be a true SV, as
pointed out in Chap. 2. To resolve this dilemma, Chap. 11 developed an orthogonal
projection (OP)-based growing simplex volume analysis (GSVA) approach, called
orthogonal projection SGA (OPSGA), which calculates a geometric SV (GSV)
from a simplex geometry point of view instead of resorting to the matrix determi-
nant. It transforms finding a new endmember that yields the maximal DSV by
growing previously simplexes into finding an endmember with the maximal OP
onto a hyperplane that is linearly spanned by simplexes specified by previously
found endmembers. Accordingly, OPSGA can be considered a technique for
calculating GSV by OP (GSV-OP). As an alternative to OPSGA, this chapter
presents another GSVA approach to finding GSV, referred to as the geometric
SGA (GSGA), developed by Chang, Li, and Song (IEEE Journal of Selected Topics
in Applied Earth Observations and Remote Sensing 99:1–13, 2016c), which con-
verts GSV calculation into a product of the height and base of a simplex. In parallel
to OPSGA, GSGA can also be considered a technique for calculating the GSV by
the simplex height (GSV-SH). In other words, finding the maximal SV is equivalent
to finding the maximal height of a simplex. As a result, when simplexes grow one
vertex at a time, GSV can be calculated by multiplying only successive heights
where the heights can be found by the Gram–Schmidt orthogonalization process
(GSOP). This is because the bases of previously grown simplexes are already
known and need not be recalculated. Furthermore, in analogy with recursive
hyperspectral sample processing of OPSGA (RHSP-OPSGA), a recursive
hyperspectral sample processing of GSGA (RHSP-GSGA) can also be derived

DOI 10.1007/978-3-319-45171-8_12
358 12 Recursive Hyperspectral Sample Processing of Geometric Simplex Growing. . .
and implemented as a Kalman filter-like algorithm so as to achieve significant

savings of computing time on a timely basis as it is implemented in real time.
Interestingly, GSGA/RHSP-GSGA produces sets of endmembers identical to those
of OPSGA/RHSP-OPSGA, even though both calculate SVs differently. In other
words, the set of endmembers produced by GSGA through finding the maximal
heights turns out to be the same set of endmembers found by OPSGA through
maximal OPs. This is because finding the maximal OP by orthogonal subspace
projection (OSP) can be shown to be equivalent to finding the maximal height by
GSOP. However, there is a key difference between GSGA and OPSGA, where
GSGA works on simplex edges, as opposed to OPSGA, which deals with vertices
directly. Consequently, the initial conditions required by GSGA and OPSGA are
also different, where GSGA must start with two vertices as an initial simplex edge,
whereas OPSGA can start with any single vertex as its initial endmember in the
same way SGA does. Another difference between GSGA and OPSGA is the
computational complexity and computer processing time resulting from the use
of OSP and GSOP. Despite the fact that OP and SVs are different criteria used to
design endmember finding algorithms (EFAs), several recent studies showed that
they were actually closely related (Chen 2014; Chang Real time progressive
hyperspectral image processing: endmember finding and anomaly detection.
Springer, 2016; IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. (99):1–27,
2016b). This fact is further confirmed by OPSGA developed in Chap. 11 and GSGA
developed in this chapter.
12.1 Introduction
As discussed in Chang (2016), the two most widely used algorithms for finding
endmembers are the OP-based Pixel Purity Index (PPI) developed by Boardman
(1994) and the simplex volume (SV)-based N-finder algorithm (N-FINDR) devel-
oped by Winter (1999a, b). In recent years N-FINDR seems to have emerged as a
preferred endmember finding algorithm (EFA) over other EFAs using different
criteria such as OP because its use of SVs as a criterion for optimality turns out also
to satisfy the convexity required for the purity of endmembers by imposing two
physical abundance constraints, abundance sum-to-one constraint (ASC) and abun-
dance nonnegativity constraint (ANC), used by linear spectral mixture analysis
(LSMA) for linear spectral unmixing (LSU).
Although SGA significantly reduces computer processing time for N-FINDR,
finding SVs generally requires calculating the determinant of a nonsquare matrix
defined in (2.1) in Chap. 2 by
" #
1 1
1 1
Det
m1 m2 mp1 mp
DSV m1 ; m2 ; . . . ; mp ¼ ; ð12:1Þ
p!
p
where mj j¼1 are p endmembers. This is because the dimensionality of the
simplexes specified by p endmembers is p 1, which is relatively smaller com-
pared to the total number of bands, L. In this case, SGA repeatedly uses (12.1) to
calculate DSV via matrix determinants. Under such a circumstance two approaches
are usually adopted. One is to perform data dimensionality reduction (DR) to make
a nonsquare matrix a square matrix. The other is to perform singular value decom-
position SVD to find a square submatrix. Unfortunately, neither approach gives true
a SV, as discussed in Chap. 2. This issue was overlooked since all simplex-based
EFAs are designed to find endmembers, not to find true SVs. As a result, finding
true SVs is immaterial to EFAs. In fact, it was shown in Li et al. (2015a) and
Chap. 2 that the determinant-based SV (DSV) calculated by N-FINDR and SGA are
not true SVs. Similarly, when vertex component analysis (VCA), using the maxi-
mal OP, is implemented to find endmembers via convex hulls; whether or not SVs it
finds yield maximal SVs is not important. As a matter of fact, it is also shown in
Chap. 4 in Chang (2016), Chang et al. (2016b), and Chen et al. (2014a, b, c) that the
OP found using VCA is indeed not necessarily a maximal SV. Furthermore, the
convex hulls found by VCA do not yield maximal convex cone volume either, as
shown in Chap. 11 in Chang (2016) and in Chen et al. (2014a, b, c). The main issue
arises in VCA from the fact that the DSV or convex cone volume calculated by
vertices or endmembers is not a true SV because the vertices found by VCA do not
satisfy convexity constraints, and the dimensionality of the simplex and convex
cone formed by these vertices is always one dimension less than the total number of
vertices. To resolve this issue, the matrix determinant used to calculate SVs must
include a unity vector with ones in all components as an additional row in (2.1) in
Chap. 2 or (12.1) to impose ASC. Second, the computing time involved in finding a
DSV grows exponentially with increasing numbers of endmembers because the
computing time required in inverting larger matrices to find matrix determinants via
SVD is increased substantially. Third, DSV calculation may run into an inevitable
numerical issue as the number of endmembers, p, grows.
More specifically, assume that a given p-vertex simplex is formed by a set of
p
p endmembers mj j¼1 already found by a sequential EFA, which finds
endmembers one at a time. To find the ( p + 1)th endmember, mj+1, as a new
endmember, the DSGA grows ( p + 1)-vertex simplexes from the p-vertex simplex
p
specified by mj j¼1 and finds a particular ( p + 1)-vertex simplex with the mj+1 as a
new vertex that yields the maximal SV among all possible ( p + 1)-vertex simplexes
by calculating matrix determinants.
The OPSGA developed in Chap. 11 considers a hyperplane linearly spanned by
p p
the p previously found mj j¼1 with the p-vertex simplex specified by mj j¼1 ,
denoted by S(m1, m2, . . ., mp) embedded in this hyperplane. It then finds the mj+1 as
a new endmember that yields the maximal OP via P⊥ U in (4.19), with the undesired
p
signature matrix U formed by mj j¼1 . It turns out that finding such a maximal OP
can be carried out by the automatic target generation process (ATGP) developed in
Sect. 4.4.2.3. In other words, OPSGA can be interpreted as a new application
of ATGP in finding new endmembers. Interestingly, this is also demonstrated in

Chap. 13 in Chang (2016) and in Chang et al. (2016b), where the endmembers
p
mj j¼1 found by DSGA are actually identical to those found by ATGP provided
that the initial endmember m1 used by the DSGA and ATGP is the same. This
p
indicates that OPSGA finds the same set of endmembers mj j¼1 found by DSGA
but significantly reduces computing time compared to DSGA, which produces
endmembers by repeatedly calculating DSV through matrix determinants.
In contrast to OPSGA, a new geometric approach to SGA, called geometric
simplex growing algorithm (GSGA), recently developed by Chang et al. (2016c)
considers the p-vertex simplex specified by previously found p endmembers,
p
mj , S(m1, m2, . . ., mp) as the base of a newly grown ( p + 1)-vertex simplex,
j¼1
S m1 ; m2 ; . . . ; mpþ1 , with mp+1 as the new added endmember that yields the
maximal height perpendicular to the base. To find such an mp+1, GSGA takes
advantage of the Gram–Schmidt orthogonalization process (GSOP) to
p
orthonormalize the p endmembers mj j¼1 and then find the mp+1 with the maximal
p
magnitude, along with the ( p + 1)th orthonormal vector to mj j¼1 as its maximal
height to the p-vertex simplex S(m1, m2, . . ., mp). That is, GSGA produces
endmembers by finding consecutive maximal heights using GSOP, compared to
OPSGA, which produces endmembers by finding consecutive maximal OPs via
ATGP. More specifically, GSGA calculates GSV by finding successive maximal
simplex heights, referred to as GSV calculation by simplex height (GSV-SH), as
opposed to OPSGA, which calculates GSV by finding successive maximal OPs,
referred to as GSV calculation by OP (GSV-OP). Technically speaking, all three
algorithms, DSGA, GSGA, and OPSGA, should produce the same set of
endmembers if their initial conditions used to initialize their algorithms are the
same. However, there are several key differences among DSGA, OPSGA, and
GSGA, described in what follows.
One is how they calculate SVs, even though they may all produce the same sets
of endmembers given the same circumstances. Second, DSGA and OPSGA gener-
ate one endmember at a time and then use the generated endmembers to form
simplexes. As a result, the dimensionality of endmembers (i.e., the number of total
spectral bands, L) is much higher than the dimensionality of their formed sim-
plexes, that is, p. Thus, DSGA generally requires DR to calculate SVs. By contrast,
OPSGA and GSGA find endmembers that yield maximal GSVs and, thus, do not
require DR. As a consequence, OPSGA and GSGA can calculate true SVs but
DSGA cannot. This leads to an important difference in how they use initial
conditions. Since DGSA and OPSGA find endmembers directly and individually,
they can start by having any data sample vector be one initial endmember as an
initial condition. But GSGA is designed to calculate GSV based on simplex edges,
and so it must start with two endmembers as its initial condition. In this case, the
lowest dimensional simplex that has edges is the two-vertex simplex. Therefore,
GSGA must start with a two-vertex simplex specified by two data sample vectors
with the maximal distance as two initial endmembers. A most important advantage
12.2 Geometric Simplex Growing Algorithm (GSGA) 361
of GSGA over DSGA and OPSGA is that, compared to DSGA and OPSGA, which
are endmember-dependent methods, the simplexes found by GSGA have nothing to
do with the spatial locations of endmembers because they are determined by
simplex edges, not simplex vertices. In this case, GSGA is actually a simplex
edge-based technique. In particular, it is also translation-invariant and rotation-
invariant.
Analogously to RHSP-OPSGA, GSGA can also be extended to recursive GSGA
(RSGA), where two Kalman filter (KF)-like versions can be derived (Chang and Li
2016a, b). One is called KF-based orthogonal subspace projection RHSP-GSGA
(KF OSP-RHSP-GSGA), which takes advantage of OSP to derive a recursive
equation. The other is called KF-based orthogonal vector projection RHSP-
GSGA (KF OVP-RHSP-GSGA), which makes use of GSOP to derive a recursive
equation. The recursive structures of both algorithms provide further advantages in
the application of GSGA in hardware implementation and design.
12.2 Geometric Simplex Growing Algorithm (GSGA)
As shown in (2.10)–(2.11), finding V(Sj) is equivalent to calculating

" #
1 1 1

DSV Sjþ1 ¼ ð1=j!ÞDetðME Þ ¼ ð1=j!ÞDet
m1 m2 mjþ1 ð12:2Þ

¼ ð1=j!ÞDet½ m
e2 m e3 m e jþ1 ¼ GSV Sjþ1 :
The method for finding the maximal height via (12.2) for each 1 i j to calculate
maximal SV is called GSGA, as opposed to the original SGA developed by Chang
et al. (2006a, b, c), referred to as a determinant-based SGA (DSGA), which uses the
matrix determinant specified by (12.1) to find the maximal SV. Interestingly, (12.2)
also shows that GSGA is essentially the same as DSGA. Nevertheless, it is worth
noting that SV in (12.2) can only be calculated when ME is a square matrix since a
determinant is defined as a value associated with a square matrix. More details on
(12.2) can be found in Stein (1966). The GSGA presented in this section was
developed by Chang et al. (2016c) to address all the issues mentioned in the
introduction as an alternative approach to DSGA by taking advantage of simplex
geometric structures to find SVs without performing either DR or SVD.
The idea of GSGA comes very naturally from the geometric structures of a
simplex where a simplex can be considered a convex set formed by its edges instead
of vertices and its volume can be calculated by finding GSV formed by its simplex
edges. In this case, GSGA starts with its initial condition, which is one single edge,
kme 2 k ¼ km2 m1 k , connected by two endmembers m1 and m2 with maximal
length defined by l2 norm kxk2 ¼ xT x. This is because GSVðm1 ; m2 Þ ¼ km2 m1
k is the maximal GSV of a degenerated one-dimensional (1D) two-vertex simplex, S
(m1, m2). In this case, we can select either m1 or m2 as its initial endmember, and
there is no random issue encountered in GSGA. Also, it should be noted that GSGA
does not start with a single vertex since a one-vertex simplex has zero dimensions
with zero volume. This is quite different from all EFAs, which start with random
initial conditions by either randomly selecting the first endmember for initializa-
tion, such as SGA or VCA, or p endmembers, such as N-FINDR. To find a third
vertex, GSGA finds the data sample vector, denoted by m3, that yields the maximal
height h2 perpendicular to the edge vector m e 2 ¼ m2 m1 . We define m e 3 ¼ m3
m1 as a second simplex edge vector and further decompose m e 3 into m
e3 ¼ m e⊥
3
k
þme 3, where me⊥3 is the vector orthogonal to the segment m e 2 ¼ km2 m1 k, with the
⊥ jj
maximal OP given by h2 ¼ km e 3 k , and me 3 is the vector along the segment
kme 2 k ¼ km2 m1 k. These three vertices are then used to form a triangle Δ(m1,
m2, m3) as a two-dimensional
(2D) three-vertex simplex, S(m1, m2, m3) with three
3
simplex edges, 3 ¼ , specified by km2 m1 k, km3 m1 k, and km3 m2 k. Its
2
corresponding volume is calculated by GSV ¼ ðm1 ; m2 ; m3 Þ ¼ GSVðm1 ; m2 Þ h2,
with m e3 ¼ me⊥3 þm e k3 and h2 ¼ km e⊥3 k. The same procedure is repeated over and
over again to find the fourth data sample vector m4 to define a third simplex edge
vector, me 4 ¼ m4 m1 , where m e 4 can be decomposed into two vectors, me4 ¼ m e⊥4
þm e k4, where me⊥4 yields the maximal OP, defined as h3 ¼ km e⊥ 4 k, perpendicular to the
space hm e 2; m
e 3 i linearly spanned by m e 2 and m e 3 , which is a plane containing the
triangle formed by Δ(m1, m2, m3), and m e k4 is the vector lying in the space hm e ;me i.
2 3
4
We then form a 3D four-vertex simplex, S(m1, m2, m3, m4) with 6 ¼ simplex
2
edges specified by km2 m1 k, km3 m1 k, km3 m2 k, km4 m1 k, km4 m2 k,
and km4 m3 k, and the same procedure is repeated over and over again. Fig-
ure 12.1 provides an example to illustrate how m3 and m4 are found by GSGA.
Suppose that for j 2, m1, m2, . . ., mj are previously found endmembers and mj
+1 is the next endmember to be found. Assume that S(m1, m2, . . ., mj) is the ( j 1)-
dimensional j-vertex simplex formed by m1, m2, . . ., mj with the volume given by V
(m1, m2, . . ., mj). Let tj+1 be a new data sample
to be added as the ( j + 1)th vertex to
form a ( j + 1)-vertex simplex, S m1 ; m2 ; . . . ; mj ; tjþ1 . Then its SV,

V m1 ; m2 ; . . . ; mj ; tjþ1 , can be calculated proportional to the value of multiplying

V m1 ; m2 ; . . . ; mj ; tjþ1 as its base, with et jþ1 ¼ tjþ1 m1 ¼ et ⊥ ek
jþ1 þ t jþ1 and hj ¼ k
et ⊥
jþ1 k as its height, which can be obtained by finding the distance from tj+1
perpendicular to the base, denoted by hj, that is,

GSV m1 ; m2 ; . . . ; mj ; tjþ1 / GSV m1 ; m2 ; . . . ; mj hj ; ð12:3Þ
where the notation “/” means “proportional” scaled by a constant. Then the desired
( j + 1)th endmember, mj+1, is the one that maximizes V m1 ; m2 ; . . . ; mj ; tjþ1 over
tjþ1 , that is,
Fig. 12.1 Simplexes with m3 ~

two, three, and four vertices m 3
along with their edges h3
m1 m2 m1 m2
~ ||
m 3
O
O
2-vetex simplex with 1 edge 3-vetex simplex with 3 edges
m4
~ m3 ~
m m
3 h4 4
h3
m1 m2
~ ||
m
m~ || 4
O 3
4-vertex simplex with 6 edges

mjþ1 ¼ arg maxtjþ1 GSV m1 ; m2 ; . . . ; mj ; tjþ1 : ð12:4Þ
Accordingly,
the volume of a j-dimensional ( j + 1)-vertex simplex,
GSV m1 ; m2 ; . . . ; mj ; mjþ1 , can be simply calculated by multiplying GSV(m1,
m2, . . ., mj) of the simplex S(m1, m2, . . ., mj) as its base by its maximal height
hj ¼ km e⊥jþ1 k, where m e jþ1 ¼ mjþ1 m1 ¼ m e⊥ e kjþ1 . As a result, calculating
jþ1 þ m
GSV can be simply performed by successively computing (12.3) by finding mj+1
via (12.4) without calculating the matrix determinant by (12.1). Most importantly,
there are two significant differences between DSGA using (12.1) and GSGA using
(12.3). One is that DSGA calculates DSV using vertices as endmembers, with ASC
!T
1 , 1 , . . . , 1
imposed by a unity vector 1 ¼ |fflfflfflfflfflffl{zfflfflfflfflfflffl} , whereas GSGA calculates GSV
L
using p 1 simplex edge vectors without referring to vertices since GSV is
calculated by multiplying the simplex edge length km e 2 k ¼ km2 m1 k by the
product of successive heights, F, with no knowledge of the spatial positions of
endmembers. Nonetheless, the rank of the matrix in (12.1) has the same dimen-
sionality, p 1, as (12.2), which is specified by p 1 vectors, m e 2 ¼ m2 m1 ,
e 3 ¼ m3 m1 , m
m e 4 ¼ m4 m1 , . . ., m
e p ¼ mp m1 , each of which yields one
dimension. The other difference is that DSGA is determined by its initial condition
compared to GSGA, which is independent of the initial condition because the initial
p
endmember m1 has been subtracted out from all other endmembers mj j¼2 . In
other words, DSV calculated by DSGA is determined by endmembers, but GSV
calculated by GSGA is determined by relative distances among endmembers as
simplex edge lengths. Accordingly, endmembers found by DSGA are completely
determined by initial conditions and also generally require DR, which does not
produce a true SV. By contrast, GSGA-calculated GSV is a true SV owing to the

fact that GSGA does not require DR. Moreover, the simplexes found by GSGA are
independent of endmembers’ spatial locations and can be moved around. This
crucial difference distinguishes GSGA from all variants of DGSA reported in the
literature.
Interestingly, GSGA can also be interpreted by linear spectral unmixing using a
noise-free linear mixing model,
rL1 ¼ ½ m1 mp Lp αp1 ; ð12:5Þ

h i
where MLp ¼ m1
m p is a signature matrix formed by p L-dimensional
T
endmembers, m1, m2, . . ., mp and αp1 ¼ α1 ; . . . ; αp is a p-dimensional abun-
dance vector, with αj indicating the abundance fraction of the jth endmember
present in the L-dimensional data sample vector rL1 . According to Honeine and
Richard (2012), (12.5) can be reexpressed as
" # " #
1 1 1 1
¼ αp1 ; ð12:6Þ
r ðLþ1Þ1
m1 m2 mp ðLþ1Þp
which yields
" # " # " #
1 1 0 0 1
¼
r ðLþ1Þ1
m1 m2 m1 mp m1 ðLþ1Þp
e
α ðpþ1Þ1
) rL1 ¼ m1 þ ½ m2 m1 e ðp1Þ1 ð12:7Þ

mp m1 Lðp1Þ α
) rL1 m1 ¼ ½ m2 m1 e ðp1Þ1
mp m1 Lðp1Þ α
)e e2
r L1 ¼ ½ m e p Lðp1Þ α
m e ðp1Þ1 ;
T
where αe ðp1Þ1 ¼ α2 ; α3 ; ; αp is a ðp 1Þ-dimensional abundance vector that
removes α1 from the original p-dimensional abundance vector αp1 and
e e j ¼ mj m1 . Using ASC the abundance fraction α1 can be
r L1 ¼ rL1 m1 , m
obtained by
X
p
α1 ¼ 1 αj : ð12:8Þ
j¼2
Using (12.7) the jth abundance αj can be solved by Cramer’s rule as follows:

1 1 1 1 1
Det
m1 mj1 r mjþ1 mp
αj ¼
for 1 j p ð12:8Þ
1 1 1 1 1
Det
m1 mj1 mj mjþ1 mp
and
Detð½ me2 me j1 e

r me jþ1 me p Þ
αj ¼ for 2 j p: ð12:9Þ
e2 m
Detð½ m e j1 mej me jþ1 me p Þ
Interestingly, if we consider a p-vertex simplex whose p vertices are specified by

the p signature vectors in M in (12.5), m1, m2, . . ., mp, then its SV can be calculated
by

DSV m1 ; . . . ; r; . . . ; mp

Detð½ m e2 m e j1 e r m e jþ1 e p Þ
m ð12:10Þ
/ :
ðp 1Þ!
By virtue of (12.9)–(12.10), αj can be further calculated as

DSV m1 ; . . . ; mj1 ; r; mjþ1 ; . . . ; mp
αj ¼ for 2 j p
DSV m1 ; . . . ; mj1 ; mj ; mjþ1 ; . . . ; mp
ð12:11Þ
DSV m e 2; . . . ; m
e j1 ; e e jþ1 ; . . . ; m
r; m ep
¼ ;
DSV m e 2; . . . ; m
e j1 ; m
e j; m
e jþ1 ; . . . ; m
ep
along with α1 given by (12.8) and (12.2), which can then be used to calculate
(12.10) and (12.11).
Furthermore, a comparison of (12.6) and (12.7) clearly shows that the ASC
imposed on (12.6) has been removed in (12.7). This is because the SV calculated by
(12.6) is based on p signatures, m1, m2, . . ., mp which must satisfy the ASC,
while the SV calculated by (12.7) is based on ( p 1) simplex edge vectors,
me 2 ¼ m2 m1 , . . . , m
e p ¼ mp m1 , with dimensions less than p by one since
(12.7) does not need to impose ASC. This further indicates that, unlike (12.6),
which is determined by signatures themselves, GSV calculated by (12.7) according
to their edges has nothing to do with spatial locations of signatures. This crucial
difference allows OP to be used to calculate the GSV.
Most recently, Wang et al. (2013a, b, c) developed an approach, called distance-
based SGA (Dist-SGA), which used a new distance measure to find tj+1. Interest-
ingly, this approach is the same as finding a distance perpendicular to the simplex
base, which can be considered as an OP onto the hyperplane containing previously
grown simplexes. As a result, Dist-SGA can be considered a technique for calcu-
lating GSV by perpendicular distance (GSV-PD) and is essentially the same as
OPSGA developed in Chap. 11, which is a GSV-OP technique; a comparison
between OPSGA and Dist-SGA can be found in Chap. 11. In addition to Dist-SGA,
Geng et al. (2014) recently also proposed a similar approach for band selection but
not for finding endmembers.
12.2.1 Finding Heights of Simplexes for GSGA
As previously noted, a key element in implementing GSGA is finding the maximal

height of previously grown simplexes.
Suppose that for each 3 j p a simplex, Sj1 ¼ S m1 ; m2 ; . . . ; mj1 , is a
( j 2)-dimensional ( j 1)-vertex simplex, with the j 1 vertices specified by
j 1 previously found endmembers, fmi gj1 i¼1 . The main goal of GSGA is to find
the maximal height, hjGSGA , and its corresponding data sample vector, mjGSGA , so

that the Sj m1 ; . . . ; mj1 ; mjGSGA formed by adding mjGSGA to Sj m1 ; . . . ; mj1 as
the jth vertex yields the maximal GSV. That is,

GSV S m1 ; . . . ; mj1 ; mjGSGA ¼ maxmj GSV S m1 ; . . . ; mj1 ; mj
ð12:12Þ
¼ hjGSGA GSV S m1 ; . . . ; mj1 :
The solution to (12.12) can be found by the well-known GSOP. Two versions of
GSOP can be derived.
GSOP1 for Finding hjGSGA :
Find two data sample vectors, m1 and m2, with the maximal segment length as
two endmembers. Let
e 2 ¼ m2 m1 , u1 ¼ jjm e2 me 2 .
m1 jj, and u2 ¼ 1=2 ¼
m
m 1
e 2T m
ðm e2Þ e 2
m
2. For the given jth target mj for j > 2 find
e j ¼ mj m1
m ð12:13Þ
j to the space linearly spanned by u2 , u3 , . . . , uj1

and the orthonormal vector u
by
j1
X

e j; m
m ei
ej
mj ¼ m e
m
i¼2
hm e ii i
e i; m
X j1
X
j1

ej
¼m me j ; ui ui ¼ m
ej me jT ui ui ð12:14Þ
i¼2 i¼2
X
j1
ej
¼m e j;
ui uiT m
i¼2
where
mj mj
uj ¼ 1=2 ¼ : ð12:15Þ
mj
mjT mj
3. The maximal height of GSGA, hjGSGA , can be found and computed by

n o
e jGSGA ¼ arg maxer e
m r T uj ; ð12:16Þ
hjGSGA ¼ maxre e jGSGA uj ;

r T uj ¼ m ð12:17Þ
e jGSGA þ m1 ;
mjGSGA ¼ m ð12:18Þ
where e
r ¼ r m1 .
4. Check whether j ¼ p. If so, the algorithm is terminated. Otherwise, let p p þ 1,
and go to step 2.
e jGSGA is obtained by finding maximal OPs of all data
According to (12.16), m
sample vectors er ¼ r m1 along the unit vector u j. In this case, er can be
⊥ k ⊥ ⊥ k #
decomposed into er ¼erj þ e r j ¼ PU e
r j , where e r and er j ¼ PUj1 e
r ¼ Uj1 Uj1 r
j1
#
using the orthogonal subspace projector defined by P⊥ Uj1
¼ I Uj1 Uj1 , using
1
# T T
Uj1 ¼ u2 u3 uj1 and the pseudo-inverse of Uj1 , Uj1 ¼ Uj1 Uj1 Uj .
k
From a signal processing point of view, e r⊥j and e
r j can be considered as the high-pass
and low-pass filter parts of the signal source e r . Thus, we can further decompose
k k
lower-pass filter e r⊥
r j1 into its high-pass-filter part, ej2, and low-pass-filter part, e
r j2,
and process continues until it reaches first basis vector u 2 as follows:
k
e ⊥ ¼e
r j1 r⊥
j2 þ e r⊥
r j2 ¼ ¼ e r⊥
j2 þ þ e3 þ u2 : ð12:19Þ
It should be noted that (12.19) is very similar to well-known wavelet representation

as discussed in Chang (2013) (Chap. 28). By means of (12.19), any data sample
vector r can be decomposed into j mutual orthogonal components via obtained
orthonormal j 2 basis unit vectors, u2 , u3 , . . . , uj1 , given by (12.15) as
k
Xj
e
r¼e r⊥
j þe rj ¼ r ⊥ þ u2 .
e
i¼3 i
Moreover, in analogy with (12.14), we can represent
X
j1 j1
X
he
ri; me ii
rj ¼ e
r ei ¼ e
m rj er jT ui ui ð12:20Þ
i¼2
e i; m
hm e ii i¼2
with e
r j ¼ r m1 . As a result, er can also be expressed as

e
r¼ r1T u1 , r2T u2 , . . . , rjT ujin terms of basis vectors u 3, . . ., u
1, u j. This leads to
another version of GSOP for finding hjGSGA .
GSOP2 for Finding hjGSGA
Find two data sample vectors, m1 and m2, with the maximal segment length as
two endmembers. Let
e 2 ¼ m2 m1 , u1 ¼ jjm e2 me 2 .
m1 jj, and u2 ¼ 1=2 ¼
m
m 1
e 2T m
ðm e2Þ e 2
m
2. For the given jth target mj for j > 2, find m e j ¼ mj m1 via (12.13) and the
orthonormal vector u ej
j to the space linearly spanned by u2 , u3 , . . . , uj1 , mj ¼ m

Xj1 m e j; m
ei
i¼2 e i by (12.14), and uj ¼ Tmj 1=2 ¼ mj by (12.15).
m
hme i; m
e ii ðmj mj Þ jjmj jj
3. Furthermore, find the jth mutual orthogonal component rj given by (12.20).
4. Using (12.17) to find the maximal height of GSGA, hjGSGA can be found and
computed by
n o n o
mjGSGA ¼ arg maxrj krj k ¼ arg maxrj rjT uj ; ð12:21Þ
mjGSGA ¼ mjGSGA þ m1 ð12:22Þ
hjGSGA ¼ kmjGSGA k: ð12:23Þ
5. Check whether j ¼ p. If so, the algorithm is terminated. Otherwise, let p p þ 1,

and go to step 2.
It is worth noting that GSOP has recently emerged as a promising alternative to
OSP, as demonstrated by Du et al. (2008b) for endmember extraction, by Geng
et al. (2014) for band selection, and by Song et al. (2014a, b) for LSMA.
12.2.2 Geometric Simplex Growing Algorithm
Even though GSOP1 and GSOP2, developed in Sect. 12.3.1, produce identical
results, there is a key difference between them. The equation used in GSOP1,
(12.16), is used to find a new endmember and projects all data sample vectors
e
r ¼ r m1 directly onto the jth orthonormal basis vector u j, whereas (12.21), used
in GSOP2, is used to find a new endmember and orthonormalizes all data sample
vectors e
r ¼ r m1 by u2 , u3 , . . . , uj1 via (12.20) to obtain rj prior to projecting the
jth orthonormal basis vector u j. Nonetheless, either one can be used in the imple-
mentation of GSGA, described as follows.
GSGA
Find two data sample vectors with the maximal segment with two endpoints
specified by m1 and m2, denoted by mGSGA
1 and mGSGA2 respectively.
Define m e2GSGA
¼ m2
GSGA
m1 GSGA
.
2. For each 3 j p we implement GSOP to find hjGSGA by (12.17) via finding
n o
e jGSGA ¼ arg maxer e
mjGSGA in (12.14) through m r T uj in (12.15). Or we can find
n o
mjGSGA ¼ arg maxrj krj k using (12.21).
e jGSGA þ m1 via (12.18) or
3. Find either mjGSGA ¼ m
mjGSGA ¼ mjGSGA þ m1 : ð12:24Þ

n op
4. The set of mjGSGA obtained in step 2 is desired set of endmembers
j¼1

generated by GSGA. In addition, GSV Sj m1GSGA ; . . . ; mp1
GSGA
; mpGSGA cal-
culated by (12.12) is the maximal GSV produced by the simplex with its vertices
n op
specified by mjGSGA .
j¼1
According to the preceding algorithm, GSGA can be implemented in a series of

sequential processes described as follows:
mjGSGA ðjth vertex=endmemberÞ

ð12:13Þ
e jGSGA ðjth edge vectorÞ ¼ mjGSGA m1
! m
ð12:14Þ
! mjGSGA ðorthogonalized vectorÞ
ð12:15Þ ð12:25Þ
! ujGSGA ðorthonormalized vectorÞ
ð12:16Þ ð12:17Þ
e jGSGA ! hjGSGA ðjth heightÞ
! m
ð12:18Þ
! mjþ1
GSGA e jþ1
¼m GSGA
þ m1 ðj þ 1Þth vertex=endmember :
It should be noted that GSGA can be considered a geometric version of 2-SGA

defined in Chang (2016), where two data sample vectors instead of one data sample
vector are used as the initial conditions of the SGA.

Apparently, the main steps of GSOP are to find (12.16) and (12.17). In this section
j in (12.15) as
we derive a recursive equation for mj in (12.14) and its normalized u
follows.
12.3.1 Orthogonal Subspace Projection-Based RHSP-GSGA
For j 2, a new endmember mjþ1 can be found by
X
j
T
e jþ1
mjþ1 ¼ m e jþ1 ¼ m
ui uiT m e jþ1
e jþ1 Uj Uj m
i¼2 ð12:26Þ

T
¼ I Uj Uj me jþ1 :
For the processed data given by u j and a new input data sample vector mj+1
Xj1
e j; m
m ei
e jþ1 ¼ mjþ1 m1 , that is, mj ¼ m
m ej e i and uj ¼ Tmj 1=2 ¼ mj .
m
h e
m i ; e
m i i ðmj mj Þ jjmj jj
i¼2
Now, let Uj ¼ u2 u3 uj . Then mjþ1 can be found by
X
j

e jþ1
mjþ1 ¼ m ui uiT me jþ1
i¼2 ð12:27Þ
T
¼m e jþ1 ¼ I Uj UjT m
e jþ1 Uj Uj m e jþ1 ;
12.3 Recursive Hyperspectral Sample Processing of Geometric Simplex. . . 371
where
T
X
j

Uj Uj ¼ ui uiT ¼ Uj1 þ uj ujT : ð12:28Þ
i¼2
RHSP-GSGA
1 and mGSGA
2 respectively.
m GSGA m GSGA
e 2GSGA ¼ m2GSGA m1GSGA , u1GSGA ¼
Define m 1
, and u2GSGA ¼ m2GSGA .
jjm1GSGA jj jj 2 jj
e j ¼ mj m1 and (12.27) to find
2. For each 3 j p use (12.13) to find m

GSGA GSGA T
mjGSGA ¼ I Uj2 Uj2 me jGSGA ð12:29Þ
and
mjGSGA
ujGSGA ¼ ;
ð12:30Þ
mjGSGA
h i
GSGA
where Uj2 ¼ u2GSGA u3GSGA uj2
GSGA
.
n o
3. Use (12.16) to find me jGSGA ¼ arg maxer e r T uj or (12.20) to find mjGSGA ¼ arg

maxr krj k .
4. Calculate hjGSGA ¼ kmjGSGA k.
5. Use (12.12) to find the SV

GSV S m2GSGA ; . . . ; mj1
GSGA
; mjGSGA
ð12:31Þ
¼ hjGSGA GSV S m2GSGA ; . . . ; mj1GSGA
:
6. Use recursive equations
∇mjGSGA ¼ mjGSGA mj1

GSGA
ð12:32Þ
and
T
GSGA GSGA T GSGA
Uj1 Uj1 ¼ Uj2 þ uj1
GSGA GSGA
uj1 : ð12:33Þ
4. Let j j þ 1, and go back to step 2.

It should be noted that the major difference between RHSP-GSGA and GSGA is
the use of recursive equations (12.32)–(12.33) in step 5.
Alternatively, from (12.26) we can further derive Kalman filte-like equations for
the GSGA using OSP as follows in a manner similar to that described in Chap. 3.
For j 2 a new endmember mjþ1 can be found by
X
j
T
e jþ1
mjþ1 ¼ m ui uiT me jþ1 ¼ m
e jþ1 Uj Uj me jþ1
i¼2
T
e jþ1 m
¼m ej þ m e jþ1 uj ujT m
e j Uj1 Uj1 m e jþ1
T
e jþ1 m
¼m ej þ m
e j Uj1 Uj1 me jþ1 m
ej þ m ej ð12:34Þ
e jþ1
uj ujT m
T
¼ ∇me jþ1 þ mj Uj1 Uj1 ∇m e jþ1
e jþ1 uj ujT m

¼ mj þ I Uj1 Uj1
T
∇m e jþ1 ;
e jþ1 uj ujT m
where
∇m e jþ1 m
e jþ1 ¼ m e j ¼ mjþ1 mj ð12:35Þ
is the innovation information obtained from mjþ1 , but not in mj , and with
T
X
j

Uj Uj ¼ ui uiT ¼ Uj1 þ uj ujT ð12:36Þ
i¼2
ju
simply updated by Uj1 and u Tj .
KF OSP-GSGA
1 and mGSGA
2 respectively. Define
m GSGA m GSGA
m2GSGA ¼ m2GSGA m1GSGA , u1GSGA ¼ 1
, and u2GSGA ¼ m2GSGA .
jjm1GSGA jj jj 2 jj
2. For each 3 j p use (12.14) to find e
r j ¼ r m1 and (12.17) to find
T
GSGA
mjGSGA ¼ mj1
GSGA
þ I Uj2 GSGA
Uj2 e j1
∇m GSGA
T ð12:37Þ
ujGSGA ujGSGA e j1
m GSGA
and
mjGSGA
ujGSGA ¼ ;
ð12:38Þ
mjGSGA
h i
GSGA
GSGA
.
3. Find hjGSGA ¼ kmjGSGA k via (12.17).

GSGA
; mjGSGA
ð12:39Þ
¼ hjGSGA GSV S m2GSGA ; . . . ; mj1GSGA
:
5. Using recursive equations

GSGA
ð12:40Þ
and
T
GSGA GSGA T GSGA
GSGA GSGA
uj1 : ð12:41Þ

Thus, according to (12.34)–(12.36), three pieces of information are used to
perform (12.29) recursively.
T
Xj1
• Uj1 ¼ u2 u3 uj1 and Uj1 Uj1 ¼ i¼2
u i u T
i ;
• Xj1
T T
mj ¼ mej ui ui
T
me j ¼ e
m j Uj1 U j1 e
m j ¼ I U j1 Uj1 me j;
i¼2
m m
• uj ¼ T j 1=2 ¼ mj ,
ðmj mj Þ jj j jj
• I Uj1 Uj1
T
:
2. New information: me jþ1 ¼ mjþ1 m1 .

3. Innovation information: ∇me jþ1 ¼ m
e jþ1 m
e j.
12.3.2 Orthogonal Vector Projection-Based RHSP-GSGA
As an alternative approach, (12.20) can be rederived as

X
j
T
rjþ1 ¼ e
r ui uiT er ¼e
r Uj Uj er
i¼2
T
¼e ej þ m
rm e j Uj1 Uj1e
r uj ujT e
r ð12:42Þ
T
¼ ∇erj þ mej Uj1 Uj1 ∇e
rj uj ujT e
r

T
¼m r j uj ujT e
e j þ I Uj1 Uj1 ∇e r;
where er ¼ r m1 and ∇rj ¼ e e j . By virtue of (12.42), equations (12.16)–

rm
(12.18) become
n o
mjGSGA ¼ arg maxrj krj k ; ð12:43Þ
mjGSGA ¼ mjGSGA þ m1 ; ð12:44Þ
hjGSGA ¼ kmjGSGA k: ð12:45Þ
Using (12.38)–(12.40), RHSP-GSGA can be rederived as a Kalman filter-based

orthogonal vector projection-based GSGA (KF OVP-GSGA) as follows.
KF OVP-GSGA
1 and mGSGA
2 respectively.
m1GSGA m2GSGA
e 2GSGA ¼ m2GSGA m1GSGA , u1GSGA ¼
Define m , and u2GSGA ¼ .
jjm1GSGA jj jjm2GSGA jj
2. For each 3 j p find e
r j ¼ r m1 and use (12.43) and (12.44) to find
T
GSGA
mjGSGA ¼ GSGA
mj1 þ I Uj2 GSGA
Uj2 e j1
∇m GSGA
T ð12:46Þ
ujGSGA ujGSGA e j1
m GSGA
and
mjGSGA
ujGSGA ¼ ;
ð12:47Þ
mjGSGA
h i
GSGA
GSGA
.
3. Find hjGSGA ¼ kmjGSGA k via (12.45).


GSGA
; mjGSGA

¼ hjGSGA GSV S m2GSGA ; . . . ; mj1
GSGA
: ð12:48Þ
5. Use recursive equations

GSGA
ð12:49Þ
and
T
GSGA GSGA T GSGA
GSGA GSGA
uj1 : ð12:50Þ

Thus, to perform KF-OVP-GSGA recursively in a Kalman filter manner, three
pieces of information are required, as follows.
GSGA
• mj1 ;
GSGA
mj1
GSGA
• uj1 ¼ ;
jhjmj1
GSGA
jj i
GSGA
• Uj2 ¼ u2GSGA u3GSGA uj2
GSGA
;
T
GSGA
• I Uj2 GSGA
Uj2 :
2. New information:
• mj,
• rj :
3. Innovation information:
e j;
• ∇m
• ∇rj ¼ e e j.
rm
Table 12.1 provides a comparison of how information is used for updating the
GSGA using the KF OSP-GSGA and KF OVP-GSGA.
Table 12.1 Comparison between KF OSP-SGSA and KF OVP-GSGA

Required information KF OSP-GSGA KF OVP-GSGA

Processed information Uj1 ¼ u2 u3 uj1 GSGA
mj1
GSGA
mj1
GSGA
uj1 ¼
GSGA
mj1
h i
GSGA
Uj2 ¼ u2GSGA u3GSGA uj2
GSGA
T Xj1 T
Uj1 Uj1 ¼ i¼2
ui uiT GSGA
I Uj2 GSGA
Uj2

T
mj ¼ I Uj1 Uj1 m ej
mj
uj ¼
mj

I Uj1 Uj1 T
New information e jþ1 ¼ mjþ1 m1

m mj, rj
Innovation information ∇m e jþ1 m
e jþ1 ¼ m ej ∇me j , ∇rj ¼ e ej
rm
12.4 Relationship Between RHSP-GSGA

and RHSP-OPSGA
j as
e jþ1 and U
Recalling (12.14), we can rewrite mjþ1 in terms of m

T T
mjþ1 ¼ m e jþ1 ¼ I Uj1 Uj1 m
e jþ1 Uj Uj m e jþ1 ; ð12:51Þ
T
where Uj ¼ u2 u3 uj , and Uj Uj ¼ Uj1 þ uj ujT is given by (12.28).

j 1; i ¼ j T
Since fui gi¼1 is an orthonormal vector, that is, ui uj ¼ , U U ¼ I, this
0; i 6¼ j j j
1
# T T T
implies that P⊥ U
¼ I U j Uj ¼ I Uj U j U j Uj ¼ I Uj Uj , which is exactly
j
the same as (12.26). Thus, (12.51) can be further expressed as
mj
mjþ1 ¼ P⊥
U
e jþ1
m and ujþ1 ¼ : ð12:52Þ
j mj
e jþ1 is equivalent to the

Thus, according to (12.52), finding the maximal mjþ1 over m
ATGP, which finds
n o
⊥
e jþ1
m ATGP
¼ arg maxm e jþ1 :
e jþ1 PUj m ð12:53Þ
12.4 Relationship Between RHSP-GSGA and RHSP-OPSGA 377
e jþ1
However, (12.53) is the same as finding the ( j + 1)th endmember m GSGA
obtained
by GSGA that is given by (12.16), that is,
n o
e jþ1
m GSGA
¼ arg maxer e
r T ujþ1 ¼ arg maxr rT ujþ1 : ð12:54Þ
Since (12.53) and (12.54) are the same, me jþ1

GSGA
¼me jþ1
ATGP
. Also, from (12.16) and
⊥
GSGA
(12.53), hjþ1 ¼k m e jþ1
ATGP
k. This further shows that GSGA and OPSGA do
indeed produce identical results even though they are derived from two completely
different design rationales. There are two differences between GSGA and OPSGA.
e i gi¼2
One is that fm j
in Ũj are not necessarily orthogonal or orthonormal compared to
j
fui gi¼2 in Uj which must be orthonormal. The other is that GSGA generates
projection vectors that are normal vectors to identify the direction of heights of
simplexes, while OPSGA produces OP subspaces that contain all data sample
vectors orthogonal to simplex bases.
As a final comment, despite the fact that the maximal OP found by OPSGA can
be interpreted by the maximal simplex heights found by GSGA, they are indeed
quite different techniques. First, OPSGA is an EFA that finds endmembers and uses
them to calculate GSV by subtracting the first endmember from all remaining
endmembers found by OPSGA (ATGP). On the other hand, GSGA is a
GSV-based approach that does not find endmembers but rather finds maximal
simplex heights orthogonal to previously found simplex bases. It then considers
the data sample vectors with the found maximal GSV as endmembers. In other
words, GSGA can be considered a simplex-edge-based approach that calculates
GSV by finding maximal simplex heights. Interestingly, such maximal heights turn
out to be maximal OPs found by ATGP. Thus, basically, these two approaches can
be considered reversed versions of each other. This difference is very crucial and
critical. Because GSGA is a simplex-edge approach, its simplest simplex is a line
segment. Thus, the initial condition for GSGA begins with a degenerated line
segment with the maximal length, which is considered an edge simplex, compared
to OPSGA, which can begin with any single endmember. Second, GSGA is a
vector-based approach that makes use of GSOP to find a sequence of maximal
heights as base vectors, compared to OPSGA, which is a subspace-based approach
that finds a sequence of orthogonal subspaces. This is a significant difference
between GSGA and OPSGA. Third, GSGA must orthonormalize all found heights,
but OPSGA does not. That is why GSGA has less computational complexity and
performs slightly faster than OPSGA. Most importantly, according to Sects.
(12.51)–(12.54), GSGA can be shown to be exactly identical to OPSGA, even
though they are completely different approaches.
12.5 Determining Number of Endmembers

for RHSP-GSGA
RHSP-GSGA makes it possible to generate new endmembers indefinitely if no

stopping rule is imposed. Taking advantage of target-specified virtual dimension-
ality (TSVD) in Sect. 4.5 and following the same treatment in Sect. 11.7, we can
also derive a stopping rule for RHSP-GSGA using (12.52) as follows:
T
mjþ1 ¼ P⊥
U
e
m jþ1 ) m T
jþ1 mjþ1 ¼ P ⊥
U
e
m jþ1 P ⊥
U
e
m jþ1
j j j
¼ e jþ1
m T
P⊥ e :
m
Uj jþ1
ð12:55Þ
Now we can define the following quantity similar to (11.29):
e jþ1
ηjþ1 ¼ m T
P⊥
U
e jþ1 ;
m ð12:56Þ
j
which can be used to estimate the value of VD, nRHBP-GSGA, to determine how many
endmembers for RHSP-GSGA should be generated by letting m e RHSP
e jþ1 ¼ m -GSGA
jþ1
obtained by (12.54) or hjGSGA
¼ kmj GSGA
k derived from (12.29). In other words, the
ηj+1 obtained by (12.56) can be used as the ( j + 1)th signal source under the binary
hypothesis testing problem to determine the VD. Similarly, (12.56) can also be
modified by the residual strength of the signal source, denoted by ρj+1, similar to
(11.42) in Chap. 11, and given by
pffiffiffiffiffiffiffiffi T ⊥ 1=2
ρjþ1 ¼ ηjþ1 ¼ m e jþ1
e jþ1 PU m ; ð12:57Þ
j
e jþ1 ¼ m RHSP-GSGA
e jþ1
with m in (12.54).
12.6 Computational Complexity
Let N denote the total number of pixels in a hyperspectral image, L the number of
bands, j the number of endmembers to be found during the process, and p the total
number of endmembers needed to be generated. Since the initial steps of these
methods are the same—either to find a data sample with maximum vector length
and then find another data sample with maximum distance from the first one or to
find two samples yielding the maximum distance—the complexity ignores the
complexity of the initial step to simplify the discussion.
12.6 Computational Complexity 379
12.6.1 Computational Complexity of Determinant-Based

SGA (DSGA)
Since the volume calculation in SGA is based on the Cayley–Menger determinant,

DR is required to reduce the dimensionality of the endmember matrix formed by
previously found endmembers. As a result, the complexity of DR should be
included. If principal components analysis (PCA) is used, the complexity of PCA
is O(NL2). The complexity of calculating the
DSVcan
be further simplified as the
1 1 1 1 1
complexity of finding the determinant of ¼ . Hence,
E r e1 e2 r
the complexity of finding DSV for each data sample vector is O( j3). With a given
total number of data sample vectors, N, the total complexity is O( j3N ). Summing up
the complexity of DR and the complexity X of finding the determinant for N data
p
sample vectors yields a complexity of O j¼1
NL þ j N for DSGA.
2 3
12.6.2 Computational Complexity of Dist-SGA
To perform Dist-SGA, which replaces volume calculation with a distance measure-

ment proposed by Wang et al. (2013a, b, c), a measure function, fi, is constructed to
solve the following equations:
(
f ðei Þ ¼ kiT ei þ bi ¼ 1,
ð12:58Þ
f ej6¼i ¼ kiT ej6¼i þ bi ¼ 0;
where ki is the slope of the affine space spanned by other endmembers.

To derive
k i
and bi, an inverse calculation is needed with a complexity of O j3 þ j2 L O j2 L .
Then the projection of all data sample vectors onto the affine space requires a
complexity of O( jNL). This is followed by finding the distance between data
sample vectors and their projections onto the affine
space. Consequently, the
overall complexity is O j2 L þ 2jNL O j2 L þ jNL . Finally, the complexity of
X p X p
the Dist-SGA is O j¼2
j2
L þ jNL ¼ O j¼2
j 2
L þ p 2
NL .
12.6.3 Computational Complexity of OPSGA
To find endmembers without calculating determinants, OPSGA uses a projection

operator to find a data sample vector that yields the maximal distancefrom the
space formed by previously found endmembers. A complexity of O j3 þ jL2
is required to obtain the jth endmember. In addition, it requires a complexity of

O(NL2) to project all data sample vectors onto the OP spaces. Therefore, the
complexity of each iteration is O(NL2) since the number of endmembers, k, used
in the calculation is much smaller than the number of total data sample vectors,
N, that is, k N. As a result, the complexity of OPSGA is
Xp
O j¼2
j þ jL þ NL2 .
3 2
12.6.4 Computational Complexity of Recursive OPSGA
The complexity
for finding the
second
endmember
is the same with each iteration of
OPSGA, O j3 þ jL2 þ NL2 O L2 þ NL2 . For j > 2, the projection operator
P⊥ ⊥ 2
e can be updated from Pe recursively with complexity O(NL ). Thus, the
Uj U j1
overall complexity of recursive OPSGA (ROPSGA) is O(pNL2).
12.6.5 Computational Complexity of GSGA Using Gram–

Schmidt Orthogonalization Process
To reduce the complexity of projecting all data sample vectors onto orthogonal
subspaces, the complexity required by GSOP is O( jL) for each data sample vector.
Thus, a complexity of O( jNL) is needed for the total number of N data sample
vectors.
X p This implies that the complexity of GSGA using GSOP is
O j¼2
ðjNLÞ Oðp2 NLÞ.
12.6.6 Computational Complexity of Recursive GSGA
The complexity for finding the second endmember is O(NL). For j > 2, the pro-
jections specified by (12.26)–(12.28) require complexity O(NL). This leads to a
complexity of RHSP-GSGA of O(pNL). A comparison among the complexities of
all the SGA-based methods is summarized in Table 12.2.
Two real hyperspectral image scenes in Figs. 1.10 and 1.14, which have been
studied extensively in the literature, are used for experiments.
Table 12.2 Computational Algorithm Computational complexity

complexity comparison Xp
DSGA O p2 NL2 þ j¼2
O j3 N
Xp
Dist-SGA Oðp2 NLÞ þ j¼2
O j2 L
GSGA O( p2NL)
RHSP-GSGA O(pNL)
KF OSP-GSGA O(pNL)
KF OVP-GSGA O(pNL)
X p
OPSGA O j 3
þ jL2
þ NL2
j¼2
RHSP-OPSGA O(pNL2)
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
of the 15 panels
12.7.1 HYDICE Data
The image data to be studied are from the HYperspectral Digital Imagery Collec-
tion Experiment (HYDICE) image scene shown in Fig. 12.2a (also shown in
Fig. 1.10a), which has a size of 64 64 pixel vectors with 15 panels in the scene
and the ground truth map in Fig. 12.2b (Fig. 1.10b).
As noted, panel pixel p212, marked by yellow in Fig. 12.2b, is of particular
interest. Based on the ground truth, this panel pixel is not a pure panel pixel and is
marked by yellow as a boundary panel pixel. However, with our extensive and
comprehensive experiments this yellow panel pixel is always extracted as the one
with the most spectrally distinct signature compared to those R panel pixels in row
2. This indicates that a signature of spectral purity is not equivalent to a signature of
spectral distinction. As a matter of fact, in many cases panel pixel p212 instead of
panel pixel p221 is the one extracted by EFAs to represent the panel signature in row
2. Also, because of such ambiguity, the panel signature representing the panel
pixels in the second row is either p221 or p212, which is always difficult to find
using EFAs. This implies that the ground truth of R panel pixels in the second row
provided in Fig. 12.2b may not be as pure as was thought.
Table 12.3 nRHBP-GSGA estimated for HYDICE data

ηj+1 in (12.56) MOCA PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
DSGA 98 98 98 98 98 98
GSGA 45 46 44 43 40 39
ρj+1 in (12.57) MOCA PF ¼ 101 PF ¼ 102 PF ¼ 103 PF ¼ 104 PF ¼ 105
DSGA 98 98 98 98 98 98
GSGA 43 45 43 40 39 37
By virtue of (12.56) and (12.57), the energies [ηj+1 in (12.56)] and strengths [ρj+1
in (12.57)] of real target pixels found by DSGA and GSGA can be used to
determine the number of endmembers by VD with RHSP-OPSGA replaced by
RHSP-GSGA. Table 12.3 tabulates their values estimated by the maximal orthog-
onal complement algorithm (MOCA), which is equivalent to a Bayes detector as
well as an NPD with various false alarm probabilities.
Since DSGA found targets without performing data DR, it seemed to produce
larger values for nRHBP-GSGA than GSGA, which grows simplexes up to nRHBP-GSGA
vertices. In addition, 45 is the least upper bound on nRHBP-GSGA determined by
GSGA (Table 12.3), DSGA and GSGA were applied to find up to 45 target pixels as
potential endmembers as shown in Fig. 12.3 for HYDICE data. As we can see from
Fig. 12.3a, b, the first nine target pixels found in Fig. 12.3a already included three
panel pixels, p11, p312, p521, and the next 9 target pixels from 10 to 18 pixels in
Fig. 12.3b also picked up the remaining 2 panel pixels, p212 and p412. As shown in
Fig. 12.2b, p212 is more pure than its neighboring pixels p211 and p221. Although
both DSGA and GSGA found five panel pixels corresponding to the three panel
pixels p11, p312, and p521 in Fig. 12.3a and the remaining two panel pixels p212 and
p412 in Fig. 12.3b, their orders of appearance were different. Interestingly, both
required 18 target pixels to find each panel pixel in 5 different rows and the last
panel pixels were the 18th pixels, p412 by DSGA and p212 by GSGA.
Interestingly, according to Chang (2003a, b, 2013) and Chang and Du (2004),
the number of endmembers estimated for this scene by VD was nVD ¼ 9. Also, from
Chang et al. (2010a, 2011b), the number of signatures used for LSMA was
estimated as twice that of nVD, 2nVD ¼ 112. In this case, the first 9 target pixels in
Fig. 12.3a and 10–18 target pixels shown in Fig. 12.3b found by DSGA and GSGA
respectively can be used as desired endmembers, which include all 5 mineral
signatures.
Furthermore, the target pixels found in Fig. 12.3a, b were used to form 9 vertex
simplexes and 18-vertex simplexes from which we can calculate their SVs.
Table 12.4 tabulates their SVs, where DSGA-generated simplexes always yielded
larger volumes than GSGA-generated simplexes.

found by DSGA and GSGA
for HYDICE data. (a)
9 endmember pixels.
(b) 10–18 endmember
pixels. (c) 19–34
endmember pixels. (d)
35–45 endmember pixels
Table 12.4 Simplex volumes calculated based on target pixels found by DSGA and GSGA
DGSA GSGA
nVD ¼ 9 1.4319E+24 1.0617E+24
2nVD ¼ 18 2.0509E+39 1.5677E+39
a b
Fig. 12.4 (a) Cuprite AVIRIS image scene. (b) Spatial positions of five pure pixels corresponding
to minerals alunite (A), buddingtonite (B), calcite (C), kaolinite (K), and muscovite (M). (c) Five
mineral reflectance spectra. (d) Five mineral radiance spectra
12.7.2 Cuprite Data
The image scene used was a real Cuprite image data (Fig. 12.4, also shown in
Fig. 1.4), which is available at the USGS Web site http://aviris.jpl.nasa.gov/. This
scene is a 224-band image with a size of 350 350 pixels and was collected over
the Cuprite Mining District site in Nevada in 1997. It is one of the most widely used
hyperspectral image scenes available in the public domain and has 20 m spatial
resolution and 10 nm spectral resolution in a range 0.4–2.5 μm. This scene has been
studied extensively because it is well understood mineralogically and has reliable
ground truth. Two data sets for this scene, reflectance and radiance data, are also
available for study. There are five pure pixels in Fig. 12.4a, b, which can be
identified as corresponding to five different minerals, alunite (A), buddingtonite
(B), calcite (C), kaolinite (K), and muscovite (M) labeled A, B, C, K, and M in
Fig. 12.4b, along with their spectral signatures plotted in Fig. 12.4c, d.
12.7.2.1 Reflectance Data
Once again, the energies [ηj+1 in (12.56)] and strengths [ρj+1 in (12.57)] of real
target pixels found by DSGA and GSGA are used to determine the number of
endmembers tabulated in Table 12.5.
c
7000
Muscovite
6000
Alunite Kaolinite
5000
Reflectance
4000 Buddingtonite
3000
Calcite
2000
Alunite
Buddingtonite
1000 Calcite
Kaolinite
Muscovite
0
400 600 800 1000 1200 1400 1600 1800 2000 2200 2400
Wavelength(nm)
d
12000
Muscovite Kaolinite Alunite
Buddingtonite
10000 Calcite
Muscovite
Kaolinite
8000 Alunite
Radiance
6000
4000
Calcite
Buddingtonite
2000
0
400 600 800 1000 1200 1400 1600 1800 2000 2200 2400
Wavelength(nm)
Because DGSA found endmembers in the original L-dimensional data space

without DR, the value of nVD produced by DGSA was larger than that produced by
GSGA, which found endmembers in growing data spaces. In this case, a value of
75, which was the smallest upper bound on nRHBP-GSGA produced by GSGA with
PF 102 (Tables 12.5 and 12.6), was used for DSGA and GSGA to generate
target pixels. Figure 12.5a–c shows, the first 22 target pixels are in Fig. 12.5a, pixels
23–46 appear in Fig. 12.5b, and pixels 47–75 appear in Fig. 12.5c.
Table 12.5 nRHBP-GSGA estimated for Cuprite reflectance data

DSGA F (120) F (120) F (120) 119 119 117
GSGA 75 82 74 69 64 62
DSGA 119 119 119 117 103 95
GSGA 75 82 74 69 64 62
Table 12.6 Target pixels found by EIDA for Cuprite reflectance data
Cuprite A B C K M
DSGA Reflectance tDSGA
11 tDSGA
13 tDSGA
44 tDSGA
64 tDSGA
8
GSGA Reflectance tGSGA
29 tGSGA
53 tGSGA
40 tGSGA
27 tGSGA
9
Fig. 12.5 Endmember

pixels found by DSGA and
GSGA for Cuprite
reflectance data. (a)
22 endmember pixels. (b)
23–46 endmember pixels.
(c) 47–75 endmember
pixels
a b
a
A a A
b
b
B B
m M m M
K K
k k
c c
C C
Fig. 12.6 Target pixels found by EIDA compared to ground truth pixels for Cuprite reflectance
data. (a) DSGA. (b) GSGA
Because there is no available prior knowledge about the spatial locations of

endmembers, we must rely on an unsupevrised means of identifying whether an
extracted target pixel is an endmember. To address this issue, the Endmember
IDentification Algorithm (EIDA) developed in Chang et al. (2014a, b) is used for
this purpose. Figure 12.6 shows target pixels highlighted in red with lowercase
letters identified by EIDA corresponding to ground truth pixels highlighted in
yellow.
Table 12.6 also tabulates the target pixels found in Fig. 12.6 by DSGA and
GSGA via EIDA, where all five mineral signatures were extracted by target pixels
in their order of appearance labeled by subscripts and a particular SGA-based
algorithm labeled by superscripts.
As we see from Table 12.6, the most difficult mineral signatures for DSGA and
GSGA to find are different, “K,” which requires 64 target pixels, for DGSA and
“B,” which requires 53 target pixels, for GSGA.
To see how close the spectral signatures of found endmembers are compared to
their corresponding ground truth signatures, Fig. 12.7a–e shows profiles of the
spectral signatures of each of the five identified target pixels against their
corresponding ground truth pixels in Fig. 12.4.
Since Figs. 12.6 and 12.7 do not provide quantitative measures on spectral
signature similarities, Table 12.7 calculates spectral similarity values among iden-
tified target pixels against the ground truth pixels using the Spectral Angle Mapper
(SAM) and Spectral Information Divergence (SID) (Chang 2003a, b).
Apparently, from the results in Figs. 12.6 and 12.7 and Table 12.7, DSGA and
GSGA found quite different sets of target pixels, with only one pixel, “c,” in
common. Nonetheless, the spectral signatures of all the found pixels are relatively
close.
a b
6000 6000
5500 5500
5000 5000
4500 4500
4000
Reflectance
4000
Reflectance
3500 3500
3000 3000
2500 2500
2000 Groundtruth 2000 Groundtruth

DSGA DSGA
1500 1500
GSGA GSGA
1000 1000
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Band Band
c4000 d5500
5000
3500
4500
3000 4000
3500
Reflectance
Reflectance
2500
3000
2000
2500
1500 2000

1000 DSGA DSGA
1000
GSGA GSGA
500 500
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Band Band
e 7000
6000
5000
Reflectance
4000
3000
Groundtruth
2000
DSGA
GSGA
1000
0 20 40 60 80 100 120 140 160 180 200
Band
Fig. 12.7 Comparative plots of spectral signatures found by DSGA and GSGA on Cuprite
reflectance data. (a) “a” signatures against “A.” (b) “b” signatures against “B.” (c) “c” signatures
against “C.” (d) “k” signatures against “K.” (e) “m” signatures against “M”
12.7.2.2 Radiance Data
Similarly, the same experiments as were conducted for Cuprite reflectance data
were also performed for Cuprite radiance data. Table 12.8 tabulates nRHBP-GSGA
determined by the energies [ηj+1 in (12.56)] and strengths [ρj+1 in (12.57)] of real
target pixels found by DSGA and GSGA.
Table 12.7 SAM/SID of closest endmembers with ground truth by DSGA and GSGA on Cuprite
reflectance data
SAM (A,a) (B,b) (C,c) (K,k) (M,m)

SID
DSGA 0.0167 0.0334 0.0379 0.0341 0
0.0002 0.0009 0.0009 0.0007 0
GSGA 0 0.0497 0.0379 0.0304 0.0264
0 0.0012 0.0009 0.0006 0.0005
Table 12.8 nRHBP-GSGA estimated for Cuprite radiance data

DSGA F (139) F (139) 109 106 104 95
GSGA 72 79 72 69 65 62
DSGA 103 106 103 95 95 85
GSGA 67 71 67 62 58 56
Once again, a value of 75 was used for DSGA and GSGA to generate 75 target
pixels. Figure 12.8a–c shows the first 22 target pixels (Fig. 12.6a), followed by
pixels 23–46 (Fig. 12.8b) and pixels 47–75 (Fig. 12.8c).
Figure 12.9 shows the five target pixels identified by EIDA from Fig. 12.8
against the five ground truth pixels, and Table 12.9 which tabulates these five target
pixels found by DGSA and GSGA with their orders of appearance.
From Table 12.9 we see that the most difficult mineral signature for DSGA and
GSGA to find are different, “K,” which requires 103 target pixels, for the DGSA
and “C,” which requires 68 target pixels, for GSGA.
In analogy with Figs. 12.9 and 12.10a–e, Fig. 12.4 shows profiles of the spectral
signatures of each of the five identified target pixels against their corresponding
ground truth pixels.
Once again, SAM and SID are used to calculate the spectral similarity values
among the identified target pixels in Fig. 12.9 against the ground truth pixels with
their results tabulated in Table 12.10, where both DGSA and GSGA found two
different sets of pixels with two common pixels that corresponded to two ground
truth pixels, “B” and “M.” Nonetheless, the spectral signatures of all the found
pixels are also very close.
The experiments conducted for Cuprite reflectance and radiance data sets dem-
onstrated an important fact: there are no unique target pixels corresponding to
ground truth pixels in the scene. In other words, many more pixels scattering around
in the scene can also correspond to ground truth pixels.
Finally, the target pixels in Figs. 12.5 and 12.8 were used to form 22/44-vertex
simplexes for reflectance data and 15/30-vertex simplexes for radiance data, from
which we can calculate their SVs. Table 12.11 tabulates their respective SVs.
Interestingly, unlike HYDICE data, GSGA produced larger SVs than DSGA did
for both the reflectance and radiance data sets.
found by DSGA and GSGA
for Cuprite radiance data.
(a) 22 endmember pixels.
(b) 23–46 endmember
pixels. (c) 47–75
endmember pixels
Fig. 12.9 Endmember pixels found by EIDA compared to ground truth pixels for Cuprite
radiance data. (a) DSGA. (b) GSGA
Table 12.9 Target pixels found by EIDA for Cuprite radiance data
Algorithm Cuprite A B C K M
DSGA Radiance tDSGA
12 tDSGA
14 tDSGA
28 tDSGA
103 tDSGA
6
GSGA Radiance tGSGA
16 tGSGA
14 tGSGA
68 tGSGA
11 tGSGA
7
a b
12000
DSGA DSGA
8000
10000 GSGA GSGA
7000
8000
6000
Radiance
Radiance
5000
6000
4000
4000 3000
2000
2000
1000
0 0
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Band Band
c d
8000 10000
7000 DSGA 9000 DSGA
GSGA GSGA
8000
6000
7000
5000
6000
Radiance
Radiance
4000 5000
4000
3000
3000
2000
2000
1000
1000
0 0
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Band Band
e
12000
Groundtruth
DSGA
10000 GSGA
8000
Radiance
6000
4000
2000
0
0 20 40 60 80 100 120 140 160 180 200
Band
Fig. 12.10 Comparative plots of spectral signatures found by DSGA and GSGA on Cuprite
radiance data. (a) “a” signatures against “A.” (b) “b” signatures against “B.” (c) “c” signatures
against “C.” (d) “k” signatures against “K.” (e) “m” signatures against “M”
Table 12.10 SAM/SID of closest endmembers with ground truth by DSGA and GSGA on Cuprite
radiance data
SAM (A,a) (B,b) (C,c) (K,k) (M,m)

SID
DSGA 0.0205 0 0.0247 0.0123 0
0.0003 0 0.0006 0.0001 0
GSGA 0.0098 0 0.0253 0.0100 0
0.0001 0 0.0010 0.0001 0
Table 12.11 Simplex Number of endmembers DGSA GSGA

volumes calculated based on
nVD ¼ 22 for reflectance 6.0364E+52 12.0991E+52
target pixels found by DSGA
and GSGA 2nVD ¼ 44 for reflectance 1.2024E+79 2.1774E+79
nVD ¼ 15 for radiance 1.0170E+40 1.9248E+40
2nVD ¼ 30 for radiance 5.6016E+58 5.4038E+58
a b
1000 4
DSGA Dist-SGA Dist-SGA
900 Dist-SGA OPSGA
OPSGA 3.5
RHSP-OPSGA
800 RHSP-OPSGA GSGA

GSGA 3 RHSP-GSGA
700 RHSP-GSGA KF OSP-GSGA
KF OSP-GSGA KF OVP-GSGA
2.5 OPSGA
600 KF OVP-GSGA
500 2 KF OVP-GSGA
400 GSGA
1.5 RHSP-OPSGA
300
1
200
0.5 KF OSP-GSGA
100 RHSP-GSGA
0 0
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180
Fig. 12.11 Cumulative computing time in seconds of DSGA without DR, Dist-SGA, OPSGA,
RHSP-OPSGA, GSGA, RHSP-GSGA, KF OSP-GSGA, and KF OVP-GSGA as p increases on
HYDICE data
12.7.3 Computer Processing Time Analysis
This section summarizes the computer processing times required by DGSA and seven
SGA-based algorithms, four developed in this chapter, GSGA, RHSP-GSGA, KF
OSP-GSGA, and KF OVP-GSGA, two developed in Chap. 11, OPSGA and RHSP-
OPSGA, plus the Dist-SGA, developed by Wang et al. (2013a, b, c). Figure 12.11a
conducts an overall comparative analysis on cumulative computing time in seconds
among all eight algorithms, DSGA without DR, Dist-SGA, OPSGA, RHSP-OPSGA,
GSGA, RHSP-GSGA, KF OSP-GSGA, and KF OVP-GSGA applied to HYDICE
data as the number of target pixels, p, increases. Since DSGA requires a significant
amount of time, which overwhelmed the other seven SGA-modified algorithms, it is
impossible to see their relative performances. In this case, Fig. 12.11b only plots the
Table 12.12 Comparison of computing time in seconds by various methods with DSGA on
HYDICE data
Methods
Dist- RHSP- RHSP- KF KF
p DSGA SGA OPSGA OPSGA GSGA GSGA OSP-GSGA OVP-GSGA
9 1.7086 0.1134 0.1136 0.1026 0.0697 0.0640 0.0673 0.1519
18 5.2592 0.2542 0.2292 0.2114 0.1521 0.1359 0.1457 0.3433
34 20.180 0.5202 0.4177 0.4014 0.3087 0.2639 0.2842 0.6843
45 43.770 0.7192 0.5482 0.5344 0.4236 0.3532 0.3763 0.9230
a b
12000 140
DSGA Dist-SGA
Dist-SGA OPSGA Dist-SGA
120

10000 OPSGA RHSP-OPSGA
RHSP-OPSGA GSGA
GSGA 100 RHSP-GSGA
8000 RHSP-GSGA KF OSP-GSGA
KF OSP-GSGA KF OVP-GSGA
KF OVP-GSGA 80 OPSGA
6000
60
KF OVP-GSGA RHSP-OPSGA
4000
40
KF OSP-GSGA
2000
20 GSGA RHSP-GSGA
0 0
0 20 40 60 80 100 120 140 160 180 200 0 20 40 60 80 100 120 140 160 180 200
Fig. 12.12 Cumulative computing time in seconds of DSGA without DR, Dist-SGA, OPSGA,
RHSP-OPSGA, GSGA, RHSP-GSGA, KF OSP-GSGA, and KF OVP-GSGA as p increases on
Cuprite data
computing times required by these seven algorithms for better visual assessment.
According to Fig. 12.11b, clearly the best algorithm is RHSP-GSGA and the worst
are Dist-SGA in terms of computing time when the number of target pixels, nT, is
greater than 140 and KF OVP-GSGA when nT is less than 140. In addition, Fig. 12.11
also partitions all eight algorithms into four groups; the best group is composed of
RHSP-GSGA and KF OSP-GSGA, followed by the second best group made up of
GSGA, OPSGA, and RHSP-OPSGA, the third group consisting of Dist-SGA and KF
OVP-GSGA, and, finally, the worst group, DGSA, which is the original SGA
developed by Chang et al. (2006a, b, c) without DR.
To give a better idea about computing times, Table 12.12 tabulates the comput-
ing times required by all eight algorithms for p ¼ 9, 18, 34, and 45 as examples
where all seven algorithms except DSGA required less than 1 s to accomplish the
task.
Similarly, for an overall comparative analysis, Fig. 12.12 also plots the computer
processing times required by the eight algorithms, DSGA without DR, Dist-SGA,
OPSGA, RHSP-OPSGA, GSGA, RHSP-GSGA, KF OSP-GSGA, and KF
OVP-GSGA applied to Cuprite data as the number of target pixels, p, increases.
Conclusions similar to those drawn from the HYDICE data are also applied to the
Table 12.13 Comparison of computing time in seconds by various methods with DSGA
Methods
Dist- RHSP- RHSP- KF OSP- KF OVP-
p DSGA SGA OPSGA OPSGA GSGA GSGA GSGA GSGA
22 262.31 10.150 7.6834 7.6477 6.0605 5.3360 5.5380 14.832
46 1465.5 12.971 16.3953 16.3544 13.510 12.3633 12.897 32.425
75 4431 39.894 26.9343 26.8683 12.343 112.6557 19.637 53.631
Cuprite data, except that KF OVP-GSGA was the worst and Dist-SGA was the
second worst algorithm (Fig. 12.12).
Like Table 12.10, Table 12.13 also tabulates the computing times required by all
eight algorithms for p ¼ 22, 46, and 75 as examples where all seven algorithms
required significantly less time than DGSA.
According to the computer processing times shown by Figs. 12.11 and 12.12,
when nT is less than approximately 140, the performance of DSGA, Dist-SGA,
OPSGA, RHSP-OPSGA, GSGA, RHSP-GSGA, KF OSP-GSGA, and KF
OVP-GSGA can be ranked as
RHSP-GSGA
KF OSP-GSGA
GSGA
KF OVP-GSGA
ð12:59Þ

RHSP-OPSGA
OPSGA
Dist-SGA
DSGA
for HYDICE data,
RHSP-GSGA
KF OSP-GSGA
GSGA

RHSP-OPSGA OPSGA ð12:60Þ

Dist-SGA
KF OVP-GSGA
DSGA
for Cuprite data, where “A

B” means that A performs better than B in the sense of
requiring less computing time. When nT grows beyond 140, OPSGA and
RHSP-OPSGA began to outperform GSGA for both the HYDICE and Cuprite
data. Interestingly, for the HYDICE data, Dist-SGA emerged as the worst among
all the test algorithms except DSGA. This demonstrates that the best SGA-based
algorithm is RHSP-GSGA followed by KF OSP-GSGA and RHSP-OPSGA, with
DSGA being the worst. Apparently, Dist-SGA developed by Wang et al. (2013a, b,
c) was ranked fifth among all eight test SGA-based algorithms. However, two
comments on (12.59) are worth making.
• The reason KF OVP-GSGA yields the second worst computing time next
to DGSA is because KF OVP-GSGA requires an additional orthonormalization
procedure via (12.42) that is not required by KF OSP-GSGA. If this OVP
process is replaced with OSP, it becomes the second best SGA algorithm.
• KF OSP-GSGA is supposed to be the best algorithm. However, it ranks second
best next to RHSP-GSGA owing to its extra use of ∇m e jþ1 ¼ m
e jþ1 m
ej
e jþ1 but
specified by (12.35) to calculate the innovation information provided by m
e j , which is not required by RHSP-GSGA. Nonetheless, it is this equation
not in m
(12.35) that makes easy hardware design on a chip a real-time process. It is our
belief that when both RHSP-GSGA and KF OSP-GSGA are implemented on a
chip, KF OSP-GSGA will outperform RHSP-GSGA.
12.8 Conclusions
When N-FINDR is implemented, its main purpose is to find endmembers. It seems

that no attention has been paid to finding SV correctly, effectively, or efficiently. As
a matter of fact, there are some issues involved in calculating DSV addressed in
Chap. 2. Since the matrix determinant used to calculate DSV is generally ill-ranked,
DSV must be calculated by either performing DR to reduce a nonsquare matrix to a
square matrix or using SVD to find nonzero eigenvalues. Unfortunately, in most
cases, neither will provide true SVs. This leads to the development of SGA, which
grows simplexes one vertex at a time to avoid computational problems. However,
SGA only reduces the computational time for calculating DSV and does not really
address DSV computational issue because it still makes use of ill-rank matrix
determinants to calculate DSV, which results in untrue SVs. For this reason, SGA
is referred to as a determinant-based SGA (DSGA) to emphasize the use of
determinant by SGA to calculate DSV. To further resolve the use of calculating
determinants of ill-ranked matrices, Chap. 11 develops an OP-based SGA
(OPSGA) to completely change the traditional wisdom in calculating DSV via
matrix determinants by converting SV computation to finding OPs. In other words,
it considers a hyperplane that is linearly spanned by previously found endmembers.
The next new endmember to be generated should be one that yields the maximal OP
onto the hyperplane. Then finding the maximal OP turns out to be the well-known
ATGP, which finds the maximal OP via an OSP. A similar idea was also noted in
Wang et al. (2013a, b, c), who used maximal distance instead of OP to find
endmembers. This chapter developed yet another new approach from a geometric
point of view. Instead of calculating volumes of simplexes formed by endmembers
by finding the maximal OP as OPSGA does, the chapter developed GSGA, which
actually finds simplex edge vectors derived from endmembers rather than finding
endmembers as simplex vertices, as is done by DGSA, OPSGA, and Dis-SGA. In
doing so, GSGA transforms a DSV calculation into a GSV calculation by finding
the product of the simplex base multiplied by the simplex height. Accordingly,
GSGA finds endmembers that are exactly those producing successive maximal
heights of growing simplexes. To accomplish this goal, GSOP is used to find
maximal heights as opposed to OPSGA, which uses OSPs to find the maximal
OP. There are also three key differences between OPSGA and GSGA. One is their
use of initial conditions. Since OPSGA finds endmembers as vertices, its initial
condition can start off with one endmember as a single-vertex simplex. In contrast,
SVA
N-FINDR MVT SGA ATGP
SQ N-FINDR SC N-FINDR RT-SGA Dist-SGA GSGA OPSGA
KF OSP-GSGA KF OSP-GSGA RHSP- RHSP-

GSGA OPSGA
Fig. 12.13 Diagram of evolution of SV calculation
GSGA calculates GSV through a base formed by simplex edge vectors and the
height. Thus, it must start with a two-vertex simplex as its initial condition because
a degenerated simplex with a simplex edge vector is a simplex connected by two
endmembers. As a consequence, GSGA-produced simplexes have nothing to do
with the spatial locations of simplexes it finds, but OPSGA does. A second
difference is that OPSGA suppresses data sample vectors via OSPs, while GSGA
does not because GSGA orthogonalizes only height vectors via GSOP. Lastly but
most importantly is finding endmembers. Assume that at the ( j + 1)th stage we
j
already found j endmembers, fmk gk¼1 , and need to find the next endmember, the ( j
+1)th endmember, m j+1 , to form a new ( j + 1)-vertex simplex,
S m1 ; m2 ; . . . ; mjþ1 . There are three approaches to finding such an mj+1. The
first one is DSGA, which finds an mj+1 that yields the maximal DSV by calculating
the matrix determinant n specified o by (12.1). A second one is OPSGA, which finds mj
⊥

+1 via mjþ1 ¼ arg maxr PUj r specified by (11.1), where Uj ¼ m1 m2 mj . A
j
third approach is GSGA, which finds the maximal height via (12.12), where the
j
simplex formed by fmk gk¼1 , S(m1, m2, . . ., mj) is considered a base of

S m1 ; m2 ; . . . ; mjþ1 and height of mj+1. Because they avoid calculating DSV by
matrix determinants, OPSGA and GSGA turn out to be the best SGA-based
algorithms reported in the literature and outperform Wang et al.’s Dist-SGA in
the sense of having the least computational complexity and lowest computer
processing time and produces true SVs.
The diagram of Fig. 12.13 summarizes all the SVA-based approaches, MVT
and N-FINDR, DSGA, ATGP, GSGA and RHSP-GSGA, and OPSGA and RHSP-
OPSGA in Chap. 11, along with the Dist-SGA developed by Wang et al. (2013a, b, c).
Part IV
Sample Spectral Statistics-Based
Recursive Hyperspectral Band Processing
By far, all the major hyperspectral imaging algorithms described in this book have
been redesigned for implementation as real-time processing algorithms in Part II.
Specifically, Part II studied the band-interleaved-pixel/sample/line (BIP/BIS/BIL)
data acquisition format for hyperspectral target detection that uses a causal sample
correlation/covariance matrix (CSCRM/CSCVM) to perform real-time detection.
At this point we have not taken up the issue of designing and developing algorithms
according to the other data acquisition format, the band-sequential (BSQ) format,
which collects data band image by band image, shown in Fig. 1.2 and reproduced in
Fig. 4.1, where (x, y) indicates the spatial coordinate of a data sample vector or pixel
vector and λ is a parameter used to specify spectral bands.
Fig. 4.1 Hyperspectral imagery collected by BSQ format

398 Part IV Sample Spectral Statistics-Based Recursive Hyperspectral Band Processing
To understand the difference between the BIS/BIP/BIL and BSQ formats, we

consider the difference between two image processes, sequential and progressive.
In general, sequential image processing processes an 8-bit image using 8-bit
grayscales to fully process each of its pixels sequentially pixel by pixel without
having to revisit the pixels. An example is downloading images directly from a Web
site. By contrast, progressive image processing using 8-bit grayscales processes
each pixel of an 8-bit image bit by bit. As a result, each image pixel is revisited over
and over again and is processed eight times. An example is the bit plane coding for
image enhancement and compression, where Fig. 4.2 shows an 8-bit grayscale
image encoded in Fig. 4.2a and decoded in Fig. 4.2b by bit plane coding [see
Chap. 3 in Gonzalez and Woods (2007)].
Thus, according to the preceding descriptions, a sequential image process is
actually a BIS/BIP/BIL process, whereas a progressive image process is a BSQ
process. Therefore, analogous to 8-bit plane coding, which can encode and decode
an 8-bit grayscale image progressively by increasing the grayscale resolution bit by
bit gradually, we can also design progressive hyperspectral band processing
(PHBP) that processes data progressively by increasing band resolution gradually
band by band according to the BSQ format. Part IV is devoted to extending real-
time constrained energy minimization (Chap. 5) and real-time anomaly detection
(Chap. 6) in Part II to their progressive and recursive band processing counterparts.
Part IV starts with active target detection, Chap. 13, “Recursive Hyperspectral
Band Processing for Active Target Detection: Constrained Energy Minimization,”
and passive target detection, Chap. 14, “Recursive Hyperspectral Band Processing
for Passive Target Detection: Anomaly Detection.”
Fig. 4.2 8-bit grayscale image encoded and decoded by bit plane image coding. (a) Progressive
8-bit grayscale processing of bit plane coding from least significant bit (bit plane 1) to most
significant bit (bit plane 8) (Gonzales and Woods 2008). (b) Progressive 8-bit grayscale recon-
struction processing from the two, three, and four most significant bit planes (bit planes 8, 7, 6, 5)
to 8-bit image (Gonzales and Woods 2007)
Chapter 13
Recursive Hyperspectral Band Processing for
Active Target Detection: Constrained Energy
Minimization
Abstract Chapter 5 extends constrained energy minimization (CEM) to a

real-time processing version of CEM, called real-time CEM (RT CEM) that allows
CEM to process data according to a band-interleaved-by-pixel/sample (BIP/BIS)
data acquisition format sample by sample recursively in real time. This chapter
extends CEM to another type of real-time implementation of CEM from a band-
sequential (BSQ) format perspective to recursive hyperspectral band processing
(RHBP) of CEM (RHBP-CEM) that can perform CEM according to BSQ band by
band recursively in real time. In doing so we introduce a new concept, called causal
band correlation matrix (CBCRM), which is a correlation matrix formed by only
those bands that were already visited up to the band currently being processed but
not bands yet to be visited in the future, to replace the global sample correlation
matrix R so that CBCRM can be updated band by band in real time. The RHBP-
CEM presented in this chapter allows CEM to perform target detection progres-
sively and recursively whenever bands are available without waiting for the com-
pletion of band collection. With such an advantage RHBP-CEM has potential in
data transmission and communication, specifically in satellite data processing.
13.1 Introduction
Subpixel detection is generally referred to as detection of a material substance of

interest that has its spatial presence extent within a single pixel. Specifically, when
the target size is smaller than pixel size, it is then called a subpixel target. Two
major causes result in a subpixel issue. One is insufficient spatial resolution. As a
result, many material substances may partially occupy a single pixel. This is
particularly true for multispectral images, which are generally acquired by tens of
discrete wavelengths with a spatial resolution ranging from 20 to 30 m. Under such
a circumstance material substances are usually mixed all together in a single pixel,
which results in a mixed pixel. As a consequence, it requires subpixel detection to
identify target substances involved with such mixing. The other cause is the high
spectral resolution resulting from hyperspectral imaging sensors using hundreds of
contiguous wavelengths where a target substance embedded in a single pixel can be

DOI 10.1007/978-3-319-45171-8_13
400 13 RHBP for Active Target Detection: Constrained Energy Minimization
uncovered as a subpixel target. This is especially important in hyperspectral data

exploitation because subpixel targets provide crucial information for image ana-
lysts. In either case, subpixel detection plays a key role in image interpretation.
Though many target detection algorithms have been proposed in the past (Chang
2003), the constrained energy minimization (CEM) developed by Harsanyi (1993)
remains one of the most effective techniques and has shown great success in
subpixel detection. Its idea was originally derived from Frost (1972) and since
then has been studied extensively in Chang (2002, 2003a), where the details of
CEM are described in Chap. 5. CEM assumes that there is a desired signature
specified by d. It then uses this designated signature to custom-design a finite
impulse response (FIR) filter to pass data sample vectors matched by d through a
constraint while minimizing the least-squares error caused by unmatched data
sample vectors. One of the major strengths resulting from CEM is that no prior
knowledge is required for data processing other than the desired signature d. This
advantage is very significant since so many signal sources cannot be either identi-
fied or inspected visually owing to the fact that these substances are very likely to be
extracted unknowingly by a sensor. Finding these signal sources is either impossi-
ble or extremely difficult. Using the specified signature d allows users to extract
targets of interest without knowing the background a priori. As a matter of fact,
CEM inverts the sample correlation matrix R so as to perform background sup-
pression prior to the extraction of the desired signature d. With background
suppression by R1 followed by a matched filter using the signature d as its
matched signal, CEM not only increases target contrast against background but
also performs very effectively in enhancing target detectability via a matched filter.
Interestingly, using the inversion of R has an effect similar to that from using
orthogonal subspace projection (OSP) in signal detection, as discussed in Chap. 5
(Chang 2016). Recently, a real-time CEM is further proposed in Chang et al.
(2001a). Chapter 5 also derives another approach by adapting R to incoming data
samples to further make CEM a real-time processing algorithm as data are being
collected according to the BIS/BIP format so that CEM can also be carried out
sample by sample at the same time. This chapter takes an approach that is rather
different from a band processing point of view by adapting both R and d to vary
bands instead of varying data sample vectors. It is called progressive hyperspectral
band processing of CEM (PHBP-CEM) and implements CEM progressively band
by band as each new band is received. This type of data processing is derived from
the need for a band-sequential (BSQ) format (Schowengerdt 1997), where remotely
sensed images are acquired and processed band by band.
The proposed PHBP-CEM is quite different from traditional band selection
(BS) in several aspects. First, BS generally requires prior knowledge about the
number of bands, nBS, to be selected, but PHBP-CEM does not. Specifically, it can
process CEM whenever bands are available. Second, to avoid repeatedly solving
BS optimization problems as the value of nBS changes, band prioritization (BP) is
developed for BS to rank all bands according to the significance of their contained
information (Chang et al. 2011d; Chang 2013). PHBP-CEM does not need BP since
bands can come in any order and PHBP-CEM performs CEM whenever a new band
13.2 Recursive Equations for Calculating the Inverse of a Casual Band. . . 401
is received. Third, to make BS more effective, band decorrelation is included to

remove interband redundancy, so that highly correlated bands will not be selected.
For PHBP-CEM there is no need for band decorrelation. It simply uses up all the
bands available at the time it is processed.
To process PHBP-CEM in real time, a new concept, called causal band corre-
lation matrix (CBCRM), which varies band by band, must be introduced. It is a
sample correlation matrix formed by all bands that are already visited and
processed up to the band currently being processed. For example, assume that Bl
l1
is the current band to be processed and Bj j¼1 are previously processed bands. The
CBCRM is a sample correlation matrix formed by all data sample vectors provided
l L
by Bj j¼1 but not data sample vectors in any band Bj j¼lþ1 yet to be visited in the
future, where L is the total number of bands. Such causality is derived from the
causal Wiener filtering (Poor 1994), where only data samples up to the sample
currently being processed can be used to process a Wiener filter. By virtue of the
CBCRM, PHBP-CEM performs a progressive process of CEM one band at a time in
a band-causal manner in real time. In other words, band-causal processing can be
defined as data processing that uses only bands already collected but not those
bands yet to be received in the future. Since CBCRM must be recalculated every
time a new band is received, its computational complexity is exceedingly high. In a
manner similar to how a Kalman filter is derived from a causal Wiener filter in Poor
(1994), a recursive innovation information update equation for PHBP-CEM is
derived to calculate the CBCRM recursively in real time where only new-band
information is required for updating. The resulting PHBP-CEM is referred to as
recursive hyperspectrral band processing of CEM (RHBP-CEM), where this
Kalman filtering-like recursive equation is a key to making it possible to implement
PHBP-CEM recursively in real time and causality in the sense of band processing.
13.2 Recursive Equations for Calculating the Inverse

of a Casual Band Correlation Matrix
T
Let ri ðlÞ ¼ r i1l ; r i2l ; . . . ; r ill be an l-dimensional data sample vector consisting of
data samples acquired by the first l bands in the ith data sample vector ri, where rij is
an image pixel acquired by the jth spectral band for 1 j l L. Assume that the
lth band, Bl, is the band image currently being processed and Ωl1 ¼ fBi gl1 i¼1 are all
the l 1 band images that have been visited. We now introduce a new concept, the
XN
causal band sample correlation matrix, defined by Rll ¼ ð1=N Þ i¼1 ri ðlÞ½ri ðlÞT ,
N
which is formed by image pixel vectors fri ðlÞgi¼1 using image pixels in all
previously visited band images, Ωl1 ¼ fBi gl1 , up to the current lth band image,
i¼1
N
Bl. Note that for each band image Bj there is a total of N pixels, denoted by r ij i¼1,
where rij is the ith pixel in the jth band image.
l T

We first define xðlÞ ¼ r 1ll ; r 2ll ; . . . ; r Nl as an N-dimensional data sample vector
N
made up of data samples of fri gi¼1 in the lth band image. We also let Xl ¼
N
r1 ðlÞ r2 ðlÞ rN1 ðlÞrN ðlÞ be the data matrix formed by fri ðlÞgi¼1 given by
2 3
l
r 11 r ðlN1Þ1 l
r N1
6 7
6 7
6 ⋮ ⋱ ⋮ ⋮ 7
6 7
Xl ¼ ½r1 ðlÞr2 ðlÞ rN1 ðlÞrN ðlÞ ¼ 6 7; ð13:1Þ
6rl r ðlN1Þðl1Þ r ðN1Þ1 7
l
6 1ðl1Þ 7
4 5
r 1ll r ðlN1Þl l
r Nl
T
where ri ðlÞ ¼ r i1l ; r i2l ; . . . ; r ilðl1Þ ; r ill . In this case, Xl can be reexpressed as

X
l T
Xl ¼ Tl1 , where xðlÞ ¼ r 1ll ; r 2ll ; . . . ; r Nl is an N-dimensional data sample
x ðl Þ
N
vector in the last row in (13.1) consisting of all data sample components of fri gi¼1
in the lth band image. Then we define causal band correlation matrix (CBRCM),
denoted by Rll , as
" #
Xl1
Rll ¼ ð1=N ÞXl XlT ¼ ð1=N Þ T
Xl1 xðlÞ
xT ð l Þ
" # ð13:2Þ
T
Xl1 Xl1 Xl1 xðlÞ
¼ ð1=N Þ :
xT ðlÞXl1
T
xT ðlÞxðlÞ
Using a matrix identity in Settle (1996) and Chang (2003), we derive band-by-band
update equations for calculating the inverse of CBRCM, Rll , as follows:
" #1
T 1 UT U UT d
M M ¼
dT U dT d
2 1 T 3 ð13:3Þ
UT U þ βU# ddT U# βU# d
¼4 T
5;
βdT U# β
1 n h 1 i o1 T ⊥ 1

where U# ¼ UT U UT and β ¼ dT I U UT U UT d ¼ d PU d .
1
Now, we let U ¼ Xl1 T
, d ¼ xðlÞ and obtain X#l1 ¼ Xl1 Xl1 T
Xl1 ,
n h i o 1 1
1
β ¼ xT ðlÞ I Xl1 T T
Xl1 Xl1 Xl xðlÞ ¼ xT ðlÞP⊥ T xð l Þ
Xl1
, P⊥ T
Xl1
¼ In
T
T
1
Xl1 Xl1 Xl1 Xl1 , and
13.3 Recursive Band Processing of CEM 403
" #1
T
Xl1 Xl1 Xl1 xðlÞ
xT ðlÞXl1
T
xT ðlÞxðlÞ
2 1 1 h 1 iT 1 3
X XT þ β Xl1 Xl1
T
Xl1 xðlÞ Xl1 Xl1
T
Xl1 xðlÞ β Xl1 Xl1
T
Xl1 xðlÞ
6 l1 l1 7
¼4 h iT 5:
1
βxT ðlÞ Xl1 Xl1T
Xl1 β
ð13:4Þ
Then the inverse of Rll can be calculated by

" #1

T 1
Xl1 Xl1T
Xl1 xðlÞ
R1
ll ¼ ð1=N ÞXl Xl ¼N
xT ðlÞXl1T
xT ðlÞxðlÞ
2 1 h 1 iT 3
1
ð1=N ÞXl1 Xl1 T
Xl1 xðlÞ ð1=N ÞXl1 Xl1 T
Xl1 xðlÞ
6 ð1=N ÞX X T 1 þ ð1=N Þ ð1=N ÞXl1 Xl1 Xl1 xðlÞ 7
T
6 l1 l1 7
6 xT ðlÞP⊥ T xðlÞ xT ðlÞP⊥ T xðlÞ
7
6 Xl1 Xl1 7
¼66 h i
7
7
6 1 T
7
6 ð1=N ÞXl1 Xl1 T
Xl1 xðlÞ 1 7
4 ⊥
N x ðlÞPX T xðlÞ
T 5
xT ðlÞP⊥ T
Xl1
x ð lÞ l1
2 h iT 3
Rð1 1
l1Þðl1Þ Xl1 xðlÞ Rðl1Þðl1Þ Xl1 xðlÞ R1
ðl1Þðl1Þ Xl1 xðlÞ
6 R1 7
6 ðl1Þðl1Þ þ ð1=N Þ 7
6 xT ðlÞP⊥ T x ðl Þ xT ðlÞP⊥ T xðlÞ 1 7
6 Xl1 Xl1 7
¼66 ⊥
T 7:
h iT 7
6 1
Rðl1Þðl1Þ Xl1 xðlÞ
l1
7
6 7
4 5
⊥
xT ðlÞPX T xðlÞ
l1
ð13:5Þ
According to (13.5), three types of information—processed information

obtained by past information provided by visited data sample vectors, innovation
information, and input information—are used to update R1
ll in (13.5).
1. Past available information and processed information:

N
Xl-1 (i.e., fri ðl 1Þgi¼1 ), Rðl1Þðl1Þ ;
R1
ðl1Þðl1Þ :
1
Xl1 xðlÞ, xT ðlÞP⊥
XT
xð l Þ :
l1
3. Input information:
l T

xðlÞ ¼ r 1ll ; r 2ll ; . . . ; r Nl :
13.3 Recursive Band Processing of CEM
Using the recursive formula of R1

ll derived by (13.5) in Sect. 13.2, we can further
derive the RHBP-CEM subpixel detector, δRBPCEM ðrðlÞÞ, as follows:
δRBPCEM ðrðlÞÞ ¼ κ l dT ðlÞðRll Þ1 rðlÞ

¼ κ l ðdðl 1Þ, d l ÞT
2 1 T 3
RT þ ð1=N Þβ R1 Xl1 xðlÞ R1 ðl1Þðl1Þ Xl1 xðlÞ β R1 Xl1 xðlÞ
6 ðl1Þðl1Þ l ðl1Þ ðl1Þðl1Þ l ðl1Þ ðl1Þðl1Þ 7
4 T 5

β 1
xðlÞ Rðl1Þðl1Þ Xl1 xðlÞ Nβ
l ðl1Þ l ðl1Þ

rð l 1Þ
rl
1
¼ κ l dT ðl 1Þ Rðl1Þðl1Þ rðl 1Þ þ ð1=N Þκ l β dT ðl 1Þνlðl1Þ νlTðl1Þ rðl 1Þ
l ðl1Þ
κ l d l β ν T rðl 1Þ κ l β dT ðl 1Þνlðl1Þ r l þ Nκ l d l β rl
l ðl1Þ lðl1Þ l ðl1Þ l ðl1Þ

¼ ðκ l =κl1 ÞδRBPCEM ðrðl 1ÞÞ þ ð1=N Þκ l β dT ðl 1Þνlðl1Þ Nd l νlTðl1Þ rðl 1Þ Nr l ;
l ðl1Þ
ð13:6Þ
1
where κl ¼ dT ðlÞR1
ll dðlÞ is a scalar, ν ¼ R1
ðl1Þðl1Þ Xl1 xðlÞ, and βlðl1Þ
l ðl1Þ
n h i o1
¼ xT ð l Þ P⊥
XT
xðlÞ ;
l1
κ l ¼ dT ðlÞðRll Þ1 dðlÞ

¼ ðdðl 1Þ, dl ÞT
2 1 T 3
RT þ ð1=N Þβ R1 Xl1 xðlÞxT ðlÞ R1 ðl1Þðl1Þ Xl1 β R1 Xl1 xðlÞ
6 ðl1Þðl1Þ l ðl1Þ ðl1Þðl1Þ l ðl1Þ ðl1Þðl1Þ 7
4 T 5
β xT ðlÞ Rð1
lðl1Þ
l1Þðl1Þ X l1 Nβ
l ðl1Þ

dðl 1Þ
dl
1
¼ dT ðl 1Þ Rðl1Þðl1Þ dðl 1Þ þ ð1=N Þβ dT ðl 1Þν ν T dðl 1Þ
l ðl1Þ l ðl1Þ lðl1Þ
d l β νlTðl1Þ dðl 1Þ β dT ðl 1Þν dl þ Nd l β dl
l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ
2
¼ κ l1 þ ð1=N Þβ dT ðl 1Þν Nd l ;
l ðl1Þ l ðl1Þ
ð13:7Þ
where kl can be calculated by updating kl1. Of course, kl in (13.6) can be calculated

1
directly by κl ¼ dT ðlÞR1
ll dðlÞ since d is known a priori.
By taking advantage of (13.5), δRBPCEM ðrðlÞÞ is easily updated by
δRBPCEM ðrðl 1ÞÞ as follows:
δRBPCEM ðrðlÞÞ
¼ ðκ l =κ l1 ÞδRBPCEM ð r ð l 1Þ Þ
þ ð1=N Þκ l β d ðl 1Þν
T Ndl νT
rðl 1Þ Nr l ;
l ðl1Þ l ðl1Þ l ðl1Þ
ð13:8Þ
T
where r(l ) is an l-dimensional data sample vector given by rðlÞ ¼ ðr n
1 ; r 2 ; . .h. ; r l Þ i,
T
, ν
1
κl ¼ d ðlÞR1ll dðlÞ ¼R1
ðl1Þðl1Þ Xl1 xðlÞ, and β ¼ xT ðlÞ P⊥ XT
l ðl1Þ l ðl1Þ l1
1
xðlÞg .
Interestingly, from a statistical signal processing point of view, the CEM

specified by (13.7) actually performs as if it is a Kalman filter, with the key
difference that only a measure equation also known as output equation (13.7) is
needed, and there is no need for a state equation in CEM. More specifically, the
abundance fractional amount detected by CEM using l bands, δRBPCEM ðrðlÞÞ , is
initiated by the first band information and then updated recursively through three
types of information: new information from the newly received lth band; processed
information produced using the first l 1 bands that have been already visited,
including the abundance fractional amount detected by CEM, δRBPCEM ðrðl 1ÞÞ;
and innovation information provided by the correlation between the lth band and
the (l 1)th bands, each of which is described in detail as follows.
d(1), r(1), R(1) ¼ r(1)rT(1).
N
2. Input information from all data sample vectors: fr il gi¼1 in the lth band Bl to form
T 1
1
x(l ) and κl ¼ d ðlÞRll dðlÞ .
3. Available processed information by (l 1) bands: r(l 1), R1 ðl1Þðl1Þ , Xl1,
δRBPCEM ðrðl 1ÞÞ,
1
P⊥
XT
¼ IN Xl1
T T
Xl1 Xl1 Xl1
l1
¼ IN NXl1
T
R1
ðl1Þðl1Þ Xl1 :
n h i o1
β ¼ xT ðlÞ P⊥ T x ð l Þ ,
l ðl1Þ Xl1
ν ¼ R1
ðl1Þðl1Þ Xl1 xðlÞ:
l ðl1Þ
Two real hyperspectral image scenes were used for experiments to conduct a
performance evaluation of CEM.
13.4.1 HYDICE Data
The image data to be studied are from the HYperspectral Digital Imagery
Collection Experiment (HYDICE) image scene shown in Fig. 13.1a (also shown
in Fig. 1.10a), which has a size of 64 64 pixel vectors with 15 panels in the scene
a b
c
7000
P1
6000 P2
P3
P4
5000 P5
Radiance
4000
3000
2000
1000
0
0 20 40 60 80 100 120 140 160 180
Band
Fig. 13.1 (a) HYDICE panel scene containing 15 panels. (b) Ground truth map of spatial
locations of the 15 panels. (c) Spectra of p1, p2, p3, p4, and p5
and the ground truth map in Fig. 13.1b (Fig. 1.10b). Figure 13.1c plots the five panel
spectral signatures obtained from Fig. 13.1b, where the ith panel signature, denoted
by pi, was generated by averaging the red panel center pixels in row i, as shown in
Fig. 13.1c (also shown in Fig. 1.11). These panel signatures will be used to
represent target knowledge of the panels in each row.
The image was acquired by 210 spectral bands with a spectral coverage from 0.4
to 2.5 μm. Low signal/high noise bands, bands 1–3 and 202–210, and water vapor
The panel signatures in Fig. 13.1c were used as required prior target knowledge
for the desired target signatures by CEM. Figure 13.2a–e shows that RHBP-CEM
detected amounts of the 19 R panel pixels, p11, p12, p13, p211, p221, p22, p23, p311,
p312, p32, p33, p411, p412, p42, p43, p511, p521, p52, p53, with the desired spectral
signature d specified by one of the 5 panel signatures, p1, p2, p3, p4, p5, in Fig. 13.1c
respectively as the band number is progressively increased from 1 to 169. The
y-axis is the detected amount and the x-axis is the number of bands used to perform
RHBP-CEM. The displayed detected amounts correspond to the δRBPCEM ðrðlÞÞ for
the specified panel pixels. Each value was normalized so that the minimum value is
zero and the maximum value is one. Thus, the pixel with the strongest response to
the algorithm will have a value of one. Such δRBPCEM ðrðlÞÞ-generated abundance
fractional amounts are referred to as normalized abundance values.
According to Fig. 13.2, it is clear that over 75 % of all 19 R panel pixels located
in the first two columns required less than 50 bands to reach similar abundance
fractions with few changes detected by CEM using additional new bands beyond
50. Of particular interest are panel pixels p411, p412 and p511, p521 where p411 and
p412 needed more than 50 bands to reach values consistent with CEM, while p511
and p521 required less than half of 50 bands. To further demonstrate the differential
detected amounts by two consecutive band numbers, Fig. 13.3a–e shows the
difference in the detected abundance fractions of the 19 R panel pixels with the
desired signature d specified by one of the five panel signatures p1, p2, p3, p4, p5 as
the band number l is increased from 1 to 169. This can be written mathematically as
the difference between the detected abundance fractions by δPBPCEM ðrðlÞÞ and
δPBPCEM ðrðl 1ÞÞ, that is, δRHBP-CEM(r(l )) δRHBP-CEM(r(l 1)), where the
detected amounts were the newly added detected abundance fractions provided
by the lth band, Bl, minus the detected amounts from the previous (l 1)
l1
bands. This updates the results from the first (l 1) bands Bj j¼1 to the first
l
l bands, Bj j¼1 .
The plots in Fig. 13.3 provide a better picture of dynamic changes in the detected
amounts of the 19 R panel pixels as the band number is progressively processed. In
particular, these plots can also be used to observe fluctuation and saturation in the
detected amounts, which showed that there was no need to use the complete set of
bands to perform CEM. As a matter of fact, this could easily have been done with
less than one-third of the total number of bands. Although panel pixel p212 is also of
interest, its experiment is not included because it is not an R center panel pixel, as
shown in Fig. 13.1b. To have an even better view of RHBP-CEM detected abun-
dance fractional maps in Figs. 13.2 and 13.3, we show their three-dimensional
(3D) progressive plots of band-varying RHBP-CEM detection maps in Figs. 13.4,
13.5, 13.6, 13.7, and 13.8, where the x-, y-, and z-axes are specified by the 19 R
panels plus a yellow panel pixel, p212, which shows the number of bands being used
to process CEM and the detected abundance fractions.
Since we have the ground truth of the 19 R panel pixels, we can further calculate
their detection rates via a receiver operating characteristic (ROC) analysis for a
performance evaluation (Poor 1994), that is, the detection of panel pixels p11, p12,
p13 using p1, the detection of panel pixels p211, p221, p22, p23 using panel signature
1 1 1
0.8 0.8 0.8
0.6 0.6 0.6

0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
0 50 100 150 0 50 100 150 0 50 100 150
Bands Processed Bands Processed Bands Processed
panel pixel p11 panel pixel p12 panel pixel p13
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 50 100 150 0 50 100 150
Bands Processed Bands Processed
panel pixel p211 panel pixel p221
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 50 100 150 0 50 100 150
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 50 100 150 0 50 100 150
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 50 100 150 0 50 100 150
Fig. 13.2 Normalized values of 19 R panel pixels in Fig. 13.1b specified by the five panel
signatures p1, p2, p3, p4, p5 by RHBP-CEM
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 50 100 150 0 50 100 150
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 50 100 150 0 50 100 150
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 50 100 150 0 50 100 150
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 50 150 0 50 100 150
p2, the detection of panel pixels p311, p312, p32, p33 using panel signature p3, the
detection of panel pixels p411, p412, p42, p43 using panel signature p4, and the
detection of panel pixels p511, p521, p52, p53 using panel signature p5. To include
the band number as a parameter to see the progressive detection performance of
RHBP-CEM on these 19 R panel pixels band by band, we do not follow the
traditional approach to plotting an ROC curve of detection probability, PD, versus
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
-0.2 -0.2 -0.2
-0.4 -0.4 -0.4

0 50 100 150 0 50 100 150 0 50 100 150
Bands Processed Bands Processed Bands Processed
panel pixel p11 panel pixel p12 Panel p13
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
0 50 100 150 0 50 100 150
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
0 50 100 150 0 50 100 150
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
0 50 100 150 0 50 100 150
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
0 50 100 150 0 50 100 150
Fig. 13.3 Difference in normalized values of the 19 R panel pixels in Fig. 13.1b specified by the
five panel signatures p1, p2, p3, p4, p5 between δRHBP-CEM(r(l )) and δRHBP-CEM(r(l 1))
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
0 50 100 150 0 50 100 150
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
0 50 100 150 0 50 100 150
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
0 50 100 150 0 50 100 150
0.4 0.4
0.2 0.2
0 0
-0.2 -0.2
-0.4 -0.4
0 50 100 150 0 50 100 150
the false alarm probability, PF. Instead, we choose the common practice used in
medical diagnosis to calculate the area under an ROC curve, referred to as the area
under curve (AUC) in Chang et al. (1998b). When such an AUC is used, an
alternative ROC curve can be plotted in terms of AUC on the y-axis versus band
number on the x-axis. Figure 13.9 plots five ROC curves of (number of bands, area
Fig. 13.4 3D progressive plots of band-varying RHBP-CEM detection maps of 19 R panel pixels
plus p212 specified by the signature, p1
under curve (AUC)) for the progressive band-varying performance of RHBP-CEM

in the detection of the 19 R panel pixels using five panel signatures in Fig. 13.1c as
l
desired target signatures, where Bj j¼1 are progressively included in the process of
(13.4), with l being increased from 1 to 169.
According to Fig. 13.9, at least 50 bands were required to achieve reasonable
performance in detecting the panel pixels in row 1,but only 40 bands were required
to detect all other R panel pixels in rows 2–5 with AUC close to one. Nevertheless,
Fig. 13.9 demonstrates the fact that in order for CEM to be effective, there is no
need to use full bands. As a matter of fact, less than one-third of the bands are
sufficient for good CEM. In addition, the plots in Fig. 13.9 also provide information
on how a band affects CEM detection performance. This is particularly interesting
in Fig. 13.9a, where the progressive band-varying detected abundance fractions had
ups and downs between 8 and 50. Such valuable information is not offered by
detection techniques that use full bands to process data.
13.4.2 Hyperion Data
The data to be studied were collected by the Hyperion sensor mounted on the
Earth Observer 1 (EO-1) satellite. Hyperion uses a high-resolution hyperspectral
imager to record Earth surface images of approximately 7.5 100 km with a
30 m spatial resolution and a 10 nm spectral resolution. The image scene
EO1H0150322011201110K3 was downloaded from the USGS Earth Explorer
Web site (http://earthexplorer.usgs.gov/) and is shown in Fig. 1.13. The image
used is a Hyperion L1 data product that includes 198 channels of calibrated
spectral information. This image is reproduced in Fig. 13.10 and was then
cropped to select a region of interest (ROI) covering the western side of the
Chesapeake Bay Bridge. The resulting image cube is a square with spatial
dimension of size 64 64 and 198 spectral bands ranging from 426 to
2396 nm. The image was then compared with higher-resolution areal imagery
to identify five distinct areas of interest (AOIs). The areas include a beach,
Westinghouse Bay, Mezick Ponds, farmland, and a large corporate building.
Four pixels were selected at random from each of these areas and used to generate
an average spectral signature for the material contained in the area, with the
exception of the farmland, which used seven pixels owing to the fact that it is
composed of four disjoint regions in the image where four pixels were selected
from the largest farmland region and a single pixel was chosen from the
remaining three regions. A color image of the region and the spectral signatures
of each of the five AOIs are shown in Fig. 13.10.
Using the spectral signatures of five AOIs plotted in Fig. 13.10c as the desired
target signatures for CEM, Fig. 13.11a–e shows detected amounts of five AOIs, a
beach, Westinghouse Bay, Mezick Ponds, farmland, and a corporate building, by
a b
1 1
0.95 0.95
0.9
0.9
0.85
0.85 0.8
AUC
AUC
0.8 0.75
0.75 0.7
0.65
0.7
0.6
0.65 0.55
0.5
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180
Band Band
c 1
d
1
0.95 0.95
0.9 0.9
0.85 0.85
0.8 0.8
AUC
0.75
AUC
0.75
0.7 0.7
0.65 0.65
0.6 0.6
0.55 0.55
0.5 0.5
0 20 40 60 80 100 120 140 160 180 0 20 40 60 80 100 120 140 160 180
Band Band
e
1
0.95
0.9
0.85
0.8
AUC
0.75
0.7
0.65
0.6
0.55
0.5
0 20 40 60 80 100 120 140 160 180
Band
Fig. 13.9 ROC curves of (number of bands, area under curve (AUC)) for band-varying progres-
sive detection performance of RHBP-CEM. (a) ROC curve for panels in row 1. (b) ROC curve for
panels in row 2. (c) ROC curve for panels in row 3. (d) ROC curve for panels in row 4. (e) ROC
curve for panels in row 5
RHBP-CEM as the band number is progressively increased from 1 to 198, where

the y-axis is the detected amount and x-axis is the number of bands used to perform
CEM. Each of these images includes the response due to a randomly selected pixel
from within each of the five AOIs. While the pixels were selected randomly, they
were compared to the locations chosen for the desired spectral signatures to match
a b
c
10000
Beach
9000 Westinghouse Bay
Mezick Ponds
8000 Farm Land
Corporate Building
7000
6000
5000
4000
3000
2000
1000
0
0 50 100 150
Fig. 13.10 (a) Cropped Hyperion image scene EO1H0150322011201110K3. (a) Cropped Hype-
rion image scene EO1H0150322011201110K3 with five areas of interest (AOIs): beach (dark
blue), Westinghouse Bay (green), Mezick Ponds (orange), farmland (light blue), and a corporate
building (magenta). (c) Average spectral signatures for each of the five AOIs
what they represented. The pixels used to generate the desired spectral signatures
were excluded, with the exception of the corporate building, which only consisted
of the four pixels used to generate the spectral signature. This allows a comparison
of the response for each region relative to each of the five corresponding signatures.
Unlike the HYDICE data, there is no complete ground truth about specific targets
other than the AOIs selected. In addition, the purity of the used desired signatures
d was smeared by the 30 m spatial resolution so that the CEM-detected abundance
fractions fluctuated. Nevertheless, analogous to HYDICE experiments, the plots in
Fig. 13.11 also indicated that there was no need to use the complete set of bands,
and in most cases less than 50 bands were required to perform CEM.
a b
1 Beach 1 Beach
Westinghouse Bay Westinghouse Bay
Mezick Ponds Mezick Ponds
Relative Abundance
Relative Abundance
Farm Land Farm Land

0.8 0.8
Corporate Building Corporate Building
0.6 0.6
0.4 0.4
0.2 0.2
0 0
50 100 150 50 100 150
c d
1 Beach 1 Beach
Westinghouse Bay Westinghouse Bay
Mezick Ponds Mezick Ponds
Relative Abundance
Relative Abundance
Farm Land Farm Land

0.8 0.8
Corporate Building Corporate Building
0.6 0.6
0.4 0.4
0.2 0.2
0 0
50 100 150 50 100 150
e
1
Relative Abundance
0.8 Beach
Westinghouse Bay
Mezick Ponds
0.6 Farm Land
Corporate Building
0.4
0.2
0
50 100 150
Bands Processed
Fig. 13.11 Normalized values of five AOIs: beach, Westinghouse Bay, Mezick Ponds, farmland,
corporate building by RHBP-CEM with desired material signatures. (a) d ¼ beach signature, (b)
d ¼ Westinghouse Bay signature, (c) d ¼ Mezick Ponds signature, (d) d ¼ farmland signature, and
(e) d ¼ corporate building signature
a b c d
e f g h
Fig. 13.12 Result of RHBP-CEM using Westinghouse Bay signature after receiving: (a) 1 band,
(b) 2 bands, (c) 10 bands, (d) 20 bands, (e) 30 bands, (f) 40 bands, (g) 60 bands, and (h) all bands
Since the plots in Fig. 13.11 do not illustrate how detection is performed by
CEM in a progressive manner, Fig. 13.12a–g visually depicts the progression in the
detected amounts of abundance for the entire image using the Westinghouse Bay
spectral signature for the desired signature d as the number of bands processes
increased. Similar results are also obtained for each spectral signature, with the
exception of the Mezick Ponds signature, which resulted in poor discrimination;
however, they are not included here. It is worth noting that plots similar to Fig. 13.9
were not applicable owing to a lack of specific ground truth in the AOIs.
Interestingly, as shown in Fig. 13.12, after 20 bands, CEM was able to outline
the bay area in Fig. 13.12d and to improve the detection results up to 40 bands in
Fig. 13.12e–g, and then its silhouette began to gradually deteriorate until full bands
were used (Fig. 13.12h). This experiment further demonstrates that progressive
band processing of CEM offers unique advantages regarding how a detected target
varies its detected abundance in a progressive manner. Since no ground truth is
provided by this Hyperion data, we are not able to perform ROC analysis for
quantitative study.
The computational time of RHBP-CEM was compared to that of CEM. Each
algorithm was run and executed in MATLAB R2012A with an Intel Core i7-3770
running at 3.4 GHz with 16 GB RAM on the HYDICE image five times to produce
an average computing time. While CEM required 34.3 ms to complete processing,
RHBP-CEM required 106.9 ms to process the first band and 380.2 ms to process all
bands. This is mainly due to the fact that CEM only calculates the global sample
correlation matrix R prior to its processing, and once it is done, it does not have to
do it again. By contrast, RHBP-CEM must recalculate the sample-varying correla-
tion matrix via a recursive equation specified by (13.4). Of course, RHBP-CEM
13.5 Graphical User Interface Design 419
requires more time for data processing than CEM. However, such a disadvantage is
offset by several benefits. One is progressive detection performance as bands are
progressively processed. Second, with such progressive performance we can learn
the impact of each band on CEM performance. Third, the recursive equation (13.4)
provides a feasible implementation of hardware design. Finally, RHBP-CEM can
begin as soon as the first band has been received, as opposed to CEM, which must
wait for data to be completed before it can calculate matrix R.
13.5 Graphical User Interface Design
A graphical user interface (GUI) with a screenshot shown in Fig. 13.13 was
developed using MATLAB’s Guide to aid in algorithm performance analysis.
The GUI allows the user to load different data sources for analysis. At the bottom
of the GUI are three image display windows. The images displayed from left to
right are a color image of the scene, a grayscale image of the last band processed,
and the current abundance image. Once the data are loaded, the user is given an
option to select a pixel from the image to be the target material d. Once the desired
pixel is selected, the user can start processing by clicking the start button. The
program then simulates data transmission using the MATLAB timer function. At
each timer tick, one complete band of data is considered to be received, and RHBP-
Fig. 13.13 Snapshot of developed GUI software for RHBP-CEM

CEM processes the newly received data. Upon completion of processing, the
abundance image and the most recently processed band images are updated to
allow the user to observe the results. A textbox indicating the current band is
displayed at the top center so that the user can easily tell how many bands have
been processed. The cycle repeats until all bands have been processed or the user
clicks on the stop button. If the processing is stopped, it may be continued from the
current band or reset to begin processing at the first band. The user may adjust the
simulated transmission rate by adjusting a slide bar in the upper left-hand corner of
the GUI. The range of data rates varies from 66 kbps to 66 Mbps and assumes the
data consist of 16 bit numbers. The user is also given an opportunity to observe the
spectral signatures of both the desired material and any pixel in the abundance
image. The two spectral signatures are located above their corresponding images.
Their corresponding pixels are highlighted by a red circle for the desired material
d and green for the observed pixel r. In the case of the desired material spectral
signature, the complete signature is displayed. In the case of the observed pixel
spectral signature, only the bands that have been received and processed are
displayed. Finally, the normalized value for the observed pixel is displayed in the
center of the GUI.
According to Fig. 13.13, the developed GUI performs RHBP-CEM whenever
new bands are added. The processing can be carried out band by band in real time.
13.6 Conclusions
This chapter presented a new look at CEM, called recursive hyperspectral band
processing of CEM (RHBP-CEM), which makes use of the causal band correlation
matrix (CBCRM) to adapt CEM to different bands. Since CBCRM varies with
bands, a recursive innovation information update equation was specifically derived
to avoid repeatedly recalculating CBCRM where only information provided by a
new incoming band is needed to be processed for updating results. As a result,
RHBP-CEM can be performed as if it were a real-time processing algorithm with
respect to data transmission and communication when ground stations receive data
sequentially from different bands. In this case, RHBP-CEM can be performed
instantly as the information in new bands is received. To further include its practical
value in applications, a GUI software program was also developed to allow users to
process CEM band by band progressively in real time according to a BSQ format.
Chapter 14
Recursive Hyperspectral Band Processing for
Passive Target Detection: Anomaly Detection
Abstract Anomaly detection (AD) is studied extensively in Chaps. 5 and 14–18 in

Chang (Real time progressive hyperspectral image processing: endmember finding
and anomaly detection, New York, 2016), where the main focus of AD is on the
design and development of AD algorithms for causal processing, which is a
prerequisite for real-time processing. Chapter 6 in this book makes use of causality
to further develop various real-time processing versions of AD so that AD can be
carried out according to the band-interleaved-pixel/sample (BIP/BIS) data acquisi-
tion format sample by sample recursively in real time. This chapter follows
Chap. 13 to look into the causality required for AD to be implemented in band
processing from a band-sequential (BSQ) format perspective rather than sample
processing, as described in Chap. 6 from a BIP/BIS data acquisition format per-
spective. Since anomalies are generally unknown and cannot be inspected by prior
knowledge, their presence can only be detected by an unsupervised means. Also,
because different anomalies respond to certain specific bands in terms of their own
unique spectral characteristics, finding anomalies via band processing becomes a
necessity. In particular, there may be anomalies that can be detected only in a small
range of spectral wavelengths but may be overwhelmed by the entire wavelength
coverage. In this case, detecting these anomalies using full band information may
be ineffective. The progressive profiles of band-by-band detection maps offer a
great value to image analysts for finding such anomalous targets. To address this
issue, progressive hyperspectral band processing of anomaly detection (PHBP-
AD), recently developed by Chang et al. (Progressive band processing of anomaly
detection. IEEE Journal of Selected Topics in Applied Earth Observations and
Remote Sensing 8(7): 3558–3571, 2015c), allows AD to be performed progres-
sively band by band so as to provide progressive detection maps of anomalies band
by band, a task that cannot be accomplished by any AD reported in the literature
using full band information. Similar to the causal sample correlation matrix
(CSCRM) introduced in Chaps. 5 and 6, we also introduce a new concept, causal
band correlation matrix (CBCRM), to replace the global sample correlation matrix
R. Like CSCRM, CBCRM is a correlation matrix formed by only those bands that
had already been visited up to the band currently being processed but not bands yet
to be visited. In this case, CBCRM must be updated repeatedly as new bands come
in. To address this issue, PHBP-AD is further extended to recursive hyperspectral
band processing of anomaly detection (RHBP-AD) in a manner similar to how
Kalman filtering operates where results can be updated by recursive equations that

DOI 10.1007/978-3-319-45171-8_14
422 14 RHBP for Passive Target Detection: Anomaly Detection
only contain innovation information provided by new information but not in

already processed information. Consequently, RHBP-AD not only can be carried
out band by band progressively without waiting for all bands to be completed, as
can PHBP-AD, but it can also process data recursively to significantly reduce
computational complexity and computer process time. This great advantage allows
AD to be implemented in real time in the sense of progressive as well as recursive
band processing with data being processed taking place at the same time data are
being collected. The progressive and recursive capability of RHBP-AD makes AD
feasible for use in future satellite data communication and transmission where the
data can be processed and downlinked from satellites band by band simultaneously.
14.1 Introduction
Recursive hyperspectral band processing of anomaly detection (RHBP-AD) is a

new approach developed on the basis of bandwise processing that is quite different
from conventional AD using full band information. It emerged from attempts to
address several crucial issues encountered in AD, some of which have already been
addressed in Chang (2016). A key issue in AD is how to determine whether a target
is an anomaly. Several scenarios allow for the possibility of anomalies in data. One
is the presence of strong interferers. Another is one in which weak anomalies, which
can be dominated or overwhelmed by strong anomalies, are not detected. A third
one is where anomalies are obscured or overshadowed by the background and go
undetected unless the background has been suppressed. A fourth scenario involves
moving anomalies, which may appear within a short time frame and vanish quickly.
Most importantly, since an anomaly detector, such as the widely used RX detector
(Reed and Yu 1990) referred to as K-AD or an anomaly detector using the global
sample correlation matrix R developed by Chang and Chiang (2002), referred to as
R-AD, is generally a real-valued filter, it requires an appropriate threshold to segment
out anomalies from the background. However, if prior knowledge is lacking, coming
up with an optimal threshold selection method will be very challenging.
To deal with the aforementioned issues, one approach, called real-time AD
(RT-AD), was investigated in Chap. 6, where AD is carried out in real time. This
chapter first looks at an approach called progressive hyperspectral band processing
of AD (PHBP-AD), recently developed by Chang et al. (2015c), according to the
band-sequential (BSQ) format in a progressive manner band by band. It is different
from RT-AD (Chap. 6). Its idea is similar to but different from that of progressive
spectral band dimensionality (PSBD), developed in Chang et al. (2011d) and Chang
(2013). Unlike commonly used anomaly detectors, which make use of full band
information, PHBP-AD can process data while data are still being collected without
waiting for the data to be completely acquired. Since anomalies are generally
unknown and unexpected, they cannot be visualized or inspected by prior knowl-
edge. Accordingly, their detection must be carried out by some unsupervised
14.2 Causal Bandwise R-AD 423
means. Without a guideline to select a good threshold, anomalies can be better

detected by visual inspection. It seems that PHBP-AD provides a solution to this
dilemma resulting from many advantages that can be gained by implementing
PHBP-AD. The most unique feature of PHBP-AD is the progressive changes in
detection maps as new bands are added and included in AD. Such a changing profile
of progressive detection maps can yield several benefits. It allows users to see
progressive changes in background suppression, as discussed in Sects. 5.4 and 5.5
(Chang 2016) and Chap. 16 (Chang 2016), so that weak anomalies can be detected
before they are overwhelmed by subsequent strong anomalies. This is mainly due to
the fact that some anomalies may be very sensitive to certain ranges of wave-
lengths, which may be dominated by the entire wavelength coverage. In this case,
such anomalies can be detected only when these spectral bands are processed.
PHBP-AD can be used to identify these spectral bands. In addition, it can also be
used to detect anomalies in real time, like RT-AD discussed in Chap. 6. For some
cases moving targets may show up during band processing for data acquisition but
disappear before all full bands are collected.
However, an issue arises in the implementation of PHBP-AD. Because PHBP-
AD is processed band by band, the results must also be updated by including the
information provided by these new bands. To avoid repeatedly processing data
sample vectors that have been visited, PHBP-AD is further extended to RHBP-AD
in a manner similar to how a Kalman filer is derived. In other words, RHBP-AD
decomposes data information into three pieces of information, processed informa-
tion, new information, and innovation information, and only the innovation infor-
mation is used to update the results of RHBP-AD as a new band comes in. To
further reduce computational complexity and computer processing time, RHBP-AD
derives recursive equations to update innovation information without reprocessing
data sample vectors that have already been visited and processed. The real-time
processing capability and recursive structure of RHBP-AD makes RHBP-AD very
attractive in data communication and transmission. As satellite communication
becomes inevitable in the future of hyperspectral imaging sensors operated in
space, RHBP-AD can be used for effective data downlink and transmission,
where data processing can be carried out progressively as well as recursively
band by band.
14.2 Causal Bandwise R-AD
The causal band sample correlation matrix (CBCRM) introduced in Sect. 13.2 and
XN
defined by Rll ¼ ð1=N Þ i¼1 ri ðlÞ½ri ðlÞT in (13.1) is formed by image pixel
N
vectors fri ðlÞgi¼1 using image pixels in all previously visited band images, Ωl1
¼ fBi gl1 , up to the current lth band image, Bl. Note that each band image Bj has a
i¼1
N
total number of N pixels denoted by r ij i¼1, where rij is the ith pixel in the jth band
image. According to (13.5), the inverse of Rll can be calculated by
" #1

T 1
T
Xl1 Xl1 Xl1 xðlÞ
R1¼
ll ð1=N ÞXl X l ¼ N
xT ðlÞXl1
T
xT ðlÞxðlÞ
2 1 h 1 iT 3
1
ð1=N ÞXl1 Xl1T
Xl1 xðlÞ ð1=N ÞXl1 Xl1 T
Xl1 xðlÞ
6 ð1=N ÞX X T 1 þ ð1=N Þ ð1=N ÞXl1 Xl1 Xl1 xðlÞ 7
T
6 l1 l1 7
7
6 Xl1 Xl1 7
¼66 h iT
7
7
6 1 7
6 ð 1=N ÞX X
l1 l1
T
X l1 xð lÞ 1 7
4 ⊥
T 5
⊥
x ðlÞPX T xðlÞ
T l1
l1
2 h iT 3
R1 1
ðl1Þðl1Þ Xl1 xðlÞ Rðl1Þðl1Þ Xl1 xðlÞ R1
ðl1Þðl1Þ Xl1 xðlÞ
6 R1 þ ð Þ 7
6 ðl1Þðl1Þ 1=N 7
6 xT ðlÞP⊥ T xðlÞ xT ðlÞP⊥ T xðlÞ 1 7
6 Xl1 Xl1 7
¼66 ⊥
T 7:
h iT 7
6 1
Rðl1Þðl1Þ Xl1 xðlÞ
l1
7
6 7
4 5
xT ðlÞP⊥X T xðlÞ
l1
ð14:1Þ
According to (14.1), three types of information, processed information obtained

by information derived from previously visited data sample vectors, innovations
information, and input information, are used to update R1
ll in (14.1).
1. Past available information and processed information:

N
Xl1 i:e:, fri ðl 1Þgi¼1 , Rðl1Þðl1Þ ;
R1
ðl1Þðl1Þ ;
1
Xl1 xðlÞ, xT ðlÞP⊥
X T xð l Þ ;
l1
3. Input information:
xðlÞ ¼ ðr 1l ; r 2l ; . . . ; r Nl ÞT :
Using (14.1), a causal band-based R-AD (CBR-AD), δCBRAD ðrðlÞÞ can be
derived as follows:
δCBR-AD ðrðlÞÞ
¼ rT ðlÞðRll Þ1 rðlÞ
¼ ðrT ðl 1Þ,r l Þ
2 1 T 3
RðTl1Þðl1Þ þ ð1=N Þβ
Rð1 Þ ð Þ Xl1 xðlÞ Rð1 Þ ð Þ Xl1 xðlÞ β
Rð1 Þ ð Þ Xl1 xðlÞ

6 l ðl1Þ l1 l1 l1 l1 l ðl1Þ l1 l1 7
6 7
4 T 5
β
1
Rðl1Þðl1Þ Xl1 xðlÞ Nβ

l ðl1Þ l ðl1Þ
!
rðl 1Þ
rl
1 T
¼ r ðl 1Þ Rðl1Þðl1Þ rðl 1Þ þ ð1=N Þβ
T
rT ðl 1ÞR1 1
ðl1Þðl1Þ Xl1 xðlÞ Rðl1Þðl1Þ Xl1 xðlÞ rðl 1Þ
l ðl1Þ
T
r l β
Rð1
l1Þðl1Þ Xl1 xl rðl 1Þ β
rT ðl 1ÞR1
ðl1Þðl1Þ Xl1 xðlÞr l þ Nr l β

rl
ð14:2Þ
14.3 Recursive Hyperspectral Band Processing of Anomaly Detection 425
n h i o1
where η
¼ rT ðl 1ÞR1
ðl1Þðl1Þ Xl1 xðlÞ and β

¼ xT ð l Þ P⊥ T x ð lÞ .
l ðl1Þ l ðl1Þ X l1

of Anomaly Detection
As noted in (14.2), CBR-AD, δCBR ‐ AD(r(l )) operating on the lth band, can be
updated recursively via η
, β
, and R1
ðl1Þðl1Þ , but not the previous result,
l ðl1Þ l ðl1Þ
CBR-AD
δ ðrðl 1ÞÞ, obtained from the previous (l 1)st band.
14.3.1 RHBP-R-AD
In fact, we further show that δCBR ‐ AD(r(l )) can be easily updated by the previous
anomaly detector δCBR-AD ðrðl 1ÞÞ obtained using the first (l 1) bands, which is
already available, and new band information rl, along with two pieces of innovation
information, β
and η
, as follows. The resulting R-AD is called recursive

l ðl1Þ l ðl1Þ
band processing of R-AD (RHBP-R-AD):
δRHBP-R-AD ðrðlÞÞ
¼ δRHBP-R-AD ðrðl 1ÞÞ þ ð1=N Þβ
η2
2β
η
r l þ Nβ
r2
l ðl1Þ l
ðl1Þ l ðl1Þ l
ðl1Þ l ðl1Þ l
2
¼δ RHBP-R-AD
ðrðl 1ÞÞ þ ð1=N Þβ

Nr l η
:
l ðl1Þ l ðl1Þ
ð14:3Þ
More specifically, δRHBP ‐ R ‐ AD(r(l )) can be interpreted by Kalman filter (Poor

1994) through updating δRHBP-R-AD ðrðl 1ÞÞ via (14.3) where r(l ) is an l-dimen-
sional data sample vector given by rðlÞ ¼ ðr 1 ; r 2 ; . . . ; r l ÞT , η
¼ r T ð l 1Þ
l ðl1Þ
n h i o1
R1 ð Þ, β
¼ T
ð Þ ⊥
ð Þ
l
ðl1Þ
X
ðl1Þðl1Þ l1 x l and x l P T
Xl1
x l .
In regard to computing R1
ll via (14.1) the recursive structure derived in (14.2) has
many significant advantages over (14.1) in terms of how to effectively use informa-
tion summarized as follows. In particular, there is no innovation information derived
in (14.1). Theoretically speaking, (14.2) can also be implemented in real time, but its
computational complexity involved may be at the cost of its real-time processing
capability. In fact, (14.3) is a much better alternative to δCBR ‐ AD(r(l )) in (14.2)
because it makes AD more effective and efficient in real time implementation by
only updating δRBP-R-AD ðrðl 1ÞÞ, along with the innovation information:
r(1), R(1) ¼ (1/N )r(1)rT(1);
2. Newly received band information: input information from the lth band image of
N
all data sample vectors: fr il gi¼1 to form x(l );
3. Processed information produced by the previously visited l 1 band images r
(l 1) and Xl1:
R1
ðl1Þðl1Þ ;
δRHBP-R-AD ðrðl 1ÞÞ;

1
P⊥
XT
¼ IN Xl1T T
Xl1 Xl1 Xl1
l1
¼ IN ð1=N ÞXl1
T
R1
ðl1Þðl1Þ Xl1 ;
4. Innovation information: correlation provided by the lth band and previous l 1

bands:
n h i o1
β
¼ xT ðlÞ P⊥ T x ð l Þ ;
l ðl1Þ X l1
¼ rT ðl 1ÞR1
ðl1Þðl1Þ Xl1 xðlÞ:
l ðl1Þ
Equation (14.3) is very easily implemented in real time by only updating

δRHBP-AD ðrðl 1ÞÞ using the innovation information specified by the second term
in (14.3) using βl|(l1) and ηl|(l1).
14.3.2 RHBP-K-AD
Despite the fact that R-AD is derived from K-AD defined in (6.1), finding the
inverse of a causal sample covariance matrix for K-AD is more complicated than
that for a causal correlation matrix since there is no sample mean involved in R-AD.
However, once R is calculated, K can be calculated by K ¼ R μμT .
Unlike the CBR-AD, which uses the autocorrelation matrix without
calculating the sample mean, a real-time CBK-AD requires the causal sample
mean prior to calculating the causal covariance matrix. Assume that
XN
μðlÞ ¼ ðμ1 ; . . . ; μl ÞT , with μj ¼ ð1=N Þ i¼1 r ji for 1 j l and Kll ¼ ð1=N Þ
XN
ðri ðlÞ μðlÞÞðri ðlÞ μðlÞÞT . Then Kll ¼ Rll μðlÞμT ðlÞ,
i¼1
XN
μðl 1Þ ¼ ð1=N Þ i¼1 ri ðl 1Þ, and
14.3 Recursive Hyperspectral Band Processing of Anomaly Detection 427

μðl 1Þ T μðl 1ÞμT ðl 1Þ μðl 1Þμl
μðlÞμ ðlÞ ¼
T
μ ðl 1Þ μl ¼ :
μl μl μT ðl 1Þ μ2l
Using the Woodbury matrix identity in Appendix A,

T 1 1 A1 u vT A1
A þ uv ¼A ; ð14:4Þ
1 þ vT A1 u
K1 1
ll can be calculated through Rll by letting A ¼ Rll , u ¼ μðlÞ, and v ¼ μðlÞ
in (14.4) as follows:
T
1 R1 1
ll μðlÞ μ ðlÞRll
K1 ¼ Rll μðlÞμ ðlÞ T
¼ R1 þ : ð14:5Þ
ll ll
1 μT ðlÞR1
ll μðlÞ
Then R1 1
ll μðlÞ and μ ðlÞRll μðlÞ in (14.5) can be further calculated as
T
T
Φll ¼ R1ll μðlÞ μ ðlÞRll
1

1 μðl 1ÞμT ðl 1Þ μðl 1Þμl
¼ Rll R1
ll ; ð14:6Þ
μ l μ T ð l 1Þ μ2l
with

Aðl1Þðl1Þ b μðl 1Þ
R1
ll μðlÞ ¼
bT d μl

Aðl1Þðl1Þ μðl 1Þ þ bμl
¼ ð14:7Þ
bT ðl 1Þμðl 1Þ þ dμl
and
T

1 μðl 1Þ Aðl1Þðl1Þ b μðl 1Þ
ρ
¼ μ ðlÞRll μðlÞ ¼
T
l ðl1Þ μl bT d μl ð14:8Þ
¼ μ ðl 1ÞAðl1Þðl1Þ μðl 1Þ þ 2μ ðl 1Þbμl þ dμ2l ;
T T
where μðl 1Þ ¼ ðμ1 ; . . . ; μl1 ÞT , b ¼ β
R1
ðl1Þðl1Þ Xl1 xðlÞ,
l ðl1Þ
n h i o1
β
¼ xT ð l Þ P⊥
X T xðlÞ ; ð14:9Þ
l ðl1Þ l1
d ¼ Nβl|(l1), and
1
Aðl1Þðl1Þ ¼ RðTl1Þðl1Þ
T ð14:10Þ
þ ð1=N Þβ
R1 1
ðl1Þðl1Þ Xl1 xðlÞ Rðl1Þðl1Þ Xl1 xðlÞ ;
l ðl1Þ
Φll
K1 1
ll ¼ Rll þ : ð14:11Þ
1 ρ
l ðl1Þ
Using (14.11), we can further derive
δCBK-AD ðrðlÞÞ ¼ ðrðlÞ μðlÞÞT 2

K1
ll ðrðlÞ μðlÞÞ 3
Φll
¼ ðrðlÞ μðlÞÞT 4R1
ll þ
5ðrðlÞ μðlÞÞ
1 ρ
l ðl1Þ
2 3
T4 Φll 5ðrðlÞ μðlÞÞ
¼ ðrðlÞ μðlÞÞ T
R1
ll ðrðlÞ μðlÞÞ þ ðrðlÞ μðlÞÞ
1 ρ
l ðl1Þ
T
rðl 1Þ μðl 1Þ Aðl1Þðl1Þ b rðl 1Þ μðl 1Þ
¼
r l μl bT d r l μl
T
1 rðl 1Þ μðl 1Þ rðl 1Þ μðl 1Þ
þ Φ ;
1 ρ
r l μl r l μl
ll
l ðl1Þ
ð14:12Þ
where ρ
¼ μT ðlÞR1
ll μðlÞ is given by (14.8).
l ðl1Þ
According to (14.12), CBK-AD can also be considered a recursive hyperspectral
band processing of K-AD (RHBP-K-AD) since it also implements R1 ll μðlÞ via
(14.7) recursively. However, implementing RHBP-K-AD in real time involves more
computation than RHBP-R-AD because it requires additional calculation of the
causal sample mean, μ(l ). Specifically, to do so we need another matrix identity,
Woodbury’s identity, which is not required for RHBP-AD. Also, (14.12) is not
recursive like (14.3) because it does not update δRHBP ‐ K ‐ AD(r(l )) through
δRHBP-K-AD ðrðl 1ÞÞ but rather calculates R1
ðl1Þðl1Þ through A and b in (14.7).
Finally, note that when the previously derived RHBP-R-AD and RHBP-K-AD
are applied to processing AD band by band according to BSQ progressively as well
as recursively, they are different from the real-time causal anomaly detectors
developed in Chang et al. (2014), which require full band information for each of
the data sample vectors according BIP/BIS.
The image data to be studied are the HYDICE image scene shown in Fig. 14.1a (and
Fig. 1.10a), which has a size of 64 64 pixel vectors with 15 panels in the scene and
the ground truth map in Fig. 14.1b (Fig. 1.10b).
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
of the 15 panels
The image was acquired by 210 spectral bands with a spectral coverage from 0.4
to 2.5 μm. Low signal/high noise bands, bands 1–3 and 202–210, and water vapor
resolution of this image scene are 1.56 m and 10 nm, respectively. As noted in
previous chapters, the panel pixel p212, marked yellow in Fig. 14.1b, is of particular
interest. It is always extracted as the one with the most spectrally distinct signature
compared to the R panel pixels in row 2. This indicates that a signature of spectral
purity is not equivalent to a signature of spectral distinction. Also, because of such
ambiguity, the panel signature representing panel pixels in the second row is either
p221 or p212, which is always difficult to determine by endmember finding algo-
rithms. This implies that the ground truth of R panel pixels in the second row in
Fig. 14.1b may not be as pure as was thought.
Figures 14.2 and 14.3 plot progressive changes in the detection of each of 19 R
panel pixels plus a yellow panel pixel, p212, as shown in Fig. 14.1b, produced by
RHBP-R-AD and RHBP-K-AD, respectively, where the x-axis is the number of
bands processed and the y-axis is the amount of detected abundance fractions. Note
that, based on the number of processed bands, say, n in Figs. 14.2 and 14.3, the first
n bands are used to process the data.
For a better visual assessment of Fig. 14.3 we plot Figs. 14.4 and 14.5 in three-
dimensional plots, where the x-axis specifies 19 R panel pixels plus a yellow panel
pixel denoted by p212, the y-axis denotes the number of bands being processed, and
the z-axis is R-AD or K-AD detected abundance fractions.
These figures clearly show three significant changes in the detected abundances
in ranges of processed bands around 20, 50, and 110 bands. These three major
ranges of processed bands correspond to Figs. 14.4 and 14.5 where the first 16–21
bands, the first 43–57 bands, and the first 103–113 bands are highlighted in pink to
show the dramatic changes in the amounts of detected abundance fractions for both
RHBP-R-AD and RHBP-K-AD.
Using the information of the three ranges of processed bands provided by
Figs. 14.2, 14.3 and 14.4, Figs.14.5, 14.6, and 14.7 show the progressive detection
p11: R-AD Value vs. Bands p12: R-AD Value vs. Bands p13: R-AD Value vs. Bands
2000 2000 2000
1500 1500 1500
R-AD Value
R-AD Value
R-AD Value
1000 1000 1000
500 500 500
0 0 0
0 50 100 150 0 50 100 150 0 50 100 150
Number of Processed Bands Number of Processed Bands Number of Processed Bands
p211: R-AD Value vs. Bands p212: R-AD Value vs. Bands p221: R-AD Value vs. Bands
2000 2000 2000
1500 1500 1500
R-AD Value
R-AD Value
1000 R-AD Value 1000 1000
500 500 500
0 0 0
0 50 100 150 0 50 100 150 0 50 100 150
p22: R-AD Value vs. Bands p23: R-AD Value vs. Bands
2000 2000
1500 1500
R-AD Value
R-AD Value
1000 1000
500 500
0 0
0 50 100 150 0 50 100 150
Number of Processed Bands Number of Processed Bands
p311: R-AD Value vs. Bands p312: R-AD Value vs. Bands p32: R-AD Value vs. Bands p33: R-AD Value vs. Bands
2000 2000 2000 2000
1500 1500 1500 1500

R-AD Value
R-AD Value
R-AD Value
R-AD Value
1000 1000 1000 1000
500 500 500 500
0 0 0 0
0 50 100 150 0 50 100 150 0 50 100 150 0 50 100 150
Number of Processed Bands Number of Processed Bands Number of Processed Bands Number of Processed Bands
2000 2000 2000 2000
1500 1500 1500 1500

R-AD Value
R-AD Value
R-AD Value
R-AD Value
1000 1000 1000 1000
500 500 500 500
0 0 0 0
0 50 100 150 0 50 100 150 0 50 100 150 0 50 100 150
2000 2000 2000 2000
1500 1500 1500 1500

R-AD Value
R-AD Value
R-AD Value
R-AD Value
1000 1000 1000 1000
500 500 500 500
0 0 0 0
0 50 100 150 0 50 100 150 0 50 100 150 0 50 100 150
Fig. 14.2 Detected abundance fractions of 19 R panel pixels plus p212 by RHBP-R-AD versus
number of bands processed
p11: K-AD Value vs. Bands p12: K-AD Value vs. Bands p13: K-AD Value vs. Bands
2000 2000 2000
1500 1500 1500
K-AD Value
K-AD Value
K-AD Value
1000 1000 1000
500 500 500
0 0 0
0 50 100 150 0 50 100 150 0 50 100 150
p211: K-AD Value vs. Bands p212: K-AD Value vs. Bands
p221: K-AD Value vs. Bands
2000 2000
2000
1500 1500
1500
K-AD Value
K-AD Value
K-AD Value
1000 1000
1000
500
500 500
0
0 0 50 100 150 0
0 50 100 150 0 50 100 150
Number of Processed Bands
p22: K-AD Value vs. Bands p23: K-AD Value vs. Bands
2000 2000
1500 1500
K-AD Value
K-AD Value
1000 1000
500 500
0 0
0 50 100 150 0 50 100 150
p311: K-AD Value vs. Bands p312: K-AD Value vs. Bands p32: K-AD Value vs. Bands p33: K-AD Value vs. Bands
2000 2000 2000 2000
1500 1500 1500 1500

K-AD Value
K-AD Value
K-AD Value
K-AD Value
1000 1000 1000 1000
500 500 500 500
0 0 0 0
0 50 100 150 0 50 100 150 0 50 100 150 0 50 100 150
2000 2000 2000 2000
1500 1500 1500 1500

K-AD Value
K-AD Value
K-AD Value
K-AD Value
1000 1000 1000 1000
500 500 500 500
0 0 0 0
0 50 100 150 0 50 100 150 0 50 100 150 0 50 100 150
2000 2000 2000 2000
1500 1500 1500 1500

K-AD Value
K-AD Value
K-AD Value
K-AD Value
1000 1000 1000 1000
500 500 500 500
0 0 0 0
0 50 100 150 0 50 100 150 0 50 100 150 0 50 100 150
Fig. 14.3 Detected abundance fractions of 19 R panel pixels plus p212 by RHBP-K-AD versus
number of bands processed
R-AD Value
2000
1500
1000
500
0
10
16
21
30
43
Nu
50
mb
57
er
70
of
Pro
90
ce
ss
103
ed
110
Ba
113
nd
s
130
P521 P52 P53
P43 P511
150 P412 P42
P33 P411
P312 P32
P23 P311
170 p212 P221 P22 ne l Pi xels
P13 P211 20 Pa
P11 P12
Fig. 14.4 3D plots of 19 R panel pixels plus p212 detected by RHBP-R-AD versus number of
bands processed
2000
K-AD Value
1500
1000
500
0
10
16
21
30
43
Nu
50
mb
57
er
70
of
Pro
90
ce
ss
103
ed
110
Ba
113
nd
130
s
P53
P521 P52
P43 P511
150 P412 P42
P33 P411
P312 P32
P22 P23 P311 ls
p212P221 l Pixe
170 P13 P211 20 Pane
P11 P12
Fig. 14.5 3D Plots of 19 R panel pixels plus p212 by RHBP-K-AD versus number of bands
processed
maps of anomalies produced by RHBP-R-AD and RHBP-K-AD respectively

according to three ranges of processed bands: (a–f): first range of 16–21 bands,
(g–l): odd bands in the second range of 43–57 bands, and (m–r): odd bands in the
third range of 103–113 bands. As shown, the first range of processed bands began
with no panel pixel detected using the first 16 bands, and then gradually, the
Fig. 14.6 Results of RHBP-R-AD of Hydice Data after receiving: (a) 16 bands (b) 17 bands (c)
18 bands (d) 19 bands (e) 20 bands (f) 21 bands (g) 43 bands (h) 45 bands (i) 47 bands (j) 49 bands
(k) 51 bands (l) 53 bands (m) 103 bands (n) 105 bands (o) 107 bands (p) 109 bands (q) 111 bands
(r) 113 bands
detection of panel pixels in rows 4 and 5 improved, until the first 21 bands were
processed. Similarly, the panel pixels in rows 1–3 were detected by the second
range of processed bands from almost being picked up when the first 43 bands were
processed to clearly being detected using the first 51–53 bands. Finally, using the
third range of processed bands, the detection of 19 R panel pixels in the five rows
pretty much stayed the same, but the interferer at the left corner, which was not
detected by the previous first and second ranges of processed bands, was clearly
detected as a very bright pixel. Interestingly, this strong interferer is always the first
target detected by an unsupervised target detection algorithm, the automatic target
generation process (ATGP) developed by Ren and Chang (2003).
Since we have the ground truth of the 19 R panel pixels and the yellow panel
pixel p212, we can further calculate the detection rates of RHBP-R-AD and RHBP-
K-AD via a receiver operating characteristic (ROC) analysis for performance
evaluation (Poor 1994). To include the number of processed bands as a parameter
to see their progressive detection performance on these 19 R panel pixels, we
calculated the area under an ROC curve, referred to as area under curve (AUC) in
Metz (1978). Using such an AUC, an alternative ROC curve can be plotted in
terms of AUC on the y-axis versus number of processed bands. Figure 14.8a–c plots
the AUC curves versus the number of processed bands with (a) an ROC curve of
(number of bands, area under curve (AUC)) for panel pixels in column 1, p11, p211,
p212, p221, p311, p312, p411, p412, p511, and p521; (b) an ROC curve of (band, AUC) for
p12, p22, p32, p42, p52; and (c) an ROC curve of (band, AUC) for p13, p23, p33, p43,
p53.
As shown in Fig. 14.8, RHBP-K-AD generally performed better than RHBP-R-
AD when the number of processed bands was smaller than 45 (Fig. 14.8a),
40 (Fig. 14.8b), and 30 (Fig. 14.8c). Nevertheless, after 45 bands were processed,
both yielded nearly same detection performance.
Similarly, Fig. 14.9a–e plots the AUC curves versus the number of processed
bands with (a) an ROC curve of (band, AUC) for panel pixels in row 1, p11, p22, p33;
(b) an ROC curve of (band, AUC) for panel pixels in row 2, p211, p212, p221, p22, p23;
(c) an ROC curve of (band, AUC) for panel pixels in row 3, p311, p312, p32, p33;
(d) an ROC curve of (band, AUC) for row 4, p411, p412, p42, p43; and (e) an ROC
curve of (band, AUC) for row 5, p511, p521, p52, p53.
Fig. 14.7 Results of RHBP-K-AD of HYDICE data after receiving: (a) 16 bands, (b) 17 bands, (c)
18 bands, (d) 19 bands, (e) 20 bands, (f) 21 bands, (g) 43 bands, (h) 45 bands, (i) 47 bands, (j)
49 bands, (k) 51 bands, (l) 53 bands, (m) 103 bands, (n) 105 bands, (o) 107 bands, (p) 109 bands,
(q) 111 bands, (r) 113 bands
Finally, Fig. 14.10 plots the curves of (band, AUC) for the progressive detection
performance of RHBP-R-AD and RHBP-K-AD for all 19 R panel pixels plus the
p212 yellow panel pixel.
As a second example, an Airborne Visible/InfraRed Image Spectrometer
(AVIRIS) image scene shown in Fig. 14.11a (and Fig. 1.8) was used for experi-
ments. It is the Lunar Crater Volcanic Field (LCVF) located in Northern Nye
County, Nevada.
Figure 14.12a plots the detected amount of abundance fractions of six signatures
specified in Fig. 14.11b by RHBP-R-AD versus the number of processed bands,
where it is very clear that the detection curve of anomaly increased exponentially as
the number of processed bands grew, as with vegetation at a slightly slower pace.
By contrast, the detection curves of the four signatures seemed to be linear. A
comparison of Fig. 14.12a with Figs. 14.2 and 14.4 reveals that LCVF has stable
and slow-changing R-AD detected values as the number of processed bands
increased, as opposed to those in Fig. 14.2, which had more dynamic ranges with
clear cuts on detecting pure panel pixels. In particular, there seemed to be three
ranges of interest in the detection of rapid changes in R-AD detected values , which
are three ranges of band numbers, 20 to 30, 57 to 69 and 110 to 110. For a better
visual assessment, Fig. 14.12b also plots the RHBP-R-AD-detected amounts of
abundance fractions of six signatures versus the number of processed bands in a 3D
view, which also confirmed the results.
Similar conclusions can also be drawn for RHBP-K-AD, as shown in

Fig. 14.13a, b where the same three ranges of processed band numbers in detecting
rapid changes in R-AD-detected values were also observed from the three ranges of
band numbers, 20 to 30, 57 to 69 and 110 to 110 from 20 to 30, 57 to 69 and 110 to
110.
To see changes in the R-AD detection maps in the three identified ranges in
Figs. 14.12 and 14.13 for the three ranges of processed band numbers, 20 to 30, 57
to 69, and 110 to 110, Figs. 14.14 and 14.15 show the R-AD and K-AD detection
maps for selected processed bands respectively for visual inspection.
a b
1 1
0.9 0.9
0.8 0.8
AUC
AUC
0.7 0.7
0.6 0.6
PBP-R-AD PBP-R-AD
PBP-K-AD PBP-K-AD
0.5 0.5
0 25 50 75 100 125 150 0 25 50 75 100 125 150
c
1
0.9
0.8
AUC
0.7
0.6
PBP-R-AD
PBP-K-AD
0.5
0 25 50 75 100 125 150
Fig. 14.8 Plots of (number of bands, area under curve (AUC)) for progressive detection perfor-
mance of RHBP-R-AD and RHBP-K-AD: (a) ROC curve of (number of bands, AUC) for panel
pixels in column 1, p11, p211, p212, p221, p311, p312, p411, p412, p511, p521; (b) ROC curve of (number
of bands, AUC) for p12, p22, p32, p42, p52; (c) ROC curve of (number of bands, AUC) for p13, p23,
p33, p43, p53
As we can see, in the first range of processed band numbers, vegetation was
detected, while an anomaly was detected in the second range of processed bands.
Interestingly, adding the third range of processed bands did not much improve the
detection of vegetation and anomalies. Most importantly, since there are large areas
of cinders, rhyolite, playa (dry lake), and shade they are not considered anomalies.
As a result, it makes sense that none of these four areas was detected in Figs. 14.14
and 14.15.
a b
1 1
0.9 0.9
0.8 0.8
AUC
AUC
0.7 0.7
0.6 0.6
PBP-R-AD PBP-R-AD
PBP-K-AD PBP-K-AD
0.5 0.5
0 25 50 75 100 125 150 0 25 50 75 100 125 150
c 1
d
1
0.9 0.9
0.8 0.8
AUC
AUC
0.7 0.7
0.6 0.6
PBP-R-AD PBP-R-AD
PBP-K-AD PBP-K-AD
0.5 0.5
0 25 50 75 100 125 150 0 25 50 75 100 125 150
e 1
0.9
0.8
AUC
0.7
0.6
PBP-R-AD
PBP-K-AD
0.5
0 25 50 75 100 125 150
Fig. 14.9 Plots of (band, AUC) for progressive detection performance of RHBP-R-AD and
RHBP-K-AD: (a) ROC curve of (band, AUC) for panel pixels in row 1, p11, p12, p13; (b) ROC
curve of (band,AUC) for row 2, p211, p212, p221, p221, p22, p23; (c) ROC curve of (band,AUC) for
row 3, p311, p312, p32, p33; (d) ROC curve of (band,AUC) for row 4, p411, p412, p42, p43; (e) ROC
curve of (band,AUC) for row 5, p511, p521, p52, p53
14.5 Computing Time Analysis 439
Fig. 14.10 Plots of (band, 1

AUC) for progressive
detection performance of
RHBP-R-AD and RHBP-K-
AD for all 19 R panel pixels 0.9
plus p212 yellow panel pixel
0.8
AUC
0.7
0.6
PBP-R-AD
PBP-K-AD
0.5
0 25 50 75 100 125 150
14.5 Computing Time Analysis
The computing time of RHBP-AD was compared to the traditional AD. Each
algorithm was run and executed in MATLAB R2011B on an Intel Core i7-3770
running at 3.40 GHz with 16 GB RAM on the HYDICE images five times to
produce an average computing time. Figure 14.16 plots the computing time versus
the number of processed bands for both RHBP-R-AD and RHBP-K-AD running on
the HYDICE and LCVF data sets, where RHBP-R-AD required less time than
RHBP-K-AD owing to the fact that the former implements a recursive equation
specified by (14.6), while the latter implements a nonrecursive equation specified
by (14.12). As also shown in Fig. 14.16, the computing time increases as new bands
are added. This is because an increase in the size of R and K results in higher
computational complexity. The reason for the drastic changes in the second band is
the calculation of initial conditions. Once initial conditions are calculated, no
dramatic changes in computing time occur, as shown in Fig. 14.16. It is also
worth noting that the fluctuations of the plots resulted from numerical computations
by computer implementation.
In addition, R-AD required 0.44 s to complete the entire process, while RHBP-
R-AD spent 0.02 s to complete processing of the first band and a total of 32.43 s was
required to process the entire image cube with 169 bands. For the global K-AD,
0.34 s was needed to complete the process, compared to RHBP-K-AD, which took
0.28 s to process the first band and an additional 41.94 s to process the whole data
set. This is mainly because R-AD and K-AD only require calculation of the global
sample correlation matrix R/global sample covariance matrix K once prior to
Fig. 14.11 AVIRIS LCVF a

image scene vegetation
cinders
shade anomaly
rhyolite
dry lake
b
1800
Shade
Rhyolite
1500 Dry lake
Cinders
Vegetation
Anomaly
1200
Radiance
900
600
300
0
0.4 mm 2.5 mm
Wavelength (mm)
processing. Once it is calculated, R/K will be used during the course of the entire
process. By contrast, RHBP-R-AD and RHBP-K-AD must recalculate the sample-
varying correlation matrix and covariance matrix via equations specified by (14.6)
and (14.15). Consequently, they both require more time than R-AD and K-AD.
Nevertheless, such a disadvantage can be offset by several benefits. One is the
progressive detection performance as bands are progressively processed. Second,
with such progressive performance, the impact of each band on AD performance
can be analyzed. Third, (14.6) and (14.15) provide a feasible implementation for
hardware design. Finally, RHBP-R-AD and RHBP-K-AD can be performed when-
ever bands are available, as opposed to R-AD and K-AD, which must wait for data
to be completed to calculate the global matrix R and K. Table 14.1 summarizes the
computing times in seconds required by K-AD, R-AD, RHBP-K-AD, and RHBP-
R-AD to run HYDICE and LCVF data sets.
a
Shade
Rhyolite
600 Dry lake
Cinders
Vegetation
Anomaly
R-AD Value
400
200
0
0 20 30 50 57 69 100 110 150
600
500
Value of R-AD
400
300
200
100
10
20
3035
Nu
m 57
be
ro 69
fB
an 85 Anomaly
ds
Pr 100 Vegetation
oc 110
es Cinders
se
d Dry lake
135
Rhyolite es
Shade 6 Signatur
160
Fig. 14.12 (a) Detected amounts of abundance fractions of six signatures by RHBP-R-AD versus
number of processed bands. (b) 3D plots of RHBP-R-AD-detected abundance fractions of six
signatures versus number of processed bands
Because the computing times for RHBP-K-AD and RHBP-R-AD were calcu-
lated by completing all bands, the times are significantly higher than their counter-
parts without RHBP. This is because RHBP must be repeatedly implemented for
each new incoming band with growing matrix sizes.
a
Shade
Rhyolite
600 Dry lake
Cinders
Vegetation
Anomaly
K-AD Value
400
200
0
0 20 30 50 57 69 100 110 150
700
600
Value of K-AD
500
400
300
200
100
0
10
20
30
35
Nu
m 57
be
ro 69
fB
an 85 Anomaly
ds 100
Pr Vegetation
oc 110
es Cinders
se Dry lake
d 135
Rhyolite es
Shade 6 Signatur
160
Fig. 14.13 (a) Detected amounts of abundance fractions of six signatures by RHBP-K-AD versus
number of processed bands. (b) 3D plots of RHBP-R-AD-detected abundance fractions of six
signatures versus number of processed bands
Fig. 14.14 Results of RHBP-R-AD of LCVF data after receiving (a) 20 bands, (b) 22 bands, (c)
A graphical user interface (GUI) with a screenshot shown in Fig. 14.17 was
developed using MATLAB’s Guide to aid in algorithm performance analysis.
The GUI allows the user to load different data sources and choose different AD
methods (R-AD/K-AD) for analysis. The three images displayed at the tops of the
windows show a color image of the scene, a grayscale image of the current band
being processed, and the result of a detection map after the current band is
processed. At the bottom a window shows the changes in the R-AD (K-AD)
value in real time with detected abundance fractions highlighted by a pink line
with circles at the top as the number of processed bands increases. Once the data are
loaded, the user can start processing by clicking the start button. As a new band is
received at each iteration RHBP-AD (R-AD/K-AD) processes this newly received
data. Upon completion of processing, the current band and the resulting image are
updated to allow the user to observe the results. A final note on Fig. 14.17 is in
order. The plots shown in the bottom window represent a real-time progressive
detection version of Figs. 14.4 and 14.5/Figs. 14.12b and 14.13b. Unfortunately,
this attractive feature can only be demonstrated in a real-time process.
Fig. 14.15 Results of RHBP-K-AD of LCVF data after receiving (a) 20 bands, (b) 22 bands, (c)
a b
0.4 16
14
Computational Time (Seconds)
Computational Time (Seconds)
0.3 12
10
0.2 8
0.1 4
RHBP-R-AD 2 RHBP-R-AD
RHBP-K-AD RHBP-K-AD
0 0
0 25 50 75 100 125 150 0 25 50 75 100 125 150
Fig. 14.16 Plots of computing time versus number of processed bands for RHBP-R-AD and
RHBP-K-RD
Table 14.1 Summary of computing time (seconds) required by K-AD, R-AD, RHBP-K-AD, and
RHBP-R-AD to run HYDICE and LCVF data sets
R-AD K-AD RHBP-R-AD RHBP-K-AD
HYDICE 0.0528 0.0689 30.9614 41.9660
LCVF 0.3710 0.4552 1716.9 1854.4
Fig. 14.17 GUI design for RHBP-R-AD/RHBP-K-AD
14.7 Conclusions
The idea of RHBP-AD was developed in this chapter. It processes anomaly

detection band by band according to the band-sequential (BSQ) format. Unlike a
real-time AD process, which must process each data sample vector using full band
information, as proposed in Chen et al. (2014) and Chap. 19 of this book, the
RHBP-AD process uses available bands without waiting for entire bands to be
completed. As a result, RHBP-AD allows users to observe progressive band-by-
band anomaly detection performance during the process of band acquisition. In
particular, RHBP-AD offers a significant advantage for users to see background
suppression band by band resulting from AD using growing numbers of bands. By
virtue of such progressive background suppression in anomaly detection, we may
be able to identify various ranges of bands that are crucial to anomaly detection.
This benefit cannot be provided by anomaly detectors that use full bands.
Part V
Signature Spectral Statistics-Based
Recursive Hyperspectral Band
Processing (RHBP)
While Part IV develops real-time recursive hyperspectral band processing (RHBP)

algorithms as counterparts of recursive hyperspectral sample band processing
(RHSP) in Part II, Part V redesigns RHSP algorithms in Part III as RHBP algo-
rithms as their counterparts. Specifically, Part V extends progressive band dimen-
sionality process (PBDP) originally developed in Chang et al. (2011d) and
discussed in Chaps. 21 and 23 (Chang 2013), which first prioritizes bands according
to band selection criteria and then selects bands in a progressive manner. The
chapters included in Part V take different approaches. Instead of relying on band
prioritization criteria for processing bands, the RHBP algorithms in Part V process
bands progressively according to various applications from linear spectral mixture
analysis (LSMA) to endmember finding while updating data information recur-
sively according to new information provided by incoming spectral bands. Similar
to the theory developed for RHSP in Part III, a parallel theory can also be derived
for RHBP in Part V. As a result, Part V can be treated as a companion part of
Part III.
It is important to note that PHBP/RHBP is not a band selection process. It has
two key features that cannot be found in any band selection process. First, it
processes data band by band progressively according to the band-sequential
(BSQ) format. Second, to updates data information for band-by-band processing
recursively. Also, it should be noted that the two concepts progressive and recursive
are completely different notions, where the former generally refers to algorithmic
implementation of sample/band processing, whereas the latter involves algorithm
architecture information updating within sample vectors and bands. These two are
very similar to the Gauss and Markov pair used to characterize a random process,
where the “Gauss” part means that the random process behaves according to a
Gaussian distribution while the “Markov” part indicates that the random process
has a Markov property. Accordingly, when recursive is used for simplicity, one
should understand progressive-recursive.
RHBP is a completely new concept yet to be developed in hyperspectral
imaging. It is specially designed to take advantage of the BSQ acquisition format
to process hyperspectral imagery band by band so as to make data communication
450 Part V Signature Spectral Statistics-ased Recursive Hyperspectral Band. . .
and transmission effective in future satellite communications. Most importantly, all

the different versions of RHBP presented in Part IV, “Sample Spectral Statistics-
Based Recursive Hyperspectral Band Processing,” and Part V, “Signature Spectral
Statistics-Based Recursive Hyperspectral Band Processing,” are quite different
from each other in terms of their design rationales and have their own merits.
This is because in Part IV data sample vectors are processed band by band both
progressively and recursively, as opposed to Part V, where data sample vectors are
processed band by band progressively and signatures are processed band by band
recursively.
Part V is made up of six chapters. Since target knowledge is generally crucial for
target detection, Chap. 15, “Recursive Hyperspectral Band Processing of the
Automatic Target Generation Process,” extends ATGP to RHBP-ATGP to produce
target knowledge a posteriori in a completely unsupervised manner via a succession
of orthogonal subspace projections. Such RHBP-ATGP-generated target knowl-
edge can be used as unsupervised target knowledge for constrained energy mini-
mization (CEM). This is followed by Chap. 16, “Recursive Hyperspectral Band
Processing of Orthogonal Subspace Projection” (RHBP-OSP), which requires
complete target knowledge that can be provided either a priori or a posteriori and
generated by RHBP-ATGP. Since OSP is designed for signal detection and not for
linear spectral mixture analysis (LSMA), Chap. 17, “Recursive Hyperspectral Band
Processing of Linear Spectral Mixture Analysis” (RHBP-LSMA), extends LSMA
to RHBP-LSMA for linear spectral unmixing. Because it is shown in Chap. 11 that
orthogonal projection (OP) can also be used to rederive a simplex growing algo-
rithm (SGA) for finding endmembers, we can also derive an RHBP version for
SGA, which is the topic of Chap. 18, “Recursive Hyperspectral Band Processing of
Growing Simplex Volume Analysis”. Finally, we conclude with one of most
important concepts resulting from the principle of orthogonality discussed in
Sect. 3.2 of Chap. 3, “Orthogonal Projection,” which has been widely used in
many applications, particularly for finding endmembers. To this end, the popular
pixel purity index (PPI) and its fast version, of fast iterative PPI (FIPPI), are
extended in Chap. 19, “Recursive Hyperspectral Band Processing of Pixel Purity
Index,” and Chap. 20, “Recursive Hyperspectral Band Processing of Fast Iterative
Pixel Purity Index.”
Chapter 15

of Automatic Target Generation Process
Abstract The automatic target generation process (ATGP) presented in

Sect. 4.4.2.3 has been widely used for unsupervised hyperspectral target detection.
It detects targets with no a priori knowledge required. However, as designed, it
detects targets using full band information. Unfortunately, on many occasions
various targets can be detected by varying bands, and ATGP can only produce
one-shot target detection with all bands being used. In other words, ATGP does not
provide how band information affects its ability in target detection. This chapter
develops a new approach that implements ATGP bandwise in a progressive man-
ner, called progressive hyperspectral band processing of ATGP (PHBP-ATGP), so
that ATGP can be carried out band by band. Several advantages are gained from
PHBP-ATGP. First, it can be implemented whenever bands are available without
waiting for full bands of data information being collected according to the band-
sequential (BSQ) format. Second, PHBP-ATGP can detect weak targets before they
vanish as new bands come in. In other words, PHBP-ATGP makes it possible to find
targets that will be overwhelmed or dominated later by subsequently detected
strong targets. Third, with such a band-by-band progression capability, PHBP-
ATGP is also able to find various targets produced by different sets of bands,
compared to only one final target produced by ATGP with full band information.
Fourth, PHBP-ATGP can also be used to identify significant bands for finding
targets. It offers an effective means of selecting bands without performing band
selection. Fifth, PHBP-ATGP can generate a progressive profile in target detection
from band to band, a capability that no target detection techniques using full band
information can provide. Finally and most importantly, PHBP-ATGP can be
extended to a recursive version of PHBP-ATGP, called recursive hyperspectral
band processing of ATGP (RHBP-ATGP), which makes use of recursive equations
that are executed much like a Kalman filter. Such RHBP-ATGP confers a great
benefit in terms of hardware implementation and design owing to its recursive
structures that only need to update innovation information without reprocessing all
past information.

DOI 10.1007/978-3-319-45171-8_15
452 15 Recursive Hyperspectral Band Processing of Automatic Target Generation Process
15.1 Introduction
ATGP is an unsupervised target detection technique that has been widely used in
hyperspectral target detection (Ren and Chang 2003). It implements a succession of
orthogonal subspace projections (OSPs) to find targets that have maximal leaking
residuals from subspaces generated by previous targets. It is very simple but also
very effective. Recently, it has also been applied to finding endmembers and shown
to be effective because two popular endmember finding algorithms currently being
used in the literature, vector component analysis (VCA) (Nascimento and Bioucas-
Dias 2005) and simplex growing algorithm (SGA) (Chang 2006), can also be shown
to be equivalent to ATGP under certain assumptions (Chang et al. 2013, 2016b;
Greg 2010; Chap. 11 in Chang 2013).
This chapter takes a rather different approach by investigating ATGP in a very
interesting way. It considers ATGP to be performed band by band from a specific
data acquisition point of view. In general, there are two formats, band-interleaved-
pixel/sample/line (BIP/BIS/BIL) and band sequential (BSQ), which have been used
for the acquisition of remote sensing imagery (Schowengerdt 1997). While the
former has been widely used for real-time processing sample by sample (Du and
Ren 2003; Du and Nevovei 2005, 2009; Tarabalka et al. 2009; Chen et al. 2014), the
latter has not received much attention to date.
The concept of using the BSQ format to implement ATGP is novel and derived
from bit plane coding (Gonzalez and Woods, 2007). Suppose that the use of a band
is coded by “1” and “0” otherwise. By virtue of this interpretation, the total number
of bands to be used for data acquisition is the same as the total number of bits to be
used for image coding. The bit plane coding encodes an image bit by bit from the
most significant bit to the least significant bit, with image quality gradually
improved in a progressive manner until all bits are fully used for encoding. Such
an image processing is generally referred to as progressive image processing.
Following the same treatment we can consider implementing ATGP on an L-band
hyperspectral image band by band as the way the bit plane coding does for an L-bit
image bit by bit. In other words, if a band is used by ATGP, the band is specified by
“1” and “0” otherwise. As a result, all the advantages resulting from the bit plane
coding can also be applied to ATGP provided that ATGP can be implemented
progressively band by band. Inspired by this striking similarity, it is highly desir-
able to develop a progressive band processing version for ATGP that can be used
for progressive target detection. Such progressive hyperspectral band processing of
ATGP (PHBP-ATGP) is a result of this need. More specifically, if the total number
of bands is L, for each recursive hyperspectral band processing of ATGP (RHBP-
n oL
ðlÞ
ATGP)-detected target tp, we are able to find L targets tp migrated from band
l¼1
ðlÞ
1 to band L, where tp is the pth target detected by RHBP-ATGP using the first
n oL
ðlÞ ðLÞ ðlÞ
l bands, that is, tp ! tp ¼ tp as l ! L. The set of these targets, tp , provides
l¼1
ð1Þ
important band-to-band (interband) transition information from tp obtained by
15.2 ATGP 453
ðLÞ
using only the first band on tp obtained by using full bands, a benefit that cannot be
offered by ATGP using full bands, as originally proposed in Ren and Chang (2003).
Many advantages can be gained from PHBP-ATGP. First and foremost is that
PHBP-ATGP provides profiles of progressive band-varying target detection maps
that allow users to see changes in target detection maps by PHBP-ATGP in each
band. This benefit can be offered by a one-shot operation ATGP using all spectral
bands. Intuitively, we can view PHBP-ATGP as a slow motion version of ATGP in
the sense that slow varying changes in target detection maps can be observed and
dictated from band to band. Second, using progressive target detection map profiles
enables users to find weak targets that may be captured by certain bands and later be
overwhelmed and dominated by strong targets subsequently detected by ATGP.
Targets of this type, such as moving targets, are generally not shown in the final
ATGP detection map, that is, detection maps produced by ATGP using full bands.
Third, PHBP-ATGP can help identify significant bands for target detection without
performing band selection. This is a significant advantage since band selection
requires prior knowledge of how many bands need to be selected and as well
intelligent selection of appropriate bands. Finally, PHBP-ATGP can be used for
data communication and transmission by progressively processing data band by
band in the same manner that progressive image processing does for image trans-
mission and communication. However, in order for PHBP-ATGP to be
implemented in real time for data communication and transmission, a recursive
version of PHBP-ATGP, called recursive hyperspectral band processing of ATGP
(RHBP-ATGP), was further derived by Chang and Li (2016). It can be considered a
Kalman filter-like detector that can be carried out in a manner similar to that of a
Kalman filter by only updating innovation information provided by the new incom-
ing band via recursive equations band by band without reprocessing previous
bands. Most importantly, the computer processing time of RHBP-ATGP is inde-
pendent of the number of bands being processed. In other words, RHBP-ATGP
requires nearly linear time for each band compared to PHBP-ATGP whose com-
puting time is accumulated with the number of processed bands and results in
exponential growth in computation.
15.2 ATGP
The development of ATGP primarily stems from the need to find targets of interest
in image data when no prior knowledge is available. It utilizes a sequence of
orthogonal complement subspaces from which a succession of targets of interest
can be extracted by finding maximal orthogonal projections. Interestingly, as shown
in Greg (2010), Chang et al. (2013), and Chang (2013), most such ATGP-generated
target pixels turn out to be endmembers. This is certainly not a coincidence because
the concept behind ATGP is actually the same as that behind the pixel purity index
(PPI), except for two key differences. PPI requires a very large number of random
vectors, called skewers, to find maximal/minimal orthogonal projections (OPs)

compared to ATGP, which only needs to find targets of interest from a sequence
of OP subspaces with maximal OP. As a result, PPI simultaneously extracts
endmembers by human manipulation, whereas ATGP extracts targets sequentially
one at a time automatically. Additionally, PPI takes advantage of the random nature
of skewers to cover as many directions as possible to find endmembers, as opposed
to ATGP, which searches for potential targets by finding specific directions deter-
mined by maximal OPs in specific directions. Nonetheless, both PPI and ATGP use
the same principle, OP, in two different ways.
ATGP was originally called the automatic target detection and classification
algorithm (ATDCA) in Ren and Chang (2003) and is also presented in Sect. 4.4.2.3.
It repeatedly implements an orthogonal subspace projector defined by
T 1 T
P⊥
U ¼IU U U U ð15:1Þ
to find targets of interest directly from data without prior knowledge. Its step-by-
step implementation can be described as follows.
Automatic Target Generation Process

Select an initial target pixel vector t0 ¼ argfmaxr rT rg and an error threshold ε.
Set p ¼ 1 and U0 ¼ ½t0 .
2. At the pth iteration, apply P⊥
t0 via (15.1) to all image pixels r in the image and find
the nth target tn satisfying
T
tp ¼ arg maxr P⊥Up1 r P ⊥
Up1 r ; ð15:2Þ

where Up1 ¼ t1 t2 tp1 is the target matrix generated at the ( p–1)st stage.
It should be pointed out that (15.2) can be carried out using the following
procedure:
N
(a) Assume that fri gi¼1 are the set of total data sample vectors. Let i ¼ 1 and
rmax
p r1 .
T 2
(b) If i > 1, calculate P⊥ P⊥ ⊥ and P⊥ ri >
Up1 ri Up1 ri ¼ PUp1 ri Up1
⊥ max
P r , then rmax ri , and go to step (c). Otherwise, continue.
Up1 p p
(c) If i < N, then go to step 2. Otherwise, continue.
(d) tp rmax
p , and go to step 3.
15.3 Recursive Equations for RHBP-ATGP 455
3. Stopping rule:
If tpT P⊥
Up1 tp > ε; ð15:3Þ

which is the same as (4.38), then augment Up ¼ Up1 tp ¼ t1 t2 tp by
adding tp to Up1 and go to step 2. Otherwise, continue.
4. At this stage, ATGP is terminated and the final set ofproduced target pixel vectors
comprises p target pixel vectors, t0 ; t1 ; t2 ; . . . ; tp1 ¼ ft0 g [ t1 ; t2 ; . . . ; tp1 .
It is worth noting that replacing tp in (15.3) with t0 yields
t0T P⊥
Up1 t0 ; ð15:4Þ
which is exactly the orthogonal projection correlation index (OPCI) defined in

Ren and Chang (2003).
15.3 Recursive Equations for RHBP-ATGP
In order for ATGP to be processed band by band, a key issue is to implement step
2 of ATGP progressively band by band. First, let
2 3
t11 t21 tðp1Þ1 tp1
6 t12 t22 tðp1Þ2 tp2 7

6 7
Ulp ¼ t1 t2 tp ¼ 6 ⋮ ⋱ ⋱ 7 ¼ Uðl1 Þp
;
6 7 tT ðlÞ
4 t1ðl1Þ t2ðl1Þ tðp1Þðl1Þ tpðl1Þ 5
t1l t2l tðp1Þl tpl
ð15:5Þ
h i1 h i 1 1
Uðl1Þp
T
Ulp Ulp ¼ UðTl1Þp t ðlÞ ¼ UðTl1Þp Uðl1Þp þ tðlÞtT ðlÞ ;
t T ðlÞ
ð15:6Þ
2 3
t11 t21 tðp1Þ1 tp1
6 t12 t22 ⋮ tðp1Þ2 tp2 7
6 7
where Uðl1Þp ¼ 6 ⋮ ⋮ ⋱ ⋮ ⋮ 7 is a ðl 1Þ p
4 5
t1ðl1Þ
t2ðl1Þ tðp1Þðl1Þ t
p ðl1Þ
T
matrix and tðlÞ ¼ t1l ; t2l ; . . . ; tpl is a p-dimensional vector.
Applying Woodbury’s identity in Appendix A,

T 1 1 A1 u vT A1
A þ uv ¼A ; ð15:7Þ
1 þ vT A1 u
to (15.6) yields
h i1 1
T
Ulp Ulp ¼ UðTl1Þp Uðl1Þp þ tðlÞtT ðlÞ
h i1
¼ UðTl1Þp Uðl1Þp
1 1
UðTl1Þp Uðl1Þp tðlÞ tT ðlÞ UðTl1Þp Uðl1Þp
1 ;
1 þ tT ðlÞ UðTl1Þp Uðl1Þp tðlÞ
ð15:8Þ
P⊥ ¼ Ill Ulp U#lp
Ulp
0 1 1
B UðTl1Þp Uðl1Þp UðTl1Þp ρ C
l ðl1Þ
Uðl1Þp B h iC
¼ Ill B
t ð lÞ @
T 1 ρ ρ T
U T
ρ ρ T
tðlÞ CA
l ðl1Þ lðl1Þ ðl1Þp l ðl1Þ lðl1Þ
1 þ tT ðlÞρ
l ðl1Þ
2 1 3
T T
ρ
Iðl1Þðl1Þ 0 6 U ð l1 Þp U ð l1 Þp U ðl1 Þp U ð l1 Þp U ð l1 Þp
l ðl1Þ 7
¼ 4 5
0 1 ρ T UðTl1Þp tT ðlÞρ
l ðl1Þ l ðl1Þ
h i
T
þ
1 Uðl1Þp ρ
l\ ðl1Þ ρ UðTl1Þp ρl\ ðl1Þ ρ T tðlÞ
1 þ t ðlÞρ
T t ðlÞ
T l ðl1 Þ l ð l1 Þ
l ðl1Þ
2 3
Uðl1Þp U#ðl1Þp Uðl1Þp ρ
Iðl1Þðl1Þ 0 l ðl1Þ
¼ 4 T 5
0 1 ρ UðTl1Þp tT ðlÞρ
l ðl1Þ l ðl1Þ
2 3
Uðl1Þp ρ ρ T UT Uðl1Þp ρ ρ T tðlÞ
1 l ðl1Þ lðl1Þ ðl1Þp l ðl1Þ lðl1Þ
þ 4 5
1 þ tT ðlÞρ tT ðlÞρ ρ T UT tT ðlÞρ ρ T tðlÞ
l ðl1Þ l ðl1Þ lðl1Þ ðl1Þp l ðl1Þ lðl1Þ
2 3
P⊥Uðl1Þp Uðl1Þp ρ
l ðl1Þ
6 7
¼4 1 T 5
Uðl1Þp Uðl1Þp Uðl1Þp
T
tðlÞ 1 t ðlÞρ
T
l ðl1Þ
2 3
Uðl1Þp ρ ρ T UðTl1Þp Uðl1Þp ρ ρ T tðlÞ
1 l ðl1Þ l ðl1Þ l ðl1Þ lðl1Þ
þ 4 5:
1 þ tT ðlÞρ tT ðlÞρ ρ T UT tT ðlÞρ ρ T tðlÞ
ð15:9Þ
T
Let rl ¼ ðr 1 ; r 2 ; . . . ; r l1 ; r l ÞT ¼ rl1 ; r l . Then

T rl1
rlT P⊥
Ulp rl ¼ rl1 r l P⊥ Ulp rl
2 3
P⊥ Uðl1Þp Uðl1Þp ρ
l ðl1 Þ

T 6 T 7 rl1 1
¼ rl1 rl 4 5 þ T
T r l 1 þ t ð lÞρ
Uðl1Þp ρ
l ðl1Þ
1 t ðlÞρ
l ðl1Þ l ðl1Þ
2 T T T 3

T Uðl1Þp ρlðl1Þ ρlðl1Þ Uðl1Þp Uðl1Þp ρlðl1Þ ρlðl1Þ tðlÞ rl1
rl1 r l 4 T 5
t ðlÞρ ρ T UT tT ðlÞρ ρ T tðlÞ rl
T
T T T rl1
¼ rl1 P⊥Uðl1Þp r l U ρ
ðl1Þp lðl1Þ r U
l1 ðl1Þp lðl1Þ ρ þ r l 1 t ð l Þρ
lðl1Þ rl
1
þ
1 þ tT ðlÞρ
l ðl1Þ
h T T
i ð15:10Þ
rl1 Uðl1Þp ρ ρ T UðTl1Þp þ r l tT ðlÞρ ρ T UðTl1Þp rl1 Uðl1Þp ρ ρ T tðlÞ þ rl tT ðlÞρ ρ T tðlÞ rl1
l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ
rl
T
T T T
¼ rl1 P⊥
Uðl1Þp rl1 r l Uðl1Þp ρlðl1Þ rl1 rl1 Uðl1Þp ρlðl1Þ r l þ r l 1 t ðlÞρlðl1Þ r l

1 T T T T T T
þ rl1 Uðl1Þp ρ ρ U rl1 þ rl t ðlÞρ ρ U rl1
1 þ tT ðlÞρ l ðl1Þ lðl1Þ ðl1Þp l ðl1Þ lðl1Þ ðl1Þp
l ðl1Þ

1 T T T T
þ r l1 U ðl1 Þp ρ ρ t ð l Þr l þ r l t ð l Þρ ρ t ðl Þr l
1 þ tT ðlÞρ lðl1Þ lðl1Þ lðl1Þ lðl1Þ
l ðl1Þ
T
T ⊥ T
¼ rl1 PUðl1Þp rl1 2rl Uðl1Þp ρ rl1 þ r l 1 t ðlÞρ rl
l ðl1Þ l ðl1Þ

1 T T T T T
þ rl1 Uðl1Þp ρ tðlÞrl U tðlÞr l ;
l ðl1Þ
þρ ρ rl1 þ ρ
1 þ tT ðlÞρ l ðl1Þ l ðl1Þ ðl1Þp l ðl1Þ
l ðl1Þ
1
where ρ ¼ UðTl1Þp Uðl1Þp tðlÞ is a p-dimensional vector.
l ðl1Þ
15.4 Algorithms for RHBP-ATGP
According to (15.9) and (15.10), the P⊥

Up1 in step 2 in ATGP can be updated by
P⊥ ⊥ T ⊥
Uðl1Þp without recalculating PUlp via (15.9), and the tp PUðl1Þp tp in step 3 can be
updated using recursive equation (15.10) through P⊥ Uðl1Þp . In addition, it should also
T
be noted that rT P⊥ Uðl1Þp r in (15.10) is identical to P ⊥
Uðl1Þp r P ⊥
Uðl1Þp r owing to
T
the fact that P⊥Uðl1Þp ¼ P⊥
Uðl1Þp and P⊥ Up1 is idempotent. That is,
T T
P⊥
Uðl1Þp r P⊥
Uðl1Þp r ¼ r
T
P⊥
Uðl1Þp P⊥ T ⊥
Uðl1Þp r ¼ r PUðl1Þp r. By virtue of two
recursive equations, (15.9) and (15.10), an RHBP-ATGP can be derived as follows.
Recursive Hyperspectral Band Processing of ATGP

Outer loop indexed by p from 1 to L
Inner loop indexed by l from 1 to L
ðlÞ
Let l p. Find an initial target pixel vector tp ¼ arg maxrl rlT rl . Set
h i
ðlÞ ðlÞ ð lÞ
UðplÞ ¼ t1 t2 tp , and calculate P⊥ðlÞ .
Up
ðlÞ
2. At the lth iteration, find tp by maximizing rT P⊥
Ulp r via (15.9) over all data
N
sample vectors, frl ðiÞgi¼1 . This can be done by the following steps:
ðlÞ
(a) Initial conditions: Let maxlp ¼ ðrl ðiÞÞT P⊥ Ulp rl ðiÞ and tp rl ðiÞ. Set
i ¼ 2.
(b) At the ith iteration, calculate ðrl ðiÞÞT P⊥
Ulp rl ðiÞ.
(c) If ðrl ðiÞÞT P⊥ p p T ⊥
Ulp rl ðiÞ > maxl , then maxl ¼ ðrl ðiÞÞ PUlp rl ðiÞ and
ðlÞ
tp rl ðiÞ. Otherwise, continue.
(d) If i < N, then i i þ 1, and go to step 2.b. Otherwise, the algorithm is
ðlÞ
terminated and tp is already found.
3. Let l l þ 1, and use (15.8) and (15.9) to update U#lp and P⊥
Ulp via
previously calculated U#ðl1Þp and P⊥
Uðl1Þp .
End (inner loop)

Go to step 2 until l ¼ L.
15.4 Algorithms for RHBP-ATGP 459
Fig. 15.1 Flowchart of RHBP-ATGP
End (outer loop)

Figure 15.1 depicts a flowchart of RHBP-ATGP.
From the previously designed RHBP-ATGP we can generate an L L target
matrix, TM ¼ MRHBP-ATGP, denoted by
(1 )
(1 ) t2 (L) (L)
t1 (1 ) t1 t2
tj
x
(l )
tp (L) (L)
s1 tj
s2 tp
(1 )
tp
sp
nB=1 nB=l nB=L

sL
y
n oL, L
Fig. 15.2 MRHBP-ATGP consisting of L2 target pixels, tpl , generated by RHBP-ATGP
p¼1, l¼1
2 ð1Þ ð2Þ ðL1Þ ðLÞ

3
t t1 t1 t1
6 1ð1Þ ð2Þ ðL1Þ ðLÞ 7
h i 6 t2 t2 t2 t2 7
6 7
TM ¼ MRHBP-ATGP ¼ tðplÞ ¼6 ⋮ ⋱ ⋱ ⋮ ⋮ 7; ð15:11Þ
LL 6 ð1Þ ðL1Þ ð LÞ 7
4 tL1 ⋮ ⋱ tL1 tL1 5
ð1Þ ð2Þ ðL1Þ ðLÞ
tL tL tL tL
ðlÞ
with tpðlÞ being the pth target generated by RHBP-ATGP using the first l bands, as
shown in Fig. 15.2, where the x- and y-axes denote the number of the first l bands
processed by nl, used to perform RHBP-ATGP, and sp is the pth target generated by
RHBP-ATGP.
Since the notation nl will be repeatedly used in the remaining discussions in this
chapter, a note on nl is necessary for clarification. It is defined as the number of
processed bands starting from the first and second bands and continuing on until the
lth band. Thus, both l and nl have dual interpretations. When nl is used, it indicates
that there are l bands being used to process RHBP-ATGP starting from the first
band and ending with the lth band, and the total number of these processed bands is
l, that is, nl ¼ l. On the other hand, when the notation of l is used, it can be an index
to specify the lth band or simply an integer number, l. Which interpretation is
applicable can be seen from the context without ambiguity.
ðlÞ
In general, the order of the target tpðlÞ specified by the subscript of p(l ) should be
a function of nl, which is used to produce the target. However, for simplicity and to
ðlÞ ðlÞ
avoid confusion, tpðlÞ is simplified as tp in Fig. 15.2 without including l as a
variable of p.
ðLÞ ð LÞ
Theoretically speaking, tpðLÞ ¼ tp ¼ tp , which is the pth target generated by
RHBP-ATGP using full bands, as it was originally designed in Ren and Chang
15.4 Algorithms for RHBP-ATGP 461
n op
ðLÞ
(2003). That is, RHBP-ATGP generates a set of p target signal vectors tj that
j¼1
can be arranged as a target vector, TV ¼ VRHBP-ATGP,

ðLÞ ðLÞ
TV ¼ VRHBP-ATGP ¼ t1 ; t2 ; ; tðpLÞ ; ð15:12Þ
corresponding to the last Lth column in MRHBP-ATGP in Fig. 15.2. It is the

MRHBP-ATGP in (15.11) generated by RHBP-ATGP that distinguishes itself
from the target vector VRHBP-ATGP in (15.12) generated by RHBP-ATGP.
This is because for each j target RHBP-ATGP generates only one single
ðLÞ
target, which is tj , compared to RHBP-ATGP, which can generate at most
n oL n o
ðlÞ ð1Þ ð2Þ ðLÞ ðiÞ
L different targets, tj ¼ tj ; tj ; ; tj , where tj may not be the
l¼1
ðkÞ
same as tj if i 6¼ k for each j. Also note that l p to avoid the matrix
singularity issue in performing (15.1). Thus, in practical terms, there are at
most ðL pÞp targets to form MRHBP-ATGP in (15.11).
Five concluding remarks are worth making.
1. RHBP-ATGP is quite different from RHBP-AD in Chap. 5 and RHBP-CEM in
Chap. 6 in how targets are extracted and found. In RHBP-AD there are no
specific targets in mind; targets are found by examining changes in progressive
detection profiles produced by AD compared to RHBP-CEM, which looks for a
specific target designed by a desired target signature d. RHBP-ATGP searches
for specific targets in an unsupervised manner where the targets of interest vary
with different sets of processed bands being used to find targets. Technically
speaking, RHBP-AD is a passive target detection algorithm that automatically
finds targets without target knowledge at all. By contrast, RHBP-ATGP is an
unsupervised active target detection algorithm that requires unsupervised target
knowledge to find targets of interest sequentially through a succession of OSPs
where the subsequent targets are obtained by finding maximal projections
orthogonal to the linear subspace spanned by previously found targets. This is
exactly the main reason that RHBP-ATGP can generate varying targets to
produce a target matrix in (15.11), as illustrated in Fig. 15.2.
2. In addition to the foregoing remark, another important note is that RHBP-ATGP
is also very different from RHBP-AD and RHBP-CEM in the sense that the
recursive equations used by RHBP-ATGP are derived from the target signature
matrix U in (15.1), while the recursive equations used by RHBP-AD and RHBP-
CEM are derived from the autocorrelation matrix R formed by data sample
vectors, not target signal sources used by RHBP-ATGP. In other words, the
recursive equation (15.10) derived for RHBP-ATGP iterates the undesired
signatures in U via (15.9), which has nothing to do with data sample vectors.
By contrast, the recursive equations derived for RHBP-AD and RHBP-CEM
iterate the sample band correlation matrix R, which accounts for dynamic
changes in sample spectral statistics provided by newly incoming bands. As a
result, the proposed approach to deriving RHBP-ATGP is completely different

from that used to derive RHBP-AD and RHBP-CEM. Most importantly, for
RHBP-AD and RHBP-CEM there are no targets shown in Fig. 15.2 to form a
target matrix in (15.12), which is a unique feature that RHBP-AD and RHBP-
CEM do not have.
3. Since RHBP-ATGP is deigned to produce target signal sources in an
unsupervised manner, it does not perform target detection. Consequently, there
is no receiver operating characteristic (ROC) analysis involved to evaluate target
detection performance as with CEM and AD. Nevertheless, such RHBP-ATGP-
generated targets can be used for follow-up supervised target detection discussed
in Chap. 2, for example, CEM and OSP or linear spectral mixture analysis
(LSMA) to perform ROC analysis.
4. Note that RHBP-ATGP is quite different from recursive hyperspectral sample
processing of ATGP (RHSP-ATGP) in Chap. 7 and recursive hyperspectral
sample processing of OSP (RHSP-OSP) in Chap. 8. RHSP-ATGP/RHSP-OSP
is designed to generate target signal sources sequentially and recursively using
full L bands, while the value of p varies and is determined by RHBP-ATGP/
RHSP-OSP-based virtual dimensionality (VD). By contrast, RHBP-ATGP is
designed to generate p-fixed targets, while the set of the first l bands, nl, used
by RHBP-ATGP varies with the value of l. More specifically, RHSP-ATGP/
RHSP-OSP is implemented varying p and fixing the total number of bands, L,
compared to RHBP-ATGP, which varies the set of the first l bands, nl, while
fixing the number of targets to be generated, p. Both represent completely
different approaches.
5. Finally, several unsupervised hyperspectral target detection algorithms reported
in the literature are closely related to ATGP. Most notable are vertex component
analysis (VCA), simplex growing algorithm (SGA), and successive projection
algorithm (SPA) (Ma et al. 2014), where SPA is exactly the same as ATGP. In
particular, ATGP, VCA, and SGA can be considered an abundance-
unconstrained endmember finding algorithm, a partial abundance-constrained
[abundance nonnegativity constraint (ANC)] endmember finding algorithm, and
a fully abundance-constrained [abundance sum-to-one constraint (ASC) and
ANC] endmember finding algorithm, respectively. These three algorithms are
perfect candidates to be used for comparison. Interestingly, it was shown in Greg
(2010), Chang et al. (2013, 2016b), Chen (2014), and Chang (2016) that the
ideas of VCA and SGA are essentially the same as that used in ATGP and, thus,
can be treated as variants of ATGP.
The synthetic image simulated in Fig. 1.15 is shown in Fig. 15.3, with five panels in
100%
A
B
Fig. 15.3 Set of 25 panels simulated by A, B, C, K, M
x 109
RBP-ATGP Detected Abundance Fractions
2 t (6)
1 = K(1, 1) t (6) = A(1, 1)
2
(15)
t 27 = M(1, 1)
1 t (28)
50 = B(1, 1)
0
6 0
S1 28 15
50
S2
S3 t (170) 100 ands
141 = C(1, 1) ed B
Sign
ature S4 Process
s
S5 150 ber of
170 Num
190
Fig. 15.4 3D plot of progressive detection maps of five signatures versus nl for TI
subpanel pixels were simulated according to the legends in Fig. 15.2. Thus, a total
of 100 pure pixels (80 in the first column and 20 in the second column), referred to
as endmember pixels, were simulated in the data by the five endmembers A, B, C,
area “BKG,” denoted by b and plotted in Fig. 1.14b, to be used to simulate the
Fig. 15.5 Progressive target detection maps by RHBP-ATGP using (a) nl ¼ 6, (b) nl ¼ 15, (c)
nl ¼ 28, (d) nl ¼ 170, (e) nl ¼ 189 (full 189 bands)
This b-simulated image background was further corrupted by an additive noise to
achieve a signal-to-noise ratio (SNR) of 20:1, which was defined as 50 % signature
(i.e., reflectance/radiance) divided by the standard deviation of the noise in Har-
sanyi and Chang (1994).
Once target pixels and background are simulated, two types of target insertion,
referred to as target implantation (TI) and target embeddedness (TE), can be
designed to simulate experiments for various applications. Since five mineral
signatures plus a background signature b are used to simulate the synthetic images,
there are six target signatures present in TI and TE.
15.5.1 TI Experiments
Figure 15.4 shows progressive detection profiles of these five mineral signatures, A,
B, C, K, and M, for TI, where the x-, y-, and z-axes denote extracted signatures, nl,
and RHBP-ATGP-detected abundance fractions, respectively. As shown in
Fig. 15.3, the signatures s1, s2, s3, s4, and s5 on the x-axis are the first five targets
found by RHBP-ATGP and the five mineral signatures A, B, C, K, and M are
represented by the panel pixels A(1,1), K(1,1), M(1,1), B(1,1), and C(1,1)
Table 15.1 Summary of panel pixels identified by RHBP-ATGP versus nl for TI

nl Panel pixels found by RHBP-ATGP
6 ð6Þ ð6Þ
K ¼ t1 , A ¼ t2
15 ð15Þ ð15Þ
K ¼ t1 , M ¼ t2
28 ð28Þ ð28Þ ð28Þ
A ¼ t1 , K ¼ t2 , B ¼ t3
170, 189 ð170Þ ð170Þ ð170Þ ð170Þ ð170Þ
M ¼ t1 , A ¼ t2 , B ¼ t3 , K ¼ t4 , C ¼ t5
Table 15.2 Summary of minimal nl to identify signatures and their orders extracted in their
particular bands for TI
Minimal nl Order of signatures
Signatures found identifying found by RHBP- Order of signatures found using
by RHBP-ATGP signatures ATGP bands up to its particular band
ð6Þ 6 2 2
A ¼ t2
ð28Þ 28 50 3
B ¼ t50
ð170Þ 170 141 5
C ¼ t141
ð6Þ 6 1 1
K ¼ t1
ð15Þ 15 27 2
M ¼ t27
corresponding to the first panel pixel of the five 4 4 panels in five rows, each of
which is located in the upper left corner in each row in the first column in Fig. 15.3.
Figure 15.5 shows five RHBP-ATGP-extracted target maps using different
numbers of bands starting with the six bands in Fig. 15.5a progressively increasing
to 15 bands in Fig. 15.5b, 28 bands in Fig. 15.5c, 170 bands in Fig. 15.5d, and
finally reaching the full bands, 189 bands in Fig. 15.5e, each of which shows the
targets found in particular transition bands specified by Fig. 15.4.
More specifically, the bands below the figures in Fig. 15.5 were those actually
extracted panel pixels where the red crosses indicate ground truth pixels and
triangles highlight the targets found in the current band, with the numbers next to
the triangles indicating the order of the targets found. Also, the targets found in the
previous bands are indicated by magenta.
Table 15.1 summarizes 6 target pixels extracted by RHBP-ATGP, with nl
varying with 6 bands, 15 bands, 28 bands, 170 bands, and full 189 bands, to extract
panel pixels corresponding to five mineral signatures, A, B, C, K, and M. The panel
ðlÞ
pixels tj in the second column of the table were identified as the jth target among
the six target pixels found by RHBP-ATGP using the first l bands. The value of
j indicates the jth order of the target extracted by RHBP-ATGP when the particular
lth band was the new band added to process RHBP-ATGP.
According to Figs. 15.4 and 15.5 and Tables 15.1 and 15.2, all panel pixels found
by RHBP-ATGP correspond to the five mineral signatures using the minimal nl.
ðlÞ
The value of the subscript “j” in tj , listed in the third column, is the order in which
a b
6
2 C(1, 1)
5
Order of Panel Pixels Extracted

6 B(1, 1)
3 4
5 M(1, 1)
3
4 A(1, 1), K(1, 1)

2
1 1
0
0 25 50 75 100 125 150 175 200
Fig. 15.6 Six ATGP- and RHBP-ATGP-generated targets for TI. (a) Six ATGP-generated targets
for TI; (b) orders of panel pixels extracted by RHBP-ATGP
it appears in the entire sequence of RHBP-ATGP-generated target pixels. The order

of the five panel pixels found by RHBP-ATGP by their particular bands is listed in
the fourth column, with the total number of bands up to the particular band being
used to process RHBP-ATGP. For example, panel pixel B(1,1) was found by
ð28Þ
RHBP-ATGP to be the third target (Fig. 15.5c), B ¼ t3 , when nl ¼ 28 was used.
But if we include the previous RHBP-ATGP-generated targets found by using the
number of processed bands, nl < 28, panel pixel B(1,1) is actually extracted as the
ð28Þ
50th target by RHBP-ATGP, that is, B ¼ t50 . Similarly, C(1,1) was picked up as
the 5th target by RHBP-ATGP in Fig. 15.4d, when the minimal nl ¼ 170, but it was
actually extracted as the 141st target by counting all previously generated RHBP-
ATGP targets.
Figure 15.6a shows the first 6 distinct targets extracted by RHBP-ATGP for TI
using all full 189 bands, which are exactly the same as the first 6 target pixels found
in Fig. 15.5e. Also, from the results given by Fig. 15.6 and Table 15.2, Fig. 15.6b
shows how many bands are required to extract the panel pixels A(1,1), K(1,1), M
(1,1), B(1,1), and C(1,1) corresponding to the five mineral signatures A, B, C, K,
and M, where the x- and y-axes indicate nl and their extracted order.
15.5.2 Discussion of Results
As noted, nl is defined as the number of the first l bands used to process RHBP-
ATGP, where these l bands start with the first band and end with the lth band, and
the subscript “l” is used as an index to track the lth band. Thus, if two different
ðlÞ ðmÞ
targets, tj and tk , are produced by RHBP-ATGP using nl and nm bands, then these
ðlÞ
targets will be ranked by nl and nm, respectively. If l < m, then tj is more
ðmÞ ðlÞ
significant than tk because tj can be extracted by RHBP-ATGP using a smaller
ðmÞ ðlÞ ðmÞ
nl than that used to extract tk . If both tj and tk are extracted with the same
ðlÞ ðmÞ ðlÞ
nl ¼ nm and j < k, then tj will be ranked higher than tk since tj is extracted ahead
ðlÞ ðlÞ
of tk . Therefore, the target sets, {tp }, found by RHBP-ATGP can be ranked by their
significance according to two indices, the superscript “l” and the subscript “p.” More
ðlÞ
specifically, the value of the superscript “l” in tp is first used to rank targets. If the
values of l are the same, the subscript “p” will then be used to prioritize their orders.
In other words, when two targets are found using the same “nl” bands, their
significance should be ranked by their extracted order specified by the subscript “p.”
For a given nl bands being used, RHBP-ATGP is implemented to find p targets.
ðlÞ ðlÞ
If two targets tj and tk with j < k are found by RHBP-ATGP using the same nl
ðlÞ ðmÞ ðlÞ
bands, it implies that tj is extracted prior to tk . This indicates that tj is more
ðmÞ
significant than tk . Thus, for example, according to Table 15.1 the two mineral
ð6Þ ð6Þ
signatures A and K were identified by Að1; 1Þ ¼ t2 ¼ Kð1; 1Þ ¼ t1 and both A
and K were extracted by RHBP-ATGP using the same nl ¼ 6 bands. Now if nl is
increased from 6 to 15, then the two mineral signatures K and M were identified by
ð15Þ ð15Þ
Kð1; 1Þ ¼ t1 , Mð1; 1Þ ¼ t2 , where K once again was picked up by RHBP-
ATGP using the first 15 bands while M was the first time to be found by RHBP-
ATGP as the second target. In this case, the minimal numbers of bands for A, K, and
M were 6, 6, and 15, respectively, but their extracted orders by RHBP-ATGP were
actually 1, 2, and 27, as tabulated in Table 15.2. Tables 15.1 and 15.2 show an
advantage of RHBP-ATGP over ATGP in that RHBP-ATGP keeps track of the
targets it detects and records these targets in its database.
15.5.3 TE Experiments
The same experiments conducted for TI were also performed for TE. Figure 15.7
shows the progressive detection profiles of these five mineral signatures, A, B, C, K,
and M, for TE, where the x-, y-, and z-axes denote extracted signatures, nl, and
RHBP-ATGP-detected abundance fractions, respectively.
Similar to Figs. 15.4, 15.7 also shows the five mineral signatures A, B, C, K, and
M extracted by the panel pixels, A(1,1), K(1,1), M(1,1), B(1,1), and C(1,1)
corresponding to the first panel pixel of the five 4 4 panels in the five rows,
each of which is located at the upper left corner in each row in the first column in
Fig. 15.3.
Figure 15.8 further shows five RHBP-ATGP-extracted target maps using differ-
ent sets of bands starting with nl ¼ 6 bands in Fig. 15.8a, progressively increasing to
x 109
12
10
6
(6)
t = K(1, 1)
1 (6)
4 t = A(1, 1)
2
(15)
t = M(1, 1)
24
2 t
(22)
= B(1, 1)
35
0
6 0
S1 28 15
(141)
50
S2 t = C(1, 1)
108 ds
S3 100 Ban
Sign ssed
ature
s
S4 141 fProce
S5 150 ber o
Num
190
Fig. 15.7 3D plot of progressive detection maps of five signatures versus nl for TE
Fig. 15.8 Progressive target detection maps by RHBP-ATGP with different numbers of processed
bands: (a) 6 bands, (b) 15 bands, (c) 22 bands, (d) 141 bands, (e) 189 bands (full bands)
Table 15.3 Summary of panel pixels being identified by RHBP-ATGP nl for TE

6 ð6Þ ð6Þ
K ¼ t1 , A ¼ t2
15 ð15Þ ð15Þ
K ¼ t1 , M ¼ t2
22 ð22Þ ð22Þ ð22Þ
A ¼ t1 , K ¼ t2 , B ¼ t4
141, 189 ð141Þ ð141Þ ð141Þ ð141Þ ð141Þ
M ¼ t1 , A ¼ t2 , B ¼ t3 , K ¼ t4 , C ¼ t5
particular bands for TE
ð6Þ 6 2 2
A ¼ t2
ð22Þ 22 35 4
B ¼ t35
ð141Þ 141 108 5
C ¼ t108
ð6Þ 6 1 1
K ¼ t1
ð15Þ 15 24 2
M ¼ t24
15 bands in Fig. 15.8b, 22 bands in Fig. 15.8c, 141 bands in Fig. 15.8d, and
finally reaching the full 189 bands in Fig. 15.8e, where each detection map
shows the targets found in the transition bands in 4 ranges in Fig. 15.8b. Specifi-
cally, the bands below the figures were those actually extracting panel pixels
where the red crosses indicate ground truth pixels and triangles highlight the targets
found in the current band, with the numbers next to the triangles indicating the
order of the targets found. The targets found in the previous bands are also marked
by magenta.
According to Fig. 15.8, Table 15.3 summarizes 6 target pixels extracted by
RHBP-ATGP, when nl varies where the processed bands start from the first
6 bands, the first 15 bands, the first 28 bands, the first 170 bands, and 189 full
bands, to extract panel pixels corresponding to five panel signatures. The panel
ðlÞ
pixels tj in the second column of the table are identified as the jth target among
these six target pixels, with the first l bands being used to process RHBP-ATGP.
According to Fig. 15.8 and Tables 15.3, 15.4 tabulates all panel pixels found by
RHBP-ATGP corresponding to the five mineral signatures using the minimal nl,
ðlÞ
where the value of the subscript j in tj listed in the third column is its order in the
entire sequence of RHBP-ATGP-generated target pixels. The order of the five panel
pixels found by RHBP-ATGP is listed in the fourth column with the total number of
bands up to the particular band being used to process RHBP-ATGP. For example,
ð22Þ
panel pixel B(1,1) was found to be the fourth target in Fig. 15.8c, B ¼ t4 , when
nl ¼ 22 was used. But if we include the previous RHBP-ATGP-generated targets
a b
6
2 C(1, 1)
5

3 B(1, 1)
4
5 M(1, 1)
3
4 A(1, 1), K(1, 1)

2
1 1
0
6 0 25 50 75 100 125 150 175 200
Fig. 15.9 Six ATGP and RHBP-ATGP generated targets for TE. (a) Six ATGP-generated targets
for TE; (b) orders of panel pixels extracted by RHBP-ATGP
found using the processed bands, nl < 22, panel pixel B(1,1) is actually extracted as
ð22Þ
the 35th target by RHBP-ATGP, that is, B ¼ t35 . Similarly, C(1,1) was picked up
as the 5th target by RHBP-ATGP in Fig. 15.8d when the minimal nl ¼ 141, but it
ð141Þ
was actually extracted as the 108th target, C ¼ t108 , by counting all previous
RHBP-ATGP-generated targets.
Figure 15.9a shows the first 6 distinct targets extracted by ATGP using all
189 bands from TE. According to the results in Fig. 15.7 and Table 15.2,
Fig. 15.9b shows how many bands are required to extract panel pixels
corresponding to the five mineral signatures A, B, C, K, and M, where the x- and
y-axes indicate nl used to process data and the order of the panel pixels A(1,1), K
(1,1), M(1,1), B(1,1), and C(1,1).
The image data to be studied here are from the HYperspectral Digital Imagery
Collection Experiment (HYDICE) image scene shown in Fig. 15.10a (and Fig. 1.
10a), which has a size of 64 64 pixel vectors with 15 panels in the scene and the
ground truth map in Fig. 15.10b (Fig. 1.10b). It was acquired by 210 spectral bands
with a spectral coverage from 0.4 to 2.5 μm. Low signal/high noise bands, bands
1–3 and 202–210, and water vapor absorption bands, bands 101–112 and 137–153,
were removed. Thus, a total of 169 bands were used in the experiments. The spatial
respectively.
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
Fig. 15.10 (a) HYDICE panel scene containing 15 panels. (b) Ground truth map of spatial
It has been demonstrated that panel pixel p212, marked by yellow in Fig. 15.10b,
is of particular interest. It is always extracted as the one with the most spectrally
distinct signature compared to the R panel pixels in row 2. This indicates that a
signature of spectral purity is not equivalent to a signature of spectral distinction. In
fact, in many cases panel pixel p212 instead of panel pixel p221 is the first panel pixel
in row 2 extracted by endmember finding algorithms to represent the panel signa-
ture in row 2. Also, because of such ambiguity the panel signature representing
panel pixels in the second row is either p221 or p212, which is always difficult for
endmember finding algorithms to find. This implies that the ground truth of R panel
pixels in the second row in Fig. 15.10b may not be as pure as was thought. Thus, it
has a total number of 20 panel pixels of interest, 19 R panel pixels plus a yellow
panel pixel, p212.
According to some research (Chang et al. 2010a, 2011b), the number of targets
required for ATGP to find the five ground truth panel signatures in Fig. 1.11 was
shown to be 18. Figure 15.11 plots 3D detection maps from the 5th to the 18th
signatures on the x-axis versus nl on the y-axis from 1 to 169 and detected
abundance fractions specified by the z-axis since the first panel pixel extracted by
RHBP-ATGP was the 5th target pixel. The reason we started with the fifth signature
and not the first one is simply because the fifth and sixth signatures, p221 and p521,
were found to be the first two panel pixels representing two of the five panel
signatures p2 and p5 when the first six spectral bands, that is, l ¼ 6, were used to
process RHBP-ATGP.
To further track the targets found by RHBP-ATGP in the transition bands shown
in Fig. 15.11, these targets are also shown in Fig. 15.12, where the bands below the
figures were those bands actually picking up panel pixels. In Fig. 15.12, the red
crosses indicate the 19 R panel pixels and triangles highlight the targets found in the
current band, with the numbers next to the triangles indicating the order of the
targets being extracted by RHBP-ATGP. In addition, the targets found in the
previous bands are also marked by magenta.
According to Fig. 15.12, with 7 transition bands, Table 15.5 summarizes 18 tar-
get pixels extracted by RHBP-ATGP when nl varies, with 18 bands, 26 bands,
t(18) = p
5 521
t(27) = p
37 221
x 106
10 t(48) = p t(26) = p
55 311 35 212
8 0
t(46) = p 18
t(48) = p 53 312
26
56 11
6 27
46
4
s
48
d
an
B
2 t(97) = p
ed
98 411
ss
ce
0 97
ro
fP
ro
S5 S6
be
S7 S8
um
S9 S10
S11 S12
N
Target Si
gnatures S13 S14
Extracted S15 S16
by RBP-AT S17 S18 170
GP
Fig. 15.11 3D detection map plot of 5th to 18th signatures versus nl for HYDICE data
27 bands, 46 bands, 48 bands, 97 bands, and all 169 bands, to extract panel pixels
ðlÞ
corresponding to five panel signatures, where the panel pixels tj in the second
column of the table are identified as the jth target among these 18 target pixels, with
l bands being used to process RHBP-ATGP.
Comparing the results available in Chang (2013), two more panel pixels, p212
and p42, were found in Fig. 15.12f. These experiments demonstrated the advantages
of using RHBP-ATGP.
Furthermore, for comparative study, Fig. 15.13 also shows 18 targets produced
by ATGP using all 169 bands, with the number of targets to be generated, p ¼ 18,
determined by the work in Chang et al. (2010a, 2011b), where the red crosses
indicate the locations of the 19 R panel pixels and the open circle shows the targets
found by ATGP.
If we compare Fig. 15.13 to Fig. 15.12f–g, results using l ¼ 97 and 169 could
ð97Þ ð97Þ ð97Þ
extract 5 R panel pixels in 5 different rows, p11 ¼ t14 , p221 ¼ t16 , p312 ¼ t5 ,
ð97Þ ð97Þ ð169Þ ð169Þ
p411 ¼ t18 , and p521 ¼ t3 using the first 97 bands and p11 ¼ t6 , p212 ¼ t17 ,
ð169Þ ð169Þ ð169Þ
p312 ¼ t5 , p411 ¼ t18 , and p521 ¼ t3 using all 169 full bands. However, there
is a discrepancy between the 5 panel pixels generated by 97 bands and 169 bands in
ð169Þ
that the target pixel found by t17 is the yellow panel pixel p212, which is different
from the ground truth R panel pixel p221 found using 97 bands. In addition, the
extracted order of p11 was also different. If we further examine the panel pixels
found in Fig. 15.12 combined with Table 15.4, RHBP-ATGP actually found a total
of 7 panel pixels with 6 R panel pixels, p11, p221, p311, p312, p411, p521, and one
yellow panel pixel, p212, compared to only four R panel pixels, p11, p312, p411, p521,
plus a yellow panel pixel, p212, extracted in Fig. 15.13 by ATGP using all
Fig. 15.12 RHBP-ATGP progressive detection maps using various numbers of processed bands
for HYDICE data: (a) 18 bands, (b) 26 bands, (c) 27 bands, (d) 46 bands, (e) 48 bands, (f)
97 bands, (g) 169 bands
Table 15.5 Summary of panel pixels identified by RHBP-ATGP versus nl for HYDICE data
18 ð18Þ
p521 ¼ t5
26 ð26Þ ð26Þ
p212 ¼ t12 , p521 ¼ t3
27 ð27Þ ð27Þ
p221 ¼ t5 , p521 ¼ t3
46 ð46Þ ð46Þ ð46Þ
p221 ¼ t6 , p312 ¼ t17 , p521 ¼ t2
48 ð48Þ ð48Þ ð48Þ ð48Þ
p11 ¼ t16 , p221 ¼ t6 , p311 ¼ t8 , p521 ¼ t2
97 ð97Þ ð97Þ ð97Þ ð97Þ ð97Þ
p11 ¼ t14 , p221 ¼ t16 , p312 ¼ t5 , p411 ¼ t18 , p521 ¼ t3
169 ð169Þ ð169Þ ð169Þ ð169Þ ð169Þ
p11 ¼ t6 , p212 ¼ t17 , p312 ¼ t5 , p412 ¼ t18 , p521 ¼ t3
Fig. 15.13 ATGP-

generated target pixels for 11 6
HYDICE data 12
7 1
14
4 17
9
5 2
18
8
3
16
10 13 15
particular bands for HYDICE data
ð48Þ 48 56 16
p11 ¼ t56
ð27Þ 27 37 5
p221 ¼ t37
ð26Þ 26 35 12
p212 ¼ t35
ð48Þ 48 55 8
p311 ¼ t55
ð46Þ 46 53 17
p312 ¼ t53
ð97Þ 97 98 18
p411 ¼ t98
ð18Þ 18 5 5
p521 ¼ t5
169 bands. This implies that RHBP-ATGP provides advantages over ATGP in
allowing users to find two more R panel pixels, p221 and p311.
Table 15.6 summarizes the results in Fig. 15.12 by tabulating the minimal nl that
first identified a signature and the order of this signature found in a particular band.
More specifically, all the 7 panel pixels found by RHBP-ATGP correspond to the
ðlÞ
five panel signatures using the minimal nl, where the value of the subscript j in tj
given in the third column is its extracted order in the entire sequence of RHBP-
ATGP-generated target pixels. The order of the 5 panel pixels to be found by
RHBP-ATGP is listed in the fourth column, with the total number of bands up to the
particular band were used to process RHBP-ATGP.
Fig. 15.14 Orders of panel 8

pixels extracted by RHBP- p411
Order of Panel Pixels Extracted by

ATGP for HYDICE data 7
p11 p311
6
RHBP-ATGP
5
p312
4
p221
3
p212
2
p521
1
0
0 25 50 75 100 125 150
Fig. 15.15 Plot of number 100

Number of Distinct Targets Found by
of signatures found by
RHBP-ATGP versus nl for
HYDICE data
75
RBP-ATGP
50
25
0
0 25 50 75 100 125 150
According to Fig. 15.12 and Table 15.6, there are a total of 7 panel pixels,
including 6 R panel pixels, p11, p311, p312, p411, p42, and p521, and one yellow panel
pixel, p212, identified by RHBP-ATGP. Figure 15.14 further shows the order of
these 7 panel pixels found by RHBP-ATGP, indicated by the y-axis versus nl
indicated by the x-axis. For example, when nl ¼ 18, only one panel pixel, p521,
was found. Furthermore, when nl ¼ 26 and 27, two more panel pixels, p212 and p221,
were found, and then three panel pixels, p312, p11, and p311, were found before nl
reached 97. After 97 bands were processed, the last panel pixel, p411, was found.
Theoretically speaking, the target set found by PHBP-ATGP has the maximal
number of targets, p L, through band-by-band, progressive processing. That is,
for each different band, RHBP-ATGP can generate a different target. However,
200
p
Frequency
11
150
100
50 p
212
0 p
221 p
0 312
8
p
311
16
p p
411 511
24
Ro
32
w
In
de
40
x
64
48 56
48
40
56 32
24
16 Index
64 8 Column
0
Fig. 15.16 3D histogram of pixels picked up by RHBP-ATGP as targets
practically speaking, the size of an RHBP-ATGP-generated target set is smaller

than p L since some of the targets may be picked up by RHBP-ATGP more than
once. For example, for the HYDICE data set, p ¼ 18, p L ¼ 3042. In reality, the
number of distinct targets found by PHBP-ATGP was 98, which was much smaller
than 3042. In the meantime, the number of targets found by RHBP-ATGP tended to
be stabilized as more bands were added to the progressive process and RHBP-
ATGP-generated targets stopped varying after 97 bands were processed
(Fig. 15.15).
Furthermore, Fig. 15.16 also plots a 3D histogram to show how frequently a
particular pixel was picked up by RHBP-ATGP as a target, where the x-axis and y-
axis correspond respectively to the columns and rows of spatial locations of the
HYDICE scene.
It should be noted that the z-axis in Fig. 15.16 indicates the number of times a
given pixel at a particular spatial location was detected as a target by RHBP-ATGP
after all bands were processed. Thus, for the HYDICE scene with 169 bands, the
maximum value for the z-axis is 169. As shown in Fig. 15.16, the magenta upward
arrows indicate the spatial locations of the seven ground truth R panel pixels.
Finally, according to the second comment at the end of Sect. 15.4, we emphasize
that RHBP-ATGP is an unsupervised target detection algorithm that is specially
designed to produce target signal sources in a completely unsupervised manner via
(15.1) to find maximal leakage residuals through a succession of OSPs. Its main
task is to provide a posteriori unsupervised target knowledge, not to perform target
detection. Thus, there is no need to use Neyman–Pearson detection theory to
conduct ROC analysis for RHBP-ATGP. In addition, we would also like to point
out one very important feature resulting from RHBP-ATGP. Since target detection
maps produced by hyperspectral target detection algorithms are actually real
valued, an appropriate threshold to segment out targets as desired targets is gener-

ally required. Unfortunately, finding an appropriate threshold is very difficult
because it varies and is determined by data sets to be used and different applica-
tions. This is precisely the reason and a major advantage of RHBP-ATGP over
other target detection algorithms since RHBP-ATGP provides progressive target
detection maps for visual inspection of all possible detected targets while selecting
targets with the highest detected abundance amounts as the desired targets.
To evaluate the computational efficiency of RHBP-ATGP in data processing time,

we calculate the computing time required by RHBP-ATGP using the recursive
equations (15.9) and (15.10) to update the results and PHBP-ATGP without recur-
sive equations but rather repeatedly implementing (15.1) with U augmented band
by band every time the lth band is a new incoming band. Both RHBP-ATGP and
PHBP-ATGP were run and executed in MATLAB R2012b on an Intel Core i7-3770
running at 3.40 GHz with 16 GB RAM on three data sets ten times to produce an
average computing time. Figure 15.17a–c plots computing times required for
RHBP-ATGP and PHBP-ATGP without using recursive equations (15.9) and
(15.10) to process TI, TE, and HYDICE data, respectively, where the y-axis is
the computer processing time in seconds and the x-axis is the lth band as a new
incoming band. As can be seen from these figures, the processing time required by
RHBP-ATGP and PHBP-ATGP to run each new individual band is nearly linear,
where RHBP-ATGP required less time than PHBP-ATGP. It is also worth noting
that the fluctuations of plots resulted from numerical computations by computer
implementation.
To conduct a fair comparative analysis, Fig. 15.18a–c further plots computing
time versus the number of processed bands, nl (i.e., the number of the first l bands to
be used for processing) required by RHBP-ATGP using recursive equations (15.9)
and (15.10) and PHBP-ATGP without using recursive equation (15.10) for TI, TE,
and HYDICE data, respectively. It should be noted that the plots of RHBP-ATGP
represent the cumulative computer processing time required by RHBP-ATGP,
which is obtained by summing up all the computing times in Fig. 15.18 up to the
lth band. As we can see, RHBP-ATGP ran faster than PHBP-ATGP, which was
implemented without using recursive equations (15.9) and (15.10). In the latter
case, PHBP-ATGP repeatedly used (15.1) to directly calculate band varying P⊥ Ulp
when each new band, that is, the lth band, came in.
Table 15.7 calculates the final computer times required by RHBP-ATGP and
PHBP-ATGP to complete the entire data sets, where RHBP-ATGP achieved an
efficiency of around 30 % compared to PHBP-ATGP. However, we believe that this
saving would be tremendous and pronounced once RHBP-ATGP is implemented
on hardware due to its recursive structure.
a b
0.8 0.8
PBP-ATGP PBP-ATGP
RBP-ATGP RBP-ATGP
Computing Time (sec)

0.6 0.6
0.4 0.4
0.2 0.2
0 0
0 25 50 75 100 125 150 175 200 0 25 50 75 100 125 150 175 200
lth band lth band
c
0.2
PBP-ATGP
RBP-ATGP
0.15
0.1
0.05
0
0 25 50 75 100 125 150
lth band
Fig. 15.17 Computing time versus lth band by RHBP-ATGP for (a) TI, (b) TE, (c) HYDICE
A graphical user interface (GUI), with screenshot shown in Fig. 15.19, was devel-
oped to implement RHBP-ATGP using MATLAB for performance analysis. The
GUI allows users to load different data sources and choose different numbers of
targets for analysis. The three images displayed at the top of the windows show a
color image of the scene, a grayscale image of the band currently being processed,
and the result of subpixel detection after the current band is processed. At the
bottom, a window shows changes in the detected abundance fraction values in real
time with the cyan upward arrow at the top highlighting the results of the band
currently being processed. Once the data are loaded, users can start processing by
a b
70 70
PBP-ATGP PBP-ATGP
60 RBP-ATGP 60 RBP-ATGP
50 50
Computing Time
Computing Time
40 40
30 30
20 20
10 10
0 0
0 25 50 75 100 125 150 175 200 0 25 50 75 100 125 150 175 200
c
14
PBP-ATGP
12 RBP-ATGP
10
Computing Time
0
0 25 50 75 100 125 150
Fig. 15.18 Computing time versus nl by ATGP and RHBP-ATGP for (a) TI, (b) TE, (c) HYDICE
Table 15.7 Comparison of TI (s) TE (s) HYDICE (s)

ATGP 0.5794 0.5895 0.1854
required by RHBP-ATGP and
PHBP-ATGP RHBP-ATGP 43.4537 43.4818 10.0628
PHBP-ATGP 61.5242 60.0149 15.5124
clicking the start button. When each band is received at each iteration, RHBP-
ATGP processes this newly received band to produce results with the current band
and the resulting image updated to allow users to observe the changes in target
detection by RHBP-ATGP.
Fig. 15.19 GUI design for RHBP-ATGP
15.9 Conclusions
ATGP is an unsupervised hyperspectral target detection technique that has proven

very effective in many applications, such as subpixel target detection, endmembers,
anomalies, and artificial targets. Its field-programmable gate array (FPGA)
implementations were also studied in Bernabe et al. (2011, 2013). Interestingly,
the idea of implementing RHBP-ATGP according to BSQ band by band was not
investigated previously. Its development has gone through various designs, from
the original ATGP presented in Sect. 4.4.2.3, developed by Ren and Chang (2003),
used for target detection, an endmember version of ATGP for finding endmembers,
to RHSP-ATGP in Chap. 7 for finding targets recursively where the number of
targets can be determined by virtual dimensionality (VD) developed in Chap. 4.
This chapter presented another new design of ATGP, recursive hyperspectral band
processing (RHBP) of ATGP, RHBP-ATGP, which allows image analysts to
produce progressive band-varying target detection maps so that a target matrix
can be generated to dictate target detection variability during BSQ. Eventually,

RHBP-ATGP produces a set of L2 targets, which provides more target information
band to band compared to L targets generated by ATGP using entire L bands.
These additional progressive target detection profiles provide crucial information
needed to detect weak targets that may be dominated by ATGP-generated targets
using full bands.
Chapter 16
of Orthogonal Subspace Projection
Abstract Progressive hyperspectral band processing (PHBP) processes data band

by band without waiting for data to be completely collected according to the band-
sequential (BSQ) format acquired by a hyperspectral imaging sensor. It provides
progressive band-varying profiles for data processing. In order for PHBP to be
implemented in real time, a recursive version of PHBP, called recursive
hyperspectral band processing (RHBP), can also be derived by a concept similar
to that of a Kalman filter. With such a Kalman filter-like RHBP, a real-time
capability can be realized. This is particularly important for satellite communica-
tion when data download is limited by bandwidth and transmission. In Chap. 15 a
commonly used subpixel target detection algorithm, the automatic target generation
process (ATGP), presented in Sect. 4.4.2.3, is developed when the only required
knowledge is the desired target signature. This chapter follows a similar treatment
to extend another widely used mixed pixel classification technique, orthogonal
subspace projection (OSP) developed by Harsanyi and Chang (IEEE Trans Geosci
Remote Sens 32(4):779–785, 1994), to an RHBP version that can process OSPs
band by band progressively to produce progressive band-varying profiles for
various targets of interest. Unlike RHBP-anomaly detection (Chap. 14), which
requires no target knowledge at all, RHBP-constrained energy minimization
(Chap. 13), which requires only the desired target information as partial knowledge,
and RHBP-ATGP (Chap. 15), which generates a set of targets of interest without
prior knowledge, RHBP-OSP actually makes use of complete target knowledge
provided a priori by known information or by unsupervised target finding algo-
rithms such as ATGP a posteriori to produce a progressive band-varying profile of
the abundance fraction of each target signature of interest for further band-by-band
mixed pixel classification. Such progressive band-varying profiles offered by
RHBP-OSP cannot be provided by any other OSP-like operator. Finally and most
importantly, RHBP-OSP can locate and identify bands that are significant for data
processing in a progressive manner, and the results can be updated only by
innovation information generated by recursive equations. As a consequence, no
accumulated computer processing time is required by RHBP to process all bands,
and the time of processing each new incoming band is nearly constant.

DOI 10.1007/978-3-319-45171-8_16
484 16 Recursive Hyperspectral Band Processing of Orthogonal Subspace Projection
16.1 Introduction
Orthogonal subspace projection (OSP) was originally developed by Harsanyi and

Chang (1994) to perform mixed pixel classification via dimensionality reduction
(DR). It was later shown that OSP could be considered an abundance-unconstrained
linear spectral unmixing (LSU) technique (Tu et al. 1997; Chang et al. 1998). The
idea of OSP is to decompose its operating process into two stages, an undesired
signature annihilation process via an orthogonal complement subspace projection
in the first stage followed by a matched filter specified by a desired signature in the
second stage. It is very effective as long as undesired signatures are well selected in
the sense of data represented by a linear mixing model (LMM).
Since OSP is operated on a single pixel vector basis, it can be implemented in
real time provided that the signatures used to form an LMM are known a priori.
From the viewpoint of a band-interleaved-ppixel/sample/line (BIP/BIS/BIL) acqui-
sition format (Schowengerdt 1997), OSP meets this requirement. However, there is
another acquisition format, band-sequential (BSQ) (Schowengerdt 1997), which is
particularly useful in data transmission and communication where pixel informa-
tion is acquired band by band rather than pixel by pixel with full bands. Interest-
ingly, the concept of using BSQ format to implement OSP has not been investigated
and explored.
The concept of BSQ can be well explained by one of the simplest and most
effective coding techniques, called bit plane coding, which encodes an image bit by
bit from the most significant bit to the least significant bit with image quality
gradually improving in a progressive manner until all bits are fully used for
encoding (Gonzalez and Woods 2007). Such image processing is generally referred
to as progressive image processing. Inspired by such a progressive process, we can
interpret the number of bits as the number of bands where the use of a particular
band is encoded by “1” and “0” otherwise. As a result, BSQ format can be
interpreted as operating on an L-band hyperspectral image band by band in the
same way that bit plane coding encodes an L-bit image bit by bit. As a result, all
advantages resulting from bit plane coding can also be applied to BSQ such that
BSQ can be implemented progressively band by band. Inspired by this striking
similarity, a PHBP version for OSP using the BSQ format can also be derived for
progressive mixed pixel classification in the same way that bit plane coding works
for images. In particular, if band prioritization is done prior to PHBP, OSP can
perform progressive band selection from the most significant band to the least
significant band as bit plane coding does from the most significant bit to the least
significant bit.
By taking advantage of PHBP, a profile of progressive band-varying OSP–mixed
pixel classification for each pixel can be created for data analysis. Such a profile
allows users to see progressive changes in mixed pixel classification by OSP in each
band, which cannot be provided by OSP using full band information as a one-shot
operation. Intuitively, we can view PHBP-OSP as a slow-motion version of OSP,
which shows slow varying mixed pixel classification changing from band to band.
16.2 Orthogonal Subspace Projection 485
Second, the progressive profiles in interband changes enable us to find bands that
are significant and crucial to mixed pixel classification without using band
selection. Finally and most importantly, PHBP-OSP provides missing details of
spectral band-to-band transition information that can be used to analyze interband
correlation in mixed pixel classification.
As PHBP-OSP is implemented, the computer processing time is increased and
accumulated by adding and processing new bands. Such computing time grows
exponentially by the number of processed bands and will become a major hurdle for
real-time processing. To resolve this dilemma, a Kalman filtering approach is used
to derive a recursive version of PHBP-OSP, called recursive hyperspectral band
processing of OSP (RHBP-OSP). The idea behind it is to update results only by new
information, referred to as innovation information, which is provided only by a new
incoming band but not in the processed information obtained from previous bands.
Such innovation information updating is done by recursive equations, and the time
of processing RHBP-OSP for each new band is nearly constant. As a result, RHBP-
OSP can implement not only progressively as bit plane coding does but also
recursively via recursive equations as a Kalman filter does. This benefit is very
significant when it comes to hyperspectral data communication and transmission as
the hyperspectral data must be processed on a timely basis.
16.2 Orthogonal Subspace Projection
Let L be the number of spectral bands and r be an L-dimensional data sample

vector. Assume that there are p material substance signatures, m1, m2, . . ., mp.
Assume that d is the designated signature selected from p signatures m1, m2, . . .,
mp. Let d ¼ mp without loss of generality and U ¼ m1 m2 mp1 be the
undesired substance spectral signature matrix made up of m1 , m2 , . . . , mp1 .
We can rewrite a linear mixing model as
r ¼ Mfαg þ n ¼ dfαgp Ufγ g þ n: ð16:1Þ
Then OSP uses the signal-to-noise ratio (SNR) as a criterion for optimality to
derive
T ⊥
^ OSP
α p ðrÞ ¼ d PU r; ð16:2Þ
where
T 1 T
P⊥
U ¼ I UU ¼ I U U U
#
U ð16:3Þ
1
and U# is the pseudo-inverse of U given by UT U UT . It was shown in Tu et al.
(1997) and Chang et al. (1998) that OSP can also be derived as a least-squares OSP
(LSOSP) for αp, denoted by α^ pLSOSP ðrÞ, as
1 OSP
^ pLSOSP ðrÞ ¼ dT P⊥
α Ud ^ p ðrÞ;
α ð16:4Þ
which turns out to be exactly the same as the pth abundance fraction of αp estimated
^ LS
by α p ðrÞ, as specified by
1
^ LS ¼ MT M MT r
α ð16:5Þ

^ LS ¼ α
where α ^ LS ^ LS
1 , ...,α p . Interestingly, it was shown in Chang (2007) that
^ LS
α ^ jLSOSP ðrÞ for all 1 j p. In other words, α
j ðrÞ ¼ α ^ LS
p ðrÞ and α^ pLSOSP ðrÞ can
T ⊥
1
be simply obtained by α ^ OSP
p ðrÞ with a scaled constant d PU d . Nonetheless,
^ OSP
α p ðrÞ has an advantage over α ^ LS ðrÞ in that α
^ OSP
p ðrÞ can be used to generate new
signal sources in an unsupervised manner, a task for which the LS approach
specified by (16.2) has difficulty with because it requires the prior knowledge of
all signatures for abundance estimation. One such algorithm is the automatic target
generation process (ATGP) developed by Ren and Chang (2003), which will be
used in our experiments to generate unsupervised target signal sources to form M in
(16.1) for OSP.
16.3 Recursive Equations for RHBP-OSP
Examination of (16.2) and (16.3) reveals that a key to developing RHBP-OSP is to

derive a recursive equation that can update P⊥ ⊥
Ulp from PUðl1Þp and a new incoming
band, the lth spectral band Bl. Let

Ulp ¼ m1 m2 mp
2 3
m11 m21 mðp1Þ1 mp1
6 m12 mp2 7
6 m22 mðp1Þ2 7 U
ð16:6Þ
6 ⋮ ⋱ ⋱ ⋮ ⋮ 7¼7 ðl1Þp
¼6 ;
6 7 mT ðlÞ
4 m1ðl1Þ m2ðl1Þ mðp1Þðl1Þ mpðl1Þ 5
m1l m2l mðp1Þl mpl
h i1
h i 1
Uðl1Þp
T
Ulp Ulp ¼ Uðl1Þp mðlÞ
T
mT ðlÞ ð16:7Þ
1
¼ UðTl1Þp Uðl1Þp þ mðlÞmT ðlÞ ;
where
16.3 Recursive Equations for RHBP-OSP 487
2 3
6 m12 m22 ⋮ mðp1Þ2 mp2 7
6 7
Uðl1Þp ¼ 6 ⋮ ⋮ ⋱ ⋮ ⋮ 7 is an ðl 1Þ p
4 5
m1ðl1Þ m2ðl1Þ mðp1Þðl1Þ m
p ðl1Þ
T
matrix and mðlÞ ¼ m1l ; m2l ; . . . ; mpl is a p-dimensional vector. Then applying
Woodbury’s identity in Appendix A,

T 1 1 A1 u vT A1
A þ uv ¼A ; ð16:8Þ
1 þ vT A1 u
to (16.7) we can obtain

h i1 1
T
Ulp Ulp ¼ UðTl1Þp Uðl1Þp þ mðlÞmT ðlÞ
1 1
h i1
T
Uðl1Þp Uðl1Þp mðlÞ m ðlÞ Uðl1Þp Uðl1Þp
T T
¼ UðTl1Þp Uðl1Þp 1 ;

1 þ mT ðlÞ UðTl1Þp Uðl1Þp mðlÞ
ð16:9Þ
h i1
U#lp ¼ Ulp
T
Ulp Ulp T
h i1 h i
¼ UðTl1Þp Uðl1Þp UðTl1Þp mðlÞ
8 1 1 9
>
> T
mðlÞ m ðlÞ Uðl1Þp Uðl1Þp
T T >
>
< Uðl1Þp Uðl1Þp =h i
1 UðTl1Þp mðlÞ
>
> >
>
: 1 þ mT ðlÞ UðTl1Þp Uðl1Þp mðlÞ ;
1 1
¼ UðTl1Þp Uðl1Þp UðTl1Þp UðTl1Þp Uðl1Þp mðlÞ
h i

1 ρ ρ T UT ρ ρ T mðlÞ ;
1 þ mT ðlÞρ
l ðl1Þ
ð16:10Þ
1
where ρ ¼ UðTl1Þp Uðl1Þp mðlÞ is a p-dimensional vector.
l ðl1Þ
P⊥
Ulp ¼ Ill Ulp Ulp
#
0 1 1
" #B UðTl1Þp Uðl1Þp UðTl1Þp ρ C
l ðl1Þ
Uðl1Þp B C
¼ Ill B h i C
m ðlÞ @ B 1 ρ ρ T T
ρ ρ T
mðlÞ C
T U
A
1 þ m ðlÞρ
T
l ðl1Þ
2 3
" # Uðl1Þp U#ðl1Þp Uðl1Þp ρ
6 7
¼ 4 T 5
0 1 ρ UðTl1Þp mT ðlÞρ
l ðl1Þ l ðl1Þ
" #
Uðl1Þp h i
þ
1 ρl\ ðl1Þ ρ T UðTl1Þp ρl\ ðl1Þ ρ T mðlÞ
1 þ m ðlÞρ l ðl1Þ l ðl1Þ
T
l ðl1Þ m ðlÞ
T
2 3
" # Uðl1Þp U#ðl1Þp Uðl1Þp ρ
6 7
¼ 4 T 5
0 1 ρ UðTl1Þp mT ðlÞρ
l ðl1Þ l ðl1Þ
2 3
Uðl1Þp ρ ρ T UT Uðl1Þp ρ ρ T mðlÞ
1 6 7
þ 4 T 5
1 þ mT ðlÞρ ð Þρ ρ T T T
ð Þρ ρ T
ð Þ
lðl1Þ lðl1Þ ðl1Þp lðl1Þ lðl1Þ
l ðl1Þ m l U m l m l
2 3
P⊥ Uðl1Þp Uðl1Þp ρ
l ðl1Þ
6 7
¼64
1 T 7
5
Uðl1Þp UðTl1Þp Uðl1Þp mðlÞ 1 mT ðlÞρ
l ðl1Þ
2 3
Uðl1Þp ρ ρ T UT Uðl1Þp ρ ρ T mðlÞ
1 6 7
þ 4 5;
1 þ mT ðlÞρ mT ðlÞρ ρ T UðTl1Þp mT ðlÞρ ρ T mðlÞ
l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ
ð16:11Þ
" #
-OSP ðr Þ ¼ d T P⊥ r ¼ d T rl1
α
^ dRHBP
l l l Ulp l l1 d l P⊥
Ulp
rl
2 3
P⊥
Uðl1Þp Uðl1Þp ρ " #
l ðl1Þ
6 7 rl1
¼ dl1
T
dl 6
4
T 7
5 r
Uðl1Þp ρ 1 mT ðlÞρ l
l ðl1Þ l ðl1Þ
2 3
Uðl1Þp ρ ρ T UT Uðl1Þp ρ ρ T mðlÞ " #
6 l ðl1Þ lðl1Þ ðl1Þp l ðl1Þ lðl1Þ
1 7 rl1
þ T
dl1 dl 4 5
1 þ mT ðlÞρ mT ðlÞρ ρ T UT mT ðlÞρ ρ T mðlÞ rl
T

¼ dl1 P⊥ Uðl1Þp ρ rl þ d l 1 mT ðlÞρ
Uðl1Þp rl1 d l Uðl1Þp ρ rl1 dl1
T T
rl

1
þ T
dl1 Uðl1Þp ρ ρ T UT rl1 þ d l mT ðlÞρ ρ T UT rl1
1 þ mT ðlÞρ l ðl1Þ lðl1Þ ðl1Þp l ðl1Þ lðl1Þ ðl1Þp
l ðl1Þ

1
þ T
dl1 Uðl1Þp ρ ρ T mðlÞr l þ dl mT ðlÞρ ρ T mðlÞr l
1 þ mT ðlÞρ l ðl1Þ lðl1Þ l ðl1Þ lðl1Þ
l ðl1Þ
T

^ dRHBP
¼α -OSP ðr Þ d U Uðl1Þp ρ rl þ d l 1 mT ðlÞρ
ðl1Þp ρ rl1 dl1
T
l1 l rl
l1 l ðl1Þ l ðl1Þ l ðl1Þ

1
þ T
dl1 Uðl1Þp ρ þ dl mT ðlÞρ ρ T UðTl1Þp rl1 þ ρ T mðlÞr l ,
1þ mT ðlÞρ l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ
l ðl1Þ
ð16:12Þ
16.4 Recursive Hyperspectral Band Processing of OSP 489
where dl ¼ ðd1 ; . . . ; d l1 ; dl ÞT and α

^ RHBP -OSP ðr Þ ¼ d T P⊥
dl1 l1 l1 Uðl1Þp rl1 . According
to (16.12) three pieces of information are used to update α ^ RHBP-OSP ðr Þ, where d
dl l l
denotes the signatures used to unmix rl:
1. Previously unmixed abundance fraction of rl1 by dl1 via P⊥ ^ dRHBP -OSP
Uðl1Þp : α l1
ðrl1 Þ ¼ dl1
T
P⊥
Uðl1Þp rl1 using the first (l1) bands.
T
2. New information: the lth band, mðlÞ ¼ m1l ; m2l ; . . . ; mpl , dl and rl.
3. Innovation information, which is correlation information provided by Uðl1Þp
1
and m(l ): ρ ¼ UðTl1Þp Uðl1Þp mðlÞ .
l ðl1Þ
16.4 Recursive Hyperspectral Band Processing of OSP
By virtue of (16.11), P⊥ ⊥ ⊥
U ¼ PUlp used in (16.3) can be updated by PUðl1Þp without
recalculating P⊥ T ⊥
Ulp , and the dl PUlp rl can also be updated by (16.12). Thus, using
(16.11) and (16.12), a recursive hyperspectral band processing of OSP (RHBP-
OSP) can be described as follows.
Recursive Hyperspectral Band Processing of OSP (RHBP-OSP)

first band
information:
U1p ¼ m11 m21 mp1 , and calculate P⊥ U1p .
Input d1 ¼ d1 and r1 ¼ r 1 . Let l ¼ 1.
2. At the lth iteration, input the lth band and calculate dlT P⊥
Ulp rl via (16.11).
3. Use (16.10) and (16.11) to update U#lp and P⊥
Ulp respectively by previously
calculated U#ðl1Þp and P⊥Uðl1Þp .
RHBP-OSP -OSP ðr Þ.
4. Using (16.12), α^ dl ^ dRHBP
ðrl Þ is updated by α l1 l1
5. Let l l þ 1 and go to step 2 until l ¼ L.
Following RHBP-OSP we can derive RHBP-LSOSP for α ^ dRHBP -LSOSP ðr Þ in
l l
RHBP-OSP
(16.5) by implementing α ^ dl ðrl Þ recursively with the scale constant
1
T
dl1 P⊥ d , which can also be obtained at the same time that α ^ dRHBP -OSP
Uðl1Þp l1 l
ðrl Þ is generated by simply replacing rl and rl1 with dl and dl1 in (16.12).
Figure 16.1 depicts a flow chart of RHBP-OSP.
As a concluding remark, we would like also point out that what Recursive
Hyperspectral Sample Processing of OSP (RHSP-OSP) in Chap. 8 is to RHBP-
OSP developed in this chapter is exactly what Recursive Hyperspectral Sample
Processing of ATGP (RHSP-ATGP) in Chap. 7 is to Recursive Hyperspectral Band
Processing of ATGP (RHBP-ATGP) in Chap. 15. That is, RHSP-OSP is designed to
l=p
Ul p m11m 21 m pl and calculate PU

l p
l l 1
Recursive
Band Recursive Process
Process #
Update U l U l 1t p , U l p and PU .
l p
Progressive process
RHBP - OSP RHBP - OSP
Update ˆ d (rl ) by ˆ d (rl 1)
l l 1
No
l =L
Yes
p p 1
Yes p>L No
Stop
Fig. 16.1 Flowchart of RHBP-OSP
find an appropriate set of signatures for OSP recursively signature by signature

using all full bands compared to RHBP-OSP which is designed to process OSP
recursively and progressively with varying the value of l with using the first l bands,
denoted by nl, band by band while fixing the number of signatures p at a constant
value. In other words, RHSP-OSP varies the value of p while fixing l at L as
opposed to RHBP-OSP which varies l but fixes the value of p.
having the same size. Among 25 panels are five 4 4 pure-pixel panels for each row
in the first column, five 2 2 pure pixel panels for each row in the second column,
five 2 2 mixed pixel panels for each row in the third column, and five 1 1 subpixel
100%
A
B
of 100 pure pixels (80 in the first column and 20 in the second column), referred to as
endmember pixels, were simulated in the data by the five endmembers A, B, C, K,
and M. An area marked “BKG” in the upper right corner of Fig. 1.14a was selected to
find its sample mean, that is, the average of all pixel vectors within the “BKG” area,
denoted by b and also plotted in Fig. 1.14b, to be used for simulating the background
(BKG) for the image scene with a size of 200 200 pixels in Fig. 16.2. This
b-simulated image background was further corrupted by an additive noise to achieve
a SNR of 20:1, which was defined in Harsanyi and Chang (1994)as a 50 % signature
(i.e., reflectance/radiance) divided by the standard deviation of the noise.
Once target pixels and background are simulated, two types of target insertion,
referred to as target implantation (TI) and target embeddedness (TE), can be
designed to simulate experiments for various applications. Since five mineral
signatures plus a background signature b are used to simulate the synthetic images,
there are six target signatures present in TI and TE.
Let Ai, Bi, Ci, Ki, Mi for 1 i 4 denote the four panel pixels in the first row of
each of the five 4 4 pure panels in the first column in Fig. 16.2. In addition, let A5,
B5, C5, K5, and M5 also denote the first 2 2 panel pixel in the upper left corner in
the five panels in the second column in Fig. 16.2, respectively. Then Fig. 16.3
shows a 3D RHBP-OSP detection map of 25 pure panel pixels with each set of five
panel pixels Ai, Bi, Ci, Ki, Mi for 1 i 5 representing each of five mineral
signatures, A, B, C, K, M.
x 106
Abundance
10
5
0
0
20
40
60
80
100
120
140 M4 M5
M1 M2 M3
160 K3 K4 K5
C3 C4 C5 K1 K2
180 B5 C1 C2
B2 B3 B4
A4 A5 B1
A1 A2 A3
Fig. 16.3 3D detection plot of 25 panel pixels by PHBP-OSP versus nl for TI
To make discussions clear, we will use the notation nl defined in Chap. 15, where
nl is defined as the number of l bands being used to process OSP starting from the
first band and ending with the lth band. In other words, nl has a dual interpretation.
It not only provides information about how many bands are to be used for
processing OSP but also specifies the processed bands starting from the first band
and ending with the lth band. Similarly, when the notation of l is used, it can either
be used as an index to specify the lth band or simply be an integer number, l. Which
interpretation is applicable can be easily seen from the context.
Visual inspection of Fig. 16.3 reveals that the most distinct mineral signature
is M, followed by A, then K and B, with the least distinguishable mineral signature
being C. This unique progressive feature is a significant advantage of PHBP-OSP
not offered by OSP, which performs a one-shot operation using all the 189 bands as
shown in the last row corresponding nl ¼ 189 in Fig. 16.3. Figure 16.4 further shows
the detected abundance fractions of these 25 panel pixels with the 5 highest detected
abundance fractions highlighted on the plots, which are M1, M2, A1, M3, and M4.
Similarly, Fig. 16.5 shows a 3D RHBP-LSOSP estimation map of 25 pure panel
pixels with each set of 5 panel pixels Ai, Bi, Ci, Ki, and Mi for 1 i 5
representing each of 5 mineral signatures, A, B, C, K, and M.
It should be noted that, unlike Fig. 16.3, which shows the detected abundance
fractions of 25 panel pixels, the plots in Fig. 16.5 are actually abundance fractions
of the 25 panel pixels estimated by LSOSP. Since LSOSP is not fully abundance-
constrained, the peaks of 25 panel pixels in Fig. 16.5 that are supposed to be close to
1 may not be close to 1, as shown for the other panel pixels in Fig. 16.5. To see this,
Fig. 16.6 additionally shows the estimated abundance fractions of these 25 panel
pixels, where the abundance fractions of the 25 panel pixels were grouped into three
categories. One includes the 10 panel pixels A1, A2, B1, B2, C1, C2, K1, K2, M1, and
M2, with the correct estimated abundance fractions being one. Another is made up
of A3, A4, B3, B4, C3, C4, K3, K4, M3, and M4, which had estimated abundance
fractions of around 0.5, and the remaining panel pixels, A5, B5, C5, K5, and M5,
form the third category and had estimated abundance fractions of around 0.25.
x 106
A
2
M M2
1 K1
10 K
2
M
9 5
A3
8 A4
B1
7 B
2
Abundance
K4
6 M M4 K3
3
A A5
1
5
B3
B
4 4
K5
3 B5
C1
2 C
2
C
3
1 C4
C
0 5
0 6 20 40 60 80 100 120 140 160 180 200

Number of processed bands
Fig. 16.4 Detection of 25 panel pixels by RHBP-OSP versus nl for TI with top five ranked panels
indicated on plots
Abundance
1
0.5
0
0
20
40
Nu
60
m
be
80
ro
fp
100
ro
ce
120
ss
M4 M5
e
140
M1 M2 M3
d
K3 K4 K5
ba
160 C5 K1 K2
nd
C3 C4
B5 C1 C2
s
180 B2 B3 B4
A4 A5 B1
A1 A2 A3
Fig. 16.5 3D detection plot of 25 panel pixels by RHBP-LSOSP versus nl for TI
Interestingly, these three groups also correspond to the first two panel pixels, Ai, Bi,
Ci, Ki, and Mi for 1 i 2, the next two panel pixels, Ai, Bi, Ci, Ki, and Mi for
3 i 4, and the last panel pixels, Ai, Bi, Ci, Ki, and Mi for i ¼ 5, respectively.
Furthermore, if we compare Fig. 16.3 to Fig. 16.5 we immediately notice that
there are significant differences in the results produced by RHBP-OSP and RHBP-
LSOSP due to the fact that OSP is performed as a detector and LSOSP is indeed an
estimator. More specifically, RHBP-OSP attempted to detect most distinct signa-
tures with the highest possible abundance fractions in Fig. 16.3, which are M1, M2,
1.2
M1M2K1 K2 A1 A2 B1 B2 C1 C2
1
0.8
Abundance
0.6 C3 C4 B3 B4 A3 A4 M4M3K4 K3
0.4
C5 B5 A5 M5K5
0.2
-0.2
06 20 40 60 80 100 120 140 160 180 200
Fig. 16.6 Detection of 25 panel pixels by RHBP-LSOSP versus nl on TI
x 106
Abundance
10
5
0
0
20
40
Nu
60
m
be
80
ro
fp
100
ro
ce
120
ss
M3 M4 M5
ed
140
K5 M1 M2
K2 K3 K4
ba
160 C5 K1
nd
B5 C1 C2 C3 C4
s
180 B2 B3 B4
A4 A5 B1
A1 A2 A3
Fig. 16.7 3D detection plots of 25 panel pixels by RHBP-OSP versus nl for TE
A1, M3, and M4, as opposed to LSOSP, which tried to accurately estimate abun-
dance fractions within the range between 0 and 1 in Fig. 16.5, where all 10 panel
pixels A1, A2, B1, B2, C1, C2, K1, K2, M1, and M2 were estimated accurately in
Fig. 16.6 by LSOSP with abundance fractions equal to 1 as pure pixels.
The same experiments conducted for TI were also performed for TE. Figure 16.7
shows a 3D RHBP-OSP detection map of 25 pure panel pixels with each set of
5 panel pixels Ai, Bi, Ci, Ki, and Mi for 1 i 5 representing each of the 5 mineral
x 106
A
M1 M2 2
K1
10 K
2
M
9 A3
5
8 A4
B1
7 B
Abundance
2
K4
6 M M4
3
K3
A1 A5
5 B3
B
4 K5
4
3 B
5
C1
2 C
2
C
3
1 C4
C
0 5
06 20 40 60 80 100 120 140 160 180 200

Fig. 16.8 Detection of 25 panel pixels by RHBP-OSP versus nl for TE

Abundance
1
0.5
0
0
20
40
Nu
60
m
be
80
ro
fp
100
ro
ce
120
ss
M3 M4 M5
ed
140
K5 M1 M2
K2 K3 K4
ba
160 C5 K1
nd
B5 C1 C2 C3 C4
s
180 B2 B3 B4
A4 A5 B1
A1 A2 A3
Fig. 16.9 3D detection plot of 25 panel pixels by RHBP-LSOSP versus nl for TE
signatures, A, B, C, K, and M, along with Fig. 16.8, which shows the detected
abundance fractios of these 25 panel pixels with the 5 highest detected abundance
fractions highlighted on the plots, which are M1, M2, A1, M3, and M4.
Comparing Fig. 16.3 to Fig. 16.7 and Fig. 16.4 to Fig. 16.8, the results obtained
for TE were very close to those obtained for TI.
Analogous to Figs. 16.5 and 16.6, Figs. 16.9 and 16.10 also show a 3D RHBP-
LSOSP estimation map of 25 pure panel pixels with each set of 5 panel pixels Ai,
Bi, Ci, Ki, and Mi for 1 i 5 representing each of the 5 mineral signatures, A,
B, C, K, and M, along with their estimated abundance fractions. Once again, the
results obtained for TE were very close to those obtained for TI.
1.5
Abundance
C1 C2 B 1 B 2 A 1 A 2 M1M2K 1 K 2
1
C3 C4 B 3 B 4 A 3 A 4 M4M3K 4 K 3
0.5
C5 B 5 A 5 M5K 5
06 20 40 60 80 100 120 140 160 180 200

Fig. 16.10 Detection of 25 panel pixels by RHBP-LSOSP versus nl for TE
The image data to be studied are the HYperspectral Digital Imagery Collection
Experiment (HYDICE) image scene shown in Fig. 16.11a (and Fig. 1.10a), which
has a size of 64 64 pixel vectors with 15 panels in the scene and the ground truth
map in Fig. 16.11b (Fig. 1.10b). It was acquired by 210 spectral bands with a
spectral coverage from 0.4 to 2.5 μm. Low signal/high noise bands, bands 1–3 and
removed. Thus, a total of 169 bands were used in the experiments. The spatial
respectively. It is worth noting that panel pixel p212, marked by yellow in
Fig. 16.11b, is of particular interest. It is a yellow panel pixel that is always
extracted as being the most spectrally distinct signature compared to the R panel
pixels in row 2. This implies that a signature of spectral purity is not equivalent to a
signature of spectral distinction. Also, because of such ambiguity, the panel signa-
ture representing panel pixels in the second row is either p221 or p212, which is
always the last one found by endmember finding algorithms. This implies that the
ground truth of the R panel pixels in the second row in Fig. 16.11b may not be as
pure as was thought.
According to Chang (2003) and Chang and Du (2004), the virtual dimensionality
(VD) estimated for this HYDICE scene was up to nVD ¼ 18. In this case, we used
ATGP to generate 18 targets, shown in Fig. 16.12, where panel pixels
corresponding to the five panel signatures were extracted, with p11 as the 6th target
t6, p212 as the 17th target t17, p312 as the 5th target t5, p411 as the 18th target t18, and
p521 as the 3rd target t3.
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
Fig. 16.11 (a) HYDICE panel scene containing 15 panels; (b) ground truth map of spatial
Fig. 16.12 18 target pixels

generated by ATGP
Figures 16.13 and 16.14 show that RHBP-OSP and RHBP-LSOSP used the
18 target pixels extracted by ATGP in Fig. 16.12, where the z-axis in Figs. 16.13a
and 16.14a shows the progressive detection of changes in abundance fractions of
19 R panel pixels produced by RHBP-OSP and RHBP-LSOSP, respectively, and
Figs. 16.13b and 16.14b are their corresponding 3D plots of real-time progressive
profiles of abundance fractions of 19 R panel pixels detected by RHBP-OSP and
estimated by RHBP-LSOSP, respectively.
As we can see from Figs. 16.13a, b and 16.14a, b, the abundance fractions of all
panel pixels saturated and stayed nearly the same after n ¼ 118 bands were
processed. This implied that the remaining 51 bands provided no additional infor-
mation to improve the detection of desired targets. Figure 16.13a, b also clearly
shows that RHBP-OSP is a real-time progressive signal detection technique that
enhances detected signals in Fig. 16.13b, compared to RHBP-LSOSP, which is
indeed a real-time progressive signal estimation technique to estimate signal
a
5
10 x 10 p312
p311
9 p32
p11
8 p33
p521
7 p12
p52
6
Abundance
p511
p212
5 p411
p221
4 p211
p42
3 p412
p22
2 p23
p43
1 p13
p53
0
0 9 18 29 40 60 80 100 120 140 160 180
b 5
x 10
10
Abundance
5
0
0
20
40
Nu
60
m
be
ro
80
fp
ro
ce
100
ss
ed
120
p52 p53
ba
p521
p43 p511
nd
140 p412p42
s
p33 p411
p312p32
160 p23 p311
p212 p221p22
p13 p211
p11 p12
Fig. 16.13 Detection abundance fractions of 19 R panel pixels plus panel pixel p212 by RHBP-
OSP. (a) RHBP-OSP-detected abundance fractions of 19 R panel pixels plus p212. (b) 3D plot of
progressive profile of 19 R panel pixels plus panel pixel p212 by RHBP-OSP
abundances accurately, where the peaks shown in Fig. 16.14b, highlighted by open
circles, represent significant changes in the abundance fractions estimated by
RHBP-LSOSP for each of the 19 R panel pixels in earlier band numbers less than
60, and Fig. 16.14a plots the abundances of 19 R panel pixels starting from nl from
the first 9 bands up to the total number of bands, 169.
These experiments demonstrate the advantages of RHBP-OSP and RHBP-
LSOSP, which are able to provide progressive changes in abundance fractions
band by band during real-time band processing.
a p11
5
p312
p521
4
p212
3 p411
p311
2 p221
p42
1 p412
Abundance
p211
0 p52
p511
-1 p22
p32
-2 p12
p23
-3 p33
p43
-4 p13
p53
-5
0 9 18 29 40 60 80 100 120 140 160 180
b
Abundance
2
0
20
40
Nu
60
m
be
ro
80
fp
ro
ce
100
ss
ed
120
p52 p53
ba
p521
p43 p511
nd
140 p412p42
s
p32 p33 p411

160 p23 p311 p312
p212 p221p22
p13 p211
p11 p12
Fig. 16.14 Detection abundance fractions of 19 R panel pixels plus p212 by RHBP-LSOSP versus
nl. (a) RHBP-LSOSP-estimated abundance fractions of 19 R panel pixels plus panel pixel p212. (b)
3D plot of progressive profile of 19 R panel pixels plus panel pixel p212 by RHBP-LSOSP
To see the computational efficiency of RHBP-OSP/LSOSP in data processing time,

Fig. 16.15 plots the computing time required for RHBP-OSP/LSOSP to process TI,
TE, and HYDICE data, where the y-axis is the computer processing time in seconds
and the x-axis is running time using the recursive equations specified by (16.11) and
a b
0.06 0.06
PHBP-OSP PHBP-OSP
0.05 PHBP-LSOSP 0.05 PHBP-LSOSP
RHBP-OSP RHBP-OSP
0.04 RHBP-LSOSP 0.04 RHBP-LSOSP
Time (Sec)
Time (Sec)
0.03 0.03
0.02 0.02
0.01 0.01
0 0
06 20 40 60 80 100 120 140 160 180 200 06 20 40 60 80 100 120 140 160 180 200
Ith band Ith band
c
0.02
0.018 PHBP-OSP
0.016 PHBP-LSOSP
RHBP-OSP
0.014
RHBP-LSOSP
Time (Sec)
0.012
0.01
0.008
0.006
0.004
0.002
0
0 18 40 60 80 100 120 140 160 180
Ith band
Fig. 16.15 Computing time versus lth band by PHBP-OSP/LSOSP, RHBP-OSP/LSOSP for (a) TI,
(b) TE, (c) HYDICE scene
(16.12) to update the results when the lth band is a new incoming band. The
computer environment was as follows: operating system, Windows 7 64 bits;
CPU: Intel Core i7-3770 @ 3.40 GHz; memory: 16.0 GB; programming language:
MATLAB v8.0.
As can be seen, the computing time required for PHBP-OSP/LSOSP and RHBP-
OSP/LSOSP to run each individual band is nearly linear for TI and TE but shows
slow varying exponential curves for the HYDICE scene. Figure 16.16 further plots
the computing time required for PHBP-OSP/LSOSP without using recursive equa-
tions (16.11) and (16.12) to run using the first l bands where the x-axis is the running
time using the number of the first l bands, nl.
For a fair comparative study, we also include the cumulative computer
processing time of RHBP-OSP/LSOSP, which is obtained by summing up all the
computing times in Fig. 16.15 up to the lth band.
As shown in Fig. 16.16, the computing time increases as new bands are added.
This is due to an increase in the size of P⊥ U , which results in higher computational
a b
6 6
PHBP-OSP PHBP-OSP
Accumulative Time (Sec)

5 PHBP-LSOSP 5 PHBP-LSOSP
RHBP-OSP RHBP-OSP
4 RHBP-LSOSP 4 RHBP-LSOSP
3 3
2 2
1 1
0 0
06 20 40 60 80 100 120 140 160 180 200 06 20 40 60 80 100 120 140 160 180 200
Number of processed bands Number of processed bands
c
1.5
PHBP-OSP
PHBP-LSOSP
RHBP-OSP
1 RHBP-LSOSP
0.5
0
0 18 40 60 80 100 120 140 160 180
Fig. 16.16 Computing time versus nl by RHBP-OSP/LSOSP, RHBP-OSP/LSOSP for (a) TI,
(b) TE, (c) HYDICE
Table 16.1 Comparison of computing time in seconds required by ATGP and RHBP-ATGP
Method
Data set OSP PHBP-OSP RHBP-OSP LSOSP PHBP-LSOSP RHBP-LSOSP
TI 0.0286 5.4131 2.6752 0.0294 5.5565 2.7541
TE 0.0288 5.4365 2.7687 0.0296 5.5925 2.8565
HYDICE 0.0074 1.2583 0.7311 0. 0083 1.4040 0.7921
complexity. Table 16.1 calculates the final computer times required for RHBP-
OSP/LSOSP and PHBP-OSP/LSOSP to complete the entire data sets, where
RHBP-OSP/LSOSP achieved times twice as fast as PHBP-OSP/LSOSP. However,
we believe that this saving could be even greater once RHBP-OSP/LSOSP is
implemented on hardware thanks to its recursive structure.
Finally, analogous to RHBP-ATGP in Fig. 15.19, a graphical user interface (GUI)

can also be developed for RHBP-OSP, as shown in Fig. 16.17, where the tool bar at
the top allows users to load the data sets and select a particular target generation
algorithm, such as ATGP, or unsupervised target detection algorithms, such as
unsupervised nonnegativity constrained least squares in Sect. 4.4.2.2.2 or
unsupervised fully constrained least squares in Sect. 4.4.2.2.3, to produce targets
of interest specified by endmembers to form a signature matrix M in (16.1) for
RHBP-OSP to process and the desired target to be classified, both of which are
shown in two separate windows in the top bar. Then a 3D progressive band-varying
profile produced by RHBP-OSP is generated at the bottom of the GUI.
16.9 Conclusions
RHBP-OSP/LSOSP was developed according to the BSQ format to implement

RHBP-OSP/LSOSP band by band recursively instead of pixel by pixel, which uses
full band information. It allows users to dictate the effect of interband correlation on
detected/estimated abundance fractions via PHBP. Despite the fact that RHBP-
OSP/LSOSP and RHBP-ATGP in Chap. 15 update P⊥ Ulp recursively by (16.10) and
(16.11), respectively, both RHBP-OSP/LSOSP and RHBP-ATGP are indeed quite
different in terms of design rationale. First, RHBP-OSP/LSOSP detects the abun-
ðlÞ
dance fractions of α ^ OSP
dl ðrl Þ by (16.12), while RHBP-ATGP finds targets tp via
rlT P⊥
Ulp rl , specified by (15.10). As a result, RHBP-OSP/LSOSP produces
Fig. 16.17 GUI design for RHBP-OSP

^ OSP
progressive band-varying detected abundance fractions of α dl ðrl Þ and RHBP-
h i
ðlÞ
ATGP produces a target matrix TM ¼ tp in (15.11). Second, RHBP-OSP/
LL
LSOSP needs to fix the number of signatures, p, at a constant, compared to RHBP-
ATGP, which varies the value of p and requires an automatic stopping rule to
determine an appropriate value of p, which can be determined by recursive ATGP
(R-ATGP) in Chap. 7. Third and most importantly, knowledge of the signature
matrix M in (16.1) must be provided a priori, while RHBP-ATGP is completely
unsupervised and requires no prior target knowledge.
Chapter 17
of Linear Spectral Mixture Analysis
Abstract In previous chapters, recursive hyperspectral band processing (RHBP)

was developed for subpxiel detection, RHBP of constrained energy minimization in
Chap. 13, RHBP of anomaly detection in Chap. 14, unsupervised target detection,
RHBP of an automatic target generation process in Chap. 15, and mixed pixel
detection/classification, RHBP of orthogonal subspace projection in Chap. 16. This
chapter concludes the last piece of RHBP’s applications to the well-known tech-
nique linear spectral mixture analysis (LSMA). It develops a new approach, called
RHBP of LSMA (RHBP-LSMA), which can process data unmixing according to
the band-sequential (BSQ) format. This new concept is different from band selec-
tion (BS), which must select bands from a fully collected band set according to a
band optimization criterion. There are several advantages of using RHBP-LSMA
over BS. In particular, it allows users to perform LSMA using available bands
without waiting for a complete collection of full bands. In doing so, an innovation
information update recursive equation is further derived and can process LSMA
progressively as well as recursively as its band processing taking place. The
resulting LSMA becomes RHBP-LSMA. To be more specific, RHBP-LSMA can
be carried out by updating LSMA results recursively band by band in the same way
that a Kalman filter does in updating data information in a recursive fashion.
Consequently, RHBP-LSMA can provide progressive band-varying profiles of
LSMA results so that significant bands can also be detected and identified by
inter-band changes in LSMA results without BS.
17.1 Introduction
Band selection (BS) is a commonly used data reduction technique to reduce the
original data space to a lower-dimensional band selected data space as an alterna-
tive to data compression, which reduces the original data space into a lower-
dimensional compressed data space. In other words, BS selects a subset of bands
of interest from a fully collected band set according to a certain criterion for
optimality. Linear spectral mixture analysis (LSMA) is a fundamental task that
has been widely used to perform linear spectral unmixing (LSU) for image analysis.

DOI 10.1007/978-3-319-45171-8_17
506 17 Recursive Hyperspectral Band Processing of Linear Spectral Mixture Analysis
It is generally performed using full bands. Since hyperspectral imaging sensors use
hundreds of contiguous bands to acquire data, the provided spectral information
sometimes is more than what LSMA generally needs. This is particularly true for
LSMA, where only a small number of signatures is required for LSU. Most
recently, it was shown in Chang et al. (2010a, 2011b) that LSMA could be effective
even if only a small number of bands are used, according to a concept, virtual
dimensionality (VD), recently developed by Chang (2003a) and Chang and Du
(2004). More specifically, in the context of VD, if a signature used to unmix data by
LSMA can be effectively represented by a spectral band, then VD can be further
used to determine the number of signatures needed for LSMA to perform effec-
tively. The issue remaining is how to find appropriate signatures that correspond to
these spectral bands. Though this issue is very challenging and deserves further
investigation, this chapter explores an interesting approach different from BS,
called recursive hyperspectral band processing of LSMA (RHBP-LSMA).
The main idea of RHBP-LSMA is derived from progressive spectral band
dimensionality (PSBD), originally developed by Chang et al. (2011d) and
Chap. 21 in Chang (2013), so that LSMA can be performed progressively band
by band without resorting to BS. In other words, LSMA can be carried out band by
band without waiting for the completion of band collection. Such a resulting LSMA
is called progressive hyperspectral band processing of LSMA (PHBP-LSMA)
according to the band-sequential (BSQ) format; it can take place while band
collection is going on. It is important to recognize the major and crucial differences
between PHBP and BS. First, BS is a preprocessing technique that requires com-
plete knowledge of data prior to LSMA, while PHBP can be implemented by band
sequential (BSQ) format in conjunction with LSMA at the same time as bands are
being collected. Second, BS requires a specific optimal criterion to find a subset of
appropriate bands, while PHBP operates without selecting bands. Third, PHBP can
be implemented in such a way that the results unmixed by LSMA can be evaluated
by its progressive performance, while BS/LSMA gives one-shot unmixed results
produced by LSMA using selected bands where the notation “/” is used to indicate
BS and LSMA, which are carried out in sequence, with BS followed by LSMA, as
opposed to the notation “-”used in PHBP-LSMA, which indicates that both PHBP
and LSMA are jointly implemented simultaneously. Finally and most importantly,
PHBP-LSMA can be further extended to a recursive version of PHBP-LSMA,
called RHBP-LSMA, which can update LSMA results only through innovation
information via recursive equations in a manner similar to that of a Kalman filter. In
doing this we decompose the information required for RHBP-LSMA into three
pieces of information, processed information obtained by the previous bands, new
information provided by the current incoming band, and innovation information by
extracting discrepant information between processed information and new infor-
mation. Such information decomposition originating from Kalman filtering enables
us to derive a recursive equation that updates innovation information band by band
without reprocessing all past information. The derived equation includes two terms,
the unmixed results obtained by previous bands and the innovation information that
is provided by the new incoming band but cannot be predicted from the past
17.2 Linear Spectral Unmixing Via Recursive Hyperspectral Band Processing 507
information. It is this innovation information that can be implemented recursively

band by band as a new band feeds in. Consequently, RHBP-LSMA can be
implemented in real time and is useful in satellite data communication.
17.2 Linear Spectral Unmixing Via Recursive

As noted, LSMA as implemented in the current literature is performed using full

band information, in which case LSMA can only be carried out following the
completion of data acquisition. From the point of view of satellite data communi-
cation and transmission, it would be highly desirable to process LSMA while data
sample vectors are being collected. Two major benefits could be gained from such a
process. One is that it would allow users to see progressive bandwise changes in
LSMA results in such a way that bands with significant impacts on LSMA perfor-
mance could be identified. Since targets of interest have different degrees of mixing
difficulty, various bands and band numbers are required to perform LSMA effec-
tively. The proposed RHBP-LSMA fits this need.
17.2.1 Derivation of Update Equation for Innovation

Information
To perform RHBP-LSMA band by band, we decompose a data sample vector into

two components, one consisting of data sample vectors acquired by previous bands
and the other containing only a current band. In this section, we assume that the
prior knowledge of the signature matrix M is given. In addition, we also assume that
fBk gl1
k¼1 are (l 1) bands already being processed and the current band is the lth
band, denoted by Bl, yet to be processed. For a given set of p l-dimensional image
p T
endmembers, mj ðlÞ j¼1 , where mj ðlÞ ¼ mj1 ðlÞ, mj2 ðlÞ, . . . , mjl ðlÞ is the jth
l-dimensional signature vector formed by the l spectral bands, fBk gl1 k¼1 and Bl.
We then introduce a new p-dimensional vector defined by el ¼
m
T
ml1 ; ml2 ; . . . ; mlðp1Þ ; mlp , which is composed of only the lth spectral band in
all mj(l) for 1 j p, and we further
define a signature matrix up to l spectral
bands by Ml ¼ m1 ðlÞm2 ðlÞ mp ðlÞ , which can be rewritten as

Ml1
Pl ¼ MlT Ml ¼ T
Ml1 el
m ¼ Ml1
T
Ml1 þ m e lT
e lm
e lT
m ð17:1Þ
e lm
¼ Pl1 þ m e lT :
According to (2.3) in Chap. 2, the least-squares solution to LSMA is given by

1
^ LS ðrÞ ¼ MT M MT r:
α ð17:2Þ

1
^ LS ðlÞ ¼ Pl1 þ m
α e lT MlT rl :
e lm ð17:3Þ
T
Since Pl1 ¼ Ml1 Ml1 is a p p matrix formed by (l 1) spectral bands,
fBk gl1
k¼1 , we can use Woodbury’s identity in Appendix A,

T 1 1 A1 v vT A1
A þ vv ¼A ; ð17:4Þ
1 þ vT A1 v
e l to update P1
and the new information m l as follows:
1 T 1

T 1 P mel m e P
P1 e lm
¼ Pl1 þ m el 1
¼ Pl1 l1 T 1l l1 : ð17:5Þ
l
1þm el
e l Pl1 m
Furthermore, by letting A ¼ Ml1

T
Ml1 and v ¼ ml in (17.4), equation (17.3)
becomes
0 1"
v
v T
#T !
l ðl1Þ l
ðl1Þ Ml1 rl1
^ ðlÞ ¼ @P1
α LS
l1
A
1 þ ρlðl1Þ e lT
m rl
0 1
v
v T
!
l ðl1Þ l
ðl1Þ rl1
¼ @P1
l1
A T
Ml1 el
m
1 þ ρ
rl
l ðl1Þ
v T
l ðl1Þ l
ðl1Þ
1
¼ Pl1 Ml1
T
e l rl
rl1 þ m T
Ml1 e l rl
rl1 þ m
1 þ ρ
l ðl1Þ
v T
l ðl1Þ l
ðl1Þ
1
¼α^ ðl 1Þ þ Pl1 m
LS
e l rl
T
Ml1 rl1 þ me l rl
1þρ
l ðl1Þ
0 1
v
v T
v
v T
l ðl1Þ l
ðl1Þ l ðl1Þ l
ðl1Þ
¼α^ LS ðl 1Þ ^ LS ðl 1Þ þ @P1
Pl1 α l1
Ame l rl
1 þ ρ
1 þ ρ
l ðl1Þ l ðl1Þ
0 1 0 1
v
v T
v
v T
l ðl1Þ l
ðl1Þ l ðl1Þ l
ðl1Þ
¼ @Ipp Pl1 Aα ^ LS ðl 1Þ þ @P1 l1
Ame l rl :
1 þ ρ
1 þ ρ
l ðl1Þ l ðl1Þ
ð17:6Þ
17.2 Linear Spectral Unmixing Via Recursive Hyperspectral Band Processing 509
^ LS ðlÞ in (17.6) is indeed the same as α

Note that α ^ UCLS ðlÞ, which can be updated
recursively by three types of information described as follows:
^ LS ðl 1Þ obtained by processing bands already visited,
1. Processed information: α
l1
fBk gk¼1 :
T 1 T
^ LS ðl 1Þ ¼ Ml1
α Ml1 Ml1 rl1 ¼ P1
l1 Ml1 rl1 :
T
ð17:7Þ
e l r l defined by
2. New information: m
T
e l r l ¼ ml1 ; ml2 ; . . . ; mlðp1Þ ; mlp r l ;
m ð17:8Þ
which is provided by the current lth spectral band, rl, and

T
e l ¼ ml1 ; ml2 ; . . . ; mlðp1Þ ; mlp formed by the lth spectral bands from p sig-
m
nature vectors, ml1 , ml2 , . . . , mlðp1Þ , mlp .
3. Innovation information: vl|(l1) and ρl|(l1) between the lth spectral band and all
previous (l 1) spectral bands denoted by a subscript, l|(l 1),
T 1
v
¼ Ml1 Ml1 m e l ¼ P1 e l;

l1 m ð17:9Þ
l ðl1Þ
T 1
ρ
e lT Ml1
¼m Ml1 m e lT P1
el ¼ m e lT v
el ¼ m
l1 m ; ð17:10Þ
l ðl1Þ l ðl1Þ
e l and the vector P1

which is an inner product of the vector m e l.
l1 m
Since the processed information is obtained by using only previously visited

(l 1) spectral bands and has nothing to do with the new information m e l r l obtained
by the incoming lth spectral band, it is the innovation information of vl|(l1) and ρl|
(l1) that is the key to updating (17.6). Specifically, the inverse of the matrix Pl1 in
(17.6) is only calculated once for its initial condition, that is, l 1 ¼ 1, and the
inverse of Pl is then updated recursively by (17.1) and (17.5). By virtue of equations
(17.9)–(17.10), vl|(l1) and ρl|(l1) are generally referred to as innovations informa-
tion in statistical signal processing (Poor 1994), and thus, (17.6), using vl|(l1) and
ρl|(l1), is also called an innovations information update equation (see Chapter 3).
Equation (17.6) provides an effective procedure to recursively calculate an

unconstrained least-squares abundance estimate α ^ LS ðlÞ using two pieces of
1
processed information, α ^ ðl 1Þ and Pl1 , two pieces of innovations information,
LS
e l.
vl|(l1) and ρl|(l1), and new information, m
Substituting (17.9) and (17.10) into (17.5) yields
v T
l ðl1Þ l
ðl1Þ
P1 ¼ P1 : ð17:11Þ
l l1
1 þ ρ
l ðl1Þ
^ LS ðlÞ in (17.6) can be further simplified to

Finally, by taking advantage of (17.11), α
0 1
v
v T
l ðl1Þ l
ðl1Þ
^ LS ðlÞ ¼ @Ipp
α Pl1 Aα
^ LS ðl 1Þ þ P1
l me l rl : ð17:12Þ
1 þ ρ
l ðl1Þ
According to (17.12), the recursive process of calculating α ^ LS ðlÞ can be carried out
by the following steps:
h i1
1. Initial condition: P1 T
1 ¼ ð m ð 1Þ Þ m ð 1Þ ^ LS ð1Þ ¼ P1
,α 1 M1 r 1 , where M1 ¼
T

m1 ð1Þm2 ð1Þ mp ð1Þ is a 1 p matrix reduced to a p-dimensional vector.
2. For each 1 < l L, with L being the total number of bands:
e l to compute vl|(l1) and ρl|(l1) by (17.9) and (17.10), respectively;
(a) Input m
v
T
l ðl1Þ l
ðl1Þ
(b) Use 2(a) to calculate 1þρ
;

l ðl1Þ
(c) Find Pl by (17.1) and P1

l by 2(b) via (17.11);
(d) Calculate α LS
e l and steps 2(b) and 2(c).
^ ðlÞ by (17.11) using m
3. Let l l þ 1, and go back to step 2.
^ LS ðlÞ,
Table 17.1 details the computational complexity of (17.12) to calculate α
where the third column summarizes the number of mathematical operations used to
compute the specific formulas listed in the second column where an inner product
and outer product of two p-dimensional vectors requires the same number of p2
multiplications.
Note that we only need to calculate one matrix inverse,
h i1
P1
1 ¼ ðmð1ÞÞ mð1Þ
T
, and P1
l for l 2 can then be easily calculated by
(17.11) without inverting a matrix. Also, once vl|(l1) and ρl|(l1) are computed,
v
T
l ðl1Þ l
ðl1Þ
we can simply compute 1þρ
^ LS ðlÞ by (17.12).
and use it to calculate α

l ðl1Þ
Two notes are worthwhile.

1. Owing to the use of (17.12), the computational complexity of calculating α^ LS ðlÞ
using RHBP as each new band Bl is constant. This is a significant advantage over
17.2
^ LS ðlÞ
Table 17.1 Computational complexity of (17.12) to calculate α
h i1
Initial pp T One outer product of a p-dimensional vectors, m(1) ( p2
P1
1 ¼ mð1Þðmð1ÞÞ
conditions matrix P1
1
multiplications) + one matrix inverse requiring O( p3)
T p inner products of p-dimensional vectors ( p2 multiplications)
^ LS ð1Þ
α ^ LS ð1Þ ¼ P1
α 1 M1 r 1
+ p multiplications by r1
Previously processed T T Available from (l 1)st previous recursion, thus, no calculation
Pl1 ¼ Ml1 ^ LS ðl 1Þ ¼ P1
Ml1 , α l1 Ml1 rl1
information α ^ LS ðl 1Þ, is needed
Pl1 , P1
l1
Input m el m
e l rl p multiplications
Innovation information: v
¼ P1 el
l1 m
p inner products of p-dimensional vectors ( p2 multiplications)
l ðl1Þ
vl|(l1) by (17.9)
Innovation information: ρ
¼m
e lT v
An inner product of two p-dimensional vectors

l ðl1Þ l ðl1Þ
ρl|(l1) by (17.10) ( p multiplications)
Update Pl by (17.1) Pl ¼ Pl1 þ m
e lm
e lT An outer product of a p-dimensional vectors, m el
( p2 multiplications)
Update P1 v
v T
One outer product of vl|(l1) ( p2 multiplications + p2 division)
l by (17.11) ðP1 e l Þðm l ðl1Þ l
ðl1Þ
1 l1 m e lT P1l1 Þ 1
P1
l ¼ Pl1 ¼ P l1 1þρ
1þm e
l1 m l
e lT P1
l ðl1Þ
0 1
^ LS ðlÞ by (17.12)
Update α v
v T
p2 inner products ( p3 multiplications) + 2p inner products of
l ðl1Þ l
ðl1Þ
α Pl1 Aα ^ LS ðl 1Þ þ P1 e l rl p-dimensional vectors (2p2 multiplications)
Linear Spectral Unmixing Via Recursive Hyperspectral Band Processing
^ LS ðlÞ ¼ @Ipp 1þρ

l m

l ðl1Þ
511
nonprogressive LSMA, where the signature matrix Ml must be recalculated for

each new band comes in.
2. One of the significant advantages of RHBP is to allow users to evaluate pro-
gressive unmixed results to see how much improvement a new band can provide.
In this case, we can determine which band is more significant than other bands.
17.3 Discussions on RHBP-LSMA
In light of (17.6) and (17.11), the abundance-unconstrained least-squares solution

^ LS ðrÞ specified by (17.2) can now be implemented recursively by the recursive
α
equation (17.6) band by band in real time. Furthermore, owing to the fact that there
are no analytic forms available for finding α ^ NCLS ðrÞ and α^ FCLS ðrÞ, they must be
found numerically by iterative algorithms, as proposed in Chang and Heinz (2000)
and Heinz and Chang (2001). Fortunately, in this case, with α ^ LS ðrÞ in (17.2)
replaced by α ^ LS ðlÞ in (17.6), M and MTM in (17.3) and FLCS in (2.19) replaced
by (17.1) and (17.7)–(17.10), α ^ NCLS ðrÞ and α
^ FCLS ðrÞ can also be implemented as
^
α NCLS
ðlÞ and α ^ FCLS
ðlÞ progressively in real time band by band given that the
computing time involved in (17.6) is negligible. The resulting unconstrained LS,
nonnegativity constrained least-squares (NCLS), and fully constrained least-
squares (FCLS) methods are referred to as RHBP-UCLS, RHBP-NCLS, and
RHBP-FCLS, respectively. Several comments are worthwhile.
1. To simplify the derivation of (17.11), the band number to be increased, denoted
by nΔ, was set to 1, that is, nΔ ¼ 1 in (17.11), where α ^ LS ðlÞ is updated from
^ LS ðl 1Þ. However, this will not necessarily be the case. For example, assume
α
that nΔ is the step size used to indicate the number of bands required to be
increased each time by PHBP. Then α ^ LS ðlÞ in (17.11) can be updated by
^ LS ðl nΔ Þ. Thus, when nΔ is greater than one, RHBP can be implemented as
α
progressive uniform BS in a progressive fashion. Specifically, if nΔ ¼ 2, RHBP
will progressively process spectral unmixing by adding every other band instead
of every band. More generally, if nPUBS is the number of bands determined by
progressive uniform BS and L is the total number of spectral bands, then nΔ
¼ bL=nUBS c is the band step size used by RHBP, where bxc is defined as the
largest integer x. In addition, the band step size nΔ can be further made
adaptive as nΔ(l ), which varies with specific bands, where α ^ LS ðlÞ in (17.11)
can be modified and updated by α ^ ðl nΔ ðlÞÞ. Most importantly, the initial
LS
band number in (17.11) is not specified. It does not have to start with the first
band but rather any band of a desired wavelength. In this case, RHBP allows
users to tune particular bands as they desire.
2. Recently, a new approach to BS, referred to as progressive band selection (PBS),
was developed in Chang and Liu (2014). Although PBS performs LSMA in a
progressive manner, there are several distinct and salient differences between
PBS and RHBP. First and foremost is that RHBP is a recursive process using
(17.11) to update only innovation information band by band, while PBS is not
and must reimplement LSMA every time new bands are added. Second,
although both PBS and RHBP can be implemented progressively band by
band, only RHBP can be used in real time because the use of (17.11) allows
RHBP to be processed recursively without revisiting previous bands, whereas
PBS must recalculate the new signature matrix M as new bands are added. Third,
in order for PBS to work, band decorrelation and band prioritization are gener-
ally required and included as a part of PBS. Because band prioritization and band
decorrelation must be done prior to PBS, PBS generally cannot be implemented
in real time. Finally, by comparison with RHBP, which does not require prior
knowledge of the number of bands to be selected, PBS generally needs to know
in advance how many bands must be selected. To address this issue, PBS
includes the concept of band dimensionality allocation in conjunction with VD
to find the best possible band dimensionality for each signature to be unmixed
by LSMA.
3. As was pointed out, RHBP is a spectral band process, not BS, because RHBP,
using equations (17.6)–(17.11) to perform LSMA, does not involve determining
nBS and finding an optimal band set, ΩBS. However, RHBP can be considered a
generalized version of PBS if it is implemented in conjunction with band
prioritization and band decorrelation, along with VD. Since the value of VD,
nVD, provides a good estimate on bounding nBS from below and above, as shown
in Chang (2013) and Chang and Liu (2014) nVD nBS 2nVD . Using the range
of [nVD, 2nVD] RHBP can be performed progressively by starting the first band
indexed by band number, nVD, and ending up with the last band indexed by
2nVD. So, an optimal band subset can be found from the subset of fBl g2n VD
l¼nVD ,
where all bands in fBl g2n VD
l¼nVD have been prioritized by band prioritization and
decorrelated by band decorrelation. It was shown in Chang (2013) and Chang
p
and Liu (2014) that nBS varies with endmembers, mj j¼1 , used to unmix data
samples, but it does fall in a range of nVD nBS 2nVD .
Finally, it is worth noting that RHBP was developed primarily to perform LSMA
in a BSQ format, where LSMA-unmixed results can be progressively presented to
data analysts, who can determine when BSQ format should be terminated for data
acquisition. Thus, RHBP does not perform BS or data reduction or compression. It
is quite different from conventional BS, where bands must be selected according to
a criterion of optimality to achieve data reduction prior to LSMA.
The synthetic image simulated in Fig. 1.15 is shown in Fig. 17.1, with five panels in
100%
A
B
column, five 2 2 pure pixel panels for each row in the second column, five 2 2
mixed pixel panels for each row in the third column, and five 1 1 subpixel panels
for each row in both the fourth and fifth columns, where the mixed and subpanel
pixels were simulated according to the Fig. 17.1 legends. Thus, a total of 100 pure
pixels (80 in the first column and 20 in the second column), referred to as
endmember pixels, were simulated in the data by the five endmembers A, B, C,
K, and M. An area of background (BKG), marked “BKG” in the upper right corner
of Fig. 1.14a, was selected to find its sample mean, that is, the average of all pixel
vectors within BKG, denoted by b and plotted in Fig. 1.14b, to be used in the
simulation of BKG for an image scene with a size of 200 200 pixels in Fig. 17.1.
The reason for this background selection is empirical since the selected area “BKG”
corrupted by an additive noise to achieve a certain signal-to-noise ratio (SNR),
which was defined in Harsanyi and Chang (1994) as a 50 % signature (i.e., reflec-
tance/radiance) divided by the standard deviation of the noise.
Once target pixels and background are simulated, two types of target insertion
can be designed to simulate experiments for various applications. The first type of
target insertion is target implantation (TI), which can be simulated by inserting
clean target panels into a noisy image BKG by replacing their corresponding BKG
pixels, where the SNR is empirically set to 20:1. That is, TI implants clean target
panel pixels into noise-corrupted image BKG with SNR ¼ 20:1, in which case there
are 100 pure panel pixels in the first and second columns. A second type of target
insertion is target embeddedness (TE), which is simulated by embedding clean
target panels into a noisy image BKG by superimposing target panel pixels over the
BKG pixels, where the SNR is empirically set to 20:1. That is, TE embeds clean
target panel pixels into noise-corrupted image BKG with SNR ¼ 20:1, in which
case all 100 pure panel pixels in the first and second columns are no longer pure. In
other words, there is a difference between TI and TE worth noting. TE inserts
targets by adding target pixels to and superimposing them over background pixels
instead of replacing background pixels the way TI does. As a consequence, the
abundance fraction of the pixel into which a target pixel is embedded is not summed
to one.
Since both RHBP-NCLS and RHBP-FCLS unmixed all panel pixels in the first
three columns nearly 100 % accurately using nband ¼ number of bands from 6 to
12, starting with band ¼ band 1 and band step size nΔ ¼ 1, their unmixed results are
not included here. Instead, Fig. 17.2 only shows the progressive band-varying
performance of RHBP-NCLS and RHBP-FCLS-unmixed results for subpixel
panels in the fourth and fifth columns in TI with band step size nΔ ¼ 1 starting
with the band number nband ¼ 6–12, with band 1 as the first band and band 6/band
12 as the last bands, corresponding to nband ¼ 6 and 12, respectively.
Note that the reason for setting nband ¼ 12 in Figs. 17.2 and 17.3 as the upper
bound is because 2nVD, that is, twice the value of VD, is enough to have NCLS and
FCLS work effectively.
As for the TI scenario where the panel pixels were embedded in BKG pixels,
Fig. 17.2 shows that when nband is equal to or greater than 6, RHBP-FCLS
accurately unmixed all the subpixel panels in the fourth and fifth columns specified
by all the five mineral signatures A, B, C, K, and M. Similarly, RHBP-NCLS also
worked well as long as nband was equal to or greater than 8. However, compared to
RHBP-FCLS, RHBP-FCLS performed slightly better. Interestingly, the conclu-
sions drawn for the TE scenario where panel pixels were superimposed on BKG
pixels shows otherwise. Since TE did not satisfy ASC, there are no endmembers.
Consequently, the TE scenario is generally used for signal detection rather than
endmember extraction. In this case, finding endmembers becomes a matter of
detecting endmembers. This can be seen in Fig. 17.3, where RHBP-NCLS
performed better than RHBP-FCLS in that RHBP-NCLS produced more accurate
abundance fractions of subpanel pixels in the fourth and fifth columns than RHBP-
FCLS did. It is worth noting that while RHBP-NCLS worked reasonably well in
unmixing subpanel pixels specified by the M signature, RHBP-FCLS did not work
in this case. This may be due to the fact that the M signature’s spectral shape is close
to the spectral profile of the BKG signature according to Fig. 1.14b, in which case
FCLS could not unmix the subpanel pixels well, but as a detector RHBP-NCLS
could do so very effectively as long as nband 7. More surprisingly, RHBP-NCLS
could be used to unmix these subpanel pixels very accurately, a task that could not
by accomplished by RHBP-FCLS. For example, the worst case was subpanel pixels
unmixed by FCLS using the mineral signature A, where its RHBP-FLCS-unmixed
progressive performance seemed to achieve stability after nband 11, but the
progressive RHBP-FCLS-unmixed abundance fractions were still incorrect. Simi-
larly, the same phenomenon can be observed for the mineral signature B after nband
8 and for the mineral signature M after nband 7. The most stable RHBP-FCLS-
unmixed progressive performance was obtained from the signatures C and K, but
Fig. 17.2 Plots of RHBP-NCLS and RHBP-FCLS-unmixed results of panel pixels in (a) fourth
column and (b) fifth column in TI from nband ¼ 6 ¼ nVD to nband ¼ 12 ¼ 2nVD by RHBP
their unmixed abundance fractions were still not accurate. All these problems were
caused by the simulated pixels, which did not satisfy ASC.
Interestingly, if we use the pure mineral signatures in Fig. 1.14b to form the
signature matrix M instead of the panel pixels directly extracted from TE, as we did
for Fig. 17.3, the unmixed results were similar to those obtained in Chang et al.
Fig. 17.3 Plots of RHBP-NCLS and RHBP-FCLS-unmixed results of panel pixels in (a) fourth
column and (b) fifth column in TE from band 6 (nVD) to band 12 (2nVD) by RHBP
(2010a, 2011b), where all abundance fractions were focused on a single mineral
signature because none of pixels in TE was pure and they all violated ASC. For
more details the interested reader is referred to Chang et al. (2010a, 2011b).
To compare computing times, Figs. 17.4 and 17.5 plots computing times in
microseconds of UCLS, NCLS, and FCLS and their progressive counterparts,
RHBP-UCLS, RHBP-NCLS, and RHBP-FCLS for TI and TE for processing a
Fig. 17.4 Average computing time of UCLS, NCLS, FCLS, RHBP-UCLS, RHBP-NCLS, RHBP-
FCLS for TI
Fig. 17.5 Average computing time of UCLS, NCLS, FCLS, RHBP-UCLS, RHBP-NCLS, RHBP-
FCLS for TE
single pixel on average based on 10 trials where the y-axis is computing time in
microseconds (ms) and the x-axis is nband, starting from band 1 and ending with
189, the last band. The computer environment used for the computations is as
follows: operating system: Windows 7 Premium; CPU: i5 480 m; memory: DDRIII
8 GB; hard disk drive: 500 GB, 7200 rpm; compiler: Microsoft Visual C++ 2008
SP1.
As shown in Figs. 17.4 and 17.5, the computational cost required for processing
a single pixel by all progressive versions of LSMA techniques, RHBP-UCLS,
RHBP-NCLS, and RHBP-FCLS, is nearly constant for each newly added band. By
contrast, the computing times required by UCLS, NCLS, and FCLS are linearly
increased with nband. This means that the more bands are involved in LSMA, the
more computing time for each data sample is required for LSMA techniques, and
the more significant are the savings in computing time. However, if a progressive
LSMA technique is used for LSMA, the computational cost is nearly constant for
each band and is independent of nband used for data unmixing, in which case
processing LSMA for each new incoming band requires the same computational
complexity.
tion Experiment (HYDICE) image scene shown in Fig. 17.6a (and Fig. 1.10a),
which has a size of 64 64 pixel vectors with 15 panels in the scene and the ground
truth map in Fig. 17.6b (Fig. 1.10b). It was acquired by 210 spectral bands with a
spectral coverage from 0.4 to 2.5 μm. Low signal/high noise bands, bands 1–3 and
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
c d
8000
7000 p1
p2 interferer
6000 p3
p4
5000 p5
4000 grass
3000
2000
1000
tree
0 road
0 20 40 60 80 100 120 140 160 180
of the 15 panels; (c) spectra of p1, p2, p3, p4, and p5; (d) ground truth map of 4 undesired signatures
removed. Thus, a total of 169 bands were used in the experiments. The spatial
respectively.
Figure 17.7 shows RHBP-FCLS and RHBP-NCLS-unmixed abundance frac-
tions of the 19 R panel pixels on the y-axis and nband from the first 8 bands to all the
169 bands on the x-axis. Note that since there are at least nine spectral signatures
present in the data, nband was initialized by starting with the first eight bands. In
addition, no BS was used in these experiments, and bands were simply processed
band by band with a band step size nΔ ¼ 1 according to the original band order. In
this case, bands were tuned purely from their wavelength range beginning with the
first 8 bands and increasing one band at a time to observe the progressive RHBP-
FCLS and RHBP-NCLS-unmixed performance of the 19 R panel pixels in the
HYDICE image scene unmixed via RHBP.
As demonstrated in Fig. 17.7, the results of the abundance fractions of the
19 panel pixels unmixed by RHBP-NCLS seemed to be better than that unmixed
by RHBP-FCLS. However, this conclusion will be reversed (Fig. 17.8) when the
signatures used to unmixed are extracted in an unsupervised manner directly from
the real scene instead of using signatures in Fig. 17.6d by visual inspection.
Nevertheless, Fig. 17.7 demonstrates that, except for the panel pixels in row
1, nband required for RHBP-NCLS and RHBP-FCLS to achieve stable performance
is actually smaller than 80, which are fewer than half the total number of bands,
169. This simple example shows that there is no need to use full bands for NCLS
and FCLS to perform well. Of course, if BS is used to select the appropriate bands,
the number of bands to be selected, nBS, will certainly be fewer than that required
by the results presented here without BS.
It was shown in Chang et al. (2010a, 2011b) that unsupervised LSMA generally
performed better than supervised LSMA for hyperspectral imagery when many
spectrally distinct signatures could not be identified by visual inspection or pro-
vided by prior knowledge, as in the HYDICE data in Fig. 17.6d. The experiments in
Chang et al. (2010a, 2011b) suggested that implementing unsupervised LSMA
(ULSMA) was more effective than supervised LSMA (SLSMA). With the main
focus of this chapter being on least squares-based LSMA, the least squares-based
approach developed in Chang et al. (2010a) was selected over the component
analysis-based approach proposed in Chang et al. (2011b). First, VD was used to
estimate the number of signatures employed to unmix this scene, nVD, which was
nine, with a false alarm probability of PF 103 . Then an automatic target
generation process (ATGP) (Ren and Chang 2003) was used to find target pixels
of interest from the original HYDICE and sphered HYDICE scenes, respectively.
Figure 17.8a shows the 9 target pixels found in HYDICE scene by ATGP that can
n o9
be considered a set of BKG (BKG) pixels, SBKG ¼ bjATGP , in which three
j¼1
panel pixels from rows 1, 3, and 5 are included. Figure 17.8b shows the nine target
pixels extracted from the sphered HYDICE scene by ATGP, which included five
Fig. 17.7 RHBP-NCLS and RHBP-FCLS-unmixed results of 19 R panel pixels in HYDICE scene
from band 8 to band 169
panel pixels extracted from each of the five rows, and can be considered a set of
n o9
target pixels Starget ¼ tjATGP . Figure 17.8c singled out the five pixels that had
j¼1
n o
been identified as BKG pixels S eBKG ¼ b e ATGP by removing the 4 target pixels
i
with a similarity measure such as spectral angle mapper (SAM), and Fig. 17.8d
shows a total number of 14 pixels obtained by combining the BKG set S eBKG
target
in Fig. 17.8c with the target set S in Fig. 17.8b into a BKG-target merged set
eBKG [ Starget to be used for spectral unmixing where the numbers in the figures
S
Fig. 17.8 ATGP-generated BKG and target pixels: (a) 9 BKG pixels in original data; (b) 9 target
pixels in sphered data; (c) 5 BKG pixels not identified as target pixels; (d) 14 pixels obtained by
combining the pixels in (b, c)
indicate the orders of pixels extracted by ATGP. Using these 14 target pixels to
form a signature matrix M, FCLS was then performed by RHBP band by band.
Since the total number of used bands was 14, the initial band number, nband, was set
to 13 and FCLS was then performed progressively band by band via (17.6) to unmix
the 19 R panel pixels in Fig. 17.6b until it reached the full band number, 169. In
analogy with Figs. 17.7, Fig. 17.9 also plots the progressive FCLS-unmixed results
of these 19 R panel pixels in terms of their unmixed abundance fractions on the y-
axis and nband from the first 13 bands to the full 169 bands on the x-axis.
The results presented in Fig. 17.9 show that only required half of the spectral
bands were required to achieve stable performance for all 19 R panel pixels in terms
of unmixed abundance fractions. Also, RHBP-constrained LS methods sensitive to
the number of endmembers if that number is not properly selected. However, once
it reaches a certain number, its performance tends to be robust, as shown in
Fig. 17.9. For example, for TI and TE this number is nVD ¼ 6 and nVD ¼ 18 for
the HYDICE data. This phenomenon is also observed in traditional UCLS (i.e.,
unsupervised unconstrained least-squares orthogonal subspace projection), NCLS,
and FCLS methods described in Chap. 4.
According to the ground truth, the pure panel pixels in the first column are
supposed to have 100 % purity in abundance. In fact, the progressive profiles of
unmixed abundance fractions showed otherwise for some panel pixels, such as p211,
p311, p412, and p511, which are actually not true pure pixels since p211 and p311 have
85–95 % purity, panel p412 has 41 % purity, and p511 has 72 % purity. This makes
sense because two panel pixels account for twice the spatial resolution, 1.56 m, that
is, 3.12 m compared to the panel size in the first column, which has a size of only
3 3m. Furthermore, comparing the results in Fig. 17.9 to those in Fig. 17.7, it is
clearly demonstrated that RHBP-FCLS did perform better than RHBP-NCLS in
terms of accuracy regarding unmixed abundance fractions of the 19 R panel pixels.
This conclusion is the complete opposite of that drawn from Fig. 17.7. The key
factor lies in the fact that the signatures used for data ummixing must be generated
directly from an unsupervised algorithm, unsupervised LSMA (ULSMA), without
human intervention, such as prior knowledge or visual inspection, supervised
LSMA (SLSMA). Moreover, nband required for ULSMA was much smaller than
that required by SLSMA. In addition, the progressive performance of ULSMA was

also better than that of SLSMA when the 14 unsupervised target pixels found by
ATGP were used for LSMA instead of the 9 signatures used for SLSMA. As a
result, those 14 ATGP-extracted target pixels in Fig. 17.8d that were used as
signatures to unmix data sample vectors were unmixed by FCLS into 100 % purity
Fig. 17.9 Unmixed results of 19 R panel pixels in HYDICE scene by unsupervised RHBP-NCLS
and RHBP-FCLS from band 13 to band 169

for some panel pixels, for example, pixels p11, p221, p312, p411, and p521, as shown
in Fig. 17.9.
Three comments on the experiments are worthwhile:
1. Despite the fact that only one real hyperspectral scene was used for the exper-
iments in this chapter, the conclusions drawn from these experiments for this
scene can also be applied to other real data sets. However, in this case, ground
truth must be provided for verification. The presented HYDICE data seemed to
be the best example we had for illustration. For quantitative analysis the exper-
iments conducted in Sect. 17.4 for two types of synthetic images, TI and TE, also
confirmed our conclusions for real data experiments.
2. It is unfortunate that a vivid real-time demonstration of real HYDICE experi-
ments could not be shown in this chapter to see how the progressive unmixed
results of each data sample vector gradually changed as bands were added band
by band progressively. However, as the band number, nband, reached a certain
limit, the progressive LSMA did not necessarily improve even if more bands
were added. Nevertheless, unmixed results did indeed progressively improve
before nband reached a specific number. But this number varies with different
signatures.
3. As noted, RHBP can be used to process specific bands of interest, such as a range
of wavelengths and visible or infrared bands, in which case RHBP can tune these
particular bands as desired. Using panel pixel p22 as an illustrative example, its
unmixed performance was stable with nband between 8 and 20 but became worse
between 20 and 33 and flat between 34 and 52, then began improving again
between 53 and 70 and fluctuating between 70 and 100. It finally reached
stability after nband was greater than 100. Thus, in this specific example, the
visible wavelengths did not improve but rather deteriorated the unmixed perfor-
mance of p22. Instead, the near infrared wavelengths actually provided useful
and valuable information for improving unmixed performance. Such similar
evidence were also found in other panel pixels.
To illustrate additional computational savings resulting from progressive band
processing of FCLS unmixing, Fig. 17.10 plots the computing time in microseconds
(ms) required for three LSMA techniques, UCLS, NCLS, and FCLS, along with
their corresponding progressive versions, RHBP-UCLS, RHBP-NCLS, and RHBP-
FCLS, on a Windows 7 Premium machine running an i5-480M processor, with
DDR3 RAM, 8 GB memory, 500 GB 7200 rpm hard disk drive, and Microsoft
Visual C++ 2008 SP1 compiler.
Similar to Figs. 17.4 and 17.5, Fig. 17.10 also shows the plots of, the three
progressive LSMA techniques which saved significant amounts of computing time
compared to their counterparts without taking advantage of RHBP, where the
computing time required by the former was nearly constant but that required by
the latter linearly increased with nband.
Fig. 17.10 Computing time in seconds of RHBP of UCLS, NCLS, and FCLS along with their
corresponding progressive versions, RHBP-UCLS, RHBP-NCLS, and RHBP-FCLS
17.6 Conclusions
This chapter developed a new concept of RHBP-LSMA that can perform LSMA
band by band recursively in real time, as if it were a Kalman filter, at the same time
bands are being collected. To the author’s best knowledge no similar technique has
been reported in the literature. Several interesting findings are noteworthy. First,
RHBP does not select bands. Instead, it allows users to tune bands of interest and
process these bands progressively band by band and recursively in real time. As a
result, RHBP does not require one to determine nBS or to find appropriate bands, as
is the case with BS. RHBP can be performed whenever bands are available without
waiting for full bands to be collected. Second, RHBP takes advantage of recursive
equations to update only innovation information without reprocessing previous
bands. As a consequence, RHBP can be implemented in real time band by band
progressively and recursively. With this real-time capability, RHBP is a very
promising technique in satellite data communication, transmission, and compres-
sion, none of which can be accomplished with BS. Third, RHBP-LSMA enables
users to evaluate progressive background suppression resulting from LSMA band
by band during data transmission in the BSQ format. In addition, RHBP also helps
users identify which bands are more effective and crucial in LSMA. Last but not
least, RHBP offers computational savings, which is a significant and tremendous
advantage. This is particularly important for hyperspectral imaging sensors oper-
ating on a space-based platform. As a concluding remark, it should also be noted
that RHBP is application-oriented, where its utility must be justified by applica-
tions. The RHBP-LSMA investigated in this chapter has demonstrated great poten-
tial in hyperspectral data communication between transmitting and receiving ends
once all necessary preprocessing tasks are accomplished. Most recently, the RHBP
idea was also applied by Wu et al. (2013) to the modified FCLS (MFCLS)
developed in Chang (2003, Chap. 10).
Chapter 18
of Growing Simplex Volume Analysis
Abstract Recursive hyperspecrral band processing (RHBP) has shown promise in

a variety of applications. For example, it provides progressive hyperspectral target
detection maps (Chaps. 13–15) or progressive unmixed abundance fraction maps
(Chaps. 16 and 17), so that these progressive band-varying profiles can be used to
monitor their interband changes to identify band significance for image analysts.
Interestingly, RHBP can also offer another major benefit for simplex volume (SV)-
based endmember finding algorithms such as N-FINDR developed by Winter
(1999a, b) and simplex growing algorithm (SGA) developed by Chang et al.
(2006b). Since the number of endmembers, p, is relatively small compared to the
number of total bands, L, used for data acquisition, it is a general practice for these
algorithms to apply dimensionality reduction (DR) to reduce dimensionality from
L to p1 so that SV can be appropriately calculated by a matrix determinant of full
rank. However, as shown in Chap. 2, SVs found from DR-data are not necessary
true SVs. To resolve this issue, Chaps. 11 and 12 developed two different
approaches to finding endmembers by calculating geometric SVs (GSVs) without
DR. This chapter takes advantage of the techniques in Chaps. 11 and 12 to derive a
third approach, called RHBP of growing simplex volume analysis (RHBP-GSVA),
which extends SGA and orthogonal projection-based SGA (OPSGA) in Chap. 11
and geometric SGA (GSGA) in Chap. 12 to finding endmembers band by band
progressively and recursively. Several benefits can be gained from RHBP-GSVA.
First and foremost is that RHBP-GSVA allows users to find endmembers through
progressive band-varying spectral profiles. Another is that it enables users to
specify those bands that are significant to find different endmembers. Finally and
most importantly, RHBP-GSVA provides a means of understanding the spectral
characteristics of endmembers during the course of the endmember finding process.
18.1 Introduction
Endmember finding is a fundamental task in hyperspectral data exploitation

because endmembers can be used to specify particular spectral classes. Because
the number of endmembers, denoted by nE, is much smaller than the number of data

DOI 10.1007/978-3-319-45171-8_18
530 18 Recursive Hyperspectral Band Processing of Growing Simplex Volume Analysis
spectral dimensions, L, dimensionality reduction (DR) is generally required. Ideal-

istically, if an endmember can be accommodated and specified by a particular
spectral band, then it only needs nE bands to find all the nE endmembers. This
can be illustrated by simplex volume (SV)-based endmember finding algorithms
(EFAs) such as the N-finder algorithm (N-FINDR) developed by Winter (1999a, b),
which finds a simplex of maximal volume with its vertices specified by all
endmembers. As a result, an nE-vertex simplex has only nE1 dimensions. This
implies that no more than nE dimensions are required to accommodate an nE-vertex
simplex. The main issue is how to find nE appropriate bands that can accommodate
such an nE-vertex simplex. Band selection (BS) seems to be a good choice for this
purpose. However, BS requires repeatedly solving band optimization problems for
all possible nE-combinations among all the L spectral bands, which is practically
impossible. To mitigate this issue, band prioritization is usually used to rank all
bands according to priorities assigned to each of the bands. However, a band with a
higher priority its adjacent bands may also have higher priorities. To avoid selecting
adjacent bands with too much overlapping spectral information, band decorrelation
(BD) is also needed. But a challenging issue is to select an adequate threshold for
BD. With all these issues remaining unsolved, this chapter investigates a new
concept, called progressive hyperspectral band processing (PHBP), which offers
an alternative solution. Unlike BS, PHBP does not require prior knowledge of nE or
select particular bands. Instead, it processes bands progressively band by band, so
that progressive changes in interband spectral variations of endmembers of interest
can be dictated. Such progressive profiles cannot be offered by BS or any EFA
using full bands. With this interpretation, a perfect EFA candidate used for this
purpose is the simplex growing algorithm (SGA) developed by Chang (2006). It is
also a progressive EFA that generates one endmember after another progressively
by growing simplexes vertex by vertex, with each vertex specified by one particular
endmember. The first attempt at developing a progressive band process version of
SGA was made by Schultz (2014), where the determinant-based SGA (DSGA)
discussed in Chap. 2 was used to find a maximal SV. However, such a resulting
PHBP-SGA suffers from exceedingly large amounts of computing time and also
does not yield true SVs. In this chapter, we take advantage of orthogonal projection-
based SGA (OPSGA), developed in Chap. 11, and geometric SGA (GSGA),
developed in Chap. 12, to develop their PHBP versions, PHBP-OPSGA and
PHBP-GSGA. To facilitate real-time processing capabilities, PHBP-OPSGA and
PHBP-GSGA are further extended to their recursive counterparts, RHBP-OPSGA
and RHBP-GSGA, in the same way that RHSP-OPSGA and RHSP-GSGA are
derived in Chaps. 11 and 12.
18.2 Recurisve Hyperspectral Band Processing of Orthogonal Projection-Based. . . 531
18.2 Recurisve Hyperspectral Band Processing

of Orthogonal Projection-Based Simplex Growing
Algorithm
As noted in Chap. 2, the main issue arising in the implementation of SGA is to use
the matrix determinant to calculate SVs, referred to as determinant-based SV
(GSV), by finding matrix inverses, which requires an enormous amount of com-
puting time. Fortunately, Chap. 11 develops an approach, called orthogonal
projection-based SGA (OPSGA), which converts finding GSVs to finding maximal
OP-based SVs. This approach turns out to be the same as using an automatic target
generation process (ATGP) to find its targets. Accordingly, what we developed,
RHBP-ATGP, in Chap. 15 is readily applied to developing RHBP-OPSGA, but it
requires significant modifications.
18.2.1 Recursive Equations for RHBP-OPGSVA

j
Assume that mj is a set of j endmembers found by OPSGA, and
j¼1

these j endmembers form a j-vertex simplex, Sj ¼ S m1 ; . . . ; mj1 ; mj . Suppose
that ej+1 is the next endmember to be found by SGA. As discussed in Chap. 2, the
SV of Sjþ1 ¼ S m1 ; . . . ; mj1 ; mj ; mjþ1 can be calculated by multiplying the SV of

Sj ¼ S m1 ; . . . ; mj1 ; mj by the OP of mj+1 onto Sj ¼ S m1 ; . . . ; mj1 ; mj ,

which is considered the base of Sjþ1 ¼ S m1 ; . . . ; mj1 ; mj ; mjþ1 . If we assume
that ejþ1 can be decomposed into two orthogonal vectors, denoted by e⊥ jþ1 , which is
k
orthogonal to the base, Sj ¼ S m1 ; . . . ; mj1 ; mj , and mjþ1 , which is parallel
k
to the base, Sj ¼ S m1 ; . . . ; mj1 ; mj , as mjþ1 ¼ m⊥jþ1 þ mjþ1, then the height of

⊥
Sjþ1 ¼ S m1 ; . . . ; mj1 ; mj ; mjþ1 can be calculated by hjþ1 ¼ mjþ1 . According

to Theorem 2.1 in Chap. 2 the GSV of Sjþ1 ¼ S m1 ; . . . ; mj1 ; mj ; mjþ1 can be
calculated by

V Sjþ1 ¼ GSV m1 ; . . . ; mj1 ; mj ; mjþ1 ¼ ð1=jÞGSV m1 ; . . . ; mj1 ; mj hjþ1 ;
ð18:1Þ

where GSV m1 ; . . . ; mj1 ; mj ; mjþ1 is the GSV of S m1 ; . . . ; mj1 ; mj ; mjþ1 . As
a result, finding the maximal GSV of Sj+1 via (18.1) is the same thing as finding the
maximal OP, hj+1. Or, equivalently, OPSGA is specifically designed to find mj+1
with the maximal OP, hjþ1 ¼ m⊥
jþ1 , which turns out to be the maximal height of

Sj ¼ S m1 ; . . . ; mj1 ; mj that yields the maximal GSV of

Sjþ1 ¼ S m1 ; . . . ; mj1 ; mj ; mjþ1 . Since this approach can only work for growing
simplexes, in this chapter we use growing simplex volume analysis (GSVA) instead
of SGA to reflect the growing nature of (18.1). Interestingly, finding such maximal
OP, hj+1, can actually be done using ATGP, as shown by OPSGA in Chap. 11. This
fact provides us with a feasible means of developing RHBP-OPSGA to be
implemented as RHBP-OPSGA, where OPGSA is used to replace SGA.
j T
First, let mj j¼1 be the jth endmembers and mjl ¼ mj1l ; mj2l ; . . . ; mjll an l-
dimensional vector formed by the first l bands of the jth endmember elj , and let m
T
ðlÞ ¼ m1l ; m2l ; . . . ; mjl be a j-dimensional vector obtained by the lth band from
each of the j endmembers. Now let
h i
Ulj ¼ m1l m2l mjl
2 3
m11l
m21l
mðlj1Þ1 mj1l
6 7
6 ml l
⋱ mðlj1Þ2 mj2l 7
6 m22 7 " #
6 12
7 Uðl1Þj ð18:2Þ
6 7
¼6 ⋮ ⋱ ⋱ ⋱ ⋮ 7¼ :
6 7 m T
ð l Þ
6 ml l
mðlj1Þðl1Þ 7
mjlðl1Þ 7
6 1ðl1Þ m2ðl1Þ
4 5
m1ll m2ll mðlj1Þl mjll
Then, according to (15.8), we can write
h i1 h i
1
Uðl1Þj
T
Ulj Ulj ¼ mðlÞ
UðTl1Þj
mT ðlÞ ð18:3Þ
1
¼ UðTl1Þj Uðl1Þj þ mðlÞmT ðlÞ ;
2 3
l
m11 l
m21 mðlj1Þ1 mj1l
6 7
6 ml m22l
⋮ mðlj1Þ2 mj2l 7
6 12 7
where Uðl1Þj ¼ 6 7 is an ðl 1Þ p
6 ⋮ ⋮ ⋱ ⋮ ⋮ 7
4 5
m1ðl1Þ m2ðl1Þ mðj1Þðl1Þ mjðl1Þ
l l l l
matrix. Applying Woodbury’s identity in Appendix A,

T 1 1 A1 u vT A1
A þ uv ¼A ; ð18:4Þ
1 þ vT A1 u
to (18.3), we can obtain recursive equations that allow us to calculate P⊥

Ulp from
⊥ 2
(15.9) and PUlp rl directly from PUðl1Þp via (15.10) and m (l ) without
⊥ T
recalculating all the l bands as follows:

18.2 Recurisve Hyperspectral Band Processing of Orthogonal Projection-Based. . . 533
P⊥
Ulj ¼ Ill Ulj Ulj
#
0 1
1
"
T
#B Uðl1Þj Uðl1Þj UðTl1Þj ρ C
l ðl1Þ
Uðl1Þj B C
¼ Ill B h iC
B C
mT ðlÞ @ 1 ρ ρ T UT ρ ρ T mðlÞ A
l ðl1Þ lðl1Þ ðl1Þj l ðl1Þ lðl1Þ
1 þ mT ðlÞρ
l ðl1Þ
2 3
P⊥
Uðl1Þj Uðl1Þj ρ
l ðl1Þ
6 7
¼6
4 1
T 7
5
Uðl1Þj UðTl1Þj Uðl1Þj mðlÞ 1 mT ðlÞρ
l ðl1Þ
2 3
Uðl1Þj ρ ρ T UT Uðl1Þj ρ ρ T mð l Þ
1 6 7
þ 4 5
1 þ mT ðlÞρ mT ðlÞρ ρ T UðTl1Þj mT ðlÞρ ρ T mðlÞ
l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ
ð18:5Þ

T
⊥
P rl 2 ¼ r T P⊥ rl ¼ r T P⊥ rl1 þ r l 1 mT ðlÞρ
Ulp l Ulp l1 Uðl1Þp rl1 2r l Uðl1Þp ρ l ðl1Þ l ðl1Þ
rl

1
þ T
rl1 Uðl1Þp ρ þ ρ T mðlÞr l ρ T UðTl1Þp rl1 þ ρ T mðlÞr l ;
1 þ mT ðlÞρ l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ
l ðl1Þ
ð18:6Þ
1
where ρ ¼ UðTl1Þj Uðl1Þj mðlÞ is a j-dimensional vector.

l ðl1Þ
18.2.2 Recursive Hyperspectral Band Processing of OPSGA
Using the recursive equations derived in (18.3)–(18.6), we are ready to develop a

recursive version of OPSGA, recursive hyperspectral band processing of OPSGA
(RHSP-OPSGA), as follows.
Recursive Hyperspectral Band Processing of OPSGA

Outer Loop indexed by j from 1 to p
Inner Loop indexed by l from 1 to L
ðlÞ
Let l j. Find an initial jth endmember mj ¼ arg maxrl rlT rl . Set
h i
ðlÞ ðlÞ ðlÞ ðlÞ
Uj ¼ m1 m2 mj , and calculate P⊥ðlÞ .
Uj
ðlÞ
2. At the lth iteration, find mj by maximizing rT P⊥
Ulj r via (18.6) over all data
N
sample vectors, frl ðiÞgi¼1 . This can be done in the following steps:
ðlÞ
(a) Initial conditions: Let maxlj ¼ ðrl ðiÞÞT P⊥ Ulj rl ðiÞ and mj rl ðiÞ. Set
i ¼ 2.
Ulj rl ðiÞ.
(c) If ð r l ð i Þ Þ T P⊥ j
Ulj rl ðiÞ > maxl , then maxlj ¼ ðrl ðiÞÞT P⊥
Ulj rl ðiÞ and
ðlÞ
mj rl ðiÞ. Otherwise continue.
(d) If i < N, then i i þ 1, and go step 1.b. Otherwise, the algorithm is
ðlÞ
terminated and mj is already found.
(e) Let l l þ 1, and use (18.3) and (18.5) to update U#lj and P⊥ Ulj via
previously calculated U#ðl1Þj and P⊥
Uðl1Þj .
End (Inner Loop)

Go to step 2 until l ¼ L.
End (Outer Loop)
Figure 18.1 depicts a flowchart of RHBP-OPSGA.

Analogous to the development of RHBP-OPSGA, recursive hyperspectral band

processing of GSGA (RHBP-GSGA) can also be derived as follows. According to
(12.29)–(12.30), they are reexpressed by

e jl ¼ m
mjl ¼ mj1l ; mj2l ; . . . ; mjll , m e j1l ; m
e j2l ; . . . ; m
e jll ; ð18:7Þ
h i
l
ujl ¼ uj1l ; uj2l ; . . . ; ujll , Uj2 ¼ u2l u3l uj2
l
; ð18:8Þ
where the superscript “GSGA” in (12.29)–(12.30) is omitted for simplicity. Then

(18.7) can be represented by
T
l l
mjl ¼ I Uj2 Uj2 me jl : ð18:9Þ
18.3 Recursive Hyperspectral Band Processing of Geometric Simplex Growing Algorithm 535
l=j
l l 1
Recursive
Band
Process ml rl (1)
Recursive Process: Update
Progressive process
T T
ri (l ) P ri (l ) ml P ml
Ul U l No
Yes
ml rl (i )
No
l =L
Yes
j j 1
Yes No
Stop j> p
Fig. 18.1 Flowchart of RHBP-OPSGA
j T
First, let mj j¼1 be the jth endmembers and mjl ¼ mj1l ; mj2l ; . . . ; mjll be an l-
dimensional vector formed by the first l bands of the jth endmember elj , and let m
T
ðlÞ ¼ m1l ; m2l ; . . . ; mjl be a j-dimensional vector obtained by the lth band from
each of the j endmembers. Now let
h i
Ulj ¼ m1l m2l mjl
2 3
m11l
m21l
mðlj1Þ1 mj1l
6 7
6 ml mj2l 7
⋱ 7 2 3
l
6 m22 mðlj1Þ2
6 12
7 U ð18:10Þ
6 7 4 ðl1Þj 5
¼6 ⋮ ⋱ ⋱ ⋱ ⋮ 7¼ :
6 7
6 ml l
mðlj1Þðl1Þ mjlðl1Þ 7 mT ðlÞ
6 1ðl1Þ m2ðl1Þ 7
4 5
m1ll m2ll mðlj1Þl mjll
Using (18.3)–(18.6) and (18.10) we can express

" #!1
h i1 h i Uðl1Þj
T T
Ulj Ulj ¼ Uðl1Þj mðlÞ
mT ðlÞ ð18:11Þ
1
T
¼ Uðl1Þj Uðl1Þj þ mðlÞmT ðlÞ ;
2 3
l
m11 l
m21 mðlj1Þ1 mj1l
6 7
6 ml l
⋮ mðlj1Þ2 mj2l 7
6 m22 7
¼6 7 is an ðl 1Þ p
12
where Uðl1Þj 6 7
6 ⋮ ⋮ ⋱ ⋮ ⋮ 7
4 5
m1l ðl1Þ m2l ðl1Þ mðlj1Þðl1Þ l
mjðl1Þ
matrix and
#
P⊥
U
¼ Ill Ulj Ulj
lj
0 1
1
T T
ρ
" #B U U
ðl1Þj ðl1Þj U ðl1Þj lðl1Þ C
Uðl1Þj B C
¼ Ill B
C
B 1 T T T C
m ðlÞ
T @ ρ ρ U ρ ρ mðlÞ A
1 þ m ðlÞρ
T
l ðl1Þ
2 3
PU⊥
Uðl1Þj ρ
ðl1Þj l ðl1Þ
6 7
¼64 1
T 7
5
1 mT ðlÞρ
T
Uðl1Þj Uðl1Þj Uðl1Þj mðlÞ
l ðl1Þ
2 3
Uðl1Þj ρ ρ T Uðl1Þj Uðl1Þj ρ ρ T
T
mð l Þ
1 6 l ð l1 Þ l ðl1 Þ l ðl1 Þ l ðl1 Þ 7
þ 6 7
1 þ mT ðlÞρ 4 T T T T
5
l ðl1Þ m ðlÞρ ρ Uðl1Þj m ðlÞρ
T
ρ mðlÞ
ð18:12Þ
18.3 Recursive Hyperspectral Band Processing of Geometric Simplex Growing Algorithm 537

T
⊥
P rl 2 ¼ r T P⊥ rl ¼ r T P⊥
l U l1 U rl1 2r l Uðl1Þp ρ rl1 þ r l 1 mT ðlÞρ rl
U lp lp ðl1Þp l ðl1Þ l ðl1Þ

1
Uðl1Þp ρ þ ρ T ρ T Uðl1Þp rl1 þ ρ T
T
þ T
rl1 mðlÞr l mðlÞr l ;
1 þ mT ðlÞρ l ðl1Þ l ðl1Þ l ðl1Þ l ðl1Þ
l ðl1Þ
ð18:13Þ
1
where ρ
T
¼ Uðl1Þj Uðl1Þj mðlÞ is a j-dimensional vector.
l ðl1Þ
Recursive Hyperspectral Band Processing of GSGA

Outer Loop indexed by j from 2 to p
Since the initial condition for GSGA must begin with two endmembers with the
maximal distance, the value of j must be at least 2. For j ¼ 2, find two data
T
sample vectors m21 and m22 such that m22 m21 2 ¼ m22 m21 m22 m21 is
maximal, that is, finding two points in the data space with the maximal distance
in a two-dimensional data space. Assume that e21 is selected as the initial
endmember. Then we define m e 22 ¼ m22 m21 . For 2 < l, j we define
me j ¼ mj m1 .
l l l
Inner Loop indexed by l from 1 to L

ðlÞ
Let l j. Find an initial jth endmember mj ¼ arg maxrl rlT rl . Set
h i
ð lÞ ðlÞ ðlÞ ð lÞ
Uj ¼ m1 m2 mj , and calculate P⊥ðlÞ .
Uj
ð lÞ
2. At the lth iteration, find mj by maximizing rT P⊥ Ulj
r via (18.13) over all
N
data sample vectors, frl ðiÞgi¼1 . This can be done in the following steps:
ðlÞ
(a) Initial conditions: Let maxlj ¼ ðrl ðiÞÞT P⊥
U
rl ðiÞ and mj rl ðiÞ. Set
lj
i ¼ 2.
Ulj rl ðiÞ.
(c) If ðrl ðiÞÞT P⊥
U
rl ðiÞ > maxlj , then maxlj ¼ ðrl ðiÞÞT P⊥
U
rl ðiÞ and
lj lj
ðlÞ
mj rl ðiÞ. Otherwise, continue.
(d) If i < N, then i i þ 1, and go step 1(b). Otherwise, the algorithm is
ðlÞ
terminated and mj is already found.
#
3. Let l l þ 1, and use (18.3) and (18.5) to update Ulj and P⊥
U
via
lj
#
previously calculated Uðl1Þj and P⊥
Uðl1Þj
.
End (Inner Loop)

4. Go to step 2 until l ¼ L.
End (Outer Loop)
A flowchart similar to that in Fig. 18.1 can also be depicted in Fig. 18.2 as
follows.
l=j>1
l l 1
Band
Progressive
process ml rl (1)
Recursive Process: Update
Successive process
T T
ri (l ) P ri (l ) ml P ml
Ul Ul No
Yes
ml rl (i )
No
l =L
Yes
j j 1
Yes No
Stop j> p
Fig. 18.2 Flowchart of RHBP-GSGA

tion Experiment (HYDICE) image scene shown in Fig. 18.3a (and Fig. 1.10a),
which has a size of 64 64 pixel vectors with 15 panels in the scene and the ground
truth map in Fig. 18.3b (Fig. 1.10b).
Note that panel pixel p212, marked yellow in Fig. 18.3b, is of particular interest.
Based on the ground truth, this panel pixel is not a pure panel pixel and is marked
yellow as a boundary panel pixel. However, in our extensive and comprehensive
experiments, this yellow panel pixel is always extracted as the one with the most
spectrally distinct signature compared to the R panel pixels in row 2. This indicates
that a signature of spectral purity is not equivalent to a signature of spectral
distinction. In fact, in many cases panel pixel p212 instead of panel pixel p221 is
the one extracted by EFAs to represent the panel signature in row 2. Also, because
of such ambiguity, the panel signature representing panel pixels in the second row is
either p221 or p212, which is always difficult to determine using EFAs. This implies
that the ground truth of the R panel pixels in the second row in Fig. 18.3b may not
be as pure as was thought.
Since GSGA has been shown to be the best among all GSGA variants (Chap. 12),
the RHBP-GSVA used for the following experiments is implemented by RHBP-
GSGA. Figure 18.4 illustrates a comparative study between GSGA using full bands
and RHBP-GSGA. Figure 18.4a shows the results of applying GSGA to the
HYDICE scene with the number of endmembers, nE, estimated for this scene as
nE ¼ 18 in Chang et al. (2010a, 2011b). Figure 18.4b shows the spatial locations of
endmembers found by RHBP-GSGA using the same value of 18 for nE, where the
magenta circle denotes the endmembers extracted in the previous bands, the cyan
upper triangle highlights the endmembers identified in the current band, and the
number next to the triangle indicates the orders of the extracted endmembers. As
also shown in the figure, RHBP-GSGA produced the same results as GSGA at the
last iteration, in which case RHBP-GSGA can be considered a slow motion of a
a b
p11, p12, p13
p211, p212, p22, p23

p221
p311, p312, p32, p33
p411, p412, p42, p43
p511, p52, p53

p521
of the 15 panels
Fig. 18.4 18 endmembers found by GSGA and RHBP-GSGA, with nE ¼ 18. (a) GSGA. (b)
RHBP-GSGA
one-shot operation GSGA. Most interestingly, some moving targets can be unveiled
during the process, where the pixels in the magenta circle represent the additional
endmembers detected by RHBP-GSGA. Furthermore, a total of eight ground truth
pixels (seven R panel pixels and one Y panel pixel) can be discovered by RHBP-
GSGA, compared with only five ground truth pixels (four R panel pixels and one Y
panel pixel) picked up by GSGA.
Let nl denote the number of the first l bands being used for data processing.
Figure 18.5 further shows the spatial locations of endmembers found by PHBP-
GSGA using different numbers of bands processed, nl, with l starting from
18 bands.
In Fig. 18.5, the red cross indicates the spatial locations of the R panel pixels in
the HYDICE scene, the yellow cross shows the spatial locations of the Y panel
pixels in the scene, the magenta circle denotes the endmembers extracted in the
previous bands, the cyan upper triangle highlights the endmembers identified by the
current band, and the numbers next to the triangle indicate the orders of the
extracted endmembers. p212 and p521 were the first two panel pixels extracted,
with 18 bands collected and received, followed by p311 with nl ¼ 41, p312 with
nl ¼ 49, p11 with nl ¼ 50, p411 with nl ¼ 60, p221 with nl ¼ 67, and p412, the last panel
pixel identified, with nl ¼ 149. No more panel pixels were extracted after nl ¼ 149.
Figure 18.6 shows progressive magnitude changes in RHBP-GSGA in detecting
the 18 endmembers identified by RHBP-GSGA, where the x-, y-, and z-axes denote
respectively extracted endmembers, the number of the first bands, nl, being used for
data processing, and the magnitudes calculated by RHBP-GSGA. The magenta
arrow indicates that a particular panel pixel is being identified for the first time as an
endmember. For example, p212 was identified for the first time as an endmember
with nl ¼ 18, which was the fourth endmember identified by RHBP-GSGA.
Fig. 18.5 Endmembers identified by RHBP-GSGA using different values of nl
e(18)
4
= p212
e9(18) = p521
PBP-OPSGA Calculated Height
(49)
e68 = p312
4 e(41) = p311
x 10 60
5 0
4 e(67) = p e(50) = p11
94 221 71 25
3 e(60) = p
89 411
2 50
1 75 s
nd
0 Ba
d
100 se
E1 E2 es
E3 E4 e(149) oc
Pr
= p412
E5 E6 115
125 f
E7 E8 ro
E9 E10 be
Endm
embe
E11 E12 150 Num
rs E13 E14
E15 E16
E17 E18 170
Fig. 18.6 3D plot of progressive magnitude changes detected by RHBP-GSGA
Table 18.1 summarizes the minimal number of bands, nl, required to identify
signatures, orders of extracted signatures, and orders of signatures extracted by
particular bands for all eight panel pixels identified by RHBP-GSGA in the
HYDICE scene.
Table 18.1 Summary of minimal number of bands processed, nl, to identify signatures and orders
of extracted signatures by particular bands
Minimal nl
Signatures found by identifying Order of signatures Order of signatures
RHBP-GSVA signatures to be found found in particular bands
ð50Þ 50 71 16
p11 ¼ t71
ð18Þ 18 4 4
p221 ¼ t4
ð67Þ 67 94 11
p212 ¼ t94
ð41Þ 41 60 9
p311 ¼ t60
ð49Þ 49 68 6
p312 ¼ t68
ð60Þ 60 89 18
p411 ¼ t89
ð149Þ 149 115 15
p412 ¼ t115
ð18Þ 18 9 9
p521 ¼ t9
18.5 Conclusions
This chapter presented a new approach, RHBP, to finding endmembers that allows
users to find endmembers band by band progressively. RHBP extends a well-known
SGA into RHBP-GSVA, which has two derived versions, RHBP-OPSGA and
RHBP-GSGA, corresponding to OPSGA and GSGA in Chaps. 11 and 12, respec-
tively. Since endmembers are generally considered insignificant targets because of
their rare appearance in the data, they may very likely be missed and overwhelmed
by strong targets in many cases when full bands are used for one-shot operations.
The RHBP-OPSGA/RHBP-GSGA developed in this chapter provides advantages
for finding endmembers during its progressive interband processing. In fact, as
shown in experiments, RHBP-GSGA can find more endmembers than GSGA
during its progressive endmember finding process.
Chapter 19
of Iterative Pixel Purity Index
Abstract The pixel purity index (PPI) is a very popular endmember finding
algorithm due to its availability in ENvironment for Visualizing Images software.
It is discussed in Chang (Hyperspectral data processing: algorithm design and
analysis. Hoboken, 2013) in great detail, where various versions of PPI are devel-
oped from an algorithmic implementation point of view. Most recently, PPI was
further expanded into the iterative PPI (IPPI) by Chang and Wu (IEEE J Sel Top
Appl Earth Obs Remote Sens 8(6):2676–2695, 2015) with details also discussed in
Chang (Real time progressive hyperspectral image processing: endmember finding
and anomaly detection. New York, 2016). All such developments of PPI and IPPI
are derived according to a data acquisition format, band-interleaved-by-pixel/sam-
ple (BIP/BIS), where data sample vectors are collected with full band information
by a hyperspectral imaging sensor. This chapter takes a different approach to
designing IPPI from another data acquisition point of view, called the band-
sequential (BSQ) format, which allows users to process IPPI bandwise without
waiting for full bands of data information to be collected. BIP/BIS and BSQ are the
two types of data acquisition formats generally used to acquire hyperspectral
imagery (Remote sensing: models and methods for image processing, New York,
1997). Basically, IPPI is suitable for both data acquisition formats. Interestingly,
designing an IPPI using the BSQ format has not received attention to date. To make
it work, IPPI must be capable of calculating and updating PPI counts of data sample
vectors band by band. Intuitively, it seems that could be easily done. Unfortunately,
several major challenging issues arise. Unlike many other hyperspectral imaging
algorithms that only deal with sample vectors and bands by visualizing their
progressive changes in algorithmic performance in a three-dimensional (3D) plot,
IPPI has an extra parameter, called skewers, which are randomly generated vectors,
that needs to be specified. By including skewers as a third parameter in addition to
sample vectors and bands, PPI counts will require a four-dimensional plot to
visualize progressive changes in PPI counts in a 3D parameter space specified by
sample vectors, bands, and skewers. In this chapter we introduce a new concept for
implementing IPPI band by band in a progressive manner called progressive
hyperspectral band processing of IPPI (PHBP-IPPI) to address these issues by
developing bandwise calculation of PPI counts of sample vectors sample by sample
and bandwise calculation of PPI counts of sample vectors skewer by skewer.
To further facilitate the use of PHBP-IPPI in practical applications, recursive
equations are also derived for PHBP-IPPI to be used in the development of

DOI 10.1007/978-3-319-45171-8_19
544 19 Recursive Hyperspectral Band Processing of Iterative Pixel Purity Index
recursive hyperspectral band processing of IPPI (RHBP-IPPI), which can process

data not only progressively but also recursively in a more efficient and effective
manner, particularly in hardware design. As a result, many benefits can be gained
from RHBP-IPPI. For example, it provides progressive profiles of IPPI counts of
data samples recursively as more bands are included for data processing, finds
crucial bands according to progressive changes in PPI counts, and detects signifi-
cant bands for specific applications.
19.1 Introduction
One of the most popular endmember extraction algorithms is the pixel purity index
(PPI), developed by Boardman (1994). It searches for a set of vertices of a convex
hull in a given data set, which are supposed to be pure signatures present in the data.
It has been widely used because it is available in the ENvironment for Visualizing
Images (ENVI) software system originally developed by Analytical Imaging and
Geophysics (AIG) (Research Systems, Inc. 2001) and has found many applications
in various areas. Owing to its propriety nature and limited published results, its
detailed implementation has never been made public. Therefore, most of the people
who use the PPI to find endmembers either use ENVI software or implement their
own versions of the PPI based on what is available in the literature. This chapter
presents our experience with PPI and investigates several issues arising in the
practical implementation of the PPI.
To better take advantage of the PPI, several approaches have been developed to
extend the PPI, such as the fast iterative PPI (FIPPI) (Chang and Plaza 2006),
random PPI (RPPI) (Chang et al. 2010c), and iterative PPI (IPPI) (Chang and Wu
2015). However, to the author’s best knowledge, all the PPI-variant developments
reported in the literature are based on using the full set of spectral bands. This
chapter presents a rather different extension of IPPI, called progressive
hyperspectral band processing (PHBP) of IPPI, from a data acquisition point of
view, band-sequential (BSQ) format (Schowengerdt 1997), where bands are
acquired band by band instead of the commonly used data acquisition format,
band-interleaved-sample/pixel (BIS/BIP) or band-interleaved-line (BIL), which
acquires data samples using full band information sample by sample or line by
line, respectively. As a result, PHBP-IPPI can be performed in real time as a new
band comes in without waiting for all bands being completely collected.
It is important to realize the differences between BIS/BIP and BSQ; the former
can be considered a sequential process, whereas the latter is a progressive process.
In general, BIS/BIP acquires data sample vectors sample by sample sequentially
using all spectral bands with no need to revisit data sample vectors. By contrast,
under the BSQ format, data samples are acquired band by band progressively in the
sense that data sample vectors must be revisited L times, where L is the total number
of spectral bands. In other words, if we interpret BSQ acquiring data samples by one
particular band as a state process, then BSQ can be considered an L-stage progres-
sive process. In the context of the preceding interpretations, the entire PPI, along
with its extensions reported in the literature according to the BIS/BIP format, can be
considered sequential hyperspectral imaging techniques, while PHBP-IPPI using
the BSQ format can be considered a progressive hyperspectral imaging technique
where the spectral resolution is progressively and gradually increased and improved
by adding more bands. Consequently, PHBP-IPPI takes advantage of its progres-
sive nature to capture details in profiling spectral variations as more bands are
added for data processing.
A key to implementing PHBP-IPPI dealing with the issue of PPI count calcula-
tion, which varies with the number of bands, p, as well as with the total number of
skewers, K, to be used by IPPI. Technically speaking, a skewer is a randomly
generated unit vector pointing to a particular direction with which data samples can
be aligned. How to effectively determine an appropriate value for K remains
unresolved and is a very challenging issue (Chang and Plaza 2006). When it
comes to PHBP-IPPI, a similar issue also arises when determining how many
bands are sufficient for IPPI to find a desired set of endmembers. In addition, a
new challenging issue that is not encountered in IPPI using full band information is
how to process skewers band by band. In other words, developing RHBP-IPPI is
much more difficult than it looks because the PPI count calculation involves
skewers that vary with bands every time a new band is fed in. To address this
issue, two approaches are developed in RHBP-IPPI. One is to fix a set of skewers
for all bands currently being considered, and skewers are only updated by the most
recent band information provided by an incoming band. Another approach is to
vary skewer sets with bands by randomly generating a completely new set of
skewers as a band comes in. More specifically, as a new band comes in, the skewer
sets used by PHBP-IPPI must be independently generated, and in this way, the
skewer sets change band by band. By varying skewer sets, the profile changes in the
PPI counts of data sample vectors captured by PHBP-IPPI provide very valuable
information for data analysis. This is indeed a great benefit resulting from PHBP-
IPPI. In particular, when skewer sets also vary band by band, PHBP-IPPI offers
additional valuable information about the impact of varying skewer sets on changes
in PPI counts. Such advantages allow image analysts to keep track of the signifi-
cance of data sample vectors. Another approach is to help find crucial and signif-
icant bands for data processing according to the progressive changes in PPI counts.
Intuitively, PHBP-IPPI can be considered a band-to-band slow-motion version of
IPPI that can dictate slow varying changes in spectral variation that cannot be
provided by IPPI using full band data processing as a one-shot operation.
To facilitate the use of PHBP-IPPI, recursive equations are also derived to make
PHBP-IPPI a recursive process, called recursive band processing of IPPI (RHBP-
IPPI). With such recursive structures RHBP-IPPI can be implemented more effi-
ciently and effectively in practical applications, specifically in hardware design. It
is believed that RHBP-IPPI may open up many future applications such as satellite
data communication and transmission.
19.2 PPI
PPI is a very simple approach to materializing convexity via orthogonal projection

(OP). It randomly generates a set of unit vectors, referred to as skewers. If the data
sample vectors have their OPs fall at one of the ends of these skewers, they will be
considered potential endmember candidates. In other words, any data sample vector
whose OP lies between end points of these skewers will not be endmembers but
rather mixed sample vectors. Unfortunately, it has several drawbacks, identified in
Chang (2013), that hinder its implementation. One drawback is the prior knowledge
about the number of skewers, K, PPI must generate. Another is its inconsistent
results owing to the use of randomly generated initial endmembers, which are
skewers. A third drawback is the need to know the value t used to set a threshold
on PPI counts generated for data sample vectors. Most importantly, it requires a
visualization tool to manually select endmembers by human intervention. In this
case, it requires skilled, experienced, and well-trained analysts to produce the best
possible sets of endmembers. Additionally, it does not guarantee that PPI will
always be effective unless the number of skewers, K, is sufficiently large. However,
how large is large enough remains unresolved. Owing to the limited details of how
PPI is implemented in ENVI, in what follows, we describe our own version of
interpreting PPI in terms of its design rationale.
N
Let fri gi¼1 be a given set of data sample vectors. For a given value of K we now
use a random generator to produce a set of K random unit vectors, referred to as
K
skewers, fskewerk gk¼1 , which cover K different random directions. All the data
N
sample vectors, fri gi¼1, are then orthogonally projected on this randomly generated
K
skewer set, fskewerk gk¼1 . According to convexity geometry, an endmember that is
considered a pure signature should occur at end points of some of these skewers
with either maximum projection or minimum projection. For each sample vector ri
we further calculate the number of skewers, denoted by NPPI(ri), at which this
particular sample vector occurs as an end point to tally the PPI count for ri.
Figure 19.1 illustrates how the concept of PPI works where three skewers, skewer1,
skewer2, and skewer3, are indicated by three random unit vectors; the sample
vectors are shown by open circles and three endmembers e1, e2, e3 by solid circles
located at three vertices of the triangles and a cross “+” used to indicate a maximum
or minimum projection of an endmember on a skewer. Because of convexity, all the
sample vectors inside the triangle should have their PPI counts equal to 0 in the
sense that they are mixtures of the three endmembers at the vertices of the triangle
indicated by dashed lines. It should be noted that a maximum or minimum projec-
tion occurring at a skewer is one that yields the maximum or minimum value among
all the sample vectors. Also, a projection can be positive or negative depending on
whether it occurs in the same or opposite direction of a skewer.
19.2 PPI 547
e1
maximal NPPI (e1)=1 maximal
projection projection
skewer3
skewer2
x
NPPI (x)=1
skewer1
e3
NPPI (e3)=3
e2
NPPI(e2)=2 minimal
projection
Fig. 19.1 Illustration of PPI with three endmembers, e1, e2, and e3
As noted, in order for PPI to be effective, a large number of skewers, K, are

generally required to guarantee that these skewers can cover as many random
directions as possible. For example, in Fig. 19.1 the shade sample vector labeled
x outside the triangle has the same NPPI ¼ 1 as does e1, but it is obviously not an
endmember. The occurrence of such an incident is the result of insufficient numbers
of skewers used for PPI. Unfortunately, no guideline has been suggested to deter-
mine how many skewers should be used for PPI. Empirically, this number is
generally hundreds if not thousands for PPI to perform well. As a result, PPI also
suffers from three drawbacks. One is that the range of PPI counts is significantly
increased. Consequently, it becomes more difficult to find an appropriate value t for
thresholding PPI counts to determine a final set of endmembers. A second drawback
is the unreproducibility of final selected endmembers owing to the random nature of
skewers. Accordingly, results are inconsistent. A third drawback is the requirement
of dimensionality reduction (DR) to reduce high-dimensional hyperspectral data.
However, there is also the issue of how many dimensions, q, must be retained to
include all endmembers. Fortunately, a concept recently developed in Chang
(2003a) and Chang and Du (2004), referred to as virtual dimensionality (VD), has
been shown to provide a good estimate of the value of q. Therefore, in the following
PPI algorithm, VD is included to estimate the value of q in the initialization process.
Since the details of the specific steps to implement ENVI’s PPI are not available
in the literature, a MATLAB version of PPI described in what follows is based on
limited published results and our own interpretation. Nevertheless, our algorithm
was verified and validated by PPI with ENVI 3.6, both of which produce the same
results.
MATLAB PPI Algorithm

1. Initialization:
(a) Use VD to determine the number of dimensions, q, to be retained
following DR.
(b) Apply a maximum noise fraction transform (Green et al. 1988) to reduce the
dimensionality of the data set to q component images.
K
(c) Randomly generate a set of K unit vectors called skewers, fskewerk gk¼1 ,
where K is a preassumed sufficiently large positive integer.
2. PPI count calculation:
For each skewerk, all the data sample vectors are orthogonally projected onto
skewerk to find sample vectors whose OPs occur at its two extreme end points to
form an extrema set for this particular skewerk, denoted by Sextrema(skewerk).
Despite the fact that a different skewerk generates a different extrema set
Sextrema(skewerk), some sample vectors will very likely appear in more than
one extrema set. Define an indicator function of a set S, IS(r) by
(
1; if r 2 S X
I S ðrÞ ¼ and N PPI ðrÞ ¼ I
k Sextrema ðskewerk Þ
ðrÞ; ð19:1Þ
2S
0; if r=
where NPPI(r) is defined as PPI count of the sample vector r.

3. Candidate selection:
Find the PPI counts NPPI(r) for each of all the sample vectors, r, defined by
(19.1).
4. Finding endmembers:
Let t be a preselected appropriate value to be used to set a threshold on PPI
counts of all data sample vectors and extract all the sample vectors with
N PPI ðrÞ t.
19.3 IPPI
Although the MATLAB version of PPI described in Sect. 19.2 remedies some of
drawbacks inherited in PPI, such inherent disadvantages make PPI difficult to apply
directly to real-time or causal processing because PPI requires generating all
K skewers prior to the computation of PPI counts for data sample vectors. Also,
according to its original design, PPI is also not iterative. This section presents two
progressive versions of IPPI, referred to as progressive iterative PPI (P-IPPI) and
causal iterative PPI (C-IPPI), to resolve the aforementioned issues. While the idea
of developing IPPI can be traced back to Wu and Chang (2012), the proposed
P-IPPI and C-IPPI are actually derived from successive N-FNDR (SC N-FINDR)
and sequential N-FINDR (SQ N-FINDR) (Wu et al. 2008), respectively.
19.3 IPPI 549
19.3.1 P-IPPI
The idea of P-IPPI was inspired by the work of Wu et al. (2008) and Xiong et al.
(2011), where two iterative processes are designed, one for data sample vectors
implemented in an inner loop and the other for skewers in an outer loop. For each
inner loop the iterative process is carried out for each incoming data sample vector.
In the context of the work of Wu et al. (2008) and Xiong et al. (2011), it is
considered a single-pass real-time processing of all data sample vectors for a
given skewer in the inner loop, while the number of skewers, K, determines how
many passes are needed for P-IPPI to complete its task in the outer loop. The most
important advantage of P-IPPI is that there is no need to fix the value of K, as is
required by PPI. Thus, technically speaking, P-IPPI can be run indefinitely until its
performance is no longer improved. Another advantage is that P-IPPI can updates
PPI counts of all data sample vectors in real time skewer by skewer once a new
skewer is generated. More specifically, the PPI count for each data sample vector
can be easily updated without recalculating its PPI count.
What follows is a step-by-step detailed implementation of P-IPPI, with index
i used to run data sample vectors in the inner loop for each given generated skewer,
while index k is used to run skewers in the outer loop.
P-IPPI
1. Initialization:
N
Assume that fri gi¼1 are data sample vectors inputted from the data set according
to 1, 2, . . ., N and K is the number of skewers. Let N PPI ðri Þ ¼ 0 and k ¼ 1.
Outer loop: iteratively producing a new skewer
2. Generate a random unit vector skewerk. Set maxðkÞ ¼ fg, max value ¼
r1T skewerk , minðkÞ ¼ fg, and min value ¼ r1T skewerk .
Inner loop: iteratively processing incoming data sample vectors for each skewer
specified by the outer loop
3. For i 2, check whether
riT skewerk ¼ max valueðkÞ: ð19:2Þ
If so, maxðkÞ ¼ maxðkÞ [ fig. Otherwise, check whether
riT skewerk > max valueðkÞ: ð19:3Þ
If so, maxðkÞ ¼ fig and max valueðkÞ ¼ riT skewerk . Otherwise, continue.
4. Repeat step 3 to find min(k) and continue.
5. Let i i þ 1, and check whether i ¼ N. If not, go to step 3. Otherwise, continue.
6. N PPI ðri Þ N PPI ðri Þ þ 1 for i 2 maxðkÞ or i 2 minðkÞ, and continue.
7. If there is a prescribed number of skewers, K, then the algorithm is terminated
when k reaches K. Otherwise, let k k þ 1, and go to step 2.
19.3.2 C-IPPI
If we interchange two loops implemented in P-IPPI, we can derive a new version of

IPPI, denoted C-IPPI, where the inner loop is now indexed by the kth skewer,
skewerk, and the outer loop is indexed by the ith data sample vector, ri. Like P-IPPI,
C-IPPI can also be implemented as a real-time processing algorithm in a causal
manner in the sense that only data sample vectors to be used for data processing are
ones that have already been processed. However, there is one significant difference
between C-IPPI and P-PPI: C-IPPI requires prior knowledge about the value of K,
that is, the total number of skewers for C-IPPI to generate prior to the implemen-
tation of C-IPPI.
Causal IPPI (C-IPPI)

1. Initialization:
N
Assume that fri gi¼1 are data sample vectors inputted according to the pixel
K
number from 1, 2, . . ., N and fskewerk gk¼1 , where K is the given total number of
skewers to be used to generate the IPPI. Set i ¼ 1 and k ¼ 1. Let max valueð1Þ
¼ r1T skewer1 and min valueð1Þ ¼ r1T skewer1 .
Outer loop: iteratively processing incoming data sample vector ri+1
2. If i N, continue. Otherwise, the algorithm is terminated, in which case, find
XK
cðiÞ ¼ I
k¼1 maxðkÞ
ðiÞ, where Imax(k)(i) is an indicator function defined by

1, if i 2 maxðkÞ,
I maxðkÞ ðiÞ ¼ ð19:4Þ
0, otherwise;
N PPI ðri Þ ¼ cðiÞ:
K
Inner loop: iteratively processing all skewers fskewerk gk¼1 for each inputted
data sample vector ri with i 2.
3. For k 2, check whether
riT skewerk ¼ max valueði 1Þ: ð19:5Þ
If so, max valueðiÞ ¼ max valueði 1Þ. Otherwise, check whether
riT skewerk > max valueði 1Þ: ð19:6Þ
If so, max valueðiÞ ¼ riT skewerk . Otherwise, continue.

4. Repeat step 3 to find min(k), min_value(i), and continue.
5. Let k k þ 1 and check whether k ¼ K. If not, go step 3. Otherwise, let i i
þ1, and go to step 2.
19.3 IPPI 551
19.3.3 FIPPI
As a special case of P-IPPI and C-IPPI, FIPPI can be implemented as P-IPPI or

C-IPPI by replacing step 3 in P-IPPI or C-IPPI with the following step.
n op
ðk Þ
Step 3: For a given set of skewerj obtained at iteration k 0:
j¼1
(i) Perform P-IPPI via (19.2) and (19.3) for all the data samples for each skewer,
ðkÞ ðkÞ
skewerj by projecting each data sample vector ri onto all skewerj in
ðkÞ ðkÞ
{skewerj } to find NPPI(ri ) defined by (19.1), where the superscript (k) is
used to indicate that the PPI count of ri is obtained using the skewer set
ðkÞ
{skewerj }.
Or alternatively:
N
(ii) Perform C-IPPI via (19.4) and (19.5) for each of the sample vectors fri gi¼1
ðkÞ ðkÞ
by projecting each data sample vector ri onto all skewerj in {skewerj } to
ðkÞ
find NPPI(ri ) defined by (19.1), where the superscript (k) is used to indicate
ðkÞ
that the PPI count of ri is obtained using the kth skewer set, {skewerj }.
From either (i) or (ii) a new skewer set can be generated by finding a joint set
given by
n o n o n o
ðkþ1Þ ðkÞ ðkÞ
skewerj ¼ ri [ skewerj : ð19:7Þ
N PPI ðri Þ>0
ðk Þ
19.3.4 Generalization of IPPI to Using Skewer Sets
Two issues arise in using skewers when it comes to IPPI, that is, determination of
K skewers and randomness resulting from skewers.
Recall that when P-IPPI is performed, it can be implemented indefinitely as long
as the index k used to dictate the number of skewers is continuously increased one
skewer at a time. The only issue that remains for P-IPPI is when P-IPPI should be
terminated, that is, when k reaches a specifically assigned value. On the other hand,
C-IPPI, like PPI, assumes that the number of skewers, K, is fixed. It then carries out
IPPI in a causal manner for each of the data samples starting from the first data
sample, r1, continuing the second data sample, r2, and so forth, until it reaches the
last data samples, rN, at which point the process is terminated. As noted in the
previous section, both P-IPPI and C-IPPI implement two loops indexed by i and k in
reverse order. More specifically, let i and k be two indices used to keep track of data
samples to be processed and the number of skewers to be processed, respectively.
Then P-IPPI implements two loops, with an outer loop indexed by k to iterate
skewers one at a time and an inner loop indexed by i to iterate all data samples,
N
fri gi¼1 , one sample vector at a time for a given skewerk. By contrast, C-IPPI also
implements two loops with an outer loop indexed by i to iterate all data samples,
N
fri gi¼1 , one sample vector at a time and an inner loop indexed by k to iterate all the
K
K skewers, fskewerk gk¼1 , at the ith iteration for the ith data sample, ri, where the
number of skewers is fixed at K during the entire k-indexed iterative process
running k from 1 to K. Both P-IPPI and C-IPPI make use of the same index i to
iterate all data samples as well as the same index k to iterate skewers in two
different ways, where P-IPPI increases the index k one at a time indefinitely in
the outer loop, as opposed to C-IPPI, which uses the index k to run through all
K
skewers in a given fixed set of K skewers, fskewerk gk¼1 , in the inner loop. Two
main issues arise in either case: (1) skewers are randomly generated and (2) deter-
mining how many skewers is enough. In what follows, we will show that two
previously developed random PPI (R-PPI) in Chang et al. (2010c) and FIPPI in
Chang and Plaza (2006) can be used to resolve these two issues.
According to P-IPPI, the number of skewers, K, is a varying parameter that can
be used as a variable to allow P-IPPI to run indefinitely while progressively
increasing K by one. This is a tremendous advantage of P-IPPI since P-IPPI
calculates PPI counts for all data sample vectors for a given skewer, skewerk,
skewer by skewer. As a result, the PPI count of each data sample vector can be
updated as K is increased and does not have to be reimplemented repeatedly,
compared to the original PPI, which must be reimplemented to recalculate the
PPI counts of all data sample vectors. To be more precise, P-IPPI takes advantage of
previous PPI counts obtained from a smaller value of K and then updates its PPI
counts for a larger value of K until the value of K is either preset or meets a certain
stopping criterion. Interestingly, a stopping rule similar to the stopping rule used by
FIPPI can also be derived for P-IPPI as follows. Specifically, P-IPPI is terminated if
k
N PPI ðri Þ ¼ N kþ1
PPI ðri Þ, with N PPI ðri Þ > 0 for 1 i N;
k
ð19:8Þ
where NkPPI (ri) is the PPI count of the ith data sample vector ri using k skewers to
generate its PPI count and the index k used to specify the outer loop of P-IPPI is
increased by one.
Now we can further extend P-IPPI by expanding n one skewer
o specified by (19.8)
ðkÞ
to a kth skewer set denoted by skewer setðkÞ ¼ skewerj . Then the single step of
(19.8) can be replaced by a new middle iterative loop indexed by j, which is
described as follows:
Loop indexed by j n o
ðkÞ ðk Þ
For each skewerj in skewer setðkÞ ¼ skewerj
ðk Þ
skewerj N
perform IPPI to find N PPI ðri Þ for fri gi¼1 : ð19:9Þ
Note that there are three scenarios in which P-IPPI and C-IPPI can be
implemented jointly as generalizations of IPPI.
19.3 IPPI 553
19.3.4.1 Joint Implementation of P-IPPI and C-IPPI
A natural extension to P-IPPI and C-IPPI is to have them implemented jointly. Two
versions can be derived. One is to replace IPPI implemented in P-IPPI with C-IPPI
in each pass progressively. The resulting IPPI is called progressive causal-IPPI
(PC-IPPI). The other is to replace IPPI implemented in C-IPPI with P-IPPI for
skewer sets sample by sample. The resulting IPPI is called causal P-IPPI (CP-IPPI)
and can implement IPPI for all skewers as a real-time process rather than one
skewer at a time progressively.
19.3.4.2 Varying Skewer Set C-IPPI (VC-IPPI)
If the IPPI specified in (19.9) is replaced by C-IPPI, it is referred to as varying

skewer set C-IPPI (VC-IPPI), which actually implements three loops; one index i is
used to iterate an inner loop operating all data sample vectors, another index k is
used to iterate n an oouter loop to find the kth skewer set,
ðkÞ
skewer setðkÞ ¼ skewerj , while a third index j is used to iterate a new middle
n o
ðkÞ
loop to perform C-IPPI on skewer setðkÞ ¼ skewerj via (19.9). Thus, when the
skewer set is a singleton set, that is, skewer_set(k) contains only one element,
skewer(k), VC-IPPI is reduced to PC-IPPI. Because of that, PC-IPPI is a special
case of VC-IPPI.
19.3.4.3 Growing Skewer Set P-IPPI
Similarly, when IPPI used in (19.7) is replaced by P-PPI, it is referred to as growing

skewer set P-IPPI (GP-IPPI), which implements three loops; one index i is used to
iterate an inner loop operating all data sample vectors, anothern index k isoused to
ðk Þ
iterate an outer loop to find the kth skewer set, skewer setðkÞ ¼ skewerj , while
N
a third index j is used to iterate a new middle loop to perform P-IPPI on fri gi¼1 for
n o
ðkÞ ð Þ
each skewer, skewerj in skewer setðkÞ ¼ skewerj .
k
To summarize, both VC-IPPI and GP-IPPI implement three loops indexed by

three parameters, i, j, and k,n where index
o k is used to iterate the kth skewer set
ðkÞ
denoted by skewer setðkÞ ¼ skewerj , index j is used to iterate skewers in the
n o
ðkÞ
skewer setðkÞ ¼ skewerj , and index i is used to iterate all data sample vectors.
If we further use the notation x/y/z to indicate that x, y, z are indices of the outer,
middle, and inner loops, respectively, then VC-IPPI and GP-IPPI are actually
implemented by three loops in the order k/i/j and k/j/i, respectively, as illustrated
in Table 19.1.
Table 19.1 Various versions of IPPI using two iterative loops indexed by two parameters, i,k, and
three iterative loops indexed by three parameters, i,j,k
Corresponding IPPI
Outer loop Middle loop ( j) Inner loop algorithms
kth skewer Data samples C-IPPI (k/i)
Data samples kth skewer P-IPPI (i/k)
kth skewer_set Data samples Skewers in kth VC-IPPI (k/i/j)
skewer_set
kth skewer_set Skewers in kth Data samples GP-IPPI (k/j/i)
skewer_set
kth skewer_set Data samples Skewers in kth FIPPI (VC-IPPI) (k/i/j)
skewer_set
kth skewer_set Skewers in kth Data samples FIPPI (GP-IPPI) (k/j/i)
skewer_set
One major advantage of VC-IPPI and GP-IPPI over PPI is that they require no
prior knowledge of K. This is a significant benefit in hardware design, such as Field
Programmable Gate Array, because several recent efforts to implement PPI in
hardware design require a specific value of K for its implementation and they
cannot vary with the parameter K as its value changes (Bernabe et al. 2011; Bernabe
et al. 2013). This feature paves the way to the feasibility of implementing IPPI in
hardware chip design.
19.4 Recursive Hyperspectral Band Processing of IPPI
The VC-IPPI and GP-IPPI developed in the previous section are progressive
algorithms that use full band information. In what follows, we derive recursive
hyperspectral band processing (RHBP) versions of implementing IPPI progres-
sively and recursively band by band.
N
Assume that fri ðlÞgi¼1 is a data set made up of all data samples with the
first l bands, where, for each 1 i N; ri(l ) can be represented by
T T
ri ðlÞ ¼ r i1 ; . . . ; r iðl1Þ ; r il ¼ riT ðl 1Þ, r il . Also suppose that there is a given
K T
set of K skewers denoted by fsk ðlÞgk¼1 ; with sk ðlÞ ¼ sk1 ; . . . ; skðl1Þ ; skl ¼
T T
sk ðl 1Þ, skl . Then rTi (l )sk(l ) can be updated by riT ðl 1Þsk ðl 1Þ progressively
band by band and calculated recursively by riT ðlÞsk ðlÞ ¼ riT ðl 1Þsk ðl 1Þ þ r il skl
band by band.
19.4 Recursive Hyperspectral Band Processing of IPPI 555
General Algorithm for Implementing RHBP-IPPI

1. For the lth band
ðkÞ
Initial conditions: N PPI ðri ðlÞÞ ¼ 0, and K is given and fixed.
(a) (i) For 1 k K
Normalize the kth skewer, sk, as ^sk and find

maxl1 ðkÞ ¼ max1iN riT ðl 1Þ^s k ðl 1Þ ; ð19:10Þ

minl1 ðkÞ ¼ min1iN riT ðl 1Þ^s k ðl 1Þ : ð19:11Þ
(ii) For 1 i N
A. If riT ðlÞ^s k ðlÞ > minl1 ðkÞ and riT ðlÞ^s k ðlÞ < maxl1 ðkÞ, then
ðkÞ
N PPI ðri ðlÞÞ ¼ 0. Otherwise, continue.
ðk Þ
B. If riT ðlÞ^s k ðlÞ < minl1 ðkÞ, then N PPI ðri ðlÞÞ ¼ 1. Otherwise, continue.
ðkÞ
C. If ri ðlÞriT ðlÞ^s k ðlÞ > maxl1 ðkÞ, then N PPI ðri ðlÞÞ ¼ 1. Otherwise, continue.
D. Let i i þ 1, and go to step 1(a)(i).
(b) Let k k þ 1 and go to step 1(a).
XK ðk Þ
2. Calculate N PPI ðri ðlÞÞ ¼ N
k¼1 PPI i
r ðlÞ.
3. Let l l þ 1 and go to step 1.
The previously described RHBP-IPPI is neither a causal nor a real time process
because step 1(a) must find maxl(k) and minl(k) via (19.10) and (19.11) for all the
N
data samples fri ðlÞgi¼1 . In the following sections we can make RHBP-IPPI a
progressive algorithm using a fixed skewer set to be implemented as RHBP-P-
IPPI, where for each skewer RHBP-P-IPPI can be implemented as a real-time
process and a causal algorithm (RHBP-C-IPPI) using a fixed skewer set, which
can also be implemented as a real-time process.
19.4.1 RHBP-IPPI Using a Skewer Set Fixed for All Bands
K
For the first band, l ¼ 1, we randomly generate a set of K skewers, fskewerk ð1Þgk¼1 ,
where skewerk ð1Þ ¼ ðsk1 Þ is actually a scalar skl. These K scalar skewers will then
be used to update skewers across all bands. In other words, when the band number is
K
increased to l, the K-skewer set used for l is fskewerk ðlÞgk¼1 , where each
T
skewerk(l ) is actually made up of skewerk ðlÞ ¼ skewerk ðl 1Þ, skl . More
T
specifically, once a skewer set is generated at band one, l ¼ 1, the locations of

skewers are fixed and used for all subsequent bands by adding the new band
K
information, fskl gk¼1 , provided by the incoming band, the lth band for
K
fskewerk ðlÞgk¼1 .
According to Chang and Wu (2015), if IPPI implements skewers in the outer

loop while implementing data samples in the inner loop, the resulting IPPI is
P-IPPI.
RHBP-P-IPPI Using a Fixed Skewer Set for All Bands

Outer loop from l ¼ 1 to L
Middle Loop from k ¼ 1 to K
T
• skewerk ðlÞ ¼ skewerkT ðl 1Þ, s1l
• Normalize skewerk(l ) to a unit vector.

Inner Loop from i ¼ 1 to N
Initial conditions:
ðk Þ
• N PPI ðri ðlÞÞ ¼ 0
• Find maxl1(k) by (19.10) and minl1(k) by (19.11).

ðk Þ
• If riT ðlÞ^s k ðlÞ > minl1 ðkÞ and riT ðlÞ^s k ðlÞ < maxl1 ðkÞ, then N PPI ðri ðlÞÞ ¼ 0.
Otherwise, continue.
ðk Þ
• If riT ðlÞ^s k ðlÞ < minl1 ðkÞ, then N PPI ðri ðlÞÞ ¼ 1. Otherwise, continue.
ðk Þ
• If ri ðlÞ^s k ðlÞ > maxl1 ðkÞ, then N PPI ðri ðlÞÞ ¼ 1. Otherwise, continue.
T
• Let i i þ 1.
End (Inner Loop)
• Let k k þ 1.
End (Middle Loop)
XK ðkÞ
Calculate N PPI ðri ðlÞÞ ¼ k¼1
N PPI ri ðlÞ.
• Let l l þ 1.
End (Outer Loop)
If we swap the inner and middle loops in P-IPPI, the result is C-IPPI.
RHBP-C-IPPI Using a Fixed Skewer Set

Middle Loop from i ¼ 1 to N
Inner Loop from k ¼ 1 to K
Initial conditions:
T
• skewerk ðlÞ ¼ skewerkT ðl 1Þ, s1l
• Normalize skewerk(1) to a unit vector, ^sk(1).

• Calculate r1T ðlÞ^s k ðlÞ ¼ r1T ðl 1Þ^s k ðl 1Þ þ r 1l^s kl
• max1 ð1; kÞ ¼ r1T ð1Þ^s k ð1Þ for all 1 k K.
ðkÞ
• N PPI ðr1 ðlÞÞ ¼ 1 for all 1 k K.
• i¼2
• l¼2 n o
• maxl1 ði; kÞ ¼ max1ji rjT ðl 1Þ^s k ðl 1Þ
n o
• minl1 ði; kÞ ¼ min1ji rjT ðl 1Þ^s k ðl 1Þ
• maxl1 ¼ max1ji fmaxl1 ðjÞg

ðkÞ
• If minl1 ði; kÞ < riþ1T
ðlÞ^s k ðlÞ < maxl1 ði; kÞ, then N PPI ðriþ1 ðlÞÞ ¼ 0. Oth-
erwise, continue.
ðkÞ ðkÞ
T
• If riþ1 ðlÞ^s k ðlÞ > maxl1 ði; kÞ, then N PPI ðri ðlÞÞ ¼ 0 and N PPI ðriþ1 ðlÞÞ ¼ 1.
ðkÞ ðkÞ
T
• riþ1 ðlÞ^s k ðlÞ < minl1 ði; kÞ, then N PPI ðri ðlÞÞ ¼ 0 and N PPI ðriþ1 ðlÞÞ ¼ 1.
• Let i i þ 1.
End (Inner Loop)
• Let k k þ 1.
End (Middle Loop)
XK ðkÞ
N PPI ri ðlÞ.
Let l l þ 1.
End (Outer Loop)
19.4.2 RHBP-IPPI Using Varying Skewer Sets with Bands
For each band number, l, we randomly generate a set of K(l ) skewers,

K ðlÞ
fskewerk ðlÞgk¼1 , with skewerk ðlÞ ¼ ðsk1 ; sk2 ; . . . ; skl ÞT , where the size of the
skewer set K(l ) varies with the bands. Even for a special case with K(l ) ¼ K for
all bands, the skewer sets are also different from the fixed skewer set because these
K skewers vary and are independently generated band by band, that is,
T
skewerk ðlÞ 6¼ skewerkT ðl 1Þ, skl .
RHBP-P-IPPI Using Varying Skewer Sets with Bands

• Initial condition: l ¼ 1
○ Randomly generate a set of K(1) skewers denoted by
K ð1Þ
fskewerk ð1Þ ¼ sk1 gk¼1 .
K ð1Þ
○ Normalize each of the skewers in fskewerk ð1Þgk¼1 to a unit vector,
K ð1Þ
f^s k ð1Þgk¼1 .
• For l from 2 to L
○ Randomly generate a set of K(l ) skewers denoted by
K ðlÞ
fskewerk ðlÞ ¼ sk1 gk¼1 .
K ð1Þ
K ð1Þ
f^s k ð1Þgk¼1 .
Middle Loop for each K(l ) with l from 2 to L
Inner Loop from i ¼ 1 to N
Initial conditions:
• N ðPPI
kÞ
ðri ðlÞÞ ¼ 0
• Find maxl1(k) by (19.10) and minl1(k) by (19.11).
• If riT ðlÞ^s k ðlÞ > minl1 ðkÞ and riT ðlÞ^s k ðlÞ < maxl1 ðkÞ, then
ðkÞ
N PPI ðri ðlÞÞ ¼ 0. Otherwise, continue.
ðkÞ
• If riT ðlÞ^s k ðlÞ < minl1 ðkÞ, then N PPI ðri ðlÞÞ ¼ 1. Otherwise, continue.
ðkÞ
• If riT ðlÞ^s k ðlÞ > maxl1 ðkÞ, then N PPI ðri ðlÞÞ ¼ 1. Otherwise, continue.
• Let i i þ 1.
End (Inner Loop)
• Let k k þ 1.
End (Middle Loop)
XK ðkÞ
Calculate N PPI ðri ðlÞÞ ¼ N
k¼1 PPI i
r ðlÞ.
Let l l þ 1.
End (Outer Loop)
Similarly, we can swap the inner and middle loops to derive an RHBP version of
C-IPPI as follows.
RHBP-C-IPPI Using Varying Skewer Sets with Bands

• Initial condition: l ¼ 1
○ Randomly generate a set of K(1) skewers denoted by
K ð1Þ
fskewerk ð1Þ ¼ sk1 gk¼1 .
K ð1Þ
K ð1Þ
f^s k ð1Þgk¼1 .
• For l from 2 to L
○ Randomly generate a set of K(l ) skewers denoted by
K ðlÞ
fskewerk ðlÞ ¼ sk1 gk¼1 .
K ð1Þ
K ð1Þ
f^s k ð1Þgk¼1 .
Middle Loop from i ¼ 1 to N
Inner Loop for each K(l ) with l from 2 to L
Initial conditions:
• Randomly generate a set of K(l ) skewers denoted by
n T oKðlÞ
skewerk ðlÞ ¼ sk1 ; . . . ; skðl1Þ ; skl .
k¼1
K ðlÞ
• Normalize each of the skewers in fskewerk ðlÞgk¼1 to a unit vector,
K ðlÞ
f^s k gk¼1 .
• Calculate r1T ðlÞ^s k ðlÞ ¼ r1T ðl 1Þ^s k ðl 1Þ þ r 1l skl
• maxl ð1; kÞ ¼ r1T ðlÞ^s k ðlÞ for all 1 k K.
ðkÞ
• N PPI ðr1 ðlÞÞ ¼ 1 for all 1 k K.
• i¼2 n o
• maxl ði; kÞ ¼ max1ji rjT ðlÞ^s k ðlÞ
n o
• minl ði; kÞ ¼ min1ji rjT ðlÞ^s k ðlÞ
• maxl ¼ max1ji fmaxl ðjÞg
ðkÞ
• If minl ði; kÞ < riþ1T
ðlÞ^s k ðlÞ < maxl ði; kÞ, then N PPI ðriþ1 ðlÞÞ ¼ 0. Other-
wise, continue.
ðk Þ ðkÞ
T
• If riþ1 ðlÞ^s k ðlÞ > maxl ði; kÞ, then N PPI ðri ðlÞÞ ¼ 0 and N PPI ðriþ1 ðlÞÞ ¼ 1.
ðk Þ ðkÞ
T
• riþ1 ðlÞ^s k ðlÞ < minl ði; kÞ, then N PPI ðri ðlÞÞ ¼ 0 and N PPI ðriþ1 ðlÞÞ ¼ 1.
• Let i i þ 1.
End (Inner Loop)
• Let k k þ 1.
End (Middle Loop)
XK ðkÞ
N PPI ri ðlÞ.
Let l l þ 1.
End (Outer Loop)
Note that there are at least two significant differences between RHBP-IPPI using
a fixed skewer set and varying skewer sets. First, the former uses the same fixed
skewer set for all bands, compared to the latter, which uses skewer sets that vary
with the bands. Second, the size of the used fixed skewer set, K, for the former must
be determined a priori beforehand, while the latter can adapt skewer sets with size K
(l ) to various bands. Even in the case of K(l ) fixed at K, the skewer sets are also
different since they are randomly generated every time a band is fed in.
To substantiate and validate the utility of C-IPPI in applications, the synthetic

images shown in Fig. 19.2 which is reproduced from Figs. 1.14 and 1.15 of Chap. 1.
as endmember pixels, were simulated in the data by the five endmembers A, B, C,
K, and M. An area marked “BKG” in the upper right corner of Fig. 1.14a was
area “BKG,” denoted by b and plotted in Fig. 1.14b, to be used to simulate the
The reason for this background selection is empirical since the selected area “BKG”
100%
A
B

K(1, 1)∈ E(1)

500
C(1, 1) ∈ E(1)
500
600
500
PPI Count Value
400
300
200 M(1, 1)∈ E500

(10)
100 A(1, 1)∈ E500

(3)
0 B(1, 1)∈ E500

(27)
0
25
50
Nu
m 75
be
ro
fP 100
ro
ce
ss 125 M(1, 1)
ed
Ba 150 K(1, 1)
nd C(1, 1)
s 175 B(1, 1) es
A(1, 1) 5 Signatur
Fig. 19.3 3D plots of PPI counts versus nl for the five different signatures
corrupted by an additive noise to achieve a signal-to-noise ratio of 20:1, which was

defined as a 50 % signature (i.e., reflectance/radiance) divided by the standard
deviation of the noise in Harsanyi and Chang (1994). Once target pixels and
background are simulated, two types of target insertion can be designed to simulate
experiments for various applications.
There are two types of synthetic images used for experiments. The first type of
clean target panels into a clean image background plus additive Gaussian noise by
replacing their corresponding background pixels.
A second type of target insertion is target embeddedness (TE), which can also
be simulated by embedding clean target panels into a clean image background
plus additive Gaussian noise by superimposing target pixels over the background
pixels.
The image scene in Fig. 19.2 was processed by RHBP-IPPI. Figure 19.3 shows the
plots of PPI count versus nl defined as the first l bands used to process the five
mineral signatures A, B, C, K, and M in the TI scene, where the x-axis denotes
various values of nl, the y-axis indicates different mineral signatures, and the z-axis
dictates the PPI count value. In this figure, 500 skewers are generated and utilized in
the experiments. Since all of the 20 pure pixels for each signature in the first two
columns can be extracted simultaneously, only the top left corner pixels in the first
column for each signature are selected and marked A(1, 1), B(1, 1), C(1, 1), K(1, 1),
Fig. 19.4 Endmember candidates found by RHBP-IPPI using different values of nl: (a) nl ¼ 1,
(b) nl ¼ 3, (c) nl ¼ 10, (d) nl ¼ 27, (e) nl ¼ 189 (full bands)
and M(1, 1) for demonstration. As shown in Fig. 19.3, the magenta arrow represents
a first-time band, whereby a particular signature is identified as an endmember
candidate. For instance, the signature B(1, 1) was first identified as an endmember
candidate when nl ¼ 27, that is, the first 27 bands collected and processed.
Figure 19.4 shows the spatial locations of endmember candidates identified by
RHBP-IPPI using various values of nl; Fig. 19.4a starts with only 1 band then
progressively increases to nl ¼ 3 in Fig. 19.4b, nl ¼ 10 in Fig. 19.4c, nl ¼ 27 in
Fig. 19.4d, and finally reaches the full bands, that is, nl ¼ 189, in Fig. 19.4e, each of
which shows endmember candidates found by RHBP-IPPI in the particular transi-
tion bands specified by Fig. 19.3. In Fig. 19.4, a pixel marked by a red cross
indicates a ground truth pixel in the TI scene, a pixel marked by cyan upper triangle
highlights an endmember candidate extracted in the current band, and those
endmember candidates identified in the previous bands are marked by green circles.
Table 19.2 summarizes the five mineral signatures A, B, C, K, and M among the
endmember candidates identified by RHBP-IPPI when nl varies, with nl ¼ 1, nl ¼ 3,
nl ¼ 10, nl ¼ 27, to full bands, nl ¼ 189. As demonstrated in the table, EðnlsÞ in the
second column was identified using ns with the first l bands collected and processed,
nl where ns is defined as the number of skewers.
Table 19.2 Summary of mineral signatures identified by RHBP-IPPI versus nl for TI scene
nl Signatures identified by RHBP-IPPI
1 ð1Þ ð1Þ
Cð1; 1Þ 2 E500 , Kð1; 1Þ 2 E500
3 ð3Þ
Að1; 1Þ 2 E500
10 ð10Þ
Mð1; 1Þ 2 E500
27 ð27Þ
Bð1; 1Þ 2 E500
189 ð189Þ ð189Þ ð189Þ ð189Þ ð189Þ
Að1; 1Þ 2 E500 , Bð1; 1Þ 2 E500 , Cð1; 1Þ 2 E500 , Kð1; 1Þ 2 E500 , Mð1; 1Þ 2 E500
a b
6
Order of Signatures Identified by RBP-PPI B(1, 1)

5
M(1, 1)
4
A(1, 1)
3
C(1, 1), K(1, 1)

2
0
0 25 50 75 100 125 150 175
Fig. 19.5 Comparison of IPPI and RHBP-IPPI: (a) IPPI-identified signatures; (b) orders of
signatures identified by RHBP-IPPI
Furthermore, Fig. 19.5a shows endmember candidates identified by PPI for the
TI scene using 500 skewers with all bands received and processed, but the same
results can be achieved by RHBP-IPPI using nl ¼ 27 bands. Moreover, Fig. 19.5b
plots the order of all mineral signatures A, B, C, K and M identified as endmember
candidates versus different values of nl. Clearly, we see that with only 27 bands
available, all 5 signatures were extracted.
As mentioned, 500 skewers were generated and used by RHBP-IPPI to process
the TI scene. However, how to select an appropriate number of skewers, ns, remains
an unsolved problem in the literature. To address this issue, a progressive-skewer
version of RHBP-IPPI is further proposed, called RHBP-Progressive-Skewer-IPPI
(RHBP-PS-IPPI). Figure 19.6 plots PPI counts versus various values of nl as the
number of skewers, ns, increases from 50 to 500 with step size ¼ 50 for the five
mineral signatures, where the x-axis denotes the value of nl, the y-axis shows the
number of skewers, ns, generated for the RHBP-PS-IPPI process, and the z-axis
dictates the value of the PPI count.
As shown in Fig. 19.6, the height of each bar represents the value of the PPI
count for a particular combination of nl and ns generated for the process. The
magenta upward pointing arrow indicates that a specific material was being iden-
tified for the first time as an endmember candidate in that particular band by RHBP-
PS-IPPI. For instance, signature B(1,1) is identified for the first time as an
endmember candidate by the 27th band with nl ¼ 27 available in RHBP-IPPI
150
PPI Count Value
100
50
0
0
3
25
50
Nu
m
75
be
ro
fP
100
ro
ce
ss
125 500
ed
450
400
Ba
350
nd
150 300
s
250
rs
175 150
200
of S kewe
100 Nu mber
50
0
15
PPI Count Value
10
0
0
25
27
50
Nu
m
be
75
ro
fP
100
ro
ce
ss
500
ed
125
450
Ba
400
nd
350
150
s
300
250
200 rs
ewe
175 150
er of Sk
50
100 Numb
0
Fig. 19.6 PPI counts versus nl and the number of skewers for each of five mineral signatures:
(a) A(1,1), (b) B(1,1), (c) C(1,1), (d) K(1,1), (e) M(1,1) processed by RHBP-PS-IPPI
600
PPI Count Value
400
200
0
0
1
25
50
Nu
m
be
75
ro
fP
100
ro
ce
ss
500
ed
125
450
Ba
400
nd
350
150 300
s
250
200 wers
175 150 Ske
er of
100 Numb
50
0
600
PPI Count Value
400
200
0
0
1
25
50
Nu
m
be
75
ro
fP
ro
100
ce
ss
ed
125 500
450
Ba
400
nd
350
s
150 300
250
200 ers
175 150 r of Skew
100 N umbe
50
0

150
PPI Count Value
100
50
0
0
10
25
50
Nu
m
be
75
ro
fP
ro
100
ce
ss
ed
125 500
Ba
450
400
nd
350
s
150 300
250
rs
200 kewe
175 150 er of S
100 Numb
50
0
using 500 skewers. However, in the RHBP-PS-IPPI process, it is further shown that
B(1,1) could be identified as an endmember candidate using only 300 generated
skewers. This means that the smaller the number of skewers, the lower the com-
putational complexity.
To further address the causal issue that only visited pixels can be used for data
processing, RHBP-C-IPPI was also developed for this purpose. Figure 19.7 plots
progressive changes in the PPI count varying np, defined as the number of first
pixels being processed, which increases from ns ¼ 2000 to 40,000 with a step
size of 2000 for the five mineral signatures, where the x-axis denotes nl, the y-
axis shows np included in the process, and the z-axis dictates the value of the PPI
count. As shown in Fig. 19.7, the height of each bar represents the value of the PPI
count for a particular combination of nl and ns. The magenta upward pointing arrow
indicates that a specific material was identified for the first time as an endmember
candidate in that particular band by RHBP-C-IPPI. For instance, signature B(1,1) is
identified for the first time as an endmember candidate by RHBP-C-IPPI using
nl ¼ 27,500 skewers and np ¼ 14,000 pixels collected.
Moreover, Fig. 19.8 also shows an RHBP-C-IPPI process, with np ranging
from ns ¼ 4000 to 40,000 and with a step size of 4000 in nl ¼ 1. As shown in the
figure, a red cross indicates a ground truth panel pixel location of the TI scene, a
cyan upper triangle highlights a ground truth pixel being identified as an
endmember candidate. The white area in the scene indicates that the pixels in
this area are not yet available for data processing. The beauty of the RHBP-C-
IPPI process is that it can be used to find weak targets in a scene. For instance,
subpixel material A in the third column can be identified as an endmember
candidate by RHBP-C-IPPI only using np ¼ 8000–24,000 pixels available but
vanishing subsequently.
a
PPI Count Value
150
100
50
0
3
25
Nu
50
m
be
ro
75
fP
ro
ce
100
ss
ed
Ba
125
nd
40000
38000
s
36000
34000
32000
150 30000
28000
26000
24000
22000
20000
18000 ix els
16000 ssed P
f Proce
175 14000
12000 o
Number
10000
8000
6000
4000
2000
0
b
PPI Count Value
15
10
25
27
50
Nu
m
be
75
ro
fP
ro
ce
100
ss
ed
Ba
125
40000
nd
38000
36000
s
34000
32000
150 30000
28000
26000
24000
22000
20000
18000 els
175 14000
16000
cess ed Pix
12000
of Pro
Number
10000
8000
6000
4000
2000
0
Fig. 19.7 Plots of PPI count versus nl and np for each of five mineral signatures: (a) A(1,1), (b) B
(1,1), (c) C(1,1), (d) K(1,1), (e) M(1,1) processed by RHBP-C-IPPI
c
PPI Count Value
600
400
200
0
1
25
50
Nu
m
75
be
ro
fP
100
ro
ce
ss
ed
125
40000
Ba
38000
36000
nd
34000
32000
30000
s
150 28000
26000
24000
22000
20000
18000 els
16000 ed Pix
175 12000
14000
of P rocess
Number
10000
8000
6000
4000
2000
0
d
PPI Count Value
600
400
200
0
1
25
50
Nu
m
be
ro
75
fP
ro
ce
100
ss
ed
Ba
125
40000
nd
38000
36000
s
34000
32000
150 30000
28000
26000
24000
22000
20000
18000
16000 els
175 14000 ed Pix
12000
rocess
6000
8000
10000
Numb er of P
4000
2000
0

e
PPI Count Value
150
100
50
0
10
25
Nu
50
m
be
ro
75
fP
ro
ce
100
ss
ed
Ba
125
nd
40000
38000
36000
s
34000
32000
150 30000
28000
26000
24000
22000
20000
18000
16000 ix els
175 14000 ssed P
12000
f Proce
mber o
10000
8000
4000
6000 Nu
2000
0
Figure 19.9 shows plots of the PPI count versus nl for the five mineral signatures A,
B, C, K, and M in a TE scene, where the x-axis denotes the number of first l bands
processed, nl, the y-axis indicates different signatures, and the z-axis dictates the
PPI count. In this figure, 500 skewers were generated and used in the experiments.
Since all of the 20 pure pixels for each signature in the first two columns can be
extracted simultaneously, only the pixels in the top left corner in the first column for
each signature are selected and marked A(1,1), B(1,1), C(1,1), K(1,1), and M(1,1)
for demonstration purposes. As shown in Fig. 19.9, the magenta arrow shows the
first band in which a particular signature is identified for the first time as an
endmember candidate. For instance, the signature B(1,1) is identified for the first
time as an endmember candidate by RHBP-IPPI using nl ¼ 21.
Figure 19.10 shows the spatial locations of the endmember candidates identified
by RHBP-IPPI using different values of nl. Figure 19.10a starts with only one band,
that is, nl ¼ 1, then progressively increases to nl ¼ 2 in Fig. 19.10b, nl ¼ 9 in
Fig. 19.10c, nl ¼ 21 in Fig. 19.10d, and nl ¼ 134 in Fig. 19.10e, and it finally reaches
the full bands, nl ¼ 189, in Fig. 19.10f, each of which shows the endmember
candidates found by RHBP-IPPI in the particular transition bands specified by
Fig. 19.9. In Fig. 19.10, a red cross indicates a ground truth pixel in the TE
scene, a cyan upper triangle highlights an endmember candidate found in the
current band, and those endmember candidates identified in the previous bands
are marked by green circles.
Table 19.3 summarizes the five mineral signatures A, B, C, K, and M among the
endmember candidates identified by RHBP-IPPI when nl has varying values,
1 band, 2 bands, 9 bands, 21 bands, 134 bands, and the full 189 189 bands. As
Fig. 19.8 Example of RHBP-C-IPPI process using different np pixels in band nl ¼ 1: (a)
np ¼ 4000, (b) np ¼ 8000, (c) np ¼ 12,000, (d) np ¼ 16,000, (e) np ¼ 20,000, (f) np ¼ 24,000, (g)
np ¼ 28,000, (h) np ¼ 32,000, (i) np ¼ 36,000, (j) np ¼ 40,000 (full size of image pixels)
K(1, 1) Œ E (1)
500
600
500
PPI Count Value
400
300
200 M(1, 1) Œ E (9)

500
100 (2)
A(1, 1) Œ E 500
B(1, 1) Œ E (21)
500
0
0
25
Nu 50
m
be 75
ro C(1, 1) Œ E (134)
fP 100
500
ro
ce
ss 125
ed M(1, 1)
Ba K(1, 1)
nd 150
s C(1, 1)
175 B(1, 1) es
A(1, 1) 5 Signatur
Fig. 19.9 3D plots of PPI count versus nl for the five different signatures
Fig. 19.10 Endmember candidates found by RHBP-IPPI using different values of nl: (a) nl ¼ 1,
(b) nl ¼ 2, (c) nl ¼ 9, (d) nl ¼ 21, (e) nl ¼ 134 (f) nl ¼ 189 (full bands)
Table 19.3 Summary of mineral signatures identified by RHBP-IPPI versus nl for TE

nl Signatures identified by RHBP-IPPI
1 ð1Þ
Kð1; 1Þ 2 E500
2 ð2Þ
Að1; 1Þ 2 E500
9 ð9Þ
Mð1; 1Þ 2 E500
21 ð21Þ
Bð1; 1Þ 2 E500
134 ð134Þ
Cð1; 1Þ 2 E500
189 ð189Þ ð189Þ ð189Þ ð189Þ ð189Þ
Að1; 1Þ 2 E500 , Bð1; 1Þ 2 E500 , Cð1; 1Þ 2 E500 , Kð1; 1Þ 2 E500 , Mð1; 1Þ 2 E500
a b
6
Order of Signatures Identified by RBP-PPI
C(1, 1)
5
B(1, 1)
4
M(1, 1)
3
A(1, 1)
2
K(1, 1)
1
0
0 25 50 75 100 125 150 175
Fig. 19.11 Comparison of IPPI and RHBP-IPPI: (a) PPI-identified signatures; (b) orders of
signatures identified by RHBP-IPPI
demonstrated in the table, EðnlsÞ in the second column was identified using ns with the
first l bands collected and processed, nl.
Furthermore, Fig. 19.11a shows the endmember candidates identified by PPI for
the TE scene using 500 skewers with all the bands received and processed where
five mineral signatures and three subpixel could be extracted as endmember
candidates. However, with only 134 bands collected and processed, that is,
nl ¼ 134, all five signatures could be extracted by RHBP-IPPI using the same
number of skewers, 500. Additionally, four subpixel could be extracted as
endmember candidates by RHBP-IPPI from the TE scene, in which case one
more additional subpixel could be found, compared to PPI (Fig. 19.11a). Moreover,
Fig. 19.11b plots the order of each mineral signature A, B, C, K, and M identified as
endmember candidates versus the number of bands processed, nl. It is clear to see
that with only 134 bands available, nl ¼ 134, all five signatures could be extracted.
As noted earlier, 500 skewers were generated and used in the RHBP-IPPI
process for the TE scene. However, selecting an appropriate number of skewers
is very challenging. To mitigate this dilemma, RHBP-PS-IPPI is further extended to

address the issue of the number of skewers that need to be generated for RHBP-
IPPI. Figure 19.12 plots PPI count versus nl ¼ number of processed bands as the
number of skewers increases from 50 to 500 with a step size of 50 for the five
mineral signatures, where the x-axis is used to specify the parameter of nl, the y-axis
shows the number of skewers generated for RHBP-PS-IPPI, and the z-axis dictates
the value of the PPI count.
150
PPI Count Value
100
50
0
0
2
25
50
Nu
m
be
75
ro
fP
ro
100
ce
ss
ed
125 500
450
Ba
400
nd
350
150
s
300
250
200 ers
175 150 er o f Skew
100 Numb
50
0
20
PPI Count Value
15
10
5
0
0
21
25
50
Nu
m
be
75
ro
fP
ro
100
ce
ss
ed
125 500
Ba
450
nd
400
350
s
150 300
250
ers
175 150
200
r o f Skew
e
100 Numb
50
0
Fig. 19.12 Plots of PPI count versus number of bands, nl and ns, processed by RHBP-PS-IPPI for
each of five mineral signatures: (a) A(1,1), (b) B(1,1), (c) C(1,1), (d) K(1,1), (e) M(1,1)
6
PPI Count Value
0
0
25
50
Nu
m
75
be
ro
fP
100
ro
ce
ss
125 500
ed
134 450
400
Ba
350
nd
150 300
s
250
200 wers
175 150 er o f Ske
100 Numb
50
0
600
PPI Count Value
400
200
0
0
1
25
50
Nu
m
be
75
ro
fP
ro
100
ce
ss
ed
125 500
Ba
450
nd
400
350
s
150 300
250
200 wers
175 150 Ske
100 Num ber of
50
0

150
PPI Count Value
100
50
0
0
9
25
50
Nu
m
be
75
ro
fP
ro
100
ce
ss
ed
125 500
Ba
450
nd
400
350
s
150 300
250
200 rs
175 150 of Skewe
100 Nu mber
50
0
As shown in the figures, the height of each bar represents the value of the PPI
count for a particular combination of the number of processed bands, nl, and the
number of skewers generated for the process. The magenta upward pointing arrow
indicates that one specific material was identified for the first time as an endmember
candidate in a particular band by RHBP-IPPI. For instance, the signature B(1,1) was
identified as an endmember candidate by RHBP-IPPI for the first time using
21 bands, that is, nl ¼ 21, with 500 skewers. Interestingly, as a matter of fact,
RHBP-PS-IPPI only needs 300 skewers to identify all five mineral signatures as
endmember candidates. Thus, the smaller the number of skewers, the better the
computational complexity. Based on our extensive experiments, since the spectrum
of material C is very similar to the background, it is difficult to identify it. This is
why it is the last material to be identified using nl ¼ 134, which also requires the
largest number of skewers.
To further address the causal issue that only visited sample vectors can be
included in data processing, a causal version of RHBP-IPPI, called RHBP-C-
IPPI, was developed. Figure 19.13 plots the progressive changes in the PPI count,
varying nl as the number of processed pixels defined as np increases from 2000 to
40,000 with a step size of 2000 for the five mineral signatures in the TE scene,
where the x-axis denotes the value of nl, the y-axis shows the number of pixels
included in the process, np, and the z-axis dictates the value of the PPI count.
As shown in Fig. 19.13, the height of each bar represents the value of the PPI
count for a particular combination of nl and np. The magenta upward pointing arrow
indicates that a specific material was identified for the first time as an endmember
candidate in the particular band by RHBP-IPPI. For instance, the signature B(1,1) is
identified as an endmember candidate by RHBP-IPPI for the first time with
a
PPI Count Value
150
100
50
0
2
25
Nu
50
m
be
ro
75
fP
ro
ce
ss
100
ed
Ba
125
nd
40000
s
38000
36000
34000
32000
150 30000
28000
26000
24000
22000
20000
18000
els
175 12000
14000
16000
cess ed Pix
10000 of Pro
4000
6000
8000
Number
2000
0
b
PPI Count Value
20
10
21
25
50
Nu
m
be
75
ro
fP
ro
100
ce
ss
ed
Ba
125
40000
nd
38000
36000
34000
s
32000
150 30000
28000
26000
24000
22000
20000
18000 els
175 14000
16000
cess ed Pix
12000 of Pro
Number
10000
8000
6000
4000
2000
0
Fig. 19.13 Plots of PPI count versus nl and ns for each of five mineral signatures: (a) A(1,1),
(b) B(1,1), (c) C(1,1), (d) K(1,1), (e) M(1,1) processed by RHBP-C-IPPI
c
PPI Count Value
25
50
Nu
m
75
be
ro
fP
100
ro
ce
ss
ed
125
40000
Ba
134 38000
36000
nd
34000
32000
30000
s
150 28000
26000
24000
22000
20000
18000 Pixels
14000
16000
cessed
175 12000 of Pro
6000
8000
10000
Number
4000
2000
0
d
PPI Count Value
600
400
200
0
1
25
Nu
50
m
be
ro
fP
75
ro
ce
ss
100
ed
Ba
nd
125
s
40000
38000
36000
34000
32000
150 30000
28000
26000
24000
22000
20000
18000 els
16000 ed Pix
Process
175 14000
12000
10000 of
4000
6000
8000 Number
2000
0

e
PPI Count Value
150
100
50
0
9
25
Nu
m
50
be
ro
fP
75
ro
ce
ss
ed
100
Ba
nd
125
s
40000
38000
36000
34000
32000
150 30000
28000
26000
24000
22000
20000
18000 els
16000 ed Pix
175 14000
rocess
er of P
12000
10000
6000
8000 Numb
4000
2000
0
500 skewers using 21 bands available, nl ¼ 21. In fact, RHBP-C-IPPI can identify
the five mineral signatures as endmember candidates using only the first
14,000 pixels collected, np ¼ 14,000.
Moreover, Fig. 19.14 shows an example of implementing RHBP-C-IPPI using
np ranging from 4000 to 40,000 with a step size of 4000 in band 1, that is, nl ¼ 1. As
shown in the figure, a red cross indicates a ground truth pixel location in the TI
scene, a cyan upper triangle highlights a ground truth pixel identified as an
endmember candidate. The white area in the scene indicates that the pixels in this
area are not yet available for data processing. The beauty of RHBP-C-IPPI is its
ability to capture weak targets in a scene. For instance, subpixel material A in the
third column could be identified as an endmember candidate with np from 8000 to
24,000 pixels but vanishing afterwards.
The two real hyperspectral image scenes described in Chap. 1, HYperspectral

Digital Imagery Collection Experiment (HYDICE) data in Fig. 1.10 and Airborne
Visible / Infrared Imaging Spectrometer (AVIRIS) data in Figs. 1.3 and 1.4 are used
for the experiments to demonstrate the applicability of C-IPPI in its causal real-time
implementation.
The image scene shown in Fig. 19.15 (also shown in Fig. 1.10a) was used for the
experiments. It was acquired by the airborne HYDICE. It has a size of 64 64 pixel
vectors with 15 panels and the ground truth map in Fig. 19.15b (Fig. 1.10b), where
the ith panel signature, denoted by pi, was generated by averaging the red panel
Fig. 19.14 Example of RHBP-C-IPPI process using different np pixels in band nl ¼ 1. (a)
a b
c
7000
P1
6000 P2
P3
P4
5000 P5
Radiance
4000
3000
2000
1000
0
0 20 40 60 80 100 120 140 160 180
Band
locations of the 15 panels; (c) spectra of p1, p2, p3, p4, and p5
center pixels in row i and are shown in Fig. 19.15c (Fig. 1.11). According to the
HYDICE image scene in Fig. 19.13, there are 19 R panel pixels arranged in a 5 3
matrix, where there are three panel pixels in row 1 and four panel pixels in each of
rows 2–5.
Note that the only pure panel pixels present in the data scene in Fig. 19.15 are
those in column 1 (one panel pixel in row 1 and two panel pixels in each of the other
four rows) and column 2 (one panel pixel in each of the five rows) corresponding to
the five panel signatures p1, p2, p3, p4, and p5 in Fig. 19.15c.
Figure 19.16a, b shows the plots of the PPI count versus nl for the different 19 R
panel pixels with 1000 skewers being used by RHBP-IPPI, where the x-axis
specifies nl, the y-axis is the coordinates of 19 R-panel pixels according in row-
by-row order, that is, p11, p12, p13, p211, p221, p22, p23, p311, p312, p32, p33, p411, p412,
p42, p43, p511, p521, p52, and p53, and the z-axis is the corresponding PPI count value.
70
60
50
PPI Count Value
40
p511∈ E1000
(9) p52∈ E(22)
1000
p521∈ E(21)
1000
30
p42∈ E(24)
1000
20 p412∈E(34)
p221∈ E(5) 1000
1000 p312 ∈E(52)
p22∈ E1000
(9) 1000
10 (36)
p411∈ E1000
0 p32∈ E(52)
1000
0 p211∈ E(28)
1000 p311∈E(50)
1000
20
40
Nu p53
60 p52
mb (101)
p11∈ E1000 p511
p521
er p43
of 80 p42
Pro p412
ces 100 p33
p411
sed p32
Ba p312
120 p311
nd p23 ixels
s p22 anel P
140
p211
p221 19R P
p13
160 p12
170 p11
70
60
PPI Count Value
50
40
p511∈ E(9) p ∈ E(22)
1000 p ∈E(21) 52 1000
521 1000
30
p42∈ E(24)
1000
20 p412∈E(34)
p221∈E(5) 1000
1000 p312∈E1000
(52)
p22∈ E(9)
1000
10 p411∈ E(36)
1000
0 p32∈ E1000
(52)
0 p211∈E(28)
1000 p311 ∈E(50)
1000
20
40
p52
Nu 60 p521
mb p11 ∈ E(101)
1000 p511
er 80 p42
of p412
Pro p411 d)
ces 100 p32 lude
sed l Exc
-pixe
p312
Ba 120 p311
ub
nd p22
els (S
s 140 p221
l Pix
p211
Pane
ure R
160 p12
170
p11
14 P
Fig. 19.16 3D plots of PPI counts versus nl for different R panel pixels: (a) PPI count changes for
the 19 R panel pixels; (b) PPI count changes for the 14 pure R panel pixels
Figure 19.16a plots the variation in the PPI count for all of 19 R panel pixels, while
Fig. 19.16b only plots progressive changes in the PPI count for the 14 pure R panel
pixels and excludes the 5 subpixels to make the 3D graphs clearer. In these two
graphs, the height of a line indicates the value of the PPI count. The magenta arrows
in the graph show the transition bands, which provide information about the PPI
count of a particular panel pixel increasing from zero to a nonzero value and can be
used to identify endmember candidate being extracted for the first time in the
current band. For instance, the PPI count value of panel pixel p11 jumps from 0 to
1 and is identified as an endmember candidate for the first time when nl ¼ 101 bands
were received and processed. As we can see in the figure, the PPI counts of the
panel pixels in the first column are greater than the PPI counts of the one-pixel-size
panel pixels in the second column. Meanwhile, the PPI counts of the subpixel
panels in the third column remain 0 during the entire process. In other words,
subpixel panels could not be extracted as endmember candidates by RHBP-IPPI. It
is also clear that the PPI counts of both p511 and p521 are greater than those of p52,
and the PPI count of p53 is 0 during the entire process and it cannot be identified as
an endmember candidate. Also, the PPI counts of the panel pixels tended to
stabilize as more bands were added to the process. Without such progressively
changing values in the PPI counts provided by RHBP-IPPI we would not be able to
show such valuable information, which was compromised by subsequent bands.
Furthermore, Fig. 19.17 highlights the panel pixels with PPI counts greater than
0 in their transition bands, where a red cross indicates the spatial location of an R
panel pixel, a cyan upper triangle shows a panel pixel with its PPI count greater than
0 extracted in the current band, and a yellow circle marks a panel pixel found in
previous bands. As we can see, after band 101, that is, nl ¼ 101, 13 out of 19 panel
pixels were identified as endmember candidates.
Figure 19.18 conducts a comparative study between IPPI and RHBP-IPPI
applied to the HYDICE scene. Figure 19.18a shows the ground truth pixels iden-
tified as endmember candidates by IPPI, where a red cross indicates the spatial
location of an R panel pixel in the HYDICE scene, and a cyan upper triangle shows
a ground truth pixel identified as an endmember candidate. A total of 7 R panel
pixels were identified as endmember candidates. However, according to
Figs. 19.16a and 19.17, a total of 13 R panel pixels were identified by RHBP-
IPPI as endmember candidates, p11, p211, p221, p22, p311, p312, p32, p411, p412, p42,
p511, p521, and p52. Figure 19.18b further shows the order of these 13 panel pixels
found by RHBP-IPPI, where the x-axis denotes nl, the number of bands being
processed, and the y-axis represents the order of panel pixels identified as
endmember candidates. For instance, p221 represents the first panel pixel identified
by RHBP-IPPI with only the first five bands, nl ¼ 5, being processed, and p22 and
p521 are the second and third panel pixels identified after nl ¼ 9. After nl ¼ 101
bands were processed, the last panel pixel, p11, was extracted.
As noted in the preceding discussions, RHBP-IPPI can identify a set of
endmember candidates band by band progressively. Figure 19.18 plots the number
of distinct endmember candidates extracted by RHBP-IPPI as the number of
processed bands, nl, gradually increases using 1000 skewers, where RHBP-IPPI
identified a total of 373 endmember candidates for the next step in the process. As
more bands are included in the process, the number of distinct endmember candi-
dates extracted by RHBP-IPPI increases. After 106 bands are processed, nl ¼ 106,
the number of distinct endmember candidates stopped varying and achieved the
same results as when full bands were processed.
Furthermore, Fig. 19.19 show a 3D plot to demonstrate how frequently a
particular pixel was picked up by RHBP-IPPI as an endmember candidate, where
the x-axis and y-axis are used to specify the columns and rows of the spatial
locations of the extracted pixels. Note that the z-axis in Fig. 19.19 indicates the
Fig. 19.17 Panel pixels with PPI counts greater than 0 after processing of nl bands: (a) nl ¼ 5, (b)
nl ¼ 9, (c) nl ¼ 21, (d) nl ¼ 22, (e) nl ¼ 24, (f) nl ¼ 28, (g) nl ¼ 34, (h) nl ¼ 36, (i) nl ¼ 50, (j) nl ¼ 52,
(k) nl ¼ 101, (l) nl ¼ 169
a b
15
Order of Panel Pixels Identified by RBP-PPI

p11
p312, p32
p311
10
p411
p412
p211
p42
p52
5
p521
p22, p511
p221
0
0 25 50 75 100 125 150
Fig. 19.18 Comparative study of IPPI and RHBP-IPPI: (a) IPPI-identified signatures; (b) orders
of signatures identified by RHBP-IPPI
Number of Distinct Endmember Candidate Identified by RBP-PPI
400
350
300
250
200
150
100
50
0
0 25 50 75 100 125 150
Fig. 19.19 Plot of number of distinct endmember candidates versus number of processed bands,
nl, for HYDICE data
number of times a given pixel at a particular spatial location was identified as an

endmember candidate. For the HYDICE scene 169 bands are the maximum value
for z-axis. As also shown in Fig 19.19, the magenta upward pointing arrows indicate
the spatial locations of the 13 R panel pixels.
p11
200
Frequency
150 p221
100 p22
p211
50 p312
p311
0
p
0 p41232
8 p411 p42
p521
16
p52
24 p511
64
Ro 32
w 56
In 48
de 40
x 40
48 32
ex
24 Ind
mn
56 16 Colu
8
64 0
Fig. 19.20 3D plot of pixels extracted by RHBP-IPPI as endmember candidates
In the HYDICE experiments, 1000 skewers were generated and used by RHBP-
IPPI. However, as noted, selecting an appropriate number of skewers remains an
unresolved problem. In this case, a progressive-skewer version of RHBP-IPPI,
RHBP-PS-IPPI, was further developed to address this issue regarding how many
skewers need to be generated for RHBP-IPPI. Figure 19.20 plots progressive
changes in the PPI count versus the number of processed bands, nl, as the number
of skewers increases from 100 to 1000 with a step size of 100 for R panel pixels p11,
p211, p311, p411, and p511, where the x-axis denotes the number of bands collected
and processed, the y-axis shows the number of skewers generated for RHBP-PS-
IPPI, and the z-axis dictates the value of the PPI count.
As shown in these figures, the height of each bar represents the value of the PPI
count for a particular combination of number of processed bands, nl, and number of
skewers, ns, generated for data processing. The magenta upward pointing arrow
indicates that a specific material signature was identified as an endmember candi-
date for the first time in the band by RHBP-PS-IPPI. For instance, R panel pixel p211
is identified for the first time by RHBP-PS-IPPI using 100 skewers as an
endmember candidate when nl ¼ 28, (i.e., the first 28 bands) were processed. As a
matter of fact, RHBP-PS-IPPI could identify all panel pixels in the first column as
endmember candidates using only 100 skewers generated. Table 19.4 summarizes
the number of bands processed, nl, specifying a particular pixel that was extracted
for the first time as an endmember candidate and the minimum number of skewers
required to extract all the R panel pixels in the first column identified as endmember
candidates.
As seen from the table, to extract all the R panel pixels in the first column, a
minimum of 700 skewers were required. A smaller set of skewers generated in the
process could have reduced the computational complexity. To address the causal
Table 19.4 Summary of all R panel pixels in first column extracted by RHBP-PS-IPPI
Number of bands processed required
R panel pixels for a pixel to be first extracted Minimum number of skewers
in first column as an endmember candidate required to extract a particular pixel
p11 101 500
p211 28 100
p221 5 500
p311 50 700
p312 52 100
p411 36 700
p412 34 400
p511 9 600
p521 21 700
issue in the data collection process, RHBP-C-IPPI is further proposed. Figure 19.22
plots the changes in the PPI count with different values of nl as the number of
processed pixels, np, increases from 256 to 4096 with a step size of 256 for the 5 R
panel pixels in the first column in the HYDICE scene, where the x-axis denotes the
number of bands collected and processed, the y-axis is the number of pixels
included in the process, np, and the z-axis represents the value of the PPI count.
As shown in Fig. 19.22, the height of each bar represents the value of the PPI count
for a particular combination of the number of processed bands, nl and the pixels
included in the process, np. The magenta upward pointing arrow indicates that one
material is beginning to be identified as an endmember candidate in the band that it
first identified as an endmember candidate in RHBP-C-IPPI. For instance, R panel
pixel p411 is first identified as an endmember candidate with 36 bands available, that is,
nl ¼ 36, by RHBP-IPPI using ns ¼ 1000 skewers and 3072 pixels, that is, np ¼ 3072.
Moreover, Fig. 19.22 shows an RHBP-C-IPPI process with the number of
processed pixels ranging from 512 to 4096 with a step size of 512 in band 19. As
shown in the figure, a red cross indicates the ground truth location of an R panel
pixel in the HYDICE scene, and a cyan upper triangle highlights the ground truth
pixel identified as an endmember candidate. The white area in the scene indicates
that the pixels in this area are not yet available. The beauty of the RHBP-C-IPPI
process is that it extracts weak targets in the scene. For instance, R panel pixel p22 in
the second column can be identified as an endmember candidate with
1536–3072 pixels available, and it vanishes afterwards.
19.7 Graphical User Interface
A graphical user interface (GUI) (screenshot shown in Fig. 19.24) was developed
using MATLAB’s GUIDE to aid in algorithm performance analysis. The three
images displayed at the top of the window show a color image of the scene, a
19.7 Graphical User Interface 587
a
PPI Count Value
5
4
3
2
1
0
0
25
Nu 50
m
be
ro 75
fP
ro
ce 1000
ss 100 900
ed 101
800
Ba 700
nd 125 600
s 500
400 wers
150 Ske
300 er of
200 Numb
170 100
0
b
PPI Count Value
5
4
3
2
1
0
0
25
28
50
Nu
m
be
ro 75
fP
ro 1000
ce 100
ss 900
ed 800
Ba 700
nd 125 600
s 500
400 ers
150 300 Skew
200 Num ber of
170 100
0
Fig. 19.21 Plots of PPI count versus number of processed bands, nl, and different number of
skewers: (a) p11, (b) p211, (c) p311, (d) p411, (e) p511
20
PPI Count Value
15
10
5
0
0
25
50
Nu
m
be
ro 75
fP
ro
ce 100 1000
ss 900
ed 800
Ba 700
nd 125 600
s 500
400 ers
150 300 Skew
200 Num ber of
170 100
0
d
PPI Count Value
5
4
3
2
1
0
0
25
36
Nu 50
m
be
ro 75
fP
ro
ce 1000
ss 100 900
ed 800
Ba 700
nd 125 600
s 500
400
150 wers
300
er of Ske
100
200 Numb
170
0

15
PPI Count Value
10
0
0
9
25
Nu 50
m
be
ro 75
fP
ro
ce 1000
ss 100 900
ed 800
Ba 700
nd 125 600
s 500
400
150 300 Sk ewers
er of
200 Numb
170 100
0
grayscale image of the current band being processed, and the result after the current
band is processed. In the bottom window are shown the variations in the IPPI counts
as more bands are collected for the different signatures. Once the data are loaded,
the user can start the process by clicking the start button. At each iteration, each
band is received and processed. Upon completion of processing, the current band
and the resulting image are updated to allow the user to observe the results. A final
note on Fig. 19.24 is worthwhile. The plots shown in the bottom window represent a
real-time progressive version of RHBP-IPPI. Unfortunately, this nice feature can
only be demonstrated in a real-time process.
19.8 Conclusions
This chapter derived a new version of PPI in recursive hyperspectral band

processing, referred to as RHBP-IPPI, other than endmember finding, as is com-
monly used in the literature. The use of RHBP-IPPI makes it possible to find and
dictate spectral variations from band to band so that missing details in the spectral
characterization of data samples can be captured for data analysis. This is partic-
ularly useful and crucial when it comes to finding weak targets with only small
changes in certain specific bands with lower PPI counts, which can be overwhelmed
or dominated by subsequent strong targets found by PPI with higher PPI counts.
This is mainly because PPI using full bands can only show the final PPI counts of
data samples; it cannot provide any information about the changes in PPI counts of
a
PPI Count Value
200
100
0
0
25
50
Nu
m
be
75
ro
fP
ro
ce
100
ss
101
4096
ed
3840
3584
Ba
125 3328
3072
nd
2816
s
2560
2304
2048
150 1792
1536 xels
1280 essed Pi
1024 r of Proc
512
768 Numbe
170 256
0
b
PPI Count Value
40
20
0
0
25
28
50
Nu
m
be
ro
75
fP
ro
ce
100
ss
ed
4096
3840
Ba
3584
3328
nd
125 3072
s
2816
2560
2304
2048
150 1792 ls
1536 se d Pixe
1024
1280 of Proces
768 Number
170 512
256
0
Fig. 19.22 PPI count variations versus number of bands and number of pixels, nl, processed by
RHBP-C-IPPI for each R panel pixel: (a) p11, (b) p211, (c) p311, (d) p411, (e) p511
c
PPI Count Value
40
20
0
0
25
Nu
50
m
be
ro
fP
75
ro
ce
ss
100
ed
Ba
4096
3840
nd
3584
125 3328
s
3072
2816
2560
2304
2048
150 1792
1280
1536
es se d Pixels
1024 of Proc
512
768 Number
170 256
0
d
PPI Count Value
40
20
0
0
25
36
Nu
50
m
be
ro
fP
75
ro
ce
ss
100
ed
Ba
4096
3840
nd
3584
125 3328
s
3072
2816
2560
2304
2048
150 1792
1536 Pixels
1280 essed
768
1024
Numbe r of Proc
170 512
256
0

e
PPI Count Value
20
10
0
0
9
25
Nu
50
m
be
rof
75
P
ro
ce
ss
100
ed
Ba
4096
3840
nd
3584
125 3328
s
3072
2816
2560
2304
2048
150 1792
1536 ls
1280 sed Pixe
1024 of Proces
170 512
768 Number
256
0

Fig. 19.23 An example of RHBP-C-IPPI process using different np pixels in band nl ¼ 1: (a)
Fig. 19.24 GUI design for RHBP-IPPI
data samples from band to band. In addition, the potential of RHBP-IPPI in finding
significant bands in changes in PPI counts was also demonstrated by experiments.
To further study RHBP-IPPI, we can extend RHBP-IPPI to real-time implementa-
tion for future hardware design.
Chapter 20
Recursive Band Processing of Fast
Iterative Pixel Purity Index
Abstract As noted in Chap. 19, the performance of the pixel purity index (PPI) is
largely determined by the number of skewers, K, to be used to calculate PPI counts
for data sample vectors. Recently, two approaches were investigated. One is the
iterative PPI (IPPI), developed by Chang and Wu [IEEE J Sel Top Appl Earth Obs
Remote Sens 8(6):2676–2695, 2015], where Chap. 19 develops a progressive
version of the IPPI, called progressive hyperspectral band processing of IPPI
(PHBP-IPPI), and a recursive version of IPPI, called recursive hyperspectral band
processing of IPPI (RHBP-IPPI), both of which vary with bands rather than skewer
sets according to the band-sequential (BSQ) acquisition format so that bands can be
collected band by band while data processing is taking place. Another approach is
the fast iterative PPI (FIPPI) developed by Chang and Plaza [IEEE Geosci Remote
Sens Lett 3(1):63–67, 2006] to address two major issues arising in PPI: the use of
skewers whose number must be determined a priori and inconsistent final outcomes
resulting from the use of skewers, which cannot be reproduced. FIPPI is an iterative
algorithm that iterates each process until it reaches a final set of endmembers. Most
importantly, it is an unsupervised algorithm, as opposed to the PPI, which requires
human intervention to manually select a final set of endmembers. As shown in
Chang and Plaza [IEEE Geosci Remote Sens Lett 3(1):63–67, 2006] and Chang and
Wu [IEEE J Sel Top Appl Earth Obs Remote Sens 8(6):2676–2695, 2015] both the
FIPPI and an IPPI produce very similar results, but FIPPI converges very rapidly
with significant savings in computation. Following a similar treatment derived for
RHBP-IPPI in Chap. 19, this chapter extends FIPPI to an RHBP version for FIPPI,
called recursive hyperspectral band processing of FIPPI (RHBP-FIPPI) in such a
way that data analysts can observe the progressive profiles of interband changes
among bands produced by RHBP-FIPPI. The idea of implementing RHBP-FIPPI is
to use two loops specified by skewers and bands implemented in RHBP-IPPI in
Chap. 19 to process FIPPI. Depending on which one is implemented in the outer
loop, two different versions of RHBP-FIPPI can be designed. When the outer loop
is iterated band by band, it is called RHBP-FIPPI. When the outer loop is iterated by
growing skewers, it is called recursive skewer processing of FIPPI (RSP-FIPPI).
Interestingly, both versions provide different insights into the design of FIPPI but
produce close results. Finally, note that since RHBP is implemented bandwise band
by band, it does not require dimensionality reduction like the original FIPPI.

DOI 10.1007/978-3-319-45171-8_20
596 20 Recursive Band Processing of Fast Iterative Pixel Purity Index
20.1 Introduction
The pixel purity index (PPI) has been widely used for finding endmembers (Chang
2013, 2016). However, its utility has been demonstrated to be more than just for
endmember finding; for example, it can be used to find seed training samples for
multispectral image classification (Chen et al. 2013), to find feasible sets for
potential endmember candidates (Xiong et al. 2011), and so forth. One of the
major issues associated with PPI is its sensitivity to input parameters, namely
K (number of skewers) and t (cutoff threshold value) used in PPI. Another impor-
tant issue is the random procedure employed by PPI to generate so-called skewer
vectors (or skewers), which are then used to compute endmember candidates. A
third issue is its computational complexity. A fourth issue is the requirement of
human intervention to manually select a final set of endmembers by visual inspec-
tion. Most importantly, PPI is not an iterative process and does not guarantee its
convergence in finite runs, despite the fact that it may converge asymptotically, as is
claimed. To address these issues, a fast iterative algorithm for implementing PPI,
referred to as fast iterative pixel purity index (FIPPI), was developed by Chang and
Plaza (2006) and discussed in Sect. 19.3.3. It has several significant advantages
over PPI. First, it takes advantage of a recently developed concept, virtual dimen-
sionality (VD), developed by Chang (2003a) and Chang and Du (2004), to estimate
the number of endmembers, p, required to be generated. Using VD allows us to
replace the two parameters k and t used in PPI so that the algorithm’s sensitivity to
these parameters can be resolved. Second, FIPPI makes use of the automatic target
generation process (ATGP) discussed in Sect. 4.4.2.3 to generate an appropriate set
of initial endmembers that can speed up the algorithm considerably. Third, because
ATGP-generated targets used as initial endmembers are specific, there is no issue of
randomness resulting from skewers. Fourth, FIPPI is an iterative algorithm that
converges very rapidly with tremendous savings in computational time. Most
importantly, PPI requires a visualization tool to manually select a final set of
endmembers. Such a problem is avoided by FIPPI because FIPPI is automatically
terminated and the final set of FIPPI-generated endmembers is consistent. This is
considered one of the most significant advantages of FIPPI over PPI. Additionally,
FIPPI only requires a few iterations to generate its final set of endmembers
compared with PPI, which requires the parameter k to be a very high number, for
example, 104 or greater. Most significantly, FIPPI substantially reduces the com-
putational complexity via an iterative process.
Since a recursive hyperspectral band processing version of PPI, called recursive
hyperspectral band processing of PPI (RHBP-PPI), is developed in Chap. 19, it
would be highly desirable to adopt a similar treatment to derive a recursive version
of FIPPI that could also process FIPPI band by band recursively according to band
availability without waiting for full bands to be completely acquired. This chapter
explores this idea and further develops a recursive hyperspectral band processing of
FIPPI (RHBP-FIPPI) as a counterpart to RHBP-PPI using two iterative processes to
perform FIPPI with bands in the outer loop and skewers in the inner loop. If these
two loops are swapped by growing skewers in the outer loop and processing bands
20.2 FIPPI 597
in the inner loop, the resulting RHBP-FIPPI is called recursive skewer processing of
FIPPI (RSP-FIPPI). Interestingly, both versions provide different insights into the
design of FIPPI but produce similar results.
20.2 FIPPI
One of the major drawbacks of PPI is its computational complexity. For instance,
the algorithm took more than 50 min to project every data sample vector of a
350 350 pixel subset of the 224-band Cuprite AVIRIS image scene into 104
skewers on a PC with an AMD Athlon 2.6 GHz processor and 512 MB RAM. To
reduce computational complexity, most users of the ENvironment for Visualizing
Images (ENVI) software preprocess data by Minimum Noise Fraction transforma-
tion so that the original data’s dimensionality can be reduced to ease computation.
Additionally, PPI is not iterative and can only guarantee that it will produce optimal
results asymptotically. Thus, it is recommended that the algorithm be implemented
using as many skewers as possible in order to obtain optimal results. Also,
according to our experiments, PPI is also very sensitive to the initial values of
parameters K and especially t. Furthermore, ENVI’s PPI makes use of a built-in
algorithm to randomly generate a large set of so-called skewers, and users of the
ENVI’s PPI do not have a choice of their initial sets of skewers. Finally, PPI
requires a supervised procedure to manually select a final set of endmembers,
which largely depends on human interpretation. This section addresses these issues
by developing FIPPI. Interestingly, by virtue of IPPI, as discussed in Chap. 19,
FIPPI can be reinterpreted as a variant of growing skewer set P-IPPI (Sect. 19.3.3).
As noted, one of the major disadvantages resulting from implementing PPI is the
inconsistency in the final set of extracted endmembers caused by randomly generated
skewers. One way to mitigate this issue is to develop a random version of P-PPI, called
random P-IPPI (RP-IPPI), similar to RC-IPPI described in Chang (2016). Another way
to resolve this issue is to specify a set of specially generated initial endmembers by a
certain algorithm, an approach that is in direction opposition to RP-IPPI. One such
algorithm is FIPPI, which is described in detail as follows where an initial set of
endmembers is generated by a specific algorithm, ATGP, in Sect. 4.4.2.3.
FIPPI
Initial conditions: Find the VD using the Harsanyi–Farrand–Chang (HFC)
method, and let it be p (see Chap. 4).
Outer loop from l ¼ p to L
Inner loop
n op
ð0Þ
1. Initial condition: Let skewerl, j be an initial set of p skewers generated
j¼1
by selecting those pixels that correspond to target pixels generated by ATGP.
n op
ð 0Þ
2. Normalize skewerl, j into a unit vector.
j¼1
ðkÞ
3. Iterative rule: At iteration k, for each skewerl; j , all sample vectors are
ðkÞ
projected onto this particular skewerl; j to find those that are at its extreme
ðkÞ
positions to form an extrema set, denoted by Sextrema(skewerl; j ). Find the
ðkÞ
sample vectors that produce the largest NPPI(rl; j ) and let them be denoted by
ðkÞ
{rl; j }.
4. Form the joint set
n o n o
ðkþ1Þ ðk Þ ðkÞ
skewerl, j ¼ arg maxrðkÞ N PPI rl, j [ skewerl, j : ð20:1Þ
l, j
n o n o
ðkþ1Þ ðkÞ
5. If skewerl, j ¼ skewerl, j , then no new endmembers are added to the
skewer set. In this case, the algorithm is terminated, breaking the inner loop.
Otherwise, let k k þ 1, and go to step 2.
End Inner loop
End Outer loop
Note that in the original
n design of
o FIPPI developed by Chang and Plaza (2006),
ðkþ1Þ
the new skewer set, skewerl, j , is augmented by including all data sample

ðkÞ ðk Þ
vectors rl; j with N PPI rl, j > 0, that is,
n o n o n o
ðkþ1Þ ðkÞ ðkÞ
skewerl, j ¼ rl, j ðk Þ
[ skewerl, j ; ð20:2Þ
N PPI rl, j >0
ðkÞ
instead of (20.1), which only includes all data sample vectors rl; j with
ðkÞ
max NPPI(rl; j ). The reason for using (20.1) is so that the skewer sets do not grow
too fast so we can see the progressive profiles of finding endmembers from growing
skewer sets.
Figure 20.1 depicts a flowchart showing a detailed, step-by-step implementation
of FIPPI.
Note that when the algorithm is terminated
in step 4, the vectors corresponding
ðkþ1Þ ðkþ1Þ
to the pixels rj with maxN PPI rj are the desired endmembers, denoted by
n o
ðkþ1Þ
ej . Additionally, in step 1 of initialization, ATGP in Sect. 4.4.2.3 is used to
n op
ð0Þ
generate the initial set of p initial skewers, skewerj . ATGP can be replaced
j¼1
by any method, including the one used in PPI that randomly generates so-called
20.2 FIPPI 599
Pre-processing:
Find nVD = p
(0) p
Use ATGP to produce an initial set of p skewers, skewer j .
j 1
Initial condition: Set k = 0
Find N PPI (ri ) for all data sample vectors ri

N k k 1
i 1
skewer skewer
( k 1) (k ) No
skewer j skewer j ?
Yes
Algorithm stops
Fig. 20.1 Flowchart showing detailed, step-by-step implementation of FIPPI
ð0Þ
skewers, {skewerj } as long as the initialization method provides a good estimate
of initial skewers. But, as will be shown experimentally, such randomly generated
initial skewers can only slow down the algorithm.
Several major benefits can be gained from FIPPI. First, it improves computa-
tional efficiency significantly compared to the original PPI since FIPPI is an
iterative algorithm, whereas PPI is not. Second, the parameters k and t are not
required. Thus, users need not input these values, avoiding human subjectivity.
Third, the number of endmembers that must be generated, p, can be estimated by
VD, unlike with PPI, which must be carried out on a trial-and-error basis. Fourth,
FIPPI implements a replacement rule to iteratively refine each iteration and is
terminated when a set of final endmembers, {ēj} is identified via an implemented
stopping rule. Fifth and most importantly, FIPPI is fully automated and requires no
human supervision, as does PPI.
20.3 Recursive Hyperspectral Band Processing of FIPPI
The original FIPPI cannot be processed in a progressive band processing manner. In

this section, following an approach similar to that of extending PPI to RHBP-PPI
derived in Chap. 19, we also extend FIPPI to RHBP-FIPPI. The idea of
implementing RHBP-FIPPI is to use two loops specified by skewers and bands to
process FIPPI. Depending on which one is implemented in the outer loop, two
different versions of RHBP-FIPPI can be designed. When the outer loop is iterated
band by band, it is called RHBP-FIPPI (Sect. 20.3.1). When the outer loop is
iterated by growing skewers, it is called RSP-FIPPI (Sect. 20.3.2). Note that since
RHBP and RSP are implemented band by band, they do not require dimensionality
reduction like the original FIPPI.
20.3.1 Recursive Hyperspectral Band Processing of FIPPI
In what follows we describe a recursive algorithm that iterates the number of

skewers, denoted by nskewer, in the inner loop, indexed by the parameter k, while
iterating the number of the first l bands being used, denoted by nl in the outer loop,
indexed by the parameter l. The resulting algorithm is termed RHBP-FIPPI.
Recursive Hyperspectral Band Processing of FIPPI

Initial Conditions: Find nVD, the number of spectrally distinct signatures, using the
HFC method (1994), and let it be p ¼ nVD.
Outer loop indexed by band parameter l from l ¼ p to L
Inner loop indexed by the skewer parameter k
n o
ð0Þ p
1. Initial condition: Let skewerl, j be an initial set of p skewers generated
j¼1
from selecting those pixels that correspond to target pixels generated
by ATGP.
n op
ð0Þ
2. Normalize each skewerl, j into a unit vector.
j¼1
ðkÞ
3. Iterative rule: At iteration k, for each skewerl; j , all the sample vectors are
ðkÞ
projected onto this particular skewerl; j to find those that are at its extreme
ðkÞ
positions to form an extrema set, denoted by Sextrema(skewerl; j ). Find the
ðkÞ ðkÞ
sample vectors that produce the largest NPPI(rl; j ) and denote them by {rl; j }.
n o n o
ðkþ1Þ ðk Þ ðkÞ
l, j
20.3 Recursive Hyperspectral Band Processing of FIPPI 601
n o n o
ðkþ1Þ ðk Þ
5. If skewerl, j ¼ skewerl, j , then no more new endmembers are added
to the skewer set. In this case, the algorithm is terminated and the data are
outputted. Go to outer loop. Otherwise, let k k þ 1, and go to step 2.
End Inner loop
Let l l þ 1.
End Outer loop
Figure 20.2 depicts a flowchart of a detailed, step-by-step implementation of
RHBP-FIPPI.
Pre-processing:
Find nVD = p
(0) p
Use ATGP to produce an initial set of p skewers, skewer j .
j 1
Initial condition: Set k = 0
Progressive Band Process

Recursive growing
skewer set process
N
Find N PPI (ri ) for all data sample vectors ri k k 1
i 1
skewer skewer
No
skewer
Yes
l l 1
Yes No
Stop l>L
Fig. 20.2 Flowchart of RHBP-FIPPI

20.3.2 Recursive Skewer Processing of FIPPI
The following recursive algorithm swaps the inner loop and outer loop
implemented in RHBP-FIPPI by iterating the number of the first l bands being
used, denoted by nl in the inner loop and indexed by the parameter l recursively,
while iterating the number of skewers, denoted by nskewer, in the outer loop, indexed
by the parameter k progressively. Since the outer loop produces skewer sets
progressively, the resulting algorithm is called RSP-FIPPI and is described in
what follows.
Recursive Skewer Processing of FIPPI

Initial Conditions:
1. Find nVD, the number of spectrally distinct signatures, using the HFC method
(1994), and let it be p ¼ nVD.
n o
ð0Þ p
2. Let skewerl, j be an initial set of p skewers generated from selecting those
j¼1
pixels that correspond to target pixels generated by ATGP.
n o
ð0Þ p
3. Normalize each skewerl, j into a unit vector.
j¼1
Outer loop indexed by skewer set parameter k

• Inner loop indexed by band parameter l from l ¼ p to L
1. Receive the information in band l.
ðkÞ
2. Iterative rule: At iteration l, for each skewerl; j , all sample vectors are projected
ðkÞ
onto this particular skewerl; j to find those that are at its extreme positions to
ðkÞ
form an extrema set, denoted by Sextrema(skewerl; j ). Find the sample vectors that
ðkÞ ðkÞ
produce the largest NPPI(rl; j ) and let them be denoted by {rl; j }.
n o n o
ðk;*Þ ðk Þ ðk Þ
l, j
n o n o
ðk;*Þ ðk Þ
4. If skewerl, j ¼ skewerl, j , then no more new endmembers are added to
the skewer set. In this case, the algorithm is terminated and the data are
outputted. Go to outer loop. Otherwise, let l l þ 1, and go to step 2.
• End Inner loop
Let k k þ 1.
End Outer loop
20.3 Recursive Hyperspectral Band Processing of FIPPI 603
Pre-processing: Find nVD = p and use ATGP to produce an initial

(0) p
set of p skewers, skewer j . Set k = 0
j 1
Progressive Skewer Process

Recursive Band Process
N
Find N PPI (ri ) for all data sample vectors ri l l 1
i 1
skewer skewer
No
skewer skewer
Yes
(k ,*)
Output skewer j k k 1
Fig. 20.3 Flow chart of RSP-FIPPI
Note that, unlike RHBP-FIPPI, which grows skewer sets recursively in the inner
loop for each fixed nl, RSP-FIPPI grows skewer sets recursively in the inner loop as
nl increases and in the meantime also uses previously generated skewer sets as its
new initial conditions in the outer loop to progressively produce new skewer sets
augmented from previously generated skewer sets. Thus, technically speaking, it
can run indefinitely as long as the outer loop is continuously executed. Furthermore,
since RSP-FIPPI also grows skewer sets recursively in the inner loop, it can also be
considered RHBP-FIPPI. Figure 20.3 depicts a flowchart of a detailed, step-by-step
implementation of RSP-FIPPI.
Interestingly, compared with RHBP-PPI derived in Chap. 19, RHBP-FIPPI and
RSP-FIPPI are much simpler and more easily implemented. A key difference
between RHBP-PPI and RHBP-FIPPI/RSP-FIPPI lies in their required prior knowl-
edge, where RHBP-PPI needs to know the value of nskewer, unlike RHBP-FIPPI/
RSP-FIPPI, which requires the knowledge of nVD ¼ p.
The synthetic images shown in Fig. 20.4, which is also described in Figs. 1.14 and
1.15 of Chap. 1, are used to substantiate and validate the utility of C-IPPI in
applications.
2 2mixed pixel panels for each row in the third column and five 1 1 subpixel
as endmember pixels, were simulated in the data by the five endmembers, A, B, C,
area “BKG”, denoted by b and plotted in Fig. 20.4b, was used to simulate the
The reason for this background selection was empirical since the selected “BKG”
area seemed more homogeneous than other regions. Nevertheless, other areas could
also have been selected for the same purpose. This b-simulated image background
was further corrupted by an additive noise to achieve a signal-to-noise ratio of 20:1,
which was defined in Harsanyi and Chang (1994) as a 50 % signature (i.e., reflec-
tance/radiance) divided by the standard deviation of the noise. Once the target
pixels and background are simulated, two types of target insertion can be designed
to simulate experiments for various applications.
There are two types of synthetic images used in experiments. The first type of
clean target panels into a clean image background plus additive Gaussian noise by
replacing their corresponding background pixels.
100%
A
B

A second type of target insertion is target embeddedness (TE), which can be also
simulated by embedding clean target panels into a clean image background plus
additive Gaussian noise by superimposing target pixels over the background pixels.
Since there are five pure mineral and one background signatures, we assume that
nVD ¼ 6. However, the value of nVD can also be estimated by the target-specified
virtual dimensionality (TSVD) developed in Chap. 4. Figure 20.5a shows the
endmember candidates found for TI by FIPPI with nVD ¼ 6, where pixels
highlighted by red crosses are the spatial locations of the ground truth pixels and
the pixels highlighted by cyan upward pointing triangles are endmember candidates
found by FIPPI. Figure 20.5b shows the results of applying RHBP-FIPPI to TI; the
pixels highlighted by yellow circles are endmember candidates extracted by the
previous bands, and pixels highlighted by cyan upper triangles are the endmember
candidates extracted by the current band.
Now, let nl be the number of first l bands used by RHBP-FIPPI. Then all the
pixels marked by yellow circle in Fig. 20.5b were produced by RHBP-FIPPI using
nl with l < L and the pixels marked by cyan upper triangles were produced by
RHBP-FIPPI using nL, that is, the entire set of full bands, which are exactly the
same as those pixels marked by cyan upper triangles in Fig. 20.5a. In other words,
RHBP-FIPPI produced the pixels marked by yellow circles in Fig. 20.5b that cannot
be found in Fig. 20.5a until RHBP-FIPPI reached the last band, in which case
Fig. 20.5 Comparative study between FIPPI and RHBP-FIPPI applied to TI. (a) Endmember
candidates found by FIPPI with nVD ¼ 6, (b) Endmember candidates found by RHBP-FIPPI with
nVD ¼ 6
RHBP-FIPPI used the entire full set of bands to produce an identical set of FIPPI-
extracted pixels, marked by cyan upper triangles in Fig. 20.5a, b. Accordingly,
RHBP-FIPPI can be interpreted as a slow motion of a one-shot-operation FIPPI,
where target pixels circled by yellow represent the additional endmember candi-
dates extracted by RHBP-FIPPI were uncovered during the RHBP process but not
found by FIPPI. Furthermore, a comparison of panels a and b in Fig. 20.5 shows that
a total of 81 ground truth pixels can be extracted by RHBP-FIPPI, while only
43 ground truth pixels picked up by FIPPI.
Figure 20.6a also plots the number of endmember candidates identified in each
band, and Fig. 20.6b plots the total number of distinct endmember candidates as the
a b
Number of Distinct Endmember Candidates Found by PBP-FIPPI

50 250
225
40 200
175
30 150
125
20 100
75
10 50
25
0 0
0 25 50 75 100 125 150 175 0 25 50 75 100 125 150 175
lth Band Number of Processed Bands
Fig. 20.6 Endmember candidates extracted by RHBP-FIPPI: (a) number of endmember candi-
dates found by RHBP-FIPPI in each band; (b) number of distinct endmember candidates identified
by RHBP-FIPPI versus number of processed bands, nl; (c) spatial locations of RHBP-FIPPI-
extracted endmember candidates from TI
Fig. 20.7 Orders of ground 6

truth pixels extracted versus
Order of Ground Truth Pixels Been Extracted

number of bands processed,
B(1, 1)
nl 5
M(1, 1)
4
A(1, 1), C(1, 1), K(1, 1)

3
0
0 25 50 75 100 125 150 175
number of processed bands, nl, is increased. As shown in Fig. 20.6b, a total of

218 distinct endmember candidates were extracted, and Fig. 20.6c plots the spatial
locations of the 218 endmember candidates in the TI scene. After 172 bands were
processed, no additional endmember candidates were extracted.
As noted in Fig. 20.5b, 81 ground truth pixels were extracted by RHBP-FIPPI as
endmember candidates. Figure 20.7 shows the order of the ground truth pixels
being extracted versus the number of bands processed, nl. It is clearly shown that
minerals A, C, and K could be extracted by nl ¼ 6. After nl ¼ 28, all five materials
could be extracted as endmember candidates.
Figure 20.8 further shows the spatial locations of the endmember candidates
extracted by RHBP-FIPPI with nl starting from six bands. In this figure, the spatial
locations of the ground truth pixels, the endmember candidates extracted by the
previous bands, and the endmember candidates extracted by the current band are
highlighted by red crosses, yellow circles, and cyan upper triangles, respectively.
With nl ¼ 6, minerals A, C, and K were extracted first, followed by M with nl ¼ 15,
and, finally, mineral B, extracted by nl ¼ 28. After the first 28 bands were processed,
all five minerals in the TI scene were extracted.
Upon applying RSP-FIPPI to TI, Fig. 20.9 shows endmember candidates
extracted by RSP-FIPPI with nVD ¼ 6. As shown in the figure, the spatial locations
of the ground truth pixels and the endmember candidates extracted by the current
band are highlighted by red crosses and cyan upper triangles, respectively, where
the four minerals A, C, K, and M could be extracted by RSP-FIPPI.
Furthermore, RSP-FIPPI required only two iterations to terminate the outer loop.
Figure 20.10a plots the number of distinct endmember candidates extracted by
RSP-FIPPI versus the number of processed bands, nl, in each outer iteration.
Fig. 20.8 Endmember candidates identified by RHBP-FIPPI with different numbers of bands
processed: (a) 6 bands, (b) 15 bands, (c) 28 bands, (d) 189 bands (full bands)
As shown in Fig. 20.10b, after the first outer iteration, 82 endmember candidates
were extracted by RSP-FIPPI, after which no additional pixels were extracted as
endmembers. This is why the plot in Fig. 20.10b is flat.
Experiments Similar to those conducted for TI were also performed for TE. Once
again we set nVD ¼ 6. Figure 20.11a, b shows endmember candidates extracted by
applying FIPPI and RHBP-FIPPI to TE, respectively, with nVD ¼ 6, where pixel
Fig. 20.9 Endmember

candidates extracted by
RSP-FIPPI from TI
a b
Number of Distinct Endmembers Found by PSP-FIPPI
100 100
75 75
50 50
25 25
0 0
0 25 50 75 100 125 150 175 0 25 50 75 100 125 150 175
Fig. 20.10 Plots of number of distinct endmember candidates extracted by RSP-FIPPI versus
number of bands processed, nl, in each iteration in outer loop. (a) First iteration. (b) Second
iteration
highlighted by red crosses are the spatial locations of the ground truth pixels and
pixels highlighted by cyan upper triangles are endmember candidates found by
FIPPI. Figure 20.11b shows the results of applying RHBP-FIPPI to TE; the pixels
highlighted by yellow circles are endmember candidates extracted by the previous
bands and pixels highlighted by cyan upper triangles are the endmember candidates
extracted by the current band. Furthermore, a total of 62 ground truth pixels were
extracted by RHBP-FIPPI compared with only 24 ground truth pixels extracted by
FIPPI.
Fig. 20.11 Comparative study between FIPPI and RHBP-FIPPI applied to TE. (a) Endmember
candidates found by FIPPI with nVD ¼ 6. (b) Endmember candidates found by RHBP-FIPPI with
nVD ¼ 6
Furthermore, Fig. 20.12a plots the number of endmember candidates identified

in each band, and Fig. 20.12b plots the total number of distinct endmember
candidates as the number of processed bands, nl, is increased. When nl 159, no
additional endmember candidates were extracted, where a total of 196 distinct
endmember candidates were extracted as endmember candidates with their spatial
locations as shown in Fig. 20.12c.
Figure 20.13 also plots the order of the ground truth pixels being extracted versus
the number of bands processed, nl, where minerals A and K were extracted only
using the first six bands, that is, nl ¼ 6. As nl 141, all five minerals were able to be
extracted as endmember candidates.
Figure 20.14 further shows the spatial locations of all endmember candidates
extracted by RHBP-FIPPI with different numbers of bands processed, nl, beginning
with six bands. In these figures, the spatial locations of the ground truth pixels, the
endmember candidates extracted by the previous bands, and the endmember can-
didates extracted by the current band are highlighted by red crosses, yellow circles,
and cyan upper triangles, respectively. When nl ¼ 6, minerals A and K were
extracted. This is followed by two more minerals, M, extracted by nl ¼ 15, and B,
extracted by nl ¼ 22 bands. Finally, C was the last mineral to be extracted, with
nl ¼ 141.
Figure 20.15 shows the endmember candidates extracted by RSP-FIPPI from TE
using nVD ¼ 6. As shown in the figure, the spatial locations of the ground truth
pixels and the endmember candidates extracted by the current band are highlighted
by red crosses and cyan upper triangles, respectively, where the three minerals A,
K, and M could be extracted by RSP-FIPPI, while minerals B and C were missed.
a b

30 200
175
25
150
20
125
15 100
75
10
50
5
25
0 0
0 25 50 75 100 125 150 175 0 25 50 75 100 125 150 175
by RHBP-FIPPI versus number of processed bands, nl; (c) spatial locations of endmember
candidates found by RHBP-FIPPI from TE
RSP-FIPPI versus the number of processed bands, nl, in each outer iteration. As
shown in Fig. 20.16b, after the first outer iteration, 94 endmember candidates were
extracted by RSP-FIPPI, after which no additional pixels were extracted as
Fig. 20.13 Orders of 6

ground truth pixels
Order of Ground Truth Pixels Been Extracted

extracted versus number of
C(1, 1)
bands processed, nl 5
B(1, 1)
4
M(1, 1)
3
A(1, 1), K(1, 1)

2
0
0 25 50 75 100 125 150 175
Fig. 20.14 Endmember candidates identified by RHBP-FIPPI with different numbers of bands
processed, nl: (a) 6 bands, (b) 15 bands, (c) 28 bands, (d) 141 bands, (e) 189 bands (full bands)
Fig. 20.15 Endmember

candidates found in TE by
RSP-FIPPI with nVD ¼ 6
a b
100 100
75 75
50 50
25 25
0 0
0 25 50 75 100 125 150 175 0 25 50 75 100 125 150 175
iteration
The HYperspectral Digital Imagery Collection Experiment (HYDICE) image

shown in Fig. 20.17a (Fig. 1.10a) was used in experiments to demonstrate the
applicability of RHBP-FIPPI. It has a size of 64 64 pixel vectors with 15 panels
and the ground truth map in Fig. 20.17b (Fig. 1.10b), where the ith panel signature,
denoted by pi, was generated by averaging the red panel center pixels in row i and
a b
c
7000
P1
6000 P2
P3
P4
5000 P5
Radiance
4000
3000
2000
1000
0
0 20 40 60 80 100 120 140 160 180
Band
locations of the 15 panels; (c) spectra of p1, p2, p3, p4, and p5
shown in Fig. 20.17c (Fig. 1.11). According to the HYDICE image scene in
Fig. 20.17b, there are 19 R panel pixels arranged in a 5 3 matrix, which has
three panel pixels in row 1 and four panel pixels in each of rows 2–5.
Note that the only pure panel pixels present in the data scene in Fig. 20.17 are
those in column 1 (one panel pixel in row 1 and two panel pixels in each of the other
four rows) and column 2 (one panel pixel in each of the five rows) corresponding to
five panel signatures, p1, p2, p3, p4, and p5, in Fig. 20.17c.
According to Chang (2003), when nVD is estimated using the HFC method to be
9, only three panel pixels in five rows can be found. However, if nVD is increased to
18, then five panel pixels in each row can be found (Chang et al. 2010a, 2011b).
Therefore, in the following sections we show the results of experiments for RHBP-
FIPPI with two cases, nVD ¼ 9 and 18.
Fig. 20.18 Comparison of endmember candidates extracted by FIPPI and RHBP-FIPPI using
nVD ¼ 9: (a) FIPPI, (b) RHBP-FIPPI
20.5.1 RHBP-FIPPI Experimental Results with nVD ¼ 9
Figure 20.18a, b shows endmember candidates extracted from the HYDICE scene
by FIPPI and RHBP-FIPPI using nVD ¼ 9. As shown in Fig. 20.18b, the R panel
pixels, the Y panel pixels, the endmember candidates extracted by previous bands,
and the endmember candidates extracted by the current band are highlighted by red
crosses, yellow crosses, magenta circles, and cyan upper triangles, respectively,
where RHBP-FIPPI produced the same results as FIPPI when it reached the last
band. In particular, many target pixels marked by magenta circles in Fig. 20.18b
were extracted by RHBP-FIPPI as additional endmember candidates during band-
by-band processing that had not been picked up by FIPPI. Specifically, compared to
the FIPPI, which only extracted three R panel pixels, p11, p312, and p521, RHBP-
FIPPI extracted two additional R panel pixels p221 and p311, plus 1 Y panel pixel,
p212, which had been missed by FIPPI.
Figure 20.19a further plots the number of endmember candidates extracted by
each band, and Fig. 20.19b shows the cumulative distinct endmember candidates
extracted by RHBP-FIPPI as the number of processed bands, nl, is increased. As
shown in Fig. 20.19b, a total of 57 distinct target pixels were extracted as
endmember candidates with their spatial locations shown Fig. 20.19c. When nl
94, no additional endmember candidates were extracted.
As noted in Fig. 20.18b, five R panel pixels and one Y panel pixel were extracted
as endmember candidates by RHBP-FIPPI. Figure 20.20 plots the order of these
extracted panel pixels versus the number of bands processed, nl. It clearly shows
that p521 was the first R panel pixel extracted using only nl ¼ 18 bands. After
nl 54, all six R panel pixels were eventually extracted by RHBP-FIPPI as
endmember candidates.
a b
15 60
Number of Distinct Skewers Found by PBP-FIPPI

Number of Skewers Found by PBP-FIPPI
50
12
40
9
30
6
20
3
10
0 0
0 25 50 75 100 125 150 0 25 50 75 100 125 150
by RHBP-FIPPI versus number of processed bands, nl; (c) spatial locations of RHBP-FIPPI-
extracted endmember candidates from HYDICE scene
Figure 20.21a, h further shows the spatial locations of the endmember candidates
identified by RHBP-FIPPI with different numbers of bands processed, nl, starting
from the first nine bands. In these figures, the spatial locations of the ground truth
pixels, the endmember candidates extracted by the previous bands, and the
endmember candidates extracted by the current band are highlighted by red
crosses, yellow circles, and cyan upper triangles, respectively. With nl ¼ 18
bands processed, the R panel pixel p521 was first extracted, followed by p221

ground truth pixels
extracted versus number of p11
bands processed, nl 6

p312
5
p311
4
p212
3
p221
2
p521
1
0
0 25 50 75 100 125 150
with nl ¼ 27 bands being processed. After nl 54 bands, all six R panel pixels
were eventually extracted by RHBP-FIPPI and no additional panel pixels were
extracted.
20.5.2 RHBP-FIPPI Experimental Results with nVD ¼ 18
Experiments similar to those presented in Sect. 20.5.1 for nVD ¼ 9 were also
conducted for the HYDICE scene using nVD ¼ 18. Figure 20.22a, b shows
endmember candidates extracted from the HYDICE scene by FIPPI and RHBP-
FIPPI HYDICE using nVD ¼ 18, respectively, where FIPPI extracted four R panel
pixels, p11, p311, p411, and p521, plus one Y panel pixel, p212, versus six R panel
pixels, p11, p221, p311, p312, p411, and p521, plus one Y panel pixel, p212, extracted by
RHBP-FIPPI; the R panel pixels, Y panel pixels, endmember candidates extracted
by previous bands, and endmember candidates extracted by the current band are
highlighted by red crosses, yellow crosses, magenta circles, and cyan upper tri-
angles, respectively. The figure also shows that RHBP-FIPPI produced the same
results as FIPPI when it reached the last band. Figure 20.18a shows how FIPPI
extracted three R panel pixels, p11, p312, and p521, using nVD ¼ 9, whereas FIPPI
using nVD ¼ 18 extracted four R panel pixels, of which two R panel pixels, p11 and
p521, were identical, one R panel pixel, p312, was different from the R panel pixel
p311, and one more R panel pixel, p411, was also extracted. In particular, many target

candidates identified by
RHBP-FIPPI with different
numbers of bands
processed, nl: (a) 9 bands,
(b) 18 bands, (c) 27 bands,
(d) 36 bands, (e) 48 bands,
(f) 49 bands, (g) 54 bands,
(h) 169 bands (full bands)
pixels marked by magenta circles in Fig. 20.22b were extracted by RHBP-FIPPI as

additional endmember candidates during the interband processing that had not been
picked up by FIPPI. Specifically, RHBP-FIPPI extracted two additional R panel
pixels, p221 and p312, that had been missed by FIPPI.
Fig. 20.22 Comparison of endmember candidates extracted by FIPPI and RHBP-FIPPI using
nVD ¼ 18: (a) FIPPI, (b) RHBP-FIPPI
Furthermore, Fig. 20.23a additionally plots the number of endmember candi-

dates extracted by each band, and Fig. 20.23b shows the cumulative distinct
endmember candidates extracted by RHBP-FIPPI as the number of processed
bands, nl, is increased. As shown in Fig. 20.23b, a total of 101 distinct target pixels
were extracted as endmember candidates, and their spatial locations are shown in
Fig. 20.23c. When nl 97, no more endmember candidates were extracted.
As noted in Fig. 20.22b, six R panel pixels and one Y panel pixel were extracted
as endmember candidates by RHBP-FIPPI. Figure 20.24 plots the order of these
extracted panel pixels versus the number of bands processed, nl. It clearly shows
that p521 was the first R panel pixel extracted using only nl ¼ 18 bands. When
nl 97, all seven R panel pixels were eventually extracted by RHBP-FIPPI as
endmember candidates.
Figure 20.25a, g further shows the spatial locations of the endmember candidates
identified by RHBP-FIPPI with different numbers of bands processed, nl, starting
from the first 18 bands. In these figures, the spatial locations of the ground truth
pixels, the endmember candidates extracted by the previous bands, and the
endmember candidates extracted by the current band are highlighted by red crosses,
yellow circles, cyan upper triangles, respectively. When RHBP-FIPPI began with
nl ¼ 18, the R panel pixel p521was immediately extracted. It was followed by p212
with nl ¼ 26 bands being processed. When nl 97 bands, all six R panel pixels were
eventually extracted by RHBP-FIPPI, and no additional panel pixels were
extracted.
a b
25 110
Number of Distinct Skewers Found by PBP-FIPPI

100
Number of Skewers Found by PBP-FIPPI
20 90
80
70
15
60
50
10
40
30
5
20
10
0 0
0 25 50 75 100 125 150 0 25 50 75 100 125 150
Fig. 20.23 Results of RHBP-FIPPI: (a) number of skewers found by RHBP-FIPPI in each band;
(b) number of distinct skewers identified by RHBP-FIPPI versus number of processed bands; (c)
spatial locations of endmember candidates found by RHBP-FIPPI from HYDICE scene
20.5.3 RSP-FIPPI Experimental Results with nVD ¼ 9
Figure 20.26 plots the RSP-FIPPI results applied to the HYDICE scene using
nVD ¼ 9. As shown in the figure, the spatial locations of the ground truth pixels
crosses and cyan upper triangles, respectively, where three R panel pixels were
extracted as endmember candidates by RSP-FIPPI.

ground truth pixels
p411
extracted versus number of 7
bands processed, nl

p11 , p311
6
p312
4
p221
3
p212
2
p521
1
0
0 25 50 75 100 125 150
extracted by RSP-FIPPI, and then no additional pixels were extracted as
20.5.4 RSP-FIPPI Experimental Results with nVD ¼ 18
Figure 20.28 plots the RSP-FIPPI results applied to the HYDICE scene using
nVD ¼ 18. As shown in the figure, the spatial locations of the ground truth pixels
crosses and cyan upper triangles, respectively, where four R panel pixels and one Y
panel pixel were extracted as endmember candidates by RSP-FIPPI.
extracted by RSP-FIPPI, and then no additional pixels were extracted as

candidates identified by
RHBP-FIPPI with different
numbers of bands
processed, nl: (a) 18 bands,
(b) 26 bands, (c) 27 bands,
(d) 46 bands, (e) 48 bands,
(f) 97 bands, (g) 169 bands
(full bands)
Finally, Table 20.1 tabulates the number of panel pixels and the number of
endmember candidates extracted by FIPPI, RHBP-FIPPI, and RSP-FIPPI using
nVD ¼ 9 and 18.

candidates found by
RSP-FIPPI using nVD ¼ 9
a b
15 15
Number of Distinct Skewers Found by PSP-FIPPI
12 12
9 9
6 6
3 3
0 0
0 25 50 75 100 125 150 0 25 50 75 100 125 150
iteration

candidates found by
RSP-FIPPI using nVD ¼ 18
a b
25 25
20 20
15 15
10 10
5 5
0 0
0 25 50 75 100 125 150 0 25 50 75 100 125 150
iteration
Table 20.1 Number of panel pixels and number of endmember candidates extracted by FIPPI,
RHBP-FIPPI, and RSP-FIPPI using nVD ¼ 9 and 18
FIPPI RHBP-FIPPI RSP-FIPPI
Number Number of Number Number of Number Number of
of panel endmember of panel endmember of panel endmember
Algorithm pixels candidates pixels candidates pixels candidates
nVD extracted extracted extracted extracted extracted extracted
9 3 10 6 57 3 13
18 5 19 7 101 5 21
20.6 Conclusions
PPI has been around for more than two decades since being first proposed by
Boardman (1994). Despite several issues in its practical implementation, as pointed
out in the introduction and in Chang (2013, 2016), it continues to be one of the most
popular endmember finding algorithms thanks to its availability in the ENVI
software. However, the effectiveness of PPI is largely determined by the number
of skewers, K, used to calculate PPI counts for data sample vectors. To address this
issue, two approaches were investigated. One is IPPI, developed by Chang and Wu
(2015) [see Chap. 12 in Chang (2016)]. Chapter 19 presents several progressive
versions of IPPI, called recursive hyperspectral band processing of IPPI (PHBP-
IPPI), which vary with the bands rather than skewer sets. It is then further extended
to recursive hyperspectral band processing of IPPI (RHBP-IPPI). The other is the
algorithm developed by Chang and Plaza (2006), called FIPPI, which allows users
to process PPI progressively by growing skewer sets instead of fixing the number of
skewers at K as PPI does. As a result, potential endmember candidates can be
extracted according to varying skewer sets. Using an idea similar to that used in
Chap. 19 to derive RHBP-IPPI, this chapter extended FIPPI to RHBP-FIPPI, as well
as RSP-FIPPI.
Chapter 21
Conclusions
Abstract Writing a book has the great advantage over writing journal articles in
the sense that the former can set the tone and agenda for carrying out what the
author wants to deliver, as opposed to the latter, which must sacrifice certain things
to accommodate reviewers’ comments. This book comes from a long journey that
started with the author’s first book, Hyperspectral Imaging: Spectral Techniques for
Detection and Classification, published by Kluwer Academic Publishers in 2003
(Chang 2003), followed by a second book, Hyperspectral Data Processing: Algo-
rithm Design and Analysis, published by Wiley in 2013 (Chang 2013), and a third
book, Real Time Progressive Hyperspectral Image Processing: Endmember Find-
ing and Anomaly Detection, published by Springer in 2016 (Chang 2016). The
series finally concludes with this book, Real-Time Recursive Hyperspectral Sample
and Band Processing: Algorithm Architecture and Implementation, to complete a
cycle that accomplishes exactly what the author believes is important and derived
from research conducted in the Remote Sensing Signal and Image Processing
Laboratory (RSSIPL) at the University of Maryland, Baltimore County (UMBC).
The subjects covered in this book focus mainly on the design and development of
real-time recursive hyperspectral sample and band processing algorithms for dif-
ferent data acquisition formats, band-interleaved-by-sample (BIS) and band-inter-
leaved-by-pixel (BIP) (Fig. 21.1) and band-sequential (BSQ) (Fig. 21.2), so that
data can be processed progressively and recursively, at the same time that data
streams are being collected either sample by sample and line by line according to
BIS/BIP and BIL respectively or band by band according to BSQ. As a conse-
quence, the designed algorithms can be implemented recursively and progressively
in a causal and real-time manner and will have great potential in the future
hyperspectral imaging sensors operated in the space for data communication and
transmission.
21.1 Introduction
In the early days, when multispectral imagery was used for remote sensing image
processing, research was directed to land-cover/use applications where spatial
domain-based methods provided the necessary analysis tools, such as geographical

DOI 10.1007/978-3-319-45171-8_21
628 21 Conclusions

imagery acquired by
BIP/BIS format

imagery acquired by BSQ
format
information system (GIS). However, with the advent of hyperspectral imaging

sensors, applications have shifted away from spatial domain-based (literal) analyses
to spectral-based (nonliteral) analyses (Chang 2003a), where endmember extrac-
tion/finding, subpixel target detection, and mixed pixel classification/quantification,
which are generally overlooked in multispectral imaging, have become major
driving forces in hyperspectral imaging applications. The Chang’s (2003a) book
is believed to be the first book of its kind ever published to look into spectral
(nonliteral) detection and classification for subpixel targets and mixed pixel sam-
ples from a statistical signal processing point of view, where many statistical signal
processing techniques were developed in Chang (2003) to cope with subpixel and
mixed pixel issues such as the constrained energy minimization (CEM) subpixel
target detection technique, the automatic target generation process (ATGP) for
unsupervised target finding, the Reed–Xiaoli technique developed for an anomaly
detector along with its variants, such as the K-based anomaly detector (K-AD) and
R-based anomaly detector (R-AD), orthogonal subspace projection (OSP)-based
mixed pixel classification, fully constrained least-squares (FCLS) mixed pixel
quantification, and virtual dimensionality (VD) for estimating the number of spec-
trally distinct signatures, all of which are cutting edge techniques when they first
came out and have become well-known techniques used in hyperspectral imaging.
Ten years later, Chang (2013) continued his effort from Chang (2003) by including
new developments in hyperspectral data processing such as endmember extraction,
spectral information compression, and one-dimensional hyperspectral signal
processing plus applications to multispectral imaging and medical imaging in
magnetic resonance imaging (MRI). This book can be considered a sequel to
Chang (2003) and is also believed to be the first comprehensive book to cover
these topics extensively. In particular, endmember extraction, which at the time
attracted considerable interest in hyperspectraol data exploitation, has been covered
in great detail where several now well-known algorithms, such as SeQuential
N-FINDR (SQ N-FINDR), SuCcessive N-FINDR (SC N-FINDR), and simplex
growing algorithm (SGA), have been developed. Most recently, Chang (2016)
has changed gear to focus on real-time processing of hyperspectral imaging algo-
rithms, where most endmember extraction algorithms developed in Chang (2013)
have been rederived as endmember finding algorithms (EFAs) and real-time
processing algorithms sequentially or progressively in a causal manner when no
assumption is made about the presence of endmembers. Specifically, techniques
developed for anomaly detection in Chang (2003a), along with their versions using
local windows, are also retreated from a real-time processing perspective. The
current book can be considered a companion book of Chang (2016), where many
techniques presented in Chang (2016) are rederived as recursive hyperspectral
sample processing (RHSP) algorithms as well as recursive hyperspectral band
processing (RHBP) algorithms. Although theoretically the sequential and progres-
sive algorithms developed in Chang (2016) can be implemented in real time, it is
practically not realistic owing to the high computational complexity caused by
repeatedly calculating and updating causal signature matrices and causal covari-
ance/correlation band matrices. This is very similar to the same issue encountered
in statistical signal processing, where causal Wiener processing can be theoretically
implemented in real time, but technically it cannot because of its requirement to
process all previously visited data samples. To resolve this issue, Kalman filtering is
developed as a real-time causal process that can update results by recursive
equations via only the processed information obtained from previous data sample
vectors and innovations information obtained by new incoming data sample vectors
without reprocessing all data samples. This book takes this advantage and rederives
real-time processing algorithms developed in Chang (2016) from the viewpoint of
Kalman filtering as their recursive counterparts. Depending on the information used
for updating , two types of information processing are of interest, causal sample
processing (Parts II and III) and causal band processing (Parts IV and V).
630 21 Conclusions
Note that progressive and recursive processes are two completely different
concepts. Technically speaking, a progressive process is a process regarding how
an algorithm is implemented and carried out progressively. By contrast, a recursive
process is a process in which an algorithm takes advantage of the process’s
algorithmic recursive architecture to save data storage space and reduce computa-
tional complexity. With this interpretation, a progressive process can be considered
as a special type of data processing according to its implementation, while a
recursive process can be viewed as a process of utilizing its algorithmic structure
according to recurrence relations.
Finally, for those interested in hyperspectral imaging, Chang (2003), along with
the fundamentals in Chaps. 1–6 of Chang (2013) and Chaps. 1–5 of Chang (2016),
plus Chaps. 1–4 in this book, constitutes a very good collection of materials that
provide readers with a solid background in hyperspectral imaging algorithm design
and that will allow them to pursue advanced research in this fascinating area.
Recursive hyperspectral sample processing (RHSP) arises from the need for
processing data collected by BIS/BIP in real time (Fig. 21.1) when the same data
processing must be repeatedly implemented each time new data sample streams
come in by data acquisition.
Since data sample vectors that have already been visited and processed remain
the same during the entire data processing, they need not be reprocessed over and
over again and should be processed only when they are first visited. To address this
issue, we need to look into how algorithms are developed. In particular, two types
of sample information are used in algorithm design, sample spectral correlation-
based information and signature spectral correlation-based information.
21.2.1 Part II: Sample Spectral Statistics-Based Information
The first type of information is sample spectral correlation provided by sample

spectral statistics; it must be causal and vary with the data sample vectors being
processed, such as covariance matrix (K)/sample correlation matrix (R). Two major
target detection modes, active hyperspectral target detection, CEM (Chap. 5), and
passive hyperspectral target detection, anomaly detectors (Chap. 6) are of interest in
this book; both of these modes make use of sample spectral statistics to adapt to data
sample vectors to perform sample-varying background suppression. More specifi-
cally, assume that rn is the data sample vector currently being visited. Then fri gn1
i¼1
represents the set of data samples already visited and processed. Then (5.9) in
Chap. 5 defines the causal sample correlation matrix (CSCRM) changing with r(n),
denoted by R(n), as
Xn
RðnÞ ¼ ð1=nÞ r rT;
i¼1 i i
ð21:1Þ
1
whose inverse can be calculated by R1 ðnÞ ¼ ððn 1Þ=nÞRðn 1Þ þ ð1=nÞrn rnT
through R1 ðn1 Þ and r(n) without recalculating the previous n–1 already visited data
1
sample vectors, fri gn1
i¼1 . As a result, R ðnÞ can be further shown by (5.12) to be
R1 ðnÞ ¼ ½ð1 1=nÞRðn 1Þ1

pffiffiffi pffiffiffi ;
ð21:2Þ
where R1 ðnÞ can be updated by the processed information term specified by
½ð1 1=nÞRðn 1Þ1 in the first term of (21.2), with the second term of (21.2) as
the innovations information term.
21.2.1.1 Real-Time CEM
By taking advantage of (21.2), a real-time processing version of CEM (RT-CEM),

δnRT - CEM (rn) varying with the current data sample vector rn, can therefore be
derived by (5.13) as follows:
δnRT-CEM ðrn Þ ¼ ð1 1=nÞ1 κ ðnÞδn1

RT-CEM
ðrn Þ
pffiffiffi 2 T 1
ðð1 1=nÞ nÞ d R ðn 1Þrn rnT R1 ðn 1Þ rn
κ ð nÞ :
ð21:3Þ
As we can see from (21.3), δnRT - CEM (rn) only needs to be updated recursively by
RT-CEM
(a) processed information specified by δn1 ðrn Þ and R1 ðn 1Þ; and (b) a new
T 1 1
incoming data sample vector, rn, and κðnÞ ¼ d R ðnÞd .
21.2.1.2 Real-Time AD
In analogy with (21.3), Chap. 6 also utilizes the same recursive equation of R1 ðnÞ
specified by (21.2) to derive the following recursive real-time version of anomaly
632 21 Conclusions
detection (RT-AD) in (6.20) that can be implemented in a causal manner but in a

rather more simple form as follows:
δRT-CR-AD ðrn Þ ¼ ð1 1=nÞ1 rnT ½Rðn 1Þn1 rn on o

pffiffiffi 2
ðð1 1=nÞ= nÞ rnT ½Rðn 1Þ1 rn rnT ½Rðn 1Þ1 rn
;
1 þ ðð1 1=nÞ=nÞ1 rnT ½Rðn 1Þ1 rn
ð21:4Þ
where δRT-CR-AD ðrn Þ only needs to be updated by (a) the processed information R1
ðn 1Þ and (b) new incoming data sample vectors, rn.

Information
Unlike the first type of sample spectral statistics-based information discussed

earlier, the second type of information is signature spectral statistics-based infor-
mation provided by an undesired signature matrix U, which must also be causal to
accommodate newly found desired signatures that can be further annihilated by P⊥ U
given by (4.19) so as to increase target detectability. The algorithms using this type
of information include ATGP, OSP, and linear spectral mixture analysis (LSMA).
21.2.2.1 RHSP-ATGP

Assume that Up1 ¼ t1 tjþ1 tp1 is a matrix of size L ðp 1Þ made up of
p–1 previously found targets. Also, let tp be a new L-dimensional
target vector to be
added to form a new target matrix, Up ¼ Up1 tp ¼ t1 t2 tp1 tp . Then,
according to (7.4)–(7.7), P⊥Up can be updated by (a) the processed information
P⊥Up1 ; (b) new information tp; and (c) innovations information uep1 ¼ Up1 U#p1 tp ,
n h i o1
ep1 ¼ u
tpT u ep1
T
tp , and β ¼ tpT P⊥
Up1 tp via (7.8) as follows:

P⊥ ⊥
Up ¼ I Up Up ¼ PUp1 β u
#
ep1 u
ep1
T
2e
u p1 tpT þ tp tpT
T ð21:5Þ
¼ P⊥Up1 β u ep1 tp u ep1 tp :
By virtue of (21.5), ATGP can be implemented recursively to find new targets

without reprocessing the signature-varying matrix Up as RHSP-ATGP.
21.2.2.2 RHSP-OSP
As for OSP, we assume that the signature matrix M in (4.17) is formed by p target
signatures, m1, m2, . . ., mp. Let d ¼ mp be the desired spectral signature of mp and

Up1 ¼ m1 m2 mp1 the undesired target spectral signature matrix made up of
m1 , m2 , . . . , mp1, which are the spectral signatures of the remaining p–1 undesired
targets. According to (4.20), the abundance fraction of the pth signature mp
detected by OSP is given by
T ⊥
^ OSP
α p ðrÞ ¼ d PUp1 r; ð21:6Þ
which can be updated by (8.6) to be implemented as RHSP-OSP as follows:
^ pRHSP-OSP ðrÞ ¼ dT P⊥
α Up r

T
¼ d T P⊥ eU eU
Up1 r βpp1 d mp m mp m ð21:7Þ
T
p
p1
p
p1
r;
n h i o1 h i
T h i
1
where β ¼ mpT P⊥
Up1 m p ¼ P ⊥
Up1 m p P ⊥
Up1 mp and
p p1
e p p1 ¼ Up1 U#p1 mp . By means of (21.7), P⊥ ⊥

U
m Up can be updated via PUp1 by

T
including a correction term of β
U U
e p p1 mp m
m e p p1 mp recursively.
p p1
21.2.2.3 RHSP-LSMA
For LSMA we assume that the linear mixture model given by (9.1) is reexpressed as
r ¼ Mpþ1 α þ n: ð21:8Þ
Then the least-squares (LS) solution to (21.8) is given by

1
^ LS ðrÞ ¼ Mpþ1
α T
Mpþ1 T
Mpþ1 r; ð21:9Þ
with
" #!1

1 MpT Mp MpT mpþ1
T
Mpþ1 Mpþ1 ¼
m T Mp m T mpþ1
2 pþ1
1 pþ1
T 3
6 M T
p M p þ βM # T
p pþ1 pþ1 Mp
m m #
βM#p mpþ1 7
¼4
T 5:
βmpþ1
T
M#p β
ð21:10Þ
634 21 Conclusions
Using (21.10), RHSP-LSMA can be implemented by

" ⊥
#
^ LS
α p ðrÞ βMp mpþ1 mpþ1 PMp r
# T
^ RHSP
α -LSMA ðrÞ ¼ ð21:11Þ
pþ1
αLS
pþ1 ðrÞ

1 1 n h i o1
and β ¼ T
mpþ1 I Mp MpT Mp MpT mpþ1 ¼ mpþ1
T
P⊥
Mp mpþ1 .
As noted in (21.5), (21.6), and (21.10), the major hurdle in finding these
solutions is inverting matrices, UpU#p in (21.5) and MTp Mp in (21.10). In many
practical applications, exact and precise knowledge about Mp or Up is lacking. In
this case, varying signatures in Mp and Up to find the best possible solutions is
necessary and must be obtained directly from data as data processing is ongoing. In
this case, we need repeatedly to invert such varying matrices when new data sample
vectors come in. As expected, the computing time will grow tremendously. Fortu-
nately, the recursive equations (21.5) (derived for ATGP), (21.7) (derived for OSP),
and (21.10) (derived for LSMA) not only avoid inverting matrix inverses but also
adapt to varying signature matrices Mp and Up by only updating new added
signatures without recalculating signatures already processed.
21.2.2.4 RHSP of Maximum Likelihood Estimation
The close relationship between maximum likelihood estimation (MLE) and OSP
was first explored by Settle (1996) and Chang et al. (1998), where OSP and MLE
were shown to produce identical linear spectral unmixing results under an assump-
tion that the n in (21.8) is an additive Gaussian distribution. Under the same
Gaussian noise assumption it was also shown in Chang (2013) that α ^ LS ðrÞ in
(21.9) produced by LSMA is indeed identical to α ^ MLE
pþ1 ðrÞ given by

1
^ MLE
α pþ1 ð r Þ ¼ M T
pþ1 M pþ1
T
Mpþ1 r ¼ M#pþ1 r: ð21:12Þ
Instead of using least-squares error (LSE) as a criterion by RHSP-LSMA to produce

new virtual signatures (VSs), a recursive version of MLE, RHSP-MLE, uses the
following recursive prediction error equation given by

1
1
2
trace T
Mpþ1 Mpþ1 ¼ trace MpT Mp þ α^ pþ1
OP
mpþ1
0 12
LS
^ <Mp > mpþ1
α
B C
þ @ A ð21:13Þ
⊥
PMp mpþ1
to find a new VS, which can be obtained by solving

8 8 0 12 9 9
>
< >
< LS >
= >
=
^
α
2 B <Mp > m
-MLE ^ pþ1 mpþ1
pþ1
C
tRHSP ¼ arg minmpþ1 α OP
þ @ ;> A
pþ1
>
: >
: α ^ pþ1
OP
mpþ1 > ;
ð21:14Þ
or, alternatively,
n h io
RHSP-LS=OSP
tpþ1 ¼ arg maxmpþ1 mpþ1
T
P⊥
Mp m pþ1 : ð21:15Þ
21.2.2.5 RHSP of Growing Simplex Volume Analysis
On the other hand, as for finding determinant-based SV (DSV) in (2.1), the matrix
determinant must be calculated from the following equation:
" #
1 1 . . . 1

det
e1 e2 . . . ep
DSV e1 ; . . . ; ep ¼ ; ð21:16Þ
ðp 1Þ!
which involves significant computing time. It is even worse if (21.16) must also be
implemented repeatedly to find maximal DSV such as N-FINDR and SGA. This is
the main reason why many DSV-based EFAs suffer from the serious issue in
practical implementation. Chapters 11 and 12 resolve this issue by developing
two different approaches based on geometric SV (GSV).
One approach is recursive hyperspectral sample processing of OP-based SGA
(RHSP-OPSGA), which transforms a problem of finding the maximal DSV to one
of finding the maximal OP to the simplex formed by previously
D E found endmembers.
e
Let m1, . . ., mp be p endmembers and the hyperplane U p1 be generated by U e p1

¼ m e 3 m
e 2m e j ¼ mj m1 for all 2 j p and m
e p1 , with m e ¼ mp m1 .
⊥ p
According to (11.22)–(11.28), in Chap. 11 the maximal OP m
e p can be found by

⊥ ⊥
e p ¼
m P e
m
p ; ð21:17Þ
eU p1
where

⊥
e p ¼ arg maxer
m P
U e
r ð21:18Þ
ep1
636 21 Conclusions
and
T
P⊥
e ¼ P⊥
e β u ep u
ep1 m ep ;
ep1 m ð21:19Þ
Up U p1
1
e p1 U
e# m P⊥
ep1
with u ¼U p1 e p and β ¼ e pT
m e
ej
m . Using (21.18) we can
U p1
simply calculate the GSV of the p-vertex simplex formed by p endmembers,
m1, . . ., mp recursively by
⊥
e p GSV m1 ; . . . ; mp1 :
GSV m1 ; . . . ; mp ¼ ð1=ðp 1ÞÞm ð21:20Þ
Another approach to RHSP-OPSGA was proposed in Chap. 12. It is called

recursive hyperspectral sample processing of geometric SGA (RHSP-GSGA),
which calculates SV by multiplying the height by the base of a simplex, where
the base of a p-vertex simplex formed by p endmembers is the ( p1)-vertex
simplex formed by p1 previously found endmembers and the height turns out to
be the maximal OP perpendicular to the base. Its equivalence to RHSP-OPSGA is
shown in Sect. 12.5.
More specifically, for p 3 RHSP-GSGA finds the new endmember mp by
(12.20)–(12.29) in Chap. 12 as
Xp1

ep
mp ¼ m ui uiT mep

i¼2
ð21:21Þ
T
e p Up1 Up1 m
¼m e p ¼ I Up1 Up1
T
e p;
m
which can be calculated recursively by

mp ¼ mp1 þ I Up2 Up2

T
e p up1 up1
∇m T
e p;
m ð21:22Þ
mp1 mp1
with up1 ¼ ¼ , and the innovation information
ðmp1
T
mp1 Þ
1=2
jjmp1 jj
ep ¼ m
∇m ep m
e p1 ¼ mp mp1 ð21:23Þ
T
obtained from mp but not in mp1 . Specifically, Up1 Up1 can be simply updated by
T
Up2 and up1 up1 by
T
Xp1
Up1 Up1 ¼ i¼2
ui uiT ¼ Up2 þ up1 up1
T
: ð21:24Þ
Using (21.21) we can find

GSGA GSGA T
mjGSGA ¼ I Uj2 Uj2 me jGSGA ð21:25Þ
so that GSV can be calculated recursively by

GSV S m2GSGA ; . . . ; mp1

GSGA
; mpGSGA

ð21:26Þ
¼ hpGSGA GSV S m2GSGA ; . . . ; mp1
GSGA
;
∇mpGSGA ¼ mpGSGA mp1

GSGA
; ð21:27Þ
and

T
GSGA GSGA T GSGA
Up1 Up1 ¼ Up2 þ up1
GSGA GSGA
up1 : ð21:28Þ
Finally, it is worth noting that recursive processes arise from such a need to
avoid the exceedingly high computational complexity of repeatedly inverting
matrices, as in (21.5) and (21.10). Chapter 7 develops a recursive ATGP (RHSP-
ATGP) that makes use of recursive equations to eliminate the need for repeatedly
finding matrix inverses in (21.5). Chapter 8 extends the commonly used OSP to
RHSP-OSP, where the new signatures can be generated by a recursive equation to
form a new matrix, Up, to find OSP-detected abundance fractions by (21.7) without
actually inverting the matrix UpU#p in (21.5). Chapter 9 derives RHSP-LSMA for
LSMA to implement (21.11) without inverting the matrix MTp Mp in (21.10). With
an approach similar to RHSP-LSMA, Chap. 10 also derives RHSP-MLE, which
also implements (21.11) without inverting the matrix MTp Mp in (21.10). Finally,
Chaps. 11 and 12 redesign SGA as two new recursive versions, RHSP-OPSGA
(Chap. 11) and RHSP-GSGA (Chap. 12), both of which allow SGA to calculate
GSV recursively without involving the calculation of a determinant.
One of major issues in hyperspectral data exploitation is how to effectively utilize

the enormous amounts of data provided by hyperspectral sensors as a result of the
significantly improved spectral and spatial resolution. One general and classic
approach is through data compression, which transforms a high-dimensional data
space into a low-dimensional data space or feature space so that data can be
managed more effectively by preserving only the desired information for data
processing, for example, data dimensionality reduction (DR) in Chap. 6 in Chang
(2013). Another approach is through data reduction, which only selects some of the
data of interest to data analysts while completely discarding information not being
selected. As a result, data reduction is an irreversible data process. A widely used
technique to reduce data dimensionality is band selection (BS) which only retains
those bands that can best represent the data. However, both data compression and
data reduction suffer from the same two challenging issues: (1) how to determine
638 21 Conclusions
the number of dimensions or bands needed to be preserved and (2) how to

intelligently select appropriate dimensions or bands. As for the first issue, the
concepts of VD developed in Chang (2003, 2013) and Chang and Du (2004) and
target-specified VD (TSVD) developed in Chap. 4 can be used for this purpose.
Regarding the second issue, it is more involved where the criteria used to select
dimensions and bands are crucial. To avoid directly dealing with these two issues,
Chang (2013) suggested several approaches: progressive spectral dimensionality
process in Chap. 20, progressive band dimensionality process in Chap. 21, and
progressive band selection in Chap. 23. This book looks into these three chapters
and focuses on the nature of progressive band processing (PBP) in the band-
sequential (BSQ) data acquisition format.
There are many significant advantages of processing data progressively band by
band. First, PBP provides progressive profiles of changes in spectral signatures of
data sample vectors. As a consequence, such progressively changing spectral
variations allow users to accomplish many tasks that BS cannot do, such as
detecting significant bands, prioritizing bands, determining crucial bands, and
identifying desired bands. Second, PBP offers an alternative to BS where there is
no need to determine the number of bands or find bands. This is because the
progressive spectral profiles produced by PBP can be used to find appropriate
bands according to their spectral characteristics. The third advantage has to do
with the hyperspectral images produced. Since they are called hyperspectral the
spectral information is more important than the spatial information, specifically
when it comes to analyzing subpixels and mixed pixels, as discussed in Chap. 2 of
Chang (2013). PBP provides additional band spectral information to characterize
data other than the spatial information of data sample vectors. This is particularly
important and critical for hyperspectral imagery.
In general, BSQ cannot be implemented in real time since it collects data band
by band. Nevertheless, it can be considered as PBP that processes data in multiple
stages, stage by stage, with each stage processing one particular band image in real
time. The RHBP presented in Parts IV and V of this book was developed to meet
this requirement. It offers a new look from a BSQ point of view, described in IV,
specifically from a real-time processing perspective in data communication trans-
mission. Instead of criteria to select bands, it selects real applications to determine
bands of interest via PBP. To implement such PBP in real time, the concept of
recursion is further introduced to PBP to derive RHBP, which not only processes
data band by band progressively but also recursively. Thus, Parts IV and V include
chapters that extend PBP to RHBP in various applications. The proposed RHBP is
very similar to RHSP, presented in Parts II and III, except that RHBP works on
spectral band images whenever they are available without waiting for all bands to
be collected, as opposed to RHSP, which processes image sample vectors using full
band information. However, it should be noted that the concepts of PBP and RHBP
are quite different. The goal of PBP is to process data progressively, in the sense
that a one-shot-operation process can be stretched out in a slow-moving process so
that bandwise profiles provided by progressive changes in the operation can be used
for further detailed data analysis, such as in detection and identification of particular
samples and significant bands. On the other hand, RHBP focuses on data to be
updated according to recursive equations specified by certain desired recurrence
relations. Thus, technically speaking, RHBP does not necessarily process data
progressively. Nevertheless, when RHBP is used in this book, it is generally
understood that RHBP is progressive in nature in its recursive process, in which
case RHBP processes data not only recursively to update data but also progressively
to produce progressive profiles of processed data.
In parallel to RHSP described in Parts II and III, we also develop its counterpart
according to the BSQ format (Fig. 21.2), called RHBP, presented in Parts IV and V,
where all algorithms designed and developed in Parts II and III also have counter-
parts in Parts IV and V, respectively.
Analogous to Parts II and III there are also two types of band information of
major interest, band spectral correlation-based information and band spectral
signature-based information. Hyperspectral imaging algorithms generally involve
using either band spectral correlations characterized by sample spectral statistics,
such as sample spectral covariance/correlation matrix used by CEM and AD target
detection algorithms, or spectral correlation characterized by signatures, such as
ATGP, OSP, or EFAs. Thus, Part IV focuses on how to implement RHBP on
sample spectral statistics-based covariance/correlation matrices causally in a pro-
gressive and recursive manner, while Part V is devoted to implementing RHBP on
signature spectral statistics-based correlation matrices causally in a progressive and
recursive manner.
21.3.1 Part IV: Band Spectral Correlation-Based

Information
N
Assume that fri gi¼1 is a set of all data sample vectors in the lth band. Let Xl
N
¼ ½r1 r2 rN1 rN be the data matrix formed by fri gi¼1 , with ri ¼ ðr i1 ; r i2 ; . . . ; r iL ÞT,
and let r(l ) be the current data streams acquired from the lth band, denoted by
rðlÞ ¼ ðr 1 ; r 2 ; . . . ; r l ÞT . Now we define a data matrix
2 3
r 11 r ðN1Þ1 r N1
6 7
6 ⋮ ⋱ ⋮ ⋮ 7
Xl ¼ ½r1 r2 rN1 rN ¼ 6
6r
7 ð21:29Þ
4 1ðl1Þ r ðN1Þðl1Þ r Nðl1Þ 7
5
r 1l r ðN1Þl r Nl

Xl1
and Xl ¼ ; with
xT ðlÞ
640 21 Conclusions
2 3
r 11 r ðN1Þ1 r N1
6 7
6 ⋮ ⋱ ⋮ ⋮ 7
Xl1 ¼ 6
6r
7; ð21:30Þ
4 1ðl2Þ r ðN1Þðl2Þ r N ðl2Þ 7
5
r 1ðl1Þ r ðN1Þðl1Þ r N ðl1Þ
where xðlÞ ¼ ðr 1l ; r 2l ; . . . ; r Nl ÞT is an N-dimensional data sample vector formed by

N
the data samples of fri gi¼1 in the lth bands. The causal band correlation matrix
(CBCRM) up to the lth band Bl is defined by Rll ¼ ð1=N ÞXl XlT , which can be
expressed as
" #
Xl1
Rll ¼ ð1=N ÞXl XlT ¼ T
Xl1 xðlÞ
x ðl Þ
T
" # ð21:31Þ
T
Xl1 Xl1 Xl1 xðlÞ
¼ ð1=N Þ :
xT ðlÞXl1
T
xT ðlÞxðlÞ
Using (21.31), R1

ll can be updated and calculated recursively band by band by

T 1
R1
ll ¼ ð1=N ÞXl Xl
2 h iT 3
R1
ð Þ ð Þ Xl1 xðlÞ Rð1 Þð Þ Xl1 xðlÞ Rð1 X x ð lÞ
6 R1 l1 l1

l1

l1

l1Þðl1Þ l1

7
6 ðl1Þðl1Þ þ ð1=N Þ 7
7
6 Xl1 Xl1 7
¼6
6 h iT
7:
7
6 R1
1 7
6 ðl1Þðl1Þ Xl1 xðlÞ 7
4
⊥
T 5
xT ðlÞP⊥ XT
xðlÞ l1
l1
ð21:32Þ
It is the recursive equation (21.32) that is a key element in making RHBP feasible
and practically useful in many real-world applications, not only for real-time
processing but also hardware design.
21.3.1.1 RHBP-CEM
An immediate application of (21.32) is to implement CEM as recursive

hyperspectral band processing of CEM (RHBP-CEM), where CEM can be
performed band by band progressively and recursively as follows:
δRHBP-CEM ðrðlÞÞ ¼ ðκ l =κ l1 ÞδRHBP-CEM ðrðl 1 ÞÞ

þ ð1=N Þκ l β dT ðl 1Þν Nd l ν T rðl 1Þ Nr l ;
ð21:33Þ
1
where κl ¼ dT ðlÞR1 ll dðlÞ is a scalar, ν ¼ R1
ðl1Þðl1Þ Xl1 xðlÞ , and
l ðl1Þ
n h i o1
β ¼ xT ðlÞ P⊥ T xð l Þ .
l ðl1Þ X l1
21.3.1.2 RHBP-AD
Since, like CEM, AD uses a sample spectral correlation matrix, we can take
advantage of (21.32) to derive RHBP-AD similarly to how RHBP-CEM is derived
in (21.33) as follows:
2
δRHBP-R-AD ðrðlÞÞ ¼ δRHBP-R-AD ðrðl 1ÞÞ þ ð1=N Þβ Nr l η ;
l ðl1Þ l ðl1Þ
ð21:34Þ
n h i o1
where η ¼ rT ðl 1ÞR1
ðl1Þðl1Þ Xl1 xðlÞ and β ¼ xT ð l Þ P⊥ T x ð lÞ .
l ðl1Þ l ðl1Þ X l1
However, deriving an RHBP version of the covariance matrix K-AD requires a

little more work because it involves updating causal sample mean vectors, which
are included in (21.32) because Kll ¼ Rll μðlÞμT ðlÞ. To take care of the causal
sample vectors μ(l ), we need to do the following
δRHBP-K-AD ðrðlÞÞ¼ ðrðlÞ μðlÞÞT K1 ll ðrðlÞ μðlÞÞ

!T " # !
r ðl 1 Þ μ ðl 1 Þ Aðl1Þðl1Þ b rðl 1Þ μðl 1Þ
¼
r l μl bT d r l μl
!T !
1 rðl 1Þ μðl 1Þ rðl 1Þ μðl 1Þ
þ Φll ;
1 ρ r l μl r l μl
l ðl1Þ
ð21:35Þ
where
T
Φll ¼ R1ll μðlÞ μ ðlÞRll
1
" #!
μð l 1 Þμ T
ð l 1 Þ μ ð l 1 Þμl
¼ R1
ll R1
ll ; ð21:36Þ
μl μT ðl 1Þ μ2l
642 21 Conclusions
with
" #" #
Aðl1Þðl1Þ b μðl 1Þ
R1
ll μðlÞ ¼
bT d μl
" #
Aðl1Þðl1Þ μðl 1Þ þ bμl
¼ ð21:37Þ
bT ðl 1Þμðl 1Þ þ dμl
and
" #T " #" #
μðl 1Þ Aðl1Þðl1Þ b μðl 1Þ
ρ ¼ μT ðlÞR1
ll μðlÞ ¼
l ðl1Þ μl μl ð21:38Þ
bT d
¼ μT ðl 1ÞAðl1Þðl1Þ μðl 1Þ þ 2μT ðl 1Þbμl þ dμ2l ;
where μðl 1Þ ¼ ðμ1 ; . . . ; μl1 ÞT , b ¼ β R1

ðl1Þðl1Þ Xl1 xðlÞ,
l ðl1Þ
n h i o1
β ¼ xT ð l Þ P⊥ T x ð lÞ ; ð21:39Þ
l ðl1Þ X l1
d ¼ Nβl|(l1), and

1
Aðl1Þðl1Þ ¼ RðTl1Þðl1Þ

T
þ ð1=N Þβ R1 1
ðl1Þðl1Þ Xl1 xðlÞ Rðl1Þðl1Þ Xl1 xðlÞ ;
l ðl1Þ
ð21:40Þ
Φll
K1 ¼ R1
ll þ : ð21:41Þ
ll
1 ρ
l ðl1Þ
As we can see from (21.35) to (21.41), calculating the inverse of the causal sample
covariance matrix Kll is not as trivial as we thought.
21.3.2 Part V: Signature Spectral Correlation-Based

Information
Part V in this section deals with another spectral correlation matrix that has been
used by many hyperspectral imaging algorithms. It is the causal signature correla-
tion matrix that must vary with new signatures added to the signature matrix to
update signature knowledge when a new band comes in. In what follows, we
describe four applications where the algorithms used employ a signature spectral
correlation matrix to update information progressively and recursively.
21.3.2.1 RHBP-ATGP
The first application is unsupervised target detection, which detects unsupervised

targets of interest without prior knowledge. One of the most widely used algorithms
is ATGP, developed by Ren and Chang (2003) and discussed in Sect. 4.4.2.3. Its
key idea is to repeatedly implement a succession of OSPs to find maximal OPs
during each h OSP operation. i Assume that for 1 j p and p l ,
ðl1Þ ðl1Þ ðl1Þ
Uðl1Þj ¼ t1 t2 tj is an undesired target signature matrix made up of
ðl1Þ ðl1Þ ðl1Þ
previously found p targets, t1 , t2 , . . . , tj , when the first (l1) bands are
used:
P⊥
Ulj ¼ Ill Ulj Ulj
#
2 3
P⊥
Uðl1Þj Uðl1Þj ρ
l ðl1Þ
6 7
¼6
4

1 T 7
5
Uðl1Þj Uðl1Þj Uðl1Þj
T
mðlÞ 1 mT ðlÞρ
l ðl1Þ
2 3
Uðl1Þj ρ ρ T UT Uðl1Þj ρ ρ T mðlÞ
1 6 7
þ 4 5:
1 þ mT ðlÞρ m ðlÞρ
T
ρ T UT
m ðlÞρ
T
ρ T
mðlÞ
l ðl1Þ l ðl1Þ lðl1Þ ðl1Þj l ðl1Þ lðl1Þ
ð21:42Þ
ðlÞ ðlÞ ðlÞ

Then the p targets t1 , t2 , . . ., tp found by ATGP using l bands can be easily
implemented by (21.42) recursively as RHBP of ATGP (RHBP-ATGP) to find
targets given by
n o
ðlÞ
tj ¼ arg max1iN ðrl ðiÞÞT P⊥
Ulj rl ðiÞ for 1 j p: ð21:43Þ
21.3.2.2 RHBP-OSP
Since the ATGP described in the previous section can be considered an

unsupervised version of the well-known OSP (Wang et al. 2002), the derivations
used to derive RHBP-ATGP should be applicable to deriving RHBP of OSP
(RHBP-OSP).
Uðl1Þp
Now assume that Ulp ¼ m1 m2 mp ¼ represents an undesired
mT ðlÞ
signature matrix and dl ¼ ðd 1 ; . . . ; dl1 ; dl ÞT is the desired signature. Let rl ¼
ðr 1 ; . . . ; r l1 ; r l ÞT be a data sample vector acquired by the first l bands. According to
(16.3) and (16.4),
644 21 Conclusions
T ⊥
^ OSP
α p ðrl Þ ¼ dl PUlp rl ; ð21:44Þ
where

1
P⊥
Ulp ¼ I Ulp Ulp ¼ I Ulp Ulp Ulp
# T T
Ulp ð21:45Þ
and
h i1 h i 1
Uðl1Þp
T
Ulp Ulp ¼ UðTl1Þp mðlÞ
m T ðl Þ ð21:46Þ

1
¼ UðTl1Þp Uðl1Þp þ mðlÞmT ðlÞ ;
where
2 3
6 7
6 m12 m22 ⋮ mðp1Þ2 mp2 7
6 7
Uðl1Þp ¼ 6 is an ðl 1Þ p
6 ⋮ ⋮ ⋱ ⋮ ⋮ 7 7
4 5
m1ðl1Þ m2ðl1Þ mðp1Þðl1Þ m
p ðl1Þ
T
matrix and mðlÞ ¼ m1l ; m2l ; . . . ; mpl is a p-dimensional vector. Finally, we can
^ OSP
derive an RHBP version of α ^ OSP
dl ðrl Þ in (21.44) in terms of α dl1 ðrl1 Þ as follows:
" #
^ RHBP
α -OSP ðr Þ ¼ d T P⊥ r ¼ d T
d l P⊥
rl1
dl l l Ulp l l1 Ulp
rl
T
^ RHBP
¼α -OSP ðr Þ d U
ðl1Þp lðl1Þ rl1 dl1 Uðl1Þp ρlðl1Þ r l
ρ T
dl1 l1 l

1
þ d l 1 mT ðlÞρ rl þ
l ðl1Þ 1 þ mT ðlÞρ
l ðl1Þ

T
dl1 Uðl1Þp ρ þ dl mT ðlÞρ ρ T UðTl1Þp rl1 þ ρ T mðlÞr l :
ð21:47Þ
21.3.2.3 RHBP-LSMA
As noted in Chang (2007c), OSP can also be used to derive LSOSP, which yields
exactly the same results derived from the abundance-unconstrained LS (UCLS)
method. However, there are several subtle differences between OSP and the UCLS
method. One is that OSP is a detection technique designed based on signal-to-noise
ratio (SNR), while UCLS is an estimation method based on LSE. A second
difference is that OSP detects the abundance fraction of one signature at a time,
which is a scalar value, whereas UCLS estimates an abundance vector as a whole. A
third difference is more crucial because UCLS does not make use of OSP to
perform estimation. As a result, the P⊥ Ulp in (21.42) and (21.45) used to derive
RHBP-ATGP and RHBP-OSP respectively is not found in UCLS. It seems that

P⊥Ulp cannot be used for UCLS. Interestingly, with an appropriate interpretation by
replacing Ulp in (21.46) with Ml to be defined in (21.48), we can actually take
advantage of (21.46) to derive RHBP of LSMA (RHBP-LSMA) as follows:
Assume that fBk gl1
k¼1 are (l1) bands already being processed and the current
band is the lth band, denoted by Bl, yet to be processed. For a given set of p l-
p
dimensional image endmembers, mj ðlÞ j¼1 , where mj ðlÞ ¼ mj1 ðlÞ, mj2 ðlÞ, . . . ,
mjl ðlÞÞT is the jth l-dimensional signature vector formed by the l spectral bands,
fBk gl1 and Bl. We then introduce a new p-dimensional vector defined by m el
k¼1 T
¼ ml1 ; ml2 ; . . . ; mlðp1Þ ; mlp , which is composed of only the lth spectral band in
all mj(l ) for 1 j p, and further
define a signature matrix up to l spectral bands by
Ml ¼ m1 ðlÞm2 ðlÞ mp ðlÞ that can be rewritten
" #
T Ml1
Pl ¼ MlT Ml el
¼ Ml1 m ¼ Pl1 þ m e lT :
e lm ð21:48Þ
e lT
m
According to (9.2) in Chap. 9 the LS solution to LSMA is given by

1
^ LS ðlÞ ¼ α
α ^ LS ðrl Þ ¼ MlT Ml MlT rl : ð21:49Þ

1
^ LS ðlÞ ¼ Pl1 þ m
α e lT MlT rl :
e lm ð21:50Þ
Now, using Woodbury’s identity in Appendix A we can derive

1 T 1
1 P mel m e P
P1 ¼ Pl1 þ m e lT
e lm ¼ P1 l1 T 1l l1 ; ð21:51Þ
l l1
1þm el
e l Pl1 m
^ LS ðrl Þ in (21.49) in terms

which can be further used to derive an RHBP version of α
^ ðrl1 Þ ¼ α
of α LS
^ ðl 1Þ as follows:
LS
0 1
v v T
l ðl1Þ lðl1Þ
^ RHBP-LS ðlÞ ¼ @Ipp
α Pl1 Aα
^ LS ðl 1Þ
1 þ ρ
l ðl1Þ
0 1
v v T
l ðl1Þ lðl1Þ
þ @P1
l1
Am
e l rl ; ð21:52Þ
1 þ ρ
l ðl1Þ
646 21 Conclusions
where
T 1
v ¼ Ml1 Ml1 m e l ¼ P1 e l;
l1 m ð21:53Þ
l ðl1Þ
T 1
ρ e lT Ml1
¼m Ml1 m e lT P1
el ¼ m e lT v
el ¼ m
l1 m : ð21:54Þ
l ðl1Þ l ðl1Þ
21.3.2.4 RHBP-SGA
One of the great potential benefits of RHBP is that it addresses the data dimension-
ality issue, particularly in the application of endmember finding. Since the number
of endmembers to be found in data is assumed to be much smaller than the data
dimensionality, an EFA generally requires DR as a preprocessing method prior to
data processing. Despite the fact that many EFAs have been reported in the
literature, SV-based algorithms have received the most interest, specifically the
N-finder algorithm (N-FINDR) developed by Winter (1999a, b). Unfortunately,
N-FINDR requires excessive computing time and has three practical problems in its
implementation: determining the number of endmembers that need to be found,
finding all endmembers simultaneously as a result of an exhaustive search, and
calculating SV by finding the determinant of an ill-rank matrix. The SGA devel-
oped by Chang et al. (2006) actually solves the first two problems by growing
simplexes to find endmembers one after another instead of finding all endmembers
together at the same time, as N-FINDR was originally designed for. Thus, for SGA
to be effective, the number of endmembers must be estimated reliably. This can be
generally done by VD, developed by Chang (2003) and Chang and Du (2004).
However, the value of VD generally varies with different algorithms (Chap. 5 in
Chang 2013). In addition, Chap. 4 in this book extends VD to TSVD, which can
vary with real targets. Consequently, many possible values can result from various
VD methods. The development of SGA arose precisely out of the need which is to
allow SGA to adapt its VD values as it grows simplexes. However, SGA also runs
into the same problem as N-FINDR, which calculates SVs by a matrix determinant,
an issue already discussed in Chap. 2 in connection with the use of full bands to
calculate SVs, which requires DR. To avoid using a matrix determinant to calculate
SVs, two approaches, OP-based SGA (OPSGA) in Chap. 11 and Geometric SGA
(GSGA) in Chap. 12, were developed for this purpose. It seems that OPSGA and
GSGA have resolved all three of the aforementioned issues encountered in
N-FINDR. However, we can take another look at this issue. If we assume that
each endmember can only be accommodated by a unique well-characterized spec-
tral band, then the number of endmembers determines the number of bands required
to find endmembers. In other words, this implies that there is no need to use full
bands to find endmembers as all the current EFAs do, including OPSGA and
GSGA. The key issue is finding a spectral band that can be used as a fingerprint
of a particular endmember. An RHBP version of SGA, recursive hyperspectral band
21.4 Future Work on Hyperspectral Band Processing 647
processing of SGA (RHBP-SGA) (Chap. 18) provides a feasible solution. It allows

SGA to be processed band by band progressively to provide profiles of progressive
changes in finding endmembers between bands, while the SGA only needs to be
updated band by band recursively without reprocessing all bands. In order for
RHBP-SGA to be implemented in real time, DR should not be used in data
processing. Thus, RHBP-SGA developed in Chap. 18 is actually RHBP-OPSGA
or RHBP-GSGA, depending on which one is used to replace SGA.
21.3.2.5 RHBP of Iterative Pixel Purity Index
SGA represents one important category of EFAs that use SVs with full abundance
constraints as a criterion to find endmembers. There is another important category
of EFAs that use OP without imposing any abundance constraint as a criterion to
find endmembers. Many algorithms that belong to this category can actually be
derived from the well-known pixel purity index (PPI) developed by Boardman
(1994). For example, there are two general approaches to extending PPI. One is to
redesign PPI as an iterative PPI (IPPI), as developed by Chang and Wu (2015),
where two types of IPPI are further derived as causal IPPI (C-IPPI) and progressive
IPPI (P-IPPI) (Chang 2016) (see also Chap. 19). The other is to reformulate PPI as a
sequential algorithm, such as ATGP or vertex component analysis (VCA), as shown
in Chang et al. (2013). Interestingly, by taking advantage of the concept of VD in
Chap. 4 and ATGP, Chang and Plaza (2006) further derived a fast sequential
algorithm, called fast IPPI (FIPPI) to implement PPI. In fact, using IPPI, we can
interpret FIPPI as a special version of P-IPPI with its initial condition particularly
specified by ATGP-generated target sample vectors. Most recently it was shown in
Chang (2016) and Chang et al. (2016) that ATGP, VCA, and SGA are essentially
the same algorithm. Since ATGP and SGA can be extended to their RHBP versions,
RHBP-ATGP (Chap. 15) and RHBP-SGA (Chap. 18), it would be interesting to
also extend IPPI to RHBP-IPPI in a similar manner. While this may seem straight-
forward, it indeed is easier said than done because IPPI involves one more param-
eter—a set of randomly generated vectors referred to as skewers. Chapters 19 and
20 address this issue.
21.4 Future Work on Hyperspectral Band Processing
No single book, including this one, can cover all the topics in one area. The subjects
presented here were selected to represent a small set of research conducted in the
area of hyperspectral imaging in the Remote Sensing Signal and Image Processing
Laboratory (RSSIPL) at the University of Maryland, Baltimore County (UMBC).
As a result, many interesting topics were unfortunately left out and could not be
included in the book. In what follows, we describe some research currently being
undertaken at the RSSIPL.
648 21 Conclusions
21.4.1 Multispectral Imaging by Nonlinear Band

Dimensionality Expansion
Multispectral imaging (MSI) has been widely used in many applications, for
example, climate change detection, environmental mentoring, forest and natural
resource management, geological surveying, agriculture land cover/use, and GIS, to
name just a few. However, owing to its low spectral/spatial resolution, MSI also has
limited applications in other areas, such as subpixel target detection, mixed pixel
classification, and quantification. To expand MSI’s capability, hyperspectral imag-
ing (HSI) has recently emerged as a new remote sensing technique that is developed
to address MSI’s drawbacks. In the early stages in the development of HSI
techniques, many MSI techniques were extended for this purpose, most notably,
maximum likelihood classification and principal components analysis. However, as
pointed out in Chang (2013), significant differences exist between MSI and HSI.
A natural extension of MSI to HSI is generally not effective since the two methods
have completely different design philosophies. Specifically, MSI is considered a
spatial domain-based analysis (referred to as literal analysis) technique, as opposed
to HSI, which is considered a spectral domain-based analysis technique (referred to
as nonliteral analysis) (Chang 2003). For example, owing to the large number of
spectral bands used by HSI sensors, dealing with the enormous amount of
hyperspectral data volumes becomes an issue, which is not an issue for MSI. In
this case, data reduction or data compression is generally used. This in turn gives
rise to new challenges, for example, how many bands or dimensions must be
determined and which bands or dimensions need be selected; neither of these issues
comes up in MSI. In addition, the very high spectral resolution provided by HSI
sensors can further reveal many subtle material substances that cannot be detected
by MSI sensors, such as endmembers, subpxiel targets, and mixed pixels. None of
the aforementioned issues can be solved by spatial domain-based techniques.
Chang (2003) is believed to be the first to apply statistical signal processing theory
to derive various signal processing techniques for addressing the issues in effective
data dimensionality by VD, subpixel target detection, and mixed pixel classifica-
tion/quantification. It was further expanded comprehensively by Chang (2013) to
address issues in endmember extraction, data compression, and reduction, as well
as real-time processing capabilities in Chang (2016). With such a rich theory
developed for HSI, can we take advantage of well-established spectral domain-
based HSI algorithms and resolve issues that cannot be resolved by MSI, specifi-
cally, target detection. To do that, a major issue is coping with the insufficient
number of bands that are only available in multispectral images. This chapter
considers that issue in multispectral target detection.
First, we need to come up with a means of expanding band dimensionality
(BD) from the original set of multispectral images. One idea is derived from a
random process that can generate statistical moments of any order by correlating
itself. Such band expansion process is referred to as a correlation band expansion
process (CBEP). Let {BCBEP
k } denote such correlated band images. The first attempt
at using CBEP was made in Ren and Chang (2000) and Chang (2013), where BD
was expanded via nonlinear correlations such as autocorrelation or cross correlation
among bands for multispectral classification. Another idea is to use band ratio
(BR) to create nonlinear band images. The use of BRs comes from the idea of
enhancing the spectral differences between bands as well as reducing the effects of
topography (Jensen 1996). It is performed by dividing one spectral band by another
to produce an image that can provide relative band intensities. Such band ratioed
images generally enhance the spectral differences between bands. Such a band
expansion process is referred to as a band ratioed expansion process (BREP), and
the resulting band set is denoted by {BBREP
k }. These two nonlinear generated band
sets are then
combined
to create
a nonlinear band expansion (NBE) set, denoted by
ΩNBE ¼ BkCBEP [ BkBREP . Adding ΩNBE to the original multispectral band set
ΩMS ¼ fBl g yields a new set of expanded multispectral band images, ΩEMS,
denoted by ΩEMS ¼ ΩNBE [ ΩMS , which should have sufficient band images for
HSI processing.
Since ΩNBE is not a new set of bands but rather created by an expanded ΩMS
through nonlinear functions, the issue of nonlinearity may come up with ΩNBE. To
further explore such nonlinearity, two processes are developed to generalize oper-
ators designed to perform data analysis. Two types of generalization are derived.
One is kernelization, which takes advantage of nonlinear kernels to design kernel-
based detectors that can solve linear nonseparability problems. The other is to
iterate detectors to filter out unwanted data to achieve better performance. For
each detector, there are two ways to perform detection depicted in Fig. 21.3: (a) the
detection band images are fed back to be included as part of NBE iteratively or
(b) the detection band images are only fed back to the detector iteratively. Such a
process is referred to as NBE iterative multispectral imaging (NBE-IMSI).
Many applications of interest can be found via NBE-IMSI, such as subpixel
detection, AD, and mixed pixel classification/quantification.
a
detector detection
MS data NBE
band image
b
detector detection
MS data NBE
band image
Fig. 21.3 Diagram showing two ways of performing NBE-IMSI

650 21 Conclusions
21.4.2 Hyperspectral Single Band Selection
BS has been widely used for data reduction while retaining data integrality. It is
different from data DR in that DR requires the entire data to perform some sort of
transformation linearly or nonlinearly to reduce data volumes, compared to BS,
which completely discards unselected bands. However, in order to perform BS it
needs to know the prior knowledge about the number of bands to be selected, p. It is
then followed by solving an optimization problem to find a set of p optimal bands,
Ωp. Thus, two challenging issues arise in implementing BS. One is determining the
value of p. The other is that if the value of p changes, the set of Ωp also changes. In
other words, we cannot take advantage of previous Ωp to find a subsequent Ωq when
p < q since a different value of p produces a different set of Ωp. As a consequence,
when it comes to different applications that require different values of p, BS must
be reimplemented over and over again. Thus, BS is practically inapplicable. The
same issues also arise in DR as well. To address these dilemmas, the new concept of
a progressive process was recently introduced into DR and BS, referred to as a
progressive dimensionality reduction process (PDRP) in Chap. 20 in Chang (2013)
and progressive band dimensionality process (PBDP) in Chap. 21 in Chang (2013),
where DR and BS can be processed progressively back and forth in a backward or a
forward manner. With the interpretation of PDRP and PBDP, DR and BS were
further extended to progressive spectral dimensionality reduction in Chang and
Safavi (2011) and progressive band selection (PBS) in Chang 23 in Chang (2013).
21.4.3 Hyperspectral Band Subset Processing
Two new approaches are currently being investigated. One is band fusion (BF),
which fuses results obtained by different band subsets. It has great potential in
applications of satellite communication and transmission when more than one
ground station can be used to receive data from different band sets and process
data individually and separately. Then the obtained results can be further fused
together without reprocessing all the bands.
Another is called band tuning (BT), which allows users to communicate data
band by band back and forth without reimplementing BS. Using a radio as an
example, with a band considered a radio station, BT is similar to allowing a listener
to tune the radio to find a station of interest, as opposed to BS, which only allows
listeners to preselect particular radio stations by pushing preset buttons. Accord-
ingly, BT can be considered a contiguous band process, unlike BS, which can be
considered a discrete band process. Thus, BT is more appropriate for hyperspectral
data that are also acquired by contiguous spectral bands. Two types of BT can be
developed for the purpose of data communication. One is sequential BT (SBT),
which finds a particular band of interest that is crucial in a specific application. This
in turn leads to another new concept, band determination by tuning (BDT), which is
also different from BS in the sense that BDT finds bands of interest via tuning
without prior knowledge, compared to BS, which requires prior knowledge of p and
selects p optimal bands via optimization. Consequently, technically speaking, BF is
used for band determination, not for BS. Another type of BF is to implement BF in
conjunction with PSDP to further develop progressive BF (PBT), in which case
previously tuned bands are always included as part of the band set to be tuned later.
If such PBT is implemented via band prioritization and band decorrelation (Chang
et al. 1999), the resulting PBT turns out to be PBS (Chang and Liu 2014).
BT has several advantages over BS. First and foremost is BDT, which finds bands
of interest without prior knowledge via tuning, and in particular, without having to
know the value of p. The band to be found is determined by a specific application, not
criteria for BS. Second, BT is performed back and forth via tuning by a custom-
designed recursive equation. It is not band prioritization used by PBDP or BS used by
solving an optimization problem, both of which must be done prior to BS. Third, BF
has potential applications in satellite communication, where BF can be performed by
ground stations individually via data downlinks, and the results can be updated via
recursive equations by incoming bands without reprocessing already included bands.
Fourth, BT can be implemented as SBT so that a band can be ranked according to a
specific application in the same way that features are identified in Chang et al. (1999).
Finally, BT can also be implemented as PBS, where bands can be selected back and
forth via band prioritization in a progressive manner without reprocessing BS as the
number of bands to be selected, the value of p, changes.
Appendix A: Matrix Identities
(a) Woodbury’s Matrix Identity

1 T 1

T 1 1 A u v A
A þ uv ¼A ðA:1Þ
1 þ vT A1 u
(b) Matrix Inverse Identity

1
ðA þ BCDÞ1 ¼ A1 A1 B DA1 B þ C1 DA1 ðA:2Þ
(c) Matrix-Vector Inverse Identity

1 " 1 T #
UT U UT d UT U þ βU# ddT U# βU# d
¼ ; ðA:3Þ
dT U dT d βdT U#
T
β
1
where U# ¼ UT U U and
n h 1 i o 1 1
β ¼ dT I U UT U UT d ¼ dT P⊥
U d .
(d) Matrix-Inverse Identity
n h 1 i o1 T ⊥ 1
β ¼ dT I U UT U UT d ¼ d PU d ; ðA:4Þ
1
where Γ ¼ C BT A1 B .

DOI 10.1007/978-3-319-45171-8
Glossary
A
AD Anomaly detection, Chap. 6
ALMM Adaptive linear mixing model, Chap. 9
ALSMA Adaptive linear spectral mixture analysis, Chap. 9
ANC Abundance nonnegativity constraint, Chap. 9
ARHBP Adaptive recursive hyperspectral band processing, Chap. 9
ASC Abundance sum-to-one constraint, Chap. 9
ATGP Automatic target generation process, Chap. 4
AVIRIS Airborne visible/infrared imaging spectrometer, Chap. 1
C
CBR-AD Causal band K-RXD, Chap. 14
CBR-AD Causal band R-RXD, Chap. 14
CBRCM Causal band correlation matrix, Chaps. 13 and 14
CEM Constrained energy minimization, Chap. 5
CK-AD Causal K-AD, Chap. 14
CLCRM Causal line correlation matrix, Chap. 5
CR-AD Causal R-AD, Chap. 6
CSCRM Causal sample correlation matrix, Chap. 5
CSCVM Causal sample covariance matrix, Chaps. 5 and 6
D
DSV Determinant-based simplex volume, Chap. 2
Dist-SGA Distance-based simple growing algorithm
DSGA Determinant-based SGA, Chaps. 2, 11, and 12
DR Dimensionality reduction
E
EFA Endmember finding algorithm
EIDA Endmember identification algorithm

DOI 10.1007/978-3-319-45171-8
656 Glossary
F
FCLS Fully constrained least-squares method, Chap. 9
FPGA Field Programmable Gate Array
G
GSGA Geometric simplex growing algorithm, Chaps. 12 and 18
GSVA Growing simplex volume analysis, Chaps. 12 and 18
GSV Geometric simplex volume, Chap. 2
GSV-OP Geometric simplex volume by orthogonal projection, Chaps. 11 and 12
GSV-SH Geometric simplex volume by simplex height, Chap. 12
GSV-PD Geometric simplex volume by perpendicular distance, Chaps. 11 and 12
H
HFC Harsanyi–Farrand–Chang, Chap. 4
HOS High-order statistics
HSI Hyperspectral imaging
HYDICE HYperspectral Digital Imagery Collection Experiment, Chap. 1
I
IBSI Interband spectral information, Chap. 4
IPPI Iterative pure pixel index, Chap. 18
K
KF-OSP-GSGA Kalman filter–based orthogonal subspace projection geometric
simplex growing algorithm, Chap. 12
KF-OVP-GSGA Kalman filter–based orthorgonal vector projection geometric
simplex growing algorithm, Chap. 12
K-AD Anomaly detection using autocovariance matrix K, Chap. 5
L
LCMV Linearly constrained minimum variance, Chap. 5
LCVF Lunar Crater Volcanic Field, Chap. 1
LSE Least-squares error
LSU Linear spectral unmixing
LSMA Linear spectral mixture analysis, Chap. 9
M
MEAC Minimum estimated abundance covariance, Chap. 10
MLE Maximum likelihood estimation, Chap. 10
MVT Minimum volume transform
N
NCLS Nonnegativity constrained least-squares method, Chaps. 4 and 9
N-FINDR N-Finder algorithm
NPD Neyman–Pearson detection/detector
NWHFC Noise-whitened Harsanyi–Farrand–Chang, Chap. 4
Glossary 657
O
OP Orthogonal projection
OPSGA Orthogonal projection–based simple growing algorithm, Chap. 11
OSP Orthogonal subspace projection
P
P-AD Progressive anomaly detection, Chap. 6
PHBP Progressive hyperspectral band processing, Chaps. 14–20
P-CEM Progressive constrained energy minimization, Chap. 5
PKP Progressive skewer set processing, Chaps. 19 and 20
PPI Pixel purity index
PSP Progressive sample processing, Chap. 1
R
R-AD RXD using autocorrelation matrix R, Chap. 5
RHBP Recursive hyperspectral band processing, Chap. 1
RHBP-AD Recursive hyperspectral band processing of anomaly detection, Chap. 14
RHBP-ATGP Recursive hyperspectral band processing of the automatic target
generation process, Chap. 15
RHBP-CEM Recursive hyperspectral band processing of constrained energy
minimization, Chap. 13
RHBP-C-IPPI Recursive hyperspectral band processing of causal iterative pixel
purity index, Chap. 19
RHBP-FIPPI Recursive hyperspectral band processing of fast iterative pixel
purity index, Chap. 20
RHBP-GSGA Recursive hyperspectral band processing of geometric simplex
growing algorithm, Chap. 18
RHBP-GSVA Recursive hyperspectral band processing of growing simplex vol-
ume analysis, Chap. 18
RHBP-LSMA Recursive hyperspectral band processing of linear spectral mixture
analysis, Chap. 17
RHBP-OSP Recursive hyperspectral band processing of orthogonal subspace
projection, Chap. 16
RHBP-PS-IPPI Recursive hyperspectral band processing of progressive-skewer
iterative pixel purity index, Chap. 19
RHBP-SGA Recursive hyperspectral band processing of simplex growing algo-
rithm, Chap. 18
RHSP Recursive hyperspectral sample processing, Chap. 1
RHSP-ATGP Recursive hyperspectral sample processing of the automatic target
generation process, Chap. 7
RHSP-GSGA Recursive hyperspectral sample processing of geometric simplex
growing algorithm, Chap. 12
RHSP-LSMA Recursive hyperspectral sample processing of linear spectral mix-
ture analysis, Chap. 9
658 Glossary
RHSP-MLE Recursive hyperspectral sample processing of maximum likelihood

estimation, Chap. 10
RHSP-OPSGA Recursive hyperspectral sample processing of orthogonal
projection–based simple growing algorithm, Chap. 11
RHSP-OSP Recursive hyperspectral sample processing of orthogonal subspace
projection, Chap. 8
ROC Receive operating characteristic
RSP Recursive skewer processing, Chap. 20
RXD RX detector, Chap. 5
RT Real time
S
SAM Spectral angle mapper
SC N-FINDR SuCcessive N-FINDR
SGA Simplex growing algorithm, Chap. 2
SID Spectral information divergence
SNR Signal-to-noise ratio
SQ N-FINDR SeQuential N-FINDR
SV Simplex volume
SVGA Simplex volume growing analysis, Chaps. 11 and 12
T
TE Target embeddedness
TI Target implantation
TSVD Target-specified virtual dimensionality, Chap. 4
U
UFCLS Unsupervised fully constrained least-squares, Chap. 4
UNCLS Unsupervised nonnegativity constrained least-squares, Chap. 4
UROSP Unsupervised recursive orthogonal subspace projection, Chap. 8
V
VCA Vertex component analysis, Chap. 11
VD Virtual dimensionality, Chap. 4
VS Virtual signature, Chaps. 9 and 10
References
Acito, N., M. Diani, and G. Corsini. 2009. A new algorithm for robust estimation of the signal
subspace in hyperspectral images in presence of rare signal components. IEEE Transactions on
Geoscience and Remote Sensing 47(11): 3844–3856.
———. 2010. Hyperspectral signal subspace identification in the presence of rare signal compo-
nents. IEEE Transactions on Geoscience and Remote Sensing 48(4): 1940–1954.
Adams, J.B., and M.O. Smith. 1986. Spectral mixture modeling: a new analysis of rock and soil
types at the Viking lander 1 suite. Journal of Geophysical Research 91(B8): 8098–8112.
Adams, J.B., M.O. Smith, and A.R. Gillepie. 1989. Simple models for complex natural surfaces: a
strategy for hyperspectral era of remote sensing. In Proceedings of IEEE international geo-
science and remote sensing symposium ‘89, 16–21.
———. 1993. Image spectroscopy: interpretation based on spectral mixture analysis. In Remote
geochemical analysis: elemental and mineralogical composition, ed. C.M. Pieters and
P.A. Englert, 145–166. Cambridge: Cambridge University Press.
Ambikapathi, A., T.H. Chan, C.-Y. Chi, and K. Keizer. 2013. Hyperspectral data geometry-based
estimation of number of endmembers using p-norm-based pure pixel identification algorithm.
IEEE Transactions on Geoscience and Remote Sensing 51(5): 2753–2769.
Anderson, T.W. 1984. An introduction to multivariate statistical analysis, 2nd ed. New York:
Wiley.
Ashton, E.A., and A. Schaum. 1998. Algorithms for the detection of sub-pixel targets in multi-
spectral imagery. Photogrammetric Engineering and Remote Sensing 64(7): 723–731.
Barjoski, P. 2009. Does virtual dimensionality work in hyperspectral images? In Algorithms and
technologies for multispectral, hyperspectral, ultraspectral imagery XV, SPIE, vol. 7334,
73341J-1–73341J-11, 16–19 April, Orlando, FL.
Basedow, R., P. Silverglate, W. Rappoport, R. Rockwell, D. Rosenberg, K. Shu, R. Whittlesey,
and E. Zalewski. 1992. The HYDICE instrument design. In Proceedings of international
symposium on spectral sensing research, vol. 1, 430–445.
Bates, C., and B. Curtis. 1996. A method for manual endmember selection and spectral unmixing.
Remote Sensing of Environment 55: 229–243.
Bates, C., A.P. Asner, and C.A. Wessman. 2000. Endmember bundles: a new approach to
incorporating endmember variability into spectral mixture analysis. IEEE Transactions on
Goescience and Remote Sensing 38: 1083–1094.
Bauman, J., E. Blasch, J. Jackson, and G. Sterling. 2005. Real-time ROC: a near-real-time
performance evaluation tool. In SPIE 05, vol. 5807, 380–390.
Bayliss, J., J.A. Gualtieri, and R.F. Cromp. 1997. Analyzing hyperspectral data with independent
component analysis. In Proceedings of SPIE, vol. 3240, 133–143.

DOI 10.1007/978-3-319-45171-8
660 References
Behrens, R.T., and L.L. Scharf. 1994. Signal processing applications of oblique projections
operators. IEEE Transactions on Signal Processing 42(6): 1413–1423.
Berman, M., H. Kiiveri, R. Lagerstrom, A. Ernst, R. Dunne, and J.F. Huntington. 2004. ICE: a
statistical approach to identifying endmembers in hyperspectral images. IEEE Transactions on
Bernabe, S., S. Lopez, A. Plaza, R. Sarmiento, and P.G. Rodriguze. 2011. FPGA design of an
automatic target generation process for hyperspectral image analysis. In 2011 I.E. 17th
international conference on parallel and distributed systems (ICPADS), 7–9 December,
Tainan, Taiwan, 1010–1015.
Bernabe, S., S. Lopez, A. Plaza, and R. Sarmiento. 2013. GPU Implementation of an automatic
target detection and classification algorithm for hyperspectral image analysis. IEEE Geosci-
ence and Remote Sensing Letters 10(2): 221–225.
Bioucas-Dias, J.M., and José Nascimento. 2005. Estimation of signal subspace on hyperspectral
data. In Proceedings of SPIE, vol. 5982, Bruges, Belgium, 191–198.
———. 2008. Hyperspectral subspace identification. IEEE Transactions on Geoscience and
Remote Sensing 46(8): 2435–2445.
Bioucas-Dias, J.M., A. Plaza, N. Dobigeon, M. Parente, Q. Du, P. Gader, and J. Chanussot. 2012.
Hyperspectral unmixing overview: geometrical, statistical, and sparse regression-based
approaches. IEEE Journal of Selected Topics in Applied Earth Observations and Remote
Sensing 5(2): 354–379.
Boardman, J.W. 1989. Inversion of imaging spectrometry data using singular value decomposi-
tion. In Proceedings of IEEE symposium on geoscience and remote sensing, 2069–2072.
———. 1990. Inversion of high spectral resolution data. Proceedings of SPIE 1298: 222–233.
———. 1993. Automated spectral unmixing of AVIRIS data using convex geometry concepts. In
Summaries, fourth JPL airborne geoscience workshop, JPL Publication 93-26, vol. 1, 11–14.
———. 1994. Geometric mixture analysis of imaging spectrometry data. In International geosci-
ence and remote sensing symposium, vol. 4, 2369–2371.
———. 1998. Leveraging the high dimensionality of AVIRIS data for improved sub-pixel target
unmixing and rejection of false positive: mixture tuned matched filtering. In Summaries of
seventh annual JPL earth science workshop, JPL Publication 98-4, vol. 1.
Boardman, J.W., F.A. Kruse, and R.O. Green. 1995. Mapping target signatures via partial
unmixing of AVIRIS data. In Summaries, fifth JPL airborne geoscience workshop, JPL
Publication 23-26.
Boker, L., S.R. Rotman, and D.G. Blumberg. 2008. Coping with mixtures of backgrounds in a
sliding window anomaly detection algorithm. In Proceedings of SPIE, vol. 7113, 711315-
1–711315-12.
Bowles, J.P., and D.B. Gilles. 2007. An optical real-tine adaptive spectral identification system. In
Hyperspectral data exploitation, ed. C.-I Chang, 77–106.
Bro, R., and S.D. Jong. 1997. A fast non-negativity-constrained least squares algorithm. Journal of
Chemometrics 11: 393–401.
Broadwater, J., and A. Banerjee. 2009. A Neyman-Pearson approach to estimating the number of
endmembers. In Proceedings of IEEE IGARSS, vol. 4, IV-693–IV-696.
Broadwater, J., J.R. Chellappa, A. Banerjee, and P. Burlina. 2007. Kernel fully constrained least
squares abundance estimates. In 2007 I.E. international geoscience and remote sensing
symposium (IGARSS), 23–28 July, 4041–4044.
Brumbley, C. 1998. Kalman Filtering and subspace projection approaches to multispectral and
hyperspectral image classification. Baltimore County, MD: Department of Computer Science
and Electrical Engineering, University of Maryland.
Brumbley, C., and C.-I Chang. 1999. An unsupervised vector quantization-based target signature
subspace projection approach to classification and detection in unknown background. Pattern
Recognition 32(7): 1161–1174.
Cawse, K., M. Sears, A. Robin, S.B. Damelin, K. Wessels, F. van den Bergh, and R. Mathieu.
2010. Using Random Matrix Theory to determine the number of endmembers in a
References 661
hyperspectral image. In Hyperspectral image and signal processing: evolution in remote

sensing (WHISPERS), 14–16 June 2010, 1–4.
Cawse-Nicholson, K., A. Robin, and M. Sears. 2013. The effect of correlation determining
intrinsic dimensionality of a hyperspectral image. IEEE Journal of Selected Topics in Applied
Earth Observations and Remote Sensing 6(2): 482–487.
Chai, J.W., J. Wang, and C.-I Chang. 2007. Mixed PCA/ICA transform for hyperspectral image
analysis. Optical Engineering 46(7): 077006-1–077006-13.
Chakravarty, S., and C.-I Chang. 2008a. Block truncation signature coding for hyperspectral
analysis. In ISSSR, August 23–27, San Diego, CA
———. 2008b. Band selection for hyperspectral signature coding. In International symposium
spectral sensing research (ISSSR), June 23–27, Steven Institute of Technology, NJ.
Chan, T.H., W.-K. Ma, C.-Y. Chi, et al. 2009a. Hyperspectral unmixing from a convex analysis
and optimization perspective. In First workshop on hyperspectral image and signal
processing: evolution in remote sensing, WHISPERS ‘09.
Chan, T.-H., C.-Y. Chi, Y.-M. Huang, and W.-K. Ma. 2009b. A convex analysis-based minimum-
volume enclosing simplex algorithm for hyperspectral unmixing. IEEE Transactions on Signal
Processing 57(11): 4418–4432.
Chang, C.-I, 1998. Further results on relationship between spectral unmixing and subspace
projection. IEEE Transactions on Geoscience and Remote Sensing 36(3): 1030–1032.
———. 1999. Least squares error theory for linear mixing problems with mixed pixel classifica-
tion for hyperspectral imagery. In Recent research developments in optical engineering,
ed. S.G. Pandalai, vol. 2, 241–268. Trivandrum: Research Signpost, India.
———. 2000. An information theoretic-based approach to spectral variability, similarity and
discriminability for hyperspectral image analysis. IEEE Transactions on Information Theory
46(5): 1927–1932.
———. 2002a. Relationship among orthogonal subspace projection, constrained energy minimi-
zation and RX-algorithm. In SPIE conference on algorithms and technologies for multispec-
tral, hyperspectral and ultraspectral imagery VIII, SPIE, vol. 4725, Orlando, FL, 1–5 April.
———. 2002b. Target signature-constrained mixed pixel classification for hyperspectral imagery.
IEEE Transactions on Geoscience and Remote Sensing 40(2):1065–1081.
———. 2003a. Hyperspectral imaging: techniques for spectral detection and classification.
Dordrecht/New York : Kluwer Academic/Plenum.
———. 2003b. How to effectively utilize information to design hyperspectral target detection and
classification algorithms. In Workshop in honor of Professor David Landgrebe on advances in
techniques for analysis of remotely sensed data, NASA Goddard Visitor Center, Washington
DC, October 27–28.
———. 2005. Orthogonal subspace projection revisited: a comprehensive study and analysis.
———. 2006a. Hand held device detects chemical and biological warfare agents. doi:10.1117/2.
1200612.0507, ISSN 1818-2559. http://newsroom.spie.org/x5237.xml.
———. 2006b. Exploration of virtual dimensionality in hyperspectral image analysis. In Algo-
rithms and technologies for multispectral, hyperspectral, and ultraspectral imagery XII, SPIE
defense and security symposium, Orlando, FL, April 17–21.
———, ed. 2006c. In Recent advances in hyperspectral signal and image processing, ed.
C.-I Chang. Trivandrum: Research Signpost, India.
———. 2006d. Utility of virtual dimensionality in hyperspectral signal/image processing. In
Recent advances in hyperspectral signal and image processing, ed. C.-I Chang. Trivandrum:
Research Signpost, India.
———, ed. 2007a. Hyperspectral data exploitation: theory and applications. New York: Wiley.
———. 2007b. Overview. In Hyperspectral data exploitation: theory and applications, ed. C.-I
Chang, 1–16. New York: Wiley.
662 References
———. 2007c. Information-processed matched filters for hyperspectral target detection and
classification. In Hyperspectral data exploitation: theory and applications, ed. C.-I Chang,
47–74. New York: Wiley.
———. 2008a. Hyperspectral imaging: an emerging technique in remote sensing. In International
symposium spectral sensing research (ISSSR), June 23–27, Steven Institute of Technology, NJ.
———. 2008b. Unsupervised linear hyperspectral unmixing. In International symposium spectral
sensing research (ISSSR), June 23–27, Steven Institute of Technology, NJ.
———. 2008c. Three dimensional receiver operating characteristic (3D ROC) analysis for
hyperspectral signal detection and estimation. In International symposium spectral sensing
research (ISSSR), June 23–27, Steven Institute of Technology, NJ.
———. 2009. Virtual dimensionality for hyperspectral imagery. SPIE Newsroom, September 28.
doi:10.1117/2.1200909.1749.
———. 2010. Multiple-parameter receiver operating characteristic analysis for signal detection
and classification. IEEE Sensors Journal 10(3): 423–442 (Invited paper).
———. 2013. Hyperspectral data processing: algorithm design and analysis. Hoboken: Wiley.
———. 2016. Real time progressive hyperspectral image processing: endmember finding and
anomaly detection. New York: Springer.
Chang, C.-I, and C. Brumbley. 1997. An orthogonalization target signature space projection
approach to image classification in unknown background. In 31st Conference on information
sciences and systems, The Johns Hopkins University, 174–178.
———. 1999a. A Kalman filtering approach to multispectral image classification and detection of
changes in signature abundance. IEEE Transactions on Geoscience and Remote Sensing 37(1):
257–268.
———. 1999b. Linear unmixing Kalman filtering approach to signature abundance detection,
signature estimation and subpixel classification for remotely sensed images. IEEE Transac-
tions on Aerospace and Electronics Systems 37(1): 319–330.
Chang, C.-I, and S.-S. Chiang. 2002. Anomaly detection and classification for hyperspectral
imagery. IEEE Transactions on Geoscience and Remote Sensing 40(2): 1314–1325.
Chang, C.-I, and Q. Du. 1999a. A noise subspace projection approach to determination of intrinsic
dimensionality for hyperspectral imagery. In EOS/SPIE symposium on remote sensing, con-
ference on image and signal processing for remote sensing V, SPIE, vol. 3871, Florence, Italy,
September 20–24, 34–44.
———. 1999b. Interference and noise adjusted principal components analysis. IEEE Transactions
on Geoscience and Remote Sensing 37(5): 2387–2396.
———. 2004. Estimation of number of spectrally distinct signal sources in hyperspectral imagery.
Chang, C.-I, and D. Heinz. 2000. Constrained subpixel detection for remotely sensed images.
Chang, C.-I, and M. Hsueh. 2006. Characterization of anomaly detection for hyperspectral
imagery. Sensor Review 26(2): 137–146.
Chang, C.-I, and B. Ji. 2006a. Weighted least squares error approaches to abundance-constrained
linear spectral mixture analysis. IEEE Transactions on Geoscience and Remote Sensing 44(2):
378–388.
———. 2006b. Fisher’s linear spectral mixture analysis. IEEE Transactions on Geoscience and
Chang, C.-I, and Y. Li. 2016. Recursive band processing of automatic target generation process for
subpixel detection in hyperspectral imagery. IEEE Transactions on Geoscience and Remote
Sensing 54(9): 5081–5094.
Chang, C.-I, and K.-H. Liu. 2014. Progressive band selection for hyperspectral imagery. IEEE
Transactions on Geoscience and Remote Sensing 52(4): 2002–2017.
Chang, C.-I, and A. Plaza. 2006. Fast iterative algorithm for implementation of pixel purity index.
IEEE Geoscience and Remote Sensing Letters 3(1): 63–67.
References 663
Chang, C.-I, and H. Ren. 1999. Linearly constrained minimum variance beamforming for target
detection and classification in hyperspectral imagery. In IEEE 1999 international geoscience
and remote sensing symposium, Hamburg, Germany, 28 June–2 July, 1241–1243.
———. 2000. An experiment-based quantitative and comparative analysis of hyperspectral target
detection and image classification algorithms. IEEE Transactions on Geoscience and Remote
Sensing 38(2): 1044–1063.
Chang, C.-I, and H. Safavi. 2011. Progressive dimensionality reduction by transform for
hyperspectral image analysis. Pattern Recognition 44(10): 2760–2773. doi:10.1016/j.patcog.
2011.03.030.
Chang, C.-I, and S. Wang. 2006. Constrained band selection for hyperspectral imagery. IEEE
Chang, C.-I, and J. Wang. 2008. Real-time implementation of field programmable gate arrays
(FPGA) design in hyperspectral imagery. US Patent, number 7,366,326, April 29.
Chang, C.-I, and Y. Wang. 2015. Anomaly detection using causal sliding windows. IEEE Journal
of Selected Topics in Applied Earth Observations and Remote Sensing 8(7): 3260–3270.
Chang, C.-I, and C.C. Wu. 2015. Design and development of iterative pixel purity index. IEEE
Journal of Selected Topics in Applied Earth Observations and Remote Sensing 8(6):
2676–2695.
Chang, C.-I, and W. Xiong. 2010. High order statistics Harsanyi-Farrand-Chang method for
estimation of virtual dimensionality. In SPIE, vol. 7810, San Diego, CA, August 2–5.
Chang, C.-I, Y. Cheng, M.L.G. Althouse, L. Zhang, and J. Wang. 1992. Multistage image coding:
a top-down gray-level triangle method. In Proceedings of international symposium on spectral
sensing research (ISSSR), Kauai, Hawaii, September 15–20, 497–511.
Chang, C.-J., C.-I Chang, and M.-L. Chang. 1993. Subband multistage predictive coding. In
Proceedings of international conference on signal processing ‘93/Beijing, Beijing, China,
October 26–30, 783–787.
Chang, C.-I, T.-L.E. Sun, and M.L.G. Althouse. 1998a. An unsupervised interference rejection
approach to target detection and classification for hyperspectral imagery. Optical Engineering
37(3): 735–743.
Chang, C.-I, X. Zhao, M.L.G. Althouse, and J.-J. Pan. 1998b. Least squares subspace projection
approach to mixed pixel classification in hyperspectral images. IEEE Transactions on Geosci-
ence and Remote Sensing 36(3): 898–912.
Chang, C.-I, Q. Du, T.S. Sun, and M.L.G. Althouse. 1999. A joint band prioritization and band
decorrelation approach to band selection for hyperspectral image classification. IEEE Trans-
actions on Geoscience and Remote Sensing 37(6): 2631–2641.
Chang, C.-I, J.-M. Liu, B.-C. Chieu, C.-M. Wang, C.S. Lo, P.-C. Chung, H. Ren, C.-W. Yang, and
D.-J. Ma. 2000. A generalized constrained energy minimization approach to subpixel target
detection for multispectral imagery. Optical Engineering 39(5): 1275–1281.
Chang, C.-I, H. Ren, and S.S. Chiang. 2001a. Real-time processing algorithms for target detection
and classification in hyperspectral imagery. IEEE Transactions on Geoscience and Remote
Sensing 39(4): 760–768.
Chang, C.-I, S.-S. Chiang, and I.W. Ginsberg. 2001b. Anomaly detection in hyperspectral imag-
ery. In SPIE conference on geo-spatial image and data exploitation II, Orlando, FL,
20–24 April, 43–50.
Chang, C.-I, Q. Du, S.-S. Chiang, D. Heinz, and I.W. Ginsberg. 2001c. Unsupervised subpixel
target detection in hyperspectral imagery. In SPIE conference on algorithms for multispectral,
hyperspectral and ultraspectral imagery VII, Orlando, FL, 20–24 April, 370–379.
Chang, C.-I, H. Ren, Q. Du, S.-S. Chiang, and A. Ifarragurri. 2001d. An ROC analysis for subpixel
detection. In IEEE international geoscience and remote sensing symposium, Sydney, Australia,
July 9–13.
Chang, C.-I, S.S. Chiang, J.A. Smith, and I.W. Ginsberg. 2002. Linear spectral random mixture
analysis for hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing 40
(2): 375–392.
664 References
Chang, C.-I, Jing Wang, F. D’Amico, and J.O. Jensen. 2003. Multistage pulse code modulation for
progressive spectral signature coding. In Chemical and biological standoff detection—optical
technologies for industrial and environmental sensing symposia—Photonics West, 27–-
31 October, 252–261.
Chang, C.-I, H. Ren, C.-C. Chang, J.O. Jensen, and F. D’Amico. 2004a. Estimation of subpixel
target size for remotely sensed imagery. IEEE Transactions on Geoscience and Remote
Sensing 42(6): 1309–1320.
Chang, C.-I, W. Liu, and C.-C. Chang. 2004b. Discrimination and identification for subpixel
targets in hyperspectral imagery. In IEEE International conference on image processing,
Singapore, October.
Chang, C.-I, Jing Wang, C.-C. Chang, and C. Lin. 2006a. Progressive coding for hyperspectral
signature characterization. Optical Engineering 45(9): 097002-1–097002-15.
Chang, C.-I, C.C. Wu, W. Liu, and Y.C. Ouyang. 2006b. A growing method for simplex-based
endmember extraction algorithms. IEEE Transactions on Geoscience and Remote Sensing 44
(10): 2804–2819.
Chang, C.-I, Y. Du, J. Wang, S.-M. Guo, and P. Thouin. 2006c. A survey and comparative study of
entropic and relative entropic thresholding techniques. IEE Proceedings, Vision, Image and
Signal Processing 153(6): 837–850.
Chang, C.-I, H. Ren, C.-I Chang, and B. Rand. 2008. How to design synthetic images to validate
and evaluate hyperspectral imaging algorithms. In SPIE conference on algorithms and tech-
nologies for multispectral, hyperspectral, and ultraspectral imagery XIV, March 16–20,
Orlando, FL.
Chang, C.-I, S. Chakravarty, H. Chen, and Y.C. Ouyang. 2009. Spectral derivative feature coding
for hyperspectral signature. Pattern Recognition 42(3): 395–408.
Chang, C.-I, S. Chakravarty, and C.-S. Lo. 2010a. Spectral feature probabilistic coding for
hyperspectral signatures. IEEE Sensors Journal 10(3): 395–409.
Chang, C.-I, X. Jiao, Y. Du, and M.-L. Chang. 2010b. A review of unsupervised hyperspectral
target analysis. EURASIP Journal on Advanced in Signal Processing. 2010: Article ID 503752,
26 p. doi:10.1155/2010/503752.
Chang, C.-I, C.C. Wu, C.-S. Lo, and M.-L. Chang. 2010b. Real-time simplex growing algorithms
for hyperspectral endmember extraction. IEEE Transactions on Geoscience and Remote
Sensing 40(4): 1834–1850.
Chang, C.-I, C.-C. Wu, and H.M. Chen. 2010c. Random pixel purity index algorithm. IEEE
Geoscience and Remote Sensing Letters 7(2): 324–328.
Chang, C.-I, W. Xiong, W. Liu, C.C. Wu, and C.C.C. Chen. 2010d. Linear spectral mixture
analysis-based approaches to estimation of virtual dimensionality in hyperspectral imagery.
Chang, C.-I, B. Ramakishna, J. Wang, and A. Plaza. 2010e. Exploitation-based hyperspectral
image compression. Journal of Applied Remote Sensing 4: 041760. doi:10.1117/1.3530429.
Chang, C.-I, C.-C. Wu, and C.-T. Tsai. 2011a. Random N-finder algorithm. IEEE Transactions on
Image Processing 20(3): 641–656.
Chang, C.-I, X. Jiao, Y. Du, and H.M. Chen. 2011b. Component-based unsupervised linear
spectral mixture analysis for hyperspectral imagery. IEEE Transactions on Geoscience and
Remote Sensing 49(11): 4123–4137.
Chang, C.-I, W. Xiong, H.M. Chen, and J.W. Chai. 2011c. Maximum orthogonal subspace
projection to estimating number of spectral signal sources for hyperspectral images. IEEE
Journal of Selected Topics in Signal Processing 5(3): 504–520.
Chang, C.-I, S. Wang, K.H. Liu, and C. Lin. 2011d. Progressive band dimensionality expansion
and reduction for hyperspectral imagery. IEEE Journal of Selected Topics in Applied Earth
Observations and Remote Sensing 4(3): 591–614.
Chang, C.-I, C.H. Wen, and C.C. Wu. 2013. Relationship exploration among PPI, ATGP and VCA
via theoretical analysis. International Journal of Computational Science and Engineering 8(4):
361–367.
References 665
Chang, C.-I, W. Xiong, and C.H. Wen. 2014a. A theory of high order statistics-based virtual
dimensionality for hyperspectral imagery. IEEE Transactions on Geoscience and Remote
Sensing 52(1): 188–208.
Chang, C.-I, S.Y. Chen, L. Zhao, and C.C. Wu. 2014b. Endmember-specified virtual dimension-
ality in hyperspectral imagery. In 2014 I.E. international geoscience and remote sensing
symposium (IGARSS), Quebec, Canada, July 13–18.
Chang, C.-I, R. Schultz, M. Hobbs, S.-Y. Chen, Y. Wang, and C. Liu. 2015a. Progressive band
processing of constrained energy minimization. IEEE Transactions on Geoscience and Remote
Sensing 53(3): 1626–1637.
Chang, C.-I, C.C. Wu, K.H. Liu, H.M. Chen, C.C.C. Chen, and C.H. Wen. 2015b. Progressive
band processing of linear spectral unmixing for hyperspectral imagery. IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensing 8(7): 2583–2597.
Chang, C.-I, Y. Li, R. Schultz, M. Hobbs, and W.M. Liu. 2015c. Progressive band processing of
anomaly detection. IEEE Journal of Selected Topics in Applied Earth Observations and
Chang, C.-I, H.C. Li, M. Song, C. Liu, and L.F. Zhang. 2015d. Real-time constrained energy
minimization for subpixel detection. IEEE Journal of Selected Topics in Applied Earth
Chang, C.-I, H.C. Li and M. Song. 2017. Recursive geometric simplex growing analysis for
finding endmembers in hyperspectral imagery. IEEE Journal of Selected Topics in Applied
Earth Observation and Remote Sensing. To appear.
Chang, C.-I, L.-C. Lee, and D. Paylor. 2015e. Virtual dimensionality analysis for hyperspectral
imagery. In Satellite data compression, communication and processing XI (ST127), SPIE
International symposium on SPIE sensing technology + applications, Baltimore, MD,
20–24 April.
Chang, C.-I, Y. Li, and C.C. Wu. 2015f. Band detection in hyperspectral imagery by pixel purity
index. In 7th workshop on hyperspectral image and signal processing: evolution in remote
sensing, (WHISPERS), Tokyo, Japan, 2–5 June.
Chang, C.-I, C. Gao, and S.Y. Chen. 2015g. Recursive automatic target generation process. IEEE
Geoscience and Remote Sensing Letters 12(9): 1848–1852.
Chang, C.-I, W. Xiong, and S.Y. Chen. 2016a. Convex cone volume analysis for finding
endmembers in hyperspectral imagery. International Journal of Computational Science and
Engineering 12(2/3): 209–236.
Chang, C.-I, S.Y. Chen, H.C. Li, and C.-H. Wen. 2016b. A comparative analysis among ATGP,
VCA and SGA for finding endmembers in hyperspectral imagery. IEEE Journal of Selected
Topics in Applied Earth Observations and Remote Sensing (99):1–27.
Chang, C.-I, H.C. Li, C.C. Wu and M. Song. 2016c. Recursive geometric simplex growing
analysis for finding endmembers in hyperspectral imagery. IEEE Journal of Selected Topics
in Applied Earth Observations and Remote Sensing (99):1–13.
Chaudhry, F., C. Wu, W. Liu, C.-I Chang, and A. Plaza. 2006. Pixel purity index-based algorithms
for endmember extraction from hyperspectral imagery. In Recent advances in hyperspectral
signal and image processing, ed. C.-I Chang, 29–61. Trivandrum: Research Signpost, India.
Chen, S., C.F.N. Cowan, and P.M. Grant. 1991. Orthogonal least squares learning algorithm for
radial basis function networks. IEEE Transaction on Neural Networks 2: 302–309.
Chen, X., J. Chen, X.X. Jia, B. Somer, J. Wu, and P. Coppin. 2011. A quantitative analysis of
virtual endmembers’ increased impact on collinearity effect in spectral unmixing. IEEE Trans-
actions on Geoscience and Remote Sensing 49(8): 2945–2956.
Chen, S.-Y., D. Paylor, and C.-I Chang. 2013. Anomaly-specified virtual dimensionality. In SPIE
conference on satellite data compression, communication and processing IX (OP 405), San
Diego, CA, August 25–29.
Chen, S.-Y., Y. Wang, C.C. Wu, C. Liu, and C.-I Chang. 2014a. Real time causal processing of
anomaly detection in hyperspectral imagery. IEEE Transactions on Aerospace and Electronics
Systems 50(2): 1511–1534.
666 References
Chen, S.Y., D. Paylor, and C.-I Chang. 2014b. Anomaly discrimination in hyperspectral imagery.
In Satellite data compression, communication and processing X (ST146), SPIE international
symposium on SPIE sensing technology + applications, Baltimore, MD, 5–9 May.
Chen, S.Y., Y.C. Ouyang, and C.-I Chang. 2014c. Recursive unsupervised fully constrained least
squares methods. In 2014 I.E. international geoscience and remote sensing symposium
(IGARSS), Quebec Canada, July 13–18.
Chen, S.Y., Y.-C. Ouyang, C. Lin, H.-M. Chen, C. Gao, and C.-I Chang. (2015). Progressive
endmember finding by fully constrained least squares method. In 7th workshop on
hyperspectral image and signal processing: evolution in remote sensing (WHISPERS),
Tokyo, Japan, 2–5 June.
Cheng, Y. 1993. Multistage pulse code modulation (MPCM). MS thesis, Department of Electrical
Engineering, University of Maryland, Baltimore County, Baltimore, MD.
Chiang, S.-S., and C.-I Chang. 1999. Target subpixel detection for hyperspectral imagery using
projection pursuit. In EOS/SPIE symposium on remote sensing, conference on image and
signal processing for remote sensing V, SPIE, vol. 3871, Florence, Italy, September 20–24,
107–115.
———. 2001. Discrimination measures for target classification. In IEEE 2001 international
geoscience and remote sensing symposium, Sydney, Australia, July 9–13.
Chiang, S.-S., C.-I Chang, and I.W. Ginsberg. 2000. Unsupervised hyperspectral image analysis
using independent components analysis. In IEEE 2000 international geoscience and remote
sensing symposium, Hawaii, USA, July 24–28.
———. 2001. Unsupervised subpixel target detection for hyperspectral images using projection
pursuit. IEEE Transactions on Geoscience and Remote Sensing 39(7): 1380–1391.
Chowdhury, A., and M.S. Alam. 2007. Fast implementation of N-FINDR algorithm for
endmember determination in hyperspectral imagery. In Proceedings of SPIE, vol. 6565,
656526-1–656526-7.
Christophe, E., D. Leger, and C. Mailhes. 2005. Quality criteria benchmark for hyperspectral
imagery. IEEE Transactions on Geoscience and Remote Sensing 43(9): 2103–2114.
Conese, C., and F. Maselli. 1993. Selection of optimum bands from TM scenes through mutual
information analysis. ISPRS Journal of Photogrammetry and Remote Sensing 48(3): 2–11.
Cover, T., and J. Thomas. 1991. Elements of information theory. New York: Wiley.
Craig, M.D. 1994. Minimum-volume transforms for remotely sensed data. IEEE Transactions on
Dennison, P.E., and D.A. Roberts. 2003. Endmember selection for multiple endmember spectral
mixture analysis using endmember average RMSE. Remote Sensing of Environments 87:
123–135.
Dowler, A., and M. Andrews. 2011. On the convergence of N-FINDR and related algorithms: to
iterate or not to iterate? IEEE Geoscience and Remote Sensing Letters 8(1): 4–8.
Du, B., and L. Zhang. 2011. Random selection based anomaly detector for hyperspectral imagery.
Du, Q. 2000. Topics in hyperspectral image analysis. Baltimore County, MD: Department of
Computer Science and Electrical Engineering, University of Maryland.
Du, Q. 2012. A new sequential algorithm for hyperspectral endmember extraction. IEEE Geosci-
ence and Remote Sensing Letters, vol. 9, no. 4, pp. 695–699, July 2012.
Du, Q., and C.-I Chang. 1998. Radial basis function neural networks approach to hyperspectral
image classification. In 1998 Conference on information science and systems, 721–726.
Princeton, NJ: Princeton University.
———. 1999. An interference rejection-based radial basis function neural network approach to
hyperspectral image classification. In International joint conference on neural network,
Washington DC, 2698–2703.
———. 2000. A hidden Markov model-based spectral measure for hyperspectral image analysis.
In SPIE Conference algorithms for multispectral, hyperspectral, and ultraspectral imagery VI,
Orlando, FL, 375–385.
References 667
———. 2001a. A linear constrained distance-based discriminant analysis for hyperspectral image
classification. Pattern Recognition 34(2): 2001.
———. 2001b. An interference subspace projection approach to subpixel target detection. In SPIE
conference on algorithms for multispectral, hyperspectral and ultraspectral imagery VII,
Orlando, FL, 20–24 April, 570–577.
———. 2004a. Linear mixture analysis-based compression for hyperspectral image analysis.
———. 2004b. A signal-decomposed and interference-annihilated approach to hyperspectral
target detection. IEEE Transactions on Geoscience and Remote Sensing 42(4): 892–906.
———. 2007. Rethinking the effective assessment of biometric systems. http://newsroom.spie.
org/x17545.xml, 2007. doi:10.1117/2.1200711.0815.
———. 2008. 3D combination curves for accuracy and performance analysis of positive bio-
metrics identification. Optics and Lasers in Engineering 46(6): 477–490.
Du, Q., and J.E. Fowler. 2007. Hyperspectral image compression using JPEG2000 and principal
component analysis. IEEE Geoscience and Remote Sensing Letters 4: 201–205.
Du, Q. and R. Nevovei. 2005. Implementation of real-time constrained linear discriminant
analysis to remote sensing image classification in hyperspectral imagery. Pattern Recognition
38(4): 1–12.
Du, Q and R. Nekovei. 2009. Fast real-time onboard processing of hyperspectral imagery for
detection and classification. J. Real Time Image Processing 4(3):273–286.
Du, Q., C.-I Chang, D.C. Heinz, M.L.G. Althouse, and I.W. Ginsberg. 2000. Hyperspectral image
compression for target detection and classification. In IEEE 2000 international geoscience and
remote sensing symposium, Hawaii, USA, July 24–28.
Du, Q., H. Ren, and C.-I Chang. 2003. A comparative study for orthogonal subspace projection
and constrained energy minimization. IEEE Transactions on Geoscience and Remote Sensing
41(6): 1525–1529.
Du, Y., C.-I Chang, and P. Thouin. 2003b. An automatic system for text detection in single video
images. Journal of Electronic Imaging 12(3): 410–422.
Du, Q., W. Zhu, and J. Fowler. 2004a. Anomaly-based JPEG2000 compression of hyperspectral
imagery. IEEE Geoscience and Remote Sensing Letters 5(4): 696–700.
Du, Y., C.-I Chang, C.-C. Chang, F. D’Amico, and J.O. Jensen. 2004b. A new hyperspectral
measure for material discrimination and identification. Optical Engineering 43(8): 1777–1786.
Du, Z., M.K. Jeong, and S.G. Kong. 2007. Band selection of hyperspectral images for automatic
detection of poultry skin tumors. IEEE Transactions on Automatic Science and Engineering 4
(3): 332–339.
Du, Q., N. Raksuntorn, and N.H. Younan. 2008a. Variants of N-FINDR algorithm for endmember
extraction. In Proceedings of SPIE, vol. 7109, 71090G-1–71090G-8, 15–18 September.
Du, Q., N. Raksuntorn, N.H. Younan, and R.L. King. 2008b. Endmember extraction algorithms for
hyperspectral image analysis. Applied Optics 47(28): F77–F84.
Duda, R.O., and P.E. Hart. 1973. Pattern classification and scene analysis. New York: Wiley.
Eches, O., N. Dobigeon, and J.-Y. Tourneret. 2010. Estimating the number of endmembers in
hyperspectral images using the normal compositional model and a hierarchical Bayesian
algorithm. IEEE Journal of Selected Topics in Signal Processing 4(3): 582–591.
Epp, S.S. 1995. Discrete mathematics with applications, 2nd ed. Pacific Grove, CA: Brooks/Cole.
Fano, R.M. 1961. Transmission of information: a statistical theory of communication. New York:
Wiley.
Farrand, W., and J.C. Harsanyi. 1997. Mapping the distribution of mine tailing in the coeur
d’Alene river valley, Idaho, through the use of constrained energy minimization technique.
Filippi, A.M., and R. Archibald. 2009. Support vector machine-based endmember extraction.
Fisher, K., and C.-I Chang. 2011. Progressive band selection for satellite hyperspectral data
compression and transmission. Journal of Applied Remote Sensing 4(1): 041770.
668 References
Frost III, O.L. 1972. An algorithm for linearly constrained adaptive array processing. Proceedings
of IEEE 60: 926–935.
Fukunaga, K. 1982. Intrinsic dimensionality extraction. In Classification, pattern recognition and
reduction of dimensionality, Handbook of Statistics, vol. 2, ed. P.R. Krishnaiah and
L.N. Kanal, 347–360. Amsterdam: North-Holland.
———. 1990. Statistical pattern recognition, 2nd ed. New York: Academic.
Gao, C., and C.-I Chang. 2014. Recursive automatic target generation process for unsupervised
hyperspectral target detection. In 2014 I.E. international geoscience and remote sensing
symposium (IGARSS), Quebec Canada, July 13–18.
Gao, C., S.Y. Chen, and C.-I Chang. 2014. Fisher’s ratio-based criterion for finding endmembers in
hyperspectral imagery. In Satellite data compression, communication and processing X
(ST146), SPIE international symposium on SPIE sensing technology + applications, Balti-
more, MD, 5–9 May.
Gao, C., Y. Li, and C.-I Chang. 2015a. Finding endmember classes in hyperspectral imagery. In
Satellite data compression, communication and processing XI (ST127), SPIE international
symposium on SPIE sensing technology + applications, Baltimore, MD, 20–24 April.
Gao, C., S.-Y. Chen, H.M. Chen, C.C. Wu, C.H. Wen, and C.-I Chang. 2015b. Fully abundance-
constrained endmember finding for hyperspectral images. In 7th Workshop on hyperspectral
image and signal processing: evolution in remote sensing (WHISPERS), Tokyo, Japan,
2–5 June.
Garzon, E.M., I. Garcia, and A. Plaza. 2012. Anomaly detection based on a parallel kernel RX
algorithm for multicore platforms. Journal of Applied Remote Sensing 6: 061503-1–061503-
10.
Gelb, A., ed. 1974. Applied optimal estimation. Cambridge: MIT Press.
Geng, X., K. Sun, L. Ji, and Y. Zhao. 2014. A fast volume-gradient-based band selection method
for hyperspectral image. IEEE Transactions on Geoscience and Remote Sensing 52(11):
7111–7119.
Gersho, A., and R.M. Gray. 1992. Vector quantization and signal compression. New York: Kluwer
Academics.
Gillespie, A.R., M.O. Smith, J.B. Adams, S.C. Willis, A.F. Fischer III, and D.E. Sabol. 1990.
Interpretation of residual images: spectral mixture analysis of AVIRIS images, Owens valley,
California. In Proceedings of 2nd AVIRIS workshop, 243–270.
Goetz, A.F.H., and J.W. Boardman. 1989. Quantitative determination of imaging spectrometer
specifications based on spectral mixing models. In Proceedings of IEEE international geosci-
ence and remote sensing symposium’89, 1036–1039.
Golub, G.H., and G.F. Van Loan. 1989. Matrix computations, 2nd ed. Baltimore: John Hopkins
University Press.
Gonzalez, R.C., and R.E. Woods. 2007. Digital Image Processing, 3rd ed. Upper Saddle River:
Prentice-Hall.
Green, A.A., M. Berman, P. Switzer, and M.D. Craig. 1988. A transformation for ordering
multispectral data in terms of image quality with implications for noise removal. IEEE
Transactions on Geoscience and Remote Sensing 26: 65–74.
Greg, I. 2010. An evaluation of three endmember extraction algorithms, ATGP, ICA-EEA and
VCA. In Hyperspectral image and signal processing: evolution in remote sensing (WHIS-
PERS), 1–4.
Gruniger, J., A.J. Ratkowski, and M.L. Hoje. 2004. The sequential maximum angle convex cone
(SMACC) endmember model. In Proceedings of SPIE, algorithms and technologies for
multispectral, hyperspectral, and ultraspectral imagery X, vol. 5425, 1–14.
Guilfoyle, K. 2003. Application of linear and nonlinear mixture models to hyperspectral imagery
analysis using radial basis function neural networks. Baltimore County, Baltimore, MD:
Department of Computer Science and electrical Engineering, University of Maryland.
References 669
Guilfoyle, K., M.L.G. Althouse, and C.-I Chang. 2001. A quantitative and comparative analysis of
linear and nonlinear spectral mixture models using radial basis function neural networks. IEEE
———. 2002. Further results on linear and nonlinear mixture models for analyzing hyperspectral
imagery. In SPIE conference on algorithms and technologies for multispectral, hyperspectral
and ultraspectral imagery VIII, SPIE 4725, Orlando, FL, 1–5 April.
Harsanyi, J.C. 1993. Detection and classification of subpixel spectral signatures in hyperspectral
image sequences. Baltimore County, MD: Department of Electrical Engineering, University of
Maryland.
Harsanyi, J.C., and C.-I Chang. 1994. Hyperspectral image classification and dimensionality
reduction: an orthogonal subspace projection approach. IEEE Transactions on Geoscience
and Remote Sensing 32(4): 779–785.
Harsanyi, J.C., W. Farrand, and C.-I Chang. 1994a. Detection of subpixel spectral signatures in
hyperspectral image sequences. In Annual meeting, proceedings of American society of
photogrammetry and remote sensing, Reno, 236–247.
Harsanyi, J.C., W. Farrand, J. Hejl, and C.-I Chang. 1994b. Automatic identification of spectral
endmembers in hyperspectral image sequences. In International symposium on spectral sens-
ing research ‘94 (ISSSR), San Diego, July 10–15, 267–277.
Haskell, K.H., and R.J. Hanson. 1981. An algorithm for linear least squares problems with equality
and nonnegativity constraints generalized. Mathematical Programming 21: 98–118.
Heinz, D., and C.-I Chang. 2001. Fully constrained least squares linear mixture analysis for
material quantification in hyperspectral imagery. IEEE Transactions on Geoscience and
Heute, A.R. 1986. Separation of soil-plant spectral mixtures by factor analysis. Remote Sensing
Environment 19: 237–251.
Heylen, R., and P. Scheunders. 2013. Hyperspectral intrinsic dimensionality estimation with
nearest-neighbor distance ratios. IEEE Journal of Selected Topics in Applied Earth Observa-
tions and Remote Sensing 6(2): 570–579.
Hofmann, T., B. Scgikkopf, and A.J. Smola. 2008. Kernel methods in machine learning. Annals of
Statistics 36(3): 1171–1220.
Honeine, P. and C. Richard. 2012. Geometric unmixing of large hyperspectral images: a
barycentric coordinate approach. IEEE Trans. on Geoscience and Remote Sensing 50(4):
2185–2195.
Hsueh, M. 2004. Adaptive causal anomaly detection. M.S. thesis, Department of computer Science
and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD.
———. 2007. Reconfigurable computing for algorithms in hyperspectral image processing. Ph.D.
dissertation, Department of Computer Science and Electrical Engineering, University of
Maryland, Baltimore County, Baltimore, MD.
Hsueh, M., and C.-I Chang. 2004. Adaptive causal anomaly detection for hyperspectral imagery.
In IEEE international geoscience and remote sensing symposium, Alaska, September 20–24.
———. 2008. Field programmable gate arrays for pixel purity index using blocks of skewers for
endmember extraction in hyperspectral imagery. International Journal of High Performance
Computing Applications 22(4): 408–423.
Huang, R., and M. He. 2005. Band selection based feature weighting for classification of
hyperspectral data. IEEE Geoscience and Remote Sensing Letters 2(2): 156–159.
HYMSMO. 1998. HYperspectral MASINT support to military operations program.
Hyvarinen, A., and E. Oja. 1997. A fast fixed-point for independent component analysis. Neural
Computation 9(7): 1483–1492.
———. 2000. Independent component analysis: algorithms and applications. Neural Networks 13:
411–430.
Hyvarinen, A., J. Karhunen, and E. Oja. 2001. Independent component analysis. New York:
Wiley.
670 References
Ifarragaerri, A., and C.-I Chang. 1999. Hyperspectral image segmentation with convex cones.
Ifarraguerri, A. 2000. Hyperspectral image analysis with convex cones and projection pursuit.
Baltimore County, MD: Department of Computer Science and Electrical Engineering, Uni-
versity of Maryland.
Ifarragaerri, A., and C.-I Chang. 2000. Multispectral and hyperspectral image analysis with
projection pursuit. IEEE Transactions on Geoscience and Remote Sensing 38(6): 2529–2538.
Jensen, J.R. 1996. Introductory digital image processing: A remote sensing perspective, 2nd
ed. Prentice-Hall, NJ: Upper Saddle River.
Jenson, S.K., and F.A. Waltz. 1979. Principal components analysis and canonical analysis in
remote sensing. In Proceedings American Society for photogrammetry 45th annual meeting,
337–348.
Ji, B. 2006. Constrained linear spectral mixture analysis. Ph.D. dissertation, Department of
Computer Science and Electrical Engineering, University of Maryland, Baltimore County,
Baltimore, MD.
Ji, B., and C.-I Chang. 2006. Principal components analysis-based endmember extraction algo-
rithms. In Recent advances in hyperspectral signal and image processing, ed. C.-I Chang,
63–91. Trivandrum, Kerala: Research Signpost, India.
Ji, B., C.-I Chang, J.O. Jensen, and J.L. Jensen. 2004. Unsupervised constrained linear Fisher’s
discriminant analysis for hyperspectral image classification. In 49th Annual meeting, SPIE
international symposium on optical science and technology, imaging spectrometry IX
(AM105), vol. 5546, 344–353. Denver, CO, August 2–4.
Jiao, X. 2010. Unsupervised hyperspectral target detection and classification. Ph.D. dissertation,
Department of Computer Science and Electrical Engineering, University of Maryland, Balti-
more County, Baltimore, MD.
Jiao, X., and C.-I Chang. 2008. Kernel-based constrained energy minimization (KCEM). In SPIE
conference on algorithms and technologies for multispectral, hyperspectral, and ultraspectral
imagery XIV, March 16–20, Orlando, FL.
Jimenez, L.O., and D.A. Landgrebe. 1999. Hyperspectral data analysis and supervised feature
reduction via projection pursuit. IEEE Transactions on Geoscience and Remote Sensing 37(6):
2653–2667.
Johnson, P., M. Smith, S. Taylor-George, and J. Adams. 1983. A semiempirical method for
analysis of the reflectance spectra of binary mineral mixtures. Journal of Geophysical
Research 88: 3557–3561.
Johnson, W.R., D.W. Wilson, W. Fink, M. Humayun, and G. Bearman. 2007. Snapshot
hyperspectral imaging in ophthalmology. Journal of Biomedical Optics 12(1): 014036-
1–014036-7.
Kailath, T. 1968. An innovations approach to least squares estimation Part I: linear filtering in
additive while noise. IEEE Transaction on Automatic Control 13(6): 646–655.
Kailath, T. 1980. Linear systems. Upper Saddle River: Prentice Hall.
Kanaev, A.V., E. Allman, and J. Murray-Krezan. 2009. Reduction of false alarms caused by
background boundaries in real time subspace RX anomaly detection. In Proceedings of SPIE,
vol. 7334, 733405.
Kelly, E.J. 1986. An adaptive detection algorithm. IEEE Transactions on Aerospace and Elec-
tronic Systems 22: 115–127.
Keshava, N., and J.F. Mustard. 2002. Spectral unmixing. IEEE Signal Processing Magazine 19(1):
44–57.
Khazai, S., S. Homayouni, A. Safari, and B. Mojaradi. 2011. Anomaly detection in hyperspectral
images based on an adaptive support vector method. IEEE Transactions on Geoscience and
Kraut, S., and L. Scharf. 1999. The CFAR adaptive subspace detector is a scale-invariant GLRT.
IEEE Transactions of Signal Processing 47(9): 2538–2541.
References 671
Kraut, S., L. Scharf, and L.T. McWhorter. 2001. Adaptive subspace detector. IEEE Transactions
on Signal Processing 49(12): 3005–3014.
Kullback, S. 1959/1968. Information theory and statistics. New York: Wiley.
Kuybeda, O., D. Malah, and M. Barzohar. 2007. Rank estimation and redundancy reduction of
high-dimensional noisy signals with preservation of rare vectors. IEEE Transactions on Signal
Processing 55(12): 5579–5592.
Kwan, C., B. Ayhan, G. Chen, J. Wang, B. Ji, and C.-I Chang. 2006. A novel approach for spectral
unmixing, classification, and concentration estimation of chemical and biological agents. IEEE
Kwon, H., and N.M. Nasrabadi. 2005a. Kernel RX-algorithm: a nonlinear anomaly detector for
hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing 43(2):
388–397.
———. 2005b. Kernel orthogonal subspace projection for hyperspectral signal classification.
Kwon, H., S.Z. Der, and N.M. Nasrabadi. 2003. Adaptive anomaly detection using subspace
separation for hyperspectral imagery. Optical Engineering 42(11): 3342–3351.
Langrebe, D.A. 2003. Signal theory methods in multispectral remote sensing. Hoboken: Wiley.
Lawson, C.L., and R.J. Hanson. 1995. Solving least squares problems, CAM, vol. 15. Philadelphia,
PA: SIAM.
Leadbetter, M. 1987. Extremes and related properties of random sequences and processes.
New York: Springer.
Lee, L.C., and C.-I Chang. 2016. An information theoretical approach to multiple band selection
for hyperspectral imagery. In 2016 I.E. geoscience and remote sensing symposium, Beijing,
China, July 10–15.
Lee, D.D., and N.S. Seung. 1999. Learning the parts of objects by non-negative matrix factoriza-
tion. Science 401(21): 788–791.
Lee, J.B., A.S. Woodyatt, and M. Berman. 1990. Enhancement of high spectral resolution remote
sensing data by a noise-adjusted principal components transform. IEEE Transactions on
Lee, T.W., M.S. Lewicki, and M. Girolami. 1999. Bind source separation of more sources than
mixtures using overcomplete representations. IEEE Transactions on Signal Processing 6(4):
87–90.
Lee, L.-C., D. Paylor, and C.-I Chang. 2015. Anomaly discrimination and classification for
hyperspectral imagery. In 7th Workshop on hyperspectral image and signal processing:
evolution in remote sensing (WHISPERS), Tokyo, Japan, 2–5 June.
Li, H.C. 2016. Simplex Volume Analysis for Finding Endmembers in Hyperspectral Imagery,
more County, Baltimore, MD, May 2016.
Li, H.C., and C.-I Chang. 2015a. An orthogonal projection approach to simplex growing algorithm
for finding endmembers in hyperspectral imagery. In 7th Workshop on hyperspectral image
and signal processing: evolution in remote sensing (WHISPERS), Tokyo, Japan, 2–5 June.
———. 2015b. Linear spectral unmixing using least squares error, orthogonal projection and
simplex volume for hyperspectral Images. In 7th Workshop on hyperspectral image and signal
processing: evolution in remote sensing (WHISPERS), Tokyo, Japan, 2–5 June.
———. 2016a. Progressive band processing of fast iterative pixel purity index. In Remotely sensed
data compression, communications, and processing XII, Part of SPIE commercial + scientific
sensing and imaging, 17–21 April.
———. 2016b. Real time hyperspectral anomaly detection via band-interleaved by line. In
Remotely sensed data compression, communications, and processing XII, part of SPIE
commercial + scientific sensing and imaging, 17–21 April.
———. 2016c. Recursive orthogonal projection-based simplex growing algorithm. IEEE Trans-
actions on Geoscience and Remote Sensing 54(7): 3780–3793. doi:10.1109/TGRS.2016.
2527737.
672 References
Li, H.C., M. Song, and C.-I Chang. 2014a. Finding analytical solutions to abundance fully-
constrained linear spectral mixture analysis. In 2014 I.E. international geoscience and remote
sensing symposium (IGARSS), Quebec, Canada, July 13–18.
Li, Y., S.Y. Chen, C. Gao, and C.-I Chang. 2014b. Endmember variability resolved by pixel purity
index in hyperspectral imagery. In Satellite data compression, communication and processing
X (ST146), SPIE international symposium on SPIE sensing technology + applications, Balti-
more, MD, 5–9 May.
Li, H.C., M. Song, and C.-I Chang. 2015a. Simplex volume analysis for finding endmembers in
hyperspectral imagery. In Satellite data compression, communication and processing XI
(ST127), SPIE international symposium on SPIE sensing technology + applications, Balti-
more, MD, 20–24 April.
Li, Y., H.C. Li, C. Gao, and M. Song. 2015b. Progressive band processing of pixel purity index for
finding endmembers in hyperspectral imagery. In Satellite data compression, communication
and processing XI (ST127), SPIE international symposium on SPIE sensing technology
+ applications, Proceedings of SPIE, vol. 9501, 95010U-1–95010U-10, Baltimore, MD,
20–24 April.
Li, Y., C. Gao, and C.-I Chang. 2015c. Progressive band processing of automatic target generation
process. In 7th Workshop on hyperspectral image and signal processing: evolution in remote
sensing (WHISPERS), Tokyo, Japan, 2–5 June.
Li, H.C., C.-I Chang, and L. Wang. 2016. Constrained multiple band selection for hyperspectral
imagery. In 2016 I.E. geoscience and remote sensing symposium, Beijing, China, July 10–15.
Liu, W. 2005. Supervised and unsupervised classification for purdue indian pines test site.
M.S. thesis, Department of Computer Science and Electrical Engineering, University of
Maryland, Baltimore County, MD.
———. 2008. Unsupervised hyperspectral target recognition. Ph.D. dissertation, Department of
Computer Science and Electrical Engineering, University of Maryland, Baltimore County,
MD.
———. 2011. Progressive band prioritization and selection for linear spectral mixture analysis.
Baltimore County, Baltimore, MD: Department of Computer Science and Electrical Engineer-
ing, University of Maryland.
Liu, W., and C.-I Chang. 2004. A nested spatial window-based approach to target detection for
hyperspectral imagery. In IEEE international geoscience and remote sensing symposium,
Alaska, September 20–24.
———. 2008. Multiple window anomaly detection for hyperspectral imagery. In IEEE interna-
tional geoscience and remote sensing symposium, Boston, MA, July 6–11.
———. 2011. Dynamic band selection for hyperspectral imagery. In International geoscience and
remote sensing symposium, 24–29 July, Vancouver, Canada.
———. 2013. Multiple window anomaly detection for hyperspectral imagery. IEEE Journal of
Liu, W., C.-I Chang, S. Wang, J. Jensen, J. Jensen, H. Hnapp, R. Daniel, and R. Yin. 2005. 3D
ROC analysis for detection software used in water monitoring. In OpticsEast, chemical and
biological standoff detection III (SA03), Boston, MA, October 23–26.
Liu, W., C.-C. Wu, and C.-I Chang. 2007. An orthogonal subspace projection-based estimation of
virtual dimensionality for hyperspectral data exploitation. In Algorithms and technologies for
multispectral, hyperspectral, and ultraspectral imagery XIII, SPIE defense and security sym-
posium, Orlando, FL, April 9–13.
Liu, K., E. Wong, Y. Du, C.C.C. Chen, and C.-I Chang. 2012. Kernel-based linear spectral mixture
analysis. IEEE Geoscience and Remote Sensing Letters 9(1): 129–133.
Lopez, S., P. Horstrand, G.M. Callico, J.F. Lopez, and R. Sarmiento. 2012a. A low-computational-
complexity algorithm for hyperspectral endmember extraction: modified vertex component
analysis. IEEE Geoscience and Remote Sensing Letters 9(3): 502–506.
References 673
———. 2012b. A novel architecture for hyperspectral endmember extraction by means of the
modified vertex component analysis (MVCA) algorithm. IEEE Journal of Selected Topics in
Applied Earth Observation and Remote Sensing 5(6): 1837–1848.
Luo, B., J. Chanussot, S. Douté, and L. Zhang. 2013. Empirical automatic estimation of the
number of endmembers in hyperspectral images. IEEE Geoscience and Remote Sensing Letters
10(1): 24–28.
Ma, W.-K., J.M. Bioucas-Dias, T.-H. Chan, N. Gillis, P. Gader, A.J. Plaza, A. Ambikapathi, and
C.-Y. Chi. 2014. A signal processing perspective on hyperspectral unmixing. IEEE Signal
Processing Magazine 31(1): 67–81.
Malinowski, E.R. 1977. Determination of the number of factors and experimental error in a data
matrix. Analytical Chemistry 49: 612–617.
Manolakis, D., and G. Shaw. 2002. Detection algorithms for hyperspectral imaging applications.
IEEE Signal Processing Magazine 19(1): 29–43.
Manolakis, D., C. Siracusa, and G. Shaw. 2001. Hyperspectral subpixel target detection using the
linear mixture model. IEEE Transactions on Geoscience and Remote Sensing 39(7):
1392–1409.
Mausel, P.W., W.J. Kramber, and J.K. Lee. 1990. Optimum band selection for supervised
classification of multispectral data. Photogrammetric Engineering and Remote Sensing 56
(1): 55–60.
Mavrovouniotis, M.L., A.M. Harper, and A. Ifarraguerri. 1994a. Classification of pyrolysis mass
spectra of biological agents using convex cones. Journal of Chemometrics 8: 305–33.
———. 1994b. Convex-cone analysis of the time profiles of pyrolysis mass spectra of biological
agents. U.S. Army Edgewood Research, Development and Engineering Center, Technical
Report ERDEC-CR-130.
Mavrovouniotis, M.L., A.M. Harper, and A. Ifarraguerri. 1996. A method for extracting patterns
from pyrolysis mass spectra, Computer assisted analytical spectroscopy, 189–240. Chichester:
Wiley.
Metz, C.E. 1978. ROC methodology in radiological imaging. Investigative Radiology 21:
720–723.
Miao, L., and H. Qi. 2007. Endmember extraction from highly mixed data using minimum volume
constrained nonnegative matrix factorization. IEEE Transactions on Geoscience and Remote
Sensing 45(3): 765–777.
Moon, T.K., and W.C. Stirling. 2000. Mathematical methods and algorithms for signal processing.
Upper Saddle River: Prentice-Hall.
Mustard, J.F., and C.M. Pieters. 1987. Abundance and distribution of ultramafic microbreccia in
moses rock dike: quantitative application of mapping spectroscopy. Journal of Geophysical
Research 92(B10): 10376–10390.
Nascimento, J.M.P., and J.M. Dias. 2005. Vertex component analysis: a fast algorithm to unmix
hyperspectral data. IEEE Transactions on Geoscience and Remote Sensing 43(4): 898–910.
Neville, R.A., K. Staenz, T. Szeredi, J. Lefebvre, and P. Hauff. 1999. Automatic endmember
extraction from hyperspectral data for mineral exploration. In Proceedings of 4th international
airborne remote sensing conference and exhibition/21st Canadian symposium on remote
sensing, Ottawa, Ontario, Canada, 21–24.
Oja, E., and M. Plumbly. 2004. Blind separation of positive sources by globally convergent
gradient search. Neural Computation 16: 1811–1825.
Pauca, V.P., J. Piper, and R.J. Plemmons. 2006. Nonnegative matrix factorization for spectral data
analysis. Linear Algebra and its Applications 416(1): 29–47.
Paylor, D. 2014. Second order statistics target-specified virtual dimensionality. Department of
Computer Science and Electrical Engineering, University if Maryland, Baltimore County, MD.
Paylor, D., and C.-I Chang. 2013. Second-order statistics-specified virtual dimensionality. In SPIE
conference on algorithms and technologies for multispectral, hyperspectral and ultraspectral
imagery XIX (DS122), 29 April–3 May 2013, Baltimore, MD.
674 References
———. 2014. A theory of least squares target-specified virtual dimensionality in hyperspectral

imagery. In Satellite data compression, communication and processing X (ST146), SPIE
international symposium on SPIE sensing technology + applications, Baltimore, MD,
5–9 May.
Pesses, M.E. 1999. A least squares-filter vector hybrid approach to hyperspectral subpixel
demixing. IEEE Transactions on Geoscience and Remote Sensing 37(2): 846–849.
Plaza, A., and C.-I Chang. 2005. An improved N-FINDR algorithm in implementation. In
Conference on algorithms and technologies for multispectral, hyperspectral, and ultraspectral
imagery XI, SPIE symposium on defense and security, SPIE, vol. 5806, Orlando, FL,
28 March–1 April.
———. 2006. Impact of initialization on design of endmember extraction algorithms. IEEE
———, ed. 2007a. High performance computing in remote sensing. Boca Raton: CRC Press.
———. 2007b. Specific issues about high-performance computing in remote sensing, non-literal
analysis versus image-based processing. In High-performance computing in remote sensing,
ed. A. Plaza and C.-I Chang. Boca Raton: CRC Press.
———. 2007c. Clusters versus FPGAs for real-time processing of hyperspectral imagery. Inter-
national Journal of High Performance Computing Applications 22(4): 366–385.
Plaza, A., P. Martı́nez, R. Pérez, and J. Plaza. 2004. A quantitative and comparative analysis of
endmember extraction algorithms from hyperspectral data. IEEE Transactions on Geoscience
and Remote Sensing 42(3): 650–663.
Plaza, A., D. Valencia, C.-I Chang, and J. Plaza. 2006. Parallel implementation of endmember
extraction algorithms from hyperspectral data. IEEE Geoscience and Remote Sensing Letters 3
(7): 334–338.
Poor, H.V. 1994. An introduction to detection and estimation theory, 2nd ed. New York: Springer.
Pudil, P., J. Novovicova, and J. Kittler. 1994. Floating search methods in feature selection. Pattern
Recognition Letters 15: 1119–1125.
Qian, S.-E. 2004. Hyperspectral data compression using fast vector quantization algorithm. IEEE
Qian, S.-E., A.B. Hollinger, D. Williams, and D. Manak. 1996. Fast three-dimensional data
compression of hyperspectral imagery using vector quantization with spectral-feature-based
binary coding. Optical Engineering 35(11): 3242–3249.
Rabiner, L., and B.-H. Juamg. 1993. Fundamentals of speech recognition. Upper Saddle River:
Prentice-Hall.
Ramakrishna, B. 2004. Principal components analysis (PCA)-based spectral/spatial hyperspectral
image compression. MS thesis, Department of Computer Science and Electrical Engineering,
University of Maryland, Baltimore County, Baltimore, MD.
Ramakrishna, B., J. Wang, A. Plaza, and C.-I Chang. 2005. Spectral/spatial hyperspectral image
compression in conjunction with virtual dimensionality. In Conference on algorithms and
technologies for multispectral, hyperspectral, and ultraspectral imagery XI, SPIE symposium
on defense and security, SPIE, vol. 5806, Orlando, FL, 28 March 28–1 April.
Ramey, N.I., and M. Scoumekh. 2006. Hyperspectral anomaly detection within the signal sub-
space. IEEE Geoscience and Remote Sensing Letters 3(3): 312–316.
Reed, I.S., and X. Yu. 1990. Adaptive multiple-band CFAR detection of an optical pattern with
unknown spectral distribution. IEEE Transactions on Acoustic, Speech and Signal Processing
38(10): 1760–1770.
Ren, H. 1998. A comparative study of mixed pixel classification versus pure pixel classification for
multi/hyperspectral imagery. Department of Computer Science and Electrical Engineering,
University of Maryland, Baltimore County, MD.
———. 2000. Unsupervised and generalized orthogonal subspace projection and constrained
energy minimization for target detection and classification in remotely sensed imagery.
more County, MD.
References 675
Ren, H., and C.-I Chang. 1998. A computer-aided detection and classification method for
concealed targets in hyperspectral imagery. In IEEE 1998 international geoscience and remote
sensing symposium, Seattle, WA, July 5–10, 1016–1018.
———. 1999. A constrained least squares approach to hyperspectral image classification. In 1999
conference on information science and systems, 551–556. Baltimore, MD: Johns Hopkins
University.
———. 2000a. A generalized orthogonal subspace projection approach to unsupervised multi-
spectral image classification. IEEE Transactions on Geoscience and Remote Sensing 38(6):
2515–2528.
———. 2000b. Target-constrained interference-minimized approach to subpixel target detection
for hyperspectral imagery. Optical Engineering 39(12): 3138–3145.
———. 2003. Automatic spectral target recognition in hyperspectral imagery. IEEE Transactions
on Aerospace and Electronic Systems 39(4): 1232–1249.
Ren, H., Q. Du, J. Wang, C.-I Chang, and J. Jensen. 2006. Automatic target recognition
hyperspectral imagery using high order statistics. IEEE Transactions on Aerospace and
Electronic Systems 42(4): 1372–1385.
Research Systems, Inc. 2001. ENVI user’s guide. Boulder, CO: Research Systems, Inc.
Resmini, R.S., M.E. Kappus, W.S. Aldrich, J.C. Harsanyi, and M. Anderson. 1997. Mineral
mapping with HYperspectral Digital Imagery Collection Experiment (HYDICE) sensor data
at Cuprite, Nevada, USA. International Journal of Remote Sensing 18(17): 1553–1570.
Richards, J.A., and X. Jia. 1999. Remote sensing digital image analysis. New York: Springer.
Roberts, D.A., M. Gardner, R. Church, S.L. Ustin, G. Scheer, and R.O. Green. 1998. Mapping
chaparral in the Santa Monica mountains using multiple endmember spectral mixture model.
Robinove, C.J. 1982. Computation with physical values from Landsat digital data. Photogram-
metric Engineering and Remote Sensing 48: 781–784.
Roger, R.E. 1994. A fast way to compute the noise-adjusted principal components transform
matrix. IEEE Transactions on Geoscience and Remote Sensing 32(1): 1194–1196.
———. 1996. Principal components transform with simple, automatic noise adjustment. Interna-
tional Journal of Remote Sensing 17(14): 2719–2727.
Roger, R.E., and J.F. Arnold. 1996. Reliably estimating the noise in AVIRIS hyperspectral
imagers. International Journal of Remote Sensing 17(10): 1951–1962.
Roggea, D.M., B. Rivarda, J. Zhanga, A. Sancheza, J. Harrisb, and J. Feng. 2007. Integration of
spatial–spectral information for the improved extraction of endmembers. Remote Sensing of
Environment 110(3): 287–303.
Rosario D. 2012. A semiparametric model for hyperspectral anomaly detection. Journal of
Electrical and Computer Engineering. 2012: Article ID 425947, 30 p.
Sabol, D.E., J.B. Adams, and M.O. Smith. 1992. Quantitative sub-pixel spectral detection of
targets in multispectral images. Journal of Geophysical Research 97: 2659–2672.
Safavi, H., and C.-I Chang. 2008. Projection pursuit-based dimensionality reduction. In SPIE
conference on algorithms and technologies for multispectral, hyperspectral, and ultraspectral
imagery XIV, March 16–20, Orlando, FL.
Safavi, H., K. Liu, and C.-I Chang. 2011. Dynamic dimensionality reduction for hyperspectral
imagery. In SPIE conference on algorithms and technologies for multispectral, hyperspectral
and ultraspectral imagery XVII, 25–29 April 2011, Orlando, FL.
Scharf, L.L. 1991. Statistical signal processing. Boston: Addison-Wesley.
Schowengerdt, R.A. 1997. Remote sensing: models and methods for image processing, 2nd
ed. New York: Academic.
Schultz, R.C., S.Y. Chen, Y. Wang, C. Liu, and C.-I Chang. 2013. Progressive band processing of
anomaly detection. In SPIE conference on satellite data compression, communication and
processing IX (OP 405), San Diego, CA, August 25–29.
Schultz, R.C., M. Hobbs, and C.-I Chang. 2014. Progressive band processing of simplex growing
algorithm for finding endmembers in hyperspectral imagery. In Satellite data compression,
676 References
communication and processing X (ST146), SPIE international symposium on SPIE sensing

technology + applications, Baltimore, MD, 5–9 May.
Serpico, S.B., and L. Bruzzone. 2001. A new search algorithm for feature selection in
hyperspectral remote sensing images. IEEE Transactions on Geoscience and Remote Sensing
39(7): 1360–1367.
Settle, J.J. 1996. On the relationship between spectral unmixing and subspace projection. IEEE
Settle, J.J., and N.A. Drake. 1993. Linear mixing and estimation of ground cover proportions.
International Journal of Remote Sensing 14(6): 1159–1177.
Shimabukuro, Y.E., and J.A. Smith. 1991. The least-squares mixing models to generate fraction
images derived from remote sensing multispectral data. IEEE Transactions on Geoscience and
Remote Sensing 29: 16–20.
Singer, R.B., and T.B. McCord. 1979. Mars: large scale mixing of bright and dark surface
materials and implications for analysis of spectral reflectance. In 10th Proceedings of lunar
planetary science conference, 1835–1848.
Smith, M.O., P.E. Johnson, and J.B. Adams. 1985. Quantitative determination of mineral types and
abundances from reflectance spectra using principal components analysis. Journal of Geo-
physical Research 90: C797–C904.
Smith, M.O., J.B. Adams, and D.E. Sabol. 1994a. Spectral mixture analysis—new strategies for
the analysis of multispectral data. In Image spectroscopy—a tool for environmental observa-
tions, ed. J. Hill and J. Mergier, 125–143. Brussels and Luxembourg.
Smith, M.O., D.A. Roberts, J. Hill, W. Mehl, B. Hosgood, J. Verdebout, G. Schmuck, C. Koechler,
and J.B. Adams. 1994b. A new approach to quantifying abundances of materials in multispec-
tral images. In Proceedings of IEEE international geoscience and remote sensing sympo-
sium’94, 2372–2374, Pasadena, CA.
Song, M., and C.-I Chang. 2015. A theory of recursive orthogonal subspace projection for
hyperspectral imaging. IEEE Transactions on Geoscience and Remote Sensing 53(6):
3055–3072.
Song, M., Y. Li, C.-I Chang, and L. Zhang. 2014a. Recursive orthogonal vector projection
algorithm for linear spectral unmixing. In IEEE GRSS WHISPERS 2014 conference (workshop
on hyperspectral image and signal processing: evolution in remote sensing, Lausanne, Swit-
zerland, June 24–27.
Song, M., H.C. Li, C.-I Chang, and Y. Li. 2014b. Gram-Schmidt orthogonal vector projection for
hyperspectral unmixing. In 2014 I.E. international geoscience and remote sensing symposium
(IGARSS), 2934–2937, Quebec, Canada, July 13–18.
Song, M., S.-Y. Chen, H.C. Li, H.M. Chen, C.C.C. Chen, and C.-I Chang. 2015. Finding virtual
signatures for linear spectral unmixing. IEEE Journal of Selected Topics in Applied Earth
Stark, H., and J. Woods. 1994. Probability, random processes, and estimation for engineers, 3rd
ed. Upper Saddle River: Prentice-Hall.
Stearns, S.D., B.E. Wilson, and J.R. Peterson. 1993. Dimensionality reduction by optimal band
selection for pixel classification of hyperspectral imagery. Applications of Digital Image
Processing XVI, SPIE 2028: 118–127.
Stein, P. 1966. A note on the volume of a simplex. American Mathematical Monthly 73(3):
299–301.
Stein, D.W., S.G. Beaven, L.E. Hoff, E.M. Winter, A.P. Schaum, and A.D. Stocker. 2002.
Anomaly detection from hyperspectral imagery. IEEE Signal Processing Magazine 19(1):
58–69.
Stellman, C.M., G.G. Hazel, F. Bucholtz, J.V. Michalowicz, A. Stocker, and W. Scaaf. 2000. Real-
time hyperspectral detection and cuing. Optical Engineering 39(7): 1928–1935.
Swayze, G.A. 1997. The hydrothermal and structural history of the Cuprite Mining District,
southwestern Nevada: an integrated geological and geophysical approach. Ph.D. dissertation,
University of Colorado Boulder.
References 677
Swets, J.A., and R.M. Pickett. 1982. Evaluation of diagnostic systems: methods from signal
detection theory. New York: Academic.
Tarabalka, T., T.V. Haavardsholm, I. Kasen and T. Skauli. 2009. Real-time anomaly detection in
hyperspectral images using multivariate normal mixture models and GPU processing. J. Real
Time Image Processing 4(3): 287–300.
Thai, B., and G. Healey. 1999. Invariant subpixel material identification in hyperspectral imagery.
In SPIE, vol. 3717.
———. 2002. Invariant subpixel material detection in hyperspectral imagery. IEEE Transactions
on Geoscience and Remote Sensing 40(3): 599–608.
Theiler, J., D. Lavernier, N.R. Harvey, S.J. Perkins and J.J. Szymanski. 2000a. Using blocks of
skewers for faster computation of pixel purity In Proceedings of SPIE, vol. 4132, 61–71.
Theiler, J., J. Frigo, M. Gokhle, and J.J. Szymanski. 2000b. FPGA implementation of pixel purity
index method. In Proceedings of SPIE, vol. 4132.
Tompkins, S., J.F. Mustarrd, C.M. Pieters, and D.W. Forsyth. 1997. Optimization of targets for
spectral mixture analysis. Remote Sensing of Environment 59: 472–489.
Tou, J.T., and R.C. Gonzalez. 1974. Pattern recognition principles, 92–94. Reading, MA: Addi-
son-Wesley.
Tu, T.M. 2000. Unsupervised signature extraction and separation in hyperspectral images: a noise-
adjusted fast independent component analysis approach. Optical Engineering 39(4): 897–906.
Tu, T.M., C.-H. Chen, and C.-I Chang. 1997. A posteriori least squares orthogonal subspace
projection approach to weak signature extraction and detection. IEEE Transactions on Geo-
science and Remote Sensing 35(1): 127–139.
Tu, T.M., H.C. Shy, C.-H. Lee, and C.-I Chang. 1999. An oblique subspace projection to mixed
pixel classification in hyperspectral images. Pattern Recognition 32(8): 1399–1408.
Tu, T.-M., C.H. Lee, C.S. Chiang, and C.P. Chang. 2001. A visual disk approach for determining
data dimensionality in hyperspectral imagery. Proceedings of National Science Council 25(4):
219–231.
Tzou, K.H. 1987. Progressive image transmission: a review and comparison of techniques. Optical
Engineering 26(7): 581–589.
Valencia, D., A. Plaza, M.A. Vega-Rodriguez, and R.M. Perez. 2005. FPGA design and imple-
mentation of a fast pixel purity index algorithm for endmember extraction in hyperspectral
imagery. In SPIE East, Boston.
Van Veen, B.D., and K.M. Buckley. 1988. Beamforming: a versatile approach to spatial filtering.
IEEE ASSP Magazine 5: 4–24.
Wang, J. 2006a. Applications of independent component analysis to hyperspectral data exploita-
tion. Ph.D. dissertation, Department of Computer Science and Electrical Engineering, Univer-
sity of Maryland, Baltimore County, Baltimore, MD.
Wang, S. 2006b. Statistical signal processing applications to hyperspectral signature character-
ization. Ph.D. dissertation, Department of Computer Science and Electrical Engineering,
Wang, J., and C.-I Chang. 2004. A uniform projection-based unsupervised detection and classi-
fication for hyperspectral imagery. In IEEE international geoscience and remote sensing
symposium, Alaska, September 20–24.
———. 2005. Dimensionality reduction by independent component analysis for hyperspectral
image analysis. In IEEE International geoscience and remote sensing symposium, Seoul,
Korea, July 25–29.
———. 2006a. A low probability detector-based unsupervised background suppression, target
detection and classification for hyperspectral imagery. In Recent advances in hyperspectral
signal and image processing, ed. C.-I Chang, 141–169. Trivandrum: Research Signpost, India.
———. 2006b. Independent component analysis-based dimensionality reduction with applica-
tions in hyperspectral image analysis. IEEE Transactions on Geoscience and Remote Sensing
44(6): 1586–1600.
678 References
———. 2006c. Applications of independent component analysis in endmember extraction and

abundance quantification for hyperspectral imagery. IEEE Transactions on Geoscience and
———. 2007. Variable-number variable-band selection for feature characterization in
hyperspectral signatures. IEEE Transactions on Geoscience and Remote Sensing 45(9):
2979–2992.
———. 2016. Multiple band selection for anomaly detection in hyperspectral imagery. In 2016 I.
E. geoscience and remote sensing symposium, Beijing, China, July 10–15.
Wang, L., and M. Goldberg. 1988. Progressive image transmission by transform coefficient
residual error quantization. IEEE Transactions on Communications 36(1): 75–87.
———. 1989. Progressive image transmission using vector quantization on images in pyramid
form. IEEE Transactions on Communications 37(12): 1339–1349.
Wang, C.M., C.C. Chen, S.-C. Yang, Y.-N. Chung, P.C. Chung, C.W. Yang, and C.-I Chang. 2002.
An unsupervised orthogonal subspace projection approach to MR image classification MR
images for classification. Optical Engineering 41(7): 1546–1557.
Wang, J., C.-I Chang, C.-C. Chang, and C. Lin. 2004a. Binary coding for remotely sensed imagery.
In 49th annual meeting, SPIE international symposium on optical science and technology,
imaging spectrometry IX (AM105), Denvor, CO, August 2–4.
Wang, S., C.-I Chang, J.L. Jensen, and J.O. Jensen. 2004b. Spectral abundance fraction estimation
of materials using Kalman filters. In Optics east, chemical and biological standoff detection II
(OE120), Philadelphia, PA, October 25–28.
Wang, J., C.-I Chang, H.-M. Chen, C.C.C. Chen, J.W. Chai, and Y.C. Ouyang. 2005. 3D ROC
analysis for medical diagnosis evaluation. In 27th annual international conference of IEEE
engineering in medicine and biology society (EMBS), September 1–4, 2005, Shanghai, China.
Wang, L., X. Jia, and Y. Zhang. 2007. A novel geometry-based feature-selection technique for
hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing Letters 4(1):
171–175.
Wang, Y., L. Guo, and N. Liang. 2009. Using a new search strategy to improve the performance of
N-FINDR algorithm for endmember determination. In 2nd International congress on signal
and image processing, Tianjin, China.
Wang, S., C.-M. Wang, M.-L. Chang, and C.-I Chang. 2010. New applications of Kalman filtering
approach to hyperspectral signature estimation, identification and abundance quantification.
IEEE Sensors Journal 10(3): 547–563.
Wang, L., F. Wei, D. Liu, and Q. Wang. 2013a. Fast implementation of maximum simplex
volume-based endmember extraction in original hyperspectral data space. IEEE Journal of
Wang, L., D. Liu, and Q. Wang. 2013b. Geometric method of fully constrained least squares linear
spectral mixture analysis. IEEE Transactions on Geoscience and Remote Sensing 51(6):
3558–3566.
Wang, Y., R.C. Schultz, S.Y. Chen, C. Liu, and C.-I Chang. 2013c. Progressive constrained
energy minimization for subpixel detection. In SPIE conference on algorithms and technolo-
gies for multispectral, hyperspectral and ultraspectral imagery XIX, 29 April–3 May 2013,
Baltimore, MD.
Wang, Y., S.Y. Chen, C. Liu, and C.-I Chang. 2014a. Background suppression issues in anomaly
detection for hyperspectral imagery. In Satellite data compression, communication and
processing X (ST146), SPIE international symposium on SPIE sensing technology
+ applications, Baltimore, MD, 5–9 May.
Wang, Y., C.H. Zhao, and C.-I Chang. 2014b. Anomaly detection using sliding causal windows. In
2014 I.E. international geoscience and remote sensing symposium (IGARSS), Quebec Canada,
July 13–18.
Winter, M.E. 1999a. Fast autonomous spectral endmember determination in hyperspectral data. In
Proceedings of 13th international conference on applied geologic remote sensing, vol. II,
337–344, Vancouver, BC, Canada.
References 679
———. 1999b. N-finder: an algorithm for fast autonomous spectral endmember determination in
hyperspectral data. In Image spectrometry V, Proceedings of SPIE, vol. 3753, 266–277.
———. 2004. A proof of the N-FINDR algorithm for the automated detection of endmembers in a
hyperspectral image. In Proceedings of SPIE, vol. 5425, 31–41.
Winter, E.M., M.J. Schlangen, A.B. Hill, C.G. Simi, and M.E. Winter. 2002. Tradeoffs for real-
time hyperspectral analysis. In Proceedings of SPIE, vol. 4725, Algorithms and technologies
for multispectral, hyperspectral, and ultraspectral imagery VIII, 366–371.
Wong, W.W. 2003. Application of linear algebra. http://convexoptimization.com/TOOLS/gp-r.
pdf.
Wu, C.-C. 2006. Exploration of methods of estimation on number of endmember.MS thesis,
———. 2009. Design and analysis of maximum simplex volume-based endmember extraction
algorithms. Ph.D. dissertation, Department of Computer Science and Electrical Engineering,
Wu, C.-C., and C.-I Chang. 2006. Automatic algorithms for endmember extraction. In SPIE
conference imaging spectrometry XI, SPIE symposium on optics and photonics, vol. 6302,
13–17 August, San Diego, CA.
———. 2012. Iterative pixel purity index. In 4th IEEE GRSS workshop on hyperspectral image
and signal processing—evolution in remote sensing (WHISPERS), 12–14 June, Shanghai,
China.
Wu, C.-C., W. Liu, H. Ren, and C.-I Chang. 2007. A comparative study and analysis between
vertex component analysis and orthogonal subspace projection for endmember extraction. In
Algorithms and technologies for multispectral, hyperspectral, and ultraspectral imagery XIII,
SPIE defense and security symposium, Orlando, FL, April 9–13.
Wu, C.C., S. Chu, and C.-I Chang. 2008. Sequential N-FINDR algorithm. In SPIE conference on
imaging spectrometry XIII, August 10–14, San Diego.
Wu, C.-C., C.S. Lo, and C.-I Chang. 2009. Improved process for use of a simplex growing
algorithm for endmember extraction. IEEE Geoscience and Remote Sensing Letters 6(3):
523–527.
Wu, C.-C., H.M. Chen, and C.-I Chang. 2010. Real-time N-finder processing algorithms. Journal f
Real-Time Image Processing 7(2): 105–129. doi:10.1007/s11554-010-0151-z.
Wu, C.C., G.S. Huang, K.H. Liu, and C.-I Chang. 2013. Real-time progressive band processing of
modified fully abundance-constrained spectral unmixing. In 2013 I.E. international geoscience
and remote sensing symposium, 21–26 July, Melbourne, Australia.
Xiong, W. 2011. Estimation of effective spectral dimensionality for hyperspectral imagery.
Xiong, W. C.T. Tsai, C.W. Yang, and C.-I Chang. 2010. Convex cone-based endmember extrac-
tion for hyperspectral imagery. In SPIE, vol. 7812, San Diego, CA, August 2–5.
Xiong, W., C.-C. Wu, C.-I Chang, K. Kapalkis, and H.M. Chen. 2011. Fast algorithms to
implement N-FINDR for hyperspectral endmember extraction. IEEE Journal of Selected
Topics in Applied Earth Observations and Remote Sensing 4(3): 545–564.
Yang, S., J. Wang, C.-I Chang, J.L. Jensen, and J.O. Jensen. 2004. Unsupervised image classifi-
cation for remotely sensed imagery. In 49th Annual meeting, SPIE international symposium on
optical science and technology, imaging spectrometry IX (AM105), Denvor, CO, August 2–4.
Yang, H., J. An, and C. Zhu. 2014. Subspace-projection-based geometric unmixing for material
quantification in hyperspectral imagery. IEEE Journal of Selected Topics in Applied Earth
Yu, X., I.S. Reed, and A.D. Stocker. 1993. Comparative performance analysis of adaptive
multispectral detectors. IEEE Transactions on Signal Processing 41(8): 2639–2656.
Zare, A., and P. Gader. 2007. Sparsity promoting iterated constrained endmember detection in
hyperspectral imagery. IEEE Geoscience and Remote Sensing Letters 4(3): 446–450.
680 References
Zenzo, S.D., R. Bernstein Degloria, and H.C. Kolsky. 1987a. Gaussian maximum likelihood and
contextual classification algorithms for multicrop classification. IEEE Transactions on Geo-
science and Remote Sensing GE-25(6): 805–814.
Zenzo, S.D., R. Bernstein, S.D. Degloria, and H.C. Kolsky. 1987b. Gaussian maximum likelihood
and contextual classification algorithms for multicrop classification experiments using the-
matic mapper and multispectral scanner sensor data. IEEE Transactions on Geoscience and
Remote Sensing GE-25: 815–824.
Zhang, Y., and M.C. Amin. 2001. Array processing for nonstationary interference suppression in
DS/SS communications using subspace projection techniques. IEEE Transactions on Signal
Processing 49(12): 3005–3014.
Zhang, X., and C.H. Chen. 2002. New independent component analysis method using higher order
statistics with applications to remote sensing images. Optical Engineering 41: 1717–1728.
Zhao, L., S.Y. Chen, M. Fan, and C.-I Chang. 2014. Endmember-specified virtual dimensionality
in hyperspectral imagery. In 2014 I.E. international geoscience and remote sensing symposium
(IGARSS), Quebec Canada, July 13–18.
Zortea, M., and A. Plaza. 2009. A quantitative and comparative analysis of different
implementations of N-FINDR: a fast endmember extraction algorithm. IEEE Geoscience
and Remote Sensing Letters 6(4): 787–791.
Index
A BIL, 199–205
Abundance nonnegativity constraint (ANC), CBCRM, 423–425
265, 320, 358 CK-AD, 161
Abundance sum-to-one constraint computational complexity, 169–171
(ASC), 265–267, 320, 358 b-simulated image, 171–172
Adaptive GSVA (AGSVA), 336–337 Intel i5-2500 3.3 GHz CPU and 8 GB
Adaptive linear mixing model (ALMM), 270 RAM, 177–178
Adaptive recursive hyperspectral sample mixed and subpanel pixels, 171–172
processing of LSMA per pixel vector, 178–179
(ARHSP-LSMA), 264 target embeddedness, 174–177, 180
Gaussian distribution, 280 target implanted, 172–175, 180
implementation, 276 computing time analysis, 439–441, 446
linear spectral unmixing-based finding CR-AD, 161
signatures, 278–279 features, 158
OSP-based finding signatures, 277–278 GUI design, 444, 447
posteriori probability distribution, 281 HYDICE image, 185–193
randomized decision rule, 281 area under curve, 434, 435, 437–439
Airborne Visible InfraRed Imaging AVIRIS, 435, 440
Spectrometer (AVIRIS), 343, 578 detected abundance fractions, 429–431
anomaly detection, 181–185 K-AD detection map, 160–161, 436,
CEM, 149–156 442, 445–446
virtual dimensionality processed bands, 429, 432–436
LCVF scene, 18–19, 115 R-AD detection map, 161, 435, 436,
2OS-and HOS-based methods, 116 441, 443–444
reflectance and radiance data, 112–114 scene and ground truth map, 428–429
scene, Cuprite, 15–18, 112 3D plots, 429, 432
signal energy and strength, MOCA and matrix inverse calculation
ATGP, 116–118 causal sample covariance/correlation
Analytical Imaging and Geophysics (AIG), 544 matrix, 162–164
Anomaly detection (AD) sample correlation matrix identity, 166
AVIRIS data, 181–185 Woodbury matrix identity, 165
background suppression, 182–184, PHBP-AD, 422–423
192–199 real-time causal K-AD, 168–169

DOI 10.1007/978-3-319-45171-8
682 Index
Anomaly detection (AD) (cont.) binary composite hypothesis testing

real-time causal R-AD, 167–168 problem, 216
recursive equation, 631–632 computational complexity, 223–225
RHBP-K-AD, 426–428 flowchart, 213–214
RHBP-R-AD, 425–426 matrix inverse identity, 211–212
sample spectral correlation MOSP, 215
matrix, 641–642 Neyman–Pearson detection theory,
ASC. See Abundance sum-to-one constraint 216–217
(ASC) stopping rule, 221–223
Automatic target detection and classification target signal sources, 212–213
algorithm (ATDCA). See Automatic TSVD, 215–216
target generation process (ATGP) SGA, 452
Automatic target generation process (ATGP), synthetic image data, 217–220
293, 321, 359, 486 target subspace, 210
advantages, 452 TE experiments, 462–463, 467–470
bit plane coding, 452 TI experiments, 462–467
BSQ format, 452 VCA, 452
computing time analysis, 477–479
GUI, 478–480
hyperspectral target detection, 452, 476 B
implementation, 454–455 Band fusion (BF), 650
information, 632 Band-inter-leaved-by-pixel (BIP), 133–135,
OPCI, 211 199–205, 484, 544–545, 628, 630
2OS-based theory, 95–97 Band-interleaved-by-sample (BIS), 484,
OSP, 643 544–545, 628, 630
PHBP-ATGP, 452–453 Band selection (BS), 9, 505, 530
PPI, 453–454 Band-sequence (BSQ) format, 3, 4, 128–129,
real image experiments, 473 484, 544–545, 628
generated target pixels, 472, 474 Band tuning (BT), 650–651
ground truth map, 470, 471 Bit plane coding, 484
HYDICE image, 470, 471
minimal nl, 469, 472, 474
Neyman–Pearson detection theory, 476 C
number of signatures, 475–476 Causal band correlation matrix (CBCRM)
orders of panel pixels extraction, 475 anomaly detection, 423–425
spatial and spectral resolution, 470 definition, 401–402
3D detection maps, 471, 472 information types, 403
3D histogram, 476 inverse of, 402–403
remote sensing imagery, 452 Causal K-AD (CK-AD), 161
RHBP-ATGP Causal R-AD (CR-AD), 161
advantage, 477 Causal sample correlation matrix (CSCRM),
end loop, 458–459 631, 640
flowchart, 459 CCA. See Convex cone analysis (CCA)
inner loop, 458 CCVA. See Convex cone volume analysis
l bands processing, 460 (CCVA)
MRHBP-ATGP, 460, 461 Computer processing time analysis
number of processed bands, 460 cumulative computing time, 350, 353
outer loop, 458 OP-based algorithm, 354
recursive equations, 455–458 processing times, 353, 354
target vector, 460–461 VCA, 354–356
RHSP-ATGP Constrained energy minimization (CEM)
advantages, 214–215 active detection, 126
Index 683
advantages, 125–126, 631 observation equation, 50–51

AVIRIS data, 149–156 optimal Kalman gain, 53–54
BIL format, 133–135 orthogonality principle, 54–56
BSQ format, 128–129 posteriori state estimates, 51–53
causal process, 7 prediction, 56–58
CBCRM, 401–403 priori state estimates, 51–53
computational complexity, 135–137 state equation, 50
FIR, 126–127
GUI, 419–420
HYDICE data, 137, 140–148 E
detected abundance fractions, 407–411 Earth Observer 1 (EO-1) satellite, 414
ground truth map, 405–406 Endmember extraction algorithms (EEAs),
panel signatures, 406–407 74, 262
ROC curves, 411–415 Endmember finding algorithm (EFA), 31–32,
3D progressive plots, 407, 412–414 358, 629
hyperion data ENvironment for Visualizing Images (ENVI)
areas of interest, 414–417 software system, 544
computational time, 418–419
EO-1 satellite, 414
progressive performance, 419 F
Westinghouse Bay signature, 418 False alarm probability, 240
hyperspectral image, 127 Fast iterative pixel purity index (FIPPI),
implementation, 640–641 551, 597–599
linearly constrained optimization problem, Finite impulse filter (FIR), 126–127
127–128 Fully constrained least-squares (FCLS)
mixed pixel issues, 628–629 method, 94–95
near real time, 125
PHBP-CEM, 400–401
recursive formula, 403–405 G
RT-CEM Gauss–Markov random process, 6
BIP/BIS format, 129–132 Geographical information system (GIS), 628
data sample/line, 129 Geometric simplex growing algorithm
detection maps, 129–130 (GSGA), 636
subpixel detection, 399–400, 628–629 ASC and ANC, 358
time analysis, 148–149 ATGP, 359–360
Convex cone analysis (CCA), 320 computational complexity
Convex cone volume analysis (CCVA), 320 Dist-SGA, 379
Cuprite Mining District image data, 24–25 DSGA, 360, 379
Gram–Schmidt orthogonalization
process, 380
D OPSGA, 360, 379–380
Determinant-based SGA (DSGA), 322, 361 recursive GSGA, 380
Determinant-based SV (DSV), 33, 321, 359, recursive OPSGA, 380
635 Cramer’s rule, 364
Discrete-time Kalman filtering DSV, 359, 365
information, 49–50 EFAs, 358
KF-SCSP end loop, 538
advantages, 61 equation, 534, 536–538
KF-SSE, 62–64, 68–69 flowchart, 538
KF-SSI, 64–67, 69–70 GSOP, 360, 366–370
KF-SSQ, 67–68, 70–71 GSV, 362–363
one-dimensional signal processing, 61 j-dimensional vector, 535
LSMA, 58–60 l-dimensional vector, 535
684 Index
Geometric simplex growing algorithm (GSGA) recursive hyperspectral band processing

(cont.) GSGA, 534–538
N-FINDR, 358 OPSGA, 533–535
number of endmembers, 378 SGA, 530
outer loop, 537 SV-based endmember finding
p-dimensional abundance vector, 364 algorithms, 530
PPI, 358
real image experiments
computer processing time analysis, H
392–395 Harsanyi–Farrend–Chang (HFC) method,
HYDICE data, 381–384 74–76, 81–82, 282, 283
radiance data, 388–392 High-order statistics (HOS)-specified target,
reflectance data, 384–388 97–99
recursive hyperspectral sample processing Hyperion data
orthogonal subspace projection-based areas of interest, 23–24, 414–417
RHSP-GSGA, 370–373 computational time, 418–419
orthogonal vector projection-based EO-1 satellite, 414
RHSP-GSGA, 373–376 progressive performance, 419
RHSP-GSGA vs. RHSP-OPSGA, Westinghouse Bay signature, 418
376–377 Hyperspectral band subset selection, 650–651
SVD, 359 HYperspectral Digital Imagery Collection
Geometric simplex volume (GSV), Experiment (HYDICE), 137,
635–636 140–148, 244–255, 281, 578,
calculation, 321 613–614
Cayley–Menger determinant, 37 abundance fractions, 284, 286
corollary, 42–43 anomaly detection, 185–193
vs. DSV, 44–46 area under curve, 434, 435, 437–439
hypertetrahedron, 34 AVIRIS, 435, 440
initial condition, 40–42 detected abundance fractions, 429–431
orthogonal projection, 39–40 K-AD detection map, 436, 442,
parallelotope, 37–39 445–446
three-dimensional four-vertex simplex, processed bands, 429, 432–436
34–36 R-AD detection map, 435, 436, 441,
three-dimensional parallelotope, 35–36 443–444
three-endmember-vertex simplex, 38–39 scene and ground truth map, 428–429
two-dimensional three-vertex 3D plots, 429, 432
simplex, 34–36 ATGP, UNCLS, and UFCLS, 283
Gram–Schmidt orthogonalization process background signatures, 23
(GSOP), 360 CEM
Graphical user interface (GUI), 586, 589, 594 detected abundance fractions, 407–411
anomaly detection, 444, 447 ground truth map, 405–406
ATGP, 478–480 panel signatures, 406–407
RHBP-CEM, 419–420 ROC curves, 411–415
Growing simplex volume analysis (GSVA) 3D progressive plots, 407, 412–414
band prioritization, 530 noise-whitened HFC, 282
band selection, 530 panel pixels extract, 284, 285
N-FINDR, 530 panel scene, 281–282
OPSGA, recursive equations, 531–533 real image experiments, 519
real image experiments spatial locations, 21
endmembers identification, 540, 541 spectral signatures, 22
ground truth map, 539 target pixels extract, 284, 285
GSGA vs. RHBP-GSGA, 539, 540 TSVD, 110–112
HYDICE image, 539 values of nRHSP-LSMA, 281–282
progressive magnitude changes, Hyperspectral single band selection, 650
540, 541 Hypertetrahedron, 34
Index 685
I ULSOSP algorithm, 92–93

Innovation information, 5 UNCLS algorithm, 94
Interband spectral information (IBSI), 78, Linear spectral mixture analysis (LSMA),
87–88 229–230
Iterative pixel purity index (IPPI), 647 ANC, 265, 267–269
background, 604 ARHSP-LSMA, 264
C-IPPI, implementation of, 550–553 Gaussian distribution, 280
FIPPI, 551, 597–599 implementation, 276
HYDICE image, 613–614 linear spectral unmixing-based finding
nVD¼9, 615–618, 620, 623 signatures, 278–279
nVD¼18, 24, 615–22 OSP-based finding signatures, 277–278
P-IPPI posteriori probability distribution, 281
advantage, 549, 552 randomized decision rule, 281
growing skewer set, 553–554 TSVD, 279
implementation, 549, 551, 553 ULSMA, 275
recursive hyperspectral band processing, ASC, 265–267
600–601 band selection, 505
recursive skewer processing, 602–603 benefits, 507
RHBP computational complexity, 510–511
fixed skewer set, 555–557 innovation information, 509
general algorithm, 555 processed information, 509
varying skewer sets with bands, 557–560 recursive process, 509
TE experiments EEAs, 262
cyan upper triangles, 609 FCLS algorithm, 269
endmember candidates, 608–611 HYDICE, 281
ground truth pixels, 610, 612 abundance fractions, 284, 286
number of processed bands, 611, 613 ATGP, UNCLS, and UFCLS, 283
spatial locations, 610, 612, 613 noise-whitened HFC, 282
TI experiments, 605–608 panel pixels extract, 284, 285
VC-IPPI, 553 panel scene, 281–282
target pixels extract, 284, 285
values of nRHSP-LSMA,
K 281–282
Kalman filter (KF), 361, 485 implementation, 634
Kalman filter-based spectral signature Kalman filter, 58–60
estimator (KF-SSE), 62–64 least-squares, 265, 633
Kalman filter-based spectral signature LSU, 263
identifier (KF-SSI), 64–67, 69–70 N-FINDR, 262
Kalman filter-based spectral signature OSP, 265, 644–646
quantifier (KF-SSQ), 67–68, 70–71 PHBP-LSMA, 506
Kalman-filter spectral characterization signal PHBP vs. BS, 506
processing (KF-SCSP) real image experiments, 519
advantages, 61 ATGP-generated BKG and target
KF-SSE, 62–64, 68–69 pixels, 520, 522–523
KF-SSI, 64–67, 69–70 computing time, 526, 527
KF-SSQ, 67–68, 70–71 ground truth map, 519
one-dimensional signal processing, 61 RHBP-NCLS and RHBP-FCLS
K-based anomaly detector (K-AD), 160–161 abundance fractions, 520–523
19 R panel pixels, 523–525
spatial and spectral resolution, 520
L RHBP-LSMA, 506, 507, 512–513
Least-squares orthogonal subspace projection RHSP-LSMA, 264
(LSOSP), 294, 485 ALMM, 270
UFCLS algorithm, 94–95 single signatures, 270–273
686 Index
Linear spectral mixture analysis (LSMA) sudden drops, 313, 317

(cont.) whitened data, 313
two signature-varying matrices, VD, 290, 304
274–275 virtual signatures
SLSMA and ULSMA, 262 least-squares estimation, 291–292
synthetic image experiments MLE-based algorithm, 295–297
computing times, 517–519 MLE error matrix, 295
RHBP-NCLS and RHBP-FCLS- OSP-based algorithm, 293–295
unmixed results, 515–517 Minimax-singular value decomposition
simulation, 513–514 (MX-SVD), 84–85
target embeddedness, 514–515 Minimum estimated abundance covariance
target implantation, 514, 515 (MEAC), 290
UCLS, 644–646 Minimum volume transform (MVT), 320
update equation, 507–509 Multispectral imaging (MSI), 648–649
VD, 262–263
VE and VS, 263
virtual dimensionality, 74, 506 N
Linear spectral unmixing (LSU), 58, NBE iterative multispectral imaging
263, 505, 506 (NBE-IMSI), 649
LSOSP. See Least-squares OSP (LSOSP) Neyman–Pearson detector (NPD), 6, 101,
Lunar Crater Volcanic Field (LCVF) 216–217, 340, 476
scene, 115 N-finder algorithm (N-FINDR), 262, 358,
530, 646
Noise-whitened HFC (NWHFC) method, 282
M Notations and terminology, 28
Matrix identities, 653
Matrix-inverse identity, 653
Matrix-vector inverse identity, 653 O
Maximal orthogonal complement algorithm OP-based Pixel Purity Index (PPI), 358
(MOCA), 77, 82–86, 338 Orthogonal projection-based SGA (OPSGA),
Maximal orthogonal subspace projection 321, 328–329
(MOSP), 215, 338 DSV, 323
Maximum likelihood estimation (MLE) finding endmembers, 324
endmember, 303 GSV, 325–326
LSMA, 634 recursive equations, 531–533
MEAC, 290 recursive hyperspectral band processing,
OSP, 289–290 533–535
real image experiments, 310–312 three-endmember simplex, 327
recursive prediction error equation, 634 2D and 3D simplexes, 323–324
RHSP-LS-based algorithm, 297–299 2D three-vertex simplex, 323
RHSP-MLE, 299–301 Orthogonal projection correlation index
SNR, 290 (OPCI), 211
stopping rule, 301–303 Orthogonal subspace projection (OSP), 265,
synthetic image 452
MEAC, 307–308 ATGP, 486, 643–644
nVS estimates, 306 band-interleaved-line, 484
set of 25 panels, 304–305 band-interleaved-pixel, 484
target embeddedness, 305 band-interleaved-sample, 484
target implantation, 305 BSQ, 484
VSs, 290, 304, 308–310, 635 computational complexity, 255–257
unmixed error analysis computing time analysis, 499–501
FCLS, 313–316 graphical user interface design, 502
HFC method, 313 history, 484
Index 687
HYDICE, 244–255 PPI count vs. number of processed

implementation, 633 bands and skewers, 585, 587–589
LSMA, 229–230 RHBP-C-IPPI process, 586, 593
LSOSP, 485 R panel pixels, in first column, 585, 586
PHBP version, 484–485 RHBP
progressive image processing, 484 efficient and effective, 545
real image experiments, 496–499 fixed skewer set, 545
RHBP-OSP varying skewer sets with bands, 545
flow chart, 489–490 skewers, 546, 547
recursive equations, 486–489 virtual dimensionality, 547
RHSP-OSP Processed data information, 5
automatic stopping rule, 241–244 Progressive hyperspectral band processing
computational complexity, 235–236 (PHBP), 9–10, 484–485, 544–545
implementation, 230 Progressive hyperspectral band processing
issues, 237 of ATGP (PHBP-ATGP), 452–453
recursive update equations, 231–235 Progressive hyperspectral band processing
time-consuming process, 235 of CEM (PHBP-CEM), 400–401
unsupervised target signal sources, Progressive hyperspectral band processing
237–238 of LSMA (PHBP-LSMA), 506
unwanted target signal sources, Progressive hyperspectral imaging (PHSI), 3–4
238–240 Progressive iterative PPI (P-IPPI)
signal signatures, 228–229 advantage, 549, 552
signal-to-noise ratio, 485 growing skewer set, 553–554
stages, 484 implementation, 549, 551, 553
synthetic image experiments Projection vector generation algorithm
simulation, 490–491 (PVGA), 97–98
TE experiments, 494–496
TI experiments, 491–494
R
Real-time causal K-AD, 168–169
P Real-time causal R-AD, 167–168
Pixel purity index (PPI), 453–454 Real-time constrained energy minimization
advantage, 544 (RT-CEM)
BIS/BIP vs. BSQ, 544–545 BIP/BIS format, 129–132
disadvantages, 546, 547 data sample/line, 129
endmembers, 544, 546, 547 detection maps, 129–130
ENVI software system, 544 Receiver operating characteristic (ROC)
IPPI (see (Iterative pixel purity index analysis, 411–415
(IPPI))) Recursive hyperspectral band processing,
MATLAB algorithm, 548 600–601
PHBP, 544–545 Recursive hyperspectral band processing
real image experiments of LSMA (RHBP-LSMA), 506
endmember candidates vs. number Recursive hyperspectral sample processing
of processed bands, 582, 584 of ATGP (RHSP-ATGP)
ground truth map, 578, 580 advantages, 214–215
HYDICE panel scene, 578, 580 binary composite hypothesis testing
IPPI and RHBP-IPPI, 582, 584 problem, 216
panel pixels, 578, 580, 582, 583 computational complexity, 223–225
pixels extraction, 585 flowchart, 213–214
PPI count vs. nl, 580–581 matrix inverse identity, 211–212
PPI count vs. number of bands and MOSP, 215
pixels, 586, 590–592 Neyman–Pearson detection theory,
216–217
688 Index
Recursive hyperspectral sample processing endmember finding algorithms, 320

of ATGP (RHSP-ATGP) (cont.) GSGA, 646–647
stopping rule, 221–223 GSVA-based algorithms
target signal sources, 212–213 AGSVA, 336–337
TSVD, 215–216 1-GSVA, 335–336
Recursive hyperspectral sample processing MVT deflates, 320
of LSMA (RHSP-LSMA), 264 N-FINDR, 646
ALMM, 270 number of endmembers
single signatures, 270–273 binary composite hypothesis testing
two signature-varying matrices, 274–275 problem, 339
Recursive hyperspectral sample processing binary hypothesis testing problem, 341
of OSP (RHSP-OSP) cumulative distribution function, 340
automatic stopping rule, 241–244 MOSP and MOCA, 338
computational complexity, 235–236 Neyman–Pearson detection problems,
implementation, 230 338, 340
issues, 237 posteriori probability distribution, 340
recursive update equations, 231–235 Shannon’s information theory, 340
time-consuming process, 235 OPSGA, 323, 328–329, 646–647
unsupervised target signal sources, finding endmembers, 324
237–238 GSV, 325–326
unwanted target signal sources, 238–240 three-endmember simplex, 327
Recursive skewer processing, 602–603 2D and 3D simplexes, 323–324
Recursive update algorithm (RUA), 2D three-vertex simplex, 323
234–235 real image experiments
AVIRIS image scene, 343, 345–346
comparative plots, 350–351
S Cuprite radiance data, 346–347
Second-order-statistics (2OS)-based Curpite reflectance data, 346–347
theory, 77 EIDA, 345–346
ATGP-specified targets, 8, 95–97 endmember pixels, 344, 347–350
LSOSP HYDICE data, 342–343
UFCLS algorithm, 94–95 SAM/SID of closet endmembers,
ULSOSP algorithm, 92–93 350, 353
UNCLS algorithm, 94 USGS Web site, 343
OSP-specified targets, 8 recursive OP-simplex growing algorithm
linear detection system, 90 recursive GSV calculation, 329–331
pixel vector, 89 RHSP-OPSGA, 333–335
pseudo-inverse, 89 RHSP-OPSGA equations, 331–333
signal-to-noise ratio, 90–92 simplex volume analysis, 323
UOSP algorithm, 92 Simplex volume (SV)-based endmember
SeQunetial N-FINDR (SQ N-FINDR), 320 finding algorithms (EFAs), 530
Shannon’s information theory, 340 Simplex volume (SV) calculation, 323
Signal-to-noise ratio (SNR), 290 DSV, 33
Simplex growing algorithm (SGA), 452, 530 EFAs, 31–32
ATGP, 321 eigenanalysis methods, 32
CCA and CCVA, 320 GSV
computer processing time analysis Cayley–Menger determinant, 37
cumulative computing time, 350, 353 corollary, 42–43
OP-based algorithm, 354 vs. DSV, 44–46
processing times, 353, 354 hypertetrahedron, 34
VCA, 354–356 initial condition, 40–42
DSGA, 322 orthogonal projection, 39–40
DSV calculation, 321 parallelotope, 37–39
Index 689
three-dimensional four-vertex simplex, RHBP-IPP, 560, 561

34–36 Target-specified virtual dimensionality
three-dimensional parallelotope, (TSVD). See Virtual dimensionality
35–36 (VD)
three-endmember-vertex simplex,
38–39
two-dimensional three-vertex simplex, U
34–36 Unmixed error analysis
HYDICE, 44, 45 FCLS, 313–316
PCA-DSV vs. PCA-DSV, 44–46 HFC method, 313
3D space, 43–44 sudden drops, 313, 317
Singular value decomposition (SVD), 359 whitened data, 313
Spectral Angle Mapper (SAM), 87 Unsupervised FCLS (UFCLS) algorithm,
SuCcessive N-FINDR (SC N-FINDR), 320 94–95
Supervised LSMA (SLSMA), 262 Unsupervised least-squares OSP method
(ULSOSP) algorithm, 92–93
Unsupervised LSMA (ULSMA), 262
T Unsupervised NCLS (UNCLS) algorithm, 94
Target embeddedness (TE) experiments, 27, Unsupervised OSP (UOSP) algorithm, 92
108–109, 174–177, 180
ATGP, 467–470
cyan upper triangles, 609 V
endmember candidates, 608–610 Varying skewer set C-IPPI (VC-IPPI), 553
ground truth pixels, 610, 612 Vector component analysis (VCA), 452
LSMA, 514–515 Virtual dimensionality (VD), 262–263, 506,
number of processed bands, 611, 613 547
OSP, 494–496 ATGP-NPD, 77–78
PPI AVIRIS data
endmember candidates, by RHBP-IPPI, LCVF scene, 115
569, 571 2OS-and HOS-based methods, 116
IPPI vs. RHBP-IPPI, 572 reflectance and radiance data, 112–114
mineral signatures, 569, 572 scene, Cuprite, 112
PPI count vs. nl, 569, 571 signal energy and strength, MOCA and
PPI count vs. nl and ns and pixels, ATGP, 116–118
575–578 EEAs, 74
PPI count vs. number of processed eigenanalysis
bands and skewers, 573–575 Bayesian detector, 78
RHBP-C-IPPI process, 578, 579 HFC method, 81–82, 85–86
spatial locations, 610, 612, 613 IBSI, 78
Target implantation (TI) experiments, 27, LSE/MSE, 75–76
108–109, 172–175, 180, 605–608 LSMA, 80
ATGP, 464–467 MOCA, 82–86
LSMA, 514, 515 NPD, 78, 80
OSP, 491–494 estimation techniques, 78, 79
PPI HFC method, 74–76
endmember candidates, 562 HYDICE data, 110–112
IPPI vs. RHBP-IPPI, 563 MOCA, 76–77
mineral signatures, 562, 563 Neyman–Pearson detection, 76
PPI count vs. nl, 561, 562 SSE/HySime method, 75
PPI count vs. nl and np, 566–569 synthetic image, 107–109
PPI count vs. nl and number of skewers, targets of interest
564–566 HOS-specified target, 97–99
RHBP-C-IPPI process, 566, 570 nVD value, 86
690 Index
Virtual dimensionality (VD) (cont.) L feature vectors, 100

2OS-based theory (see (Second-order- Neyman–Pearson detection
statistics (2OS)-based theory) theory, 101
spatial targets, 87–88 Virtual endmember (VE), 263
spectral targets, 87–88 Virtual signature (VS), 263
target-specified binary hypothesis testing least-squares estimation, 291–292
ATGP, 103–105 MLE-based algorithm, 295–297
data characterization-driven, MLE error matrix, 295
105–17 OSP-based algorithm, 293–295
data representation-driven, 105–107
eigenvalues/eigenvectors, 102
estimation techniques, 102 W
ICA-HFC, 102 Woodbury’s matrix identity, 653
k-MOCA, 101

Real Time Recursive Hyperspectral Sample and Band Processing PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Real Time Recursive Hyperspectral Sample and Band Processing PDF

Uploaded by

Copyright:

Available Formats

Chein-I

ISBN 978-3-319-45170-1 ISBN 978-3-319-45171-8 (eBook)

Library of Congress Control Number: 2016955822

© Springer International Publishing Switzerland 2017

Printed on acid-free paper

This Springer imprint is published by Springer Nature

Owing to recent advances of hyperspectral imaging technology, with as many as

hyperspectral data processing techniques. First, it produces preliminary results

Fall 2016 Chein-I Chang

4.6 Synthetic Image Experiments . . . . . . . . . . . . . . . . . . . . . . . . 107

Part II Sample Spectral Statistics-Based Recursive

6.6.3 Background Suppression . . . . . . . . . . . . . . . . . . . . . 192

Part III Signature Spectral Statistics-Based Recursive

9.4 Adaptive RHSP-LSMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

12 Recursive Hyperspectral Sample Processing of Geometric

Part IV Sample Spectral Statistics-Based Recursive Hyperspectral

14 Recursive Hyperspectral Band Processing

Part V Signature Spectral Statistics-Based Recursive

17 Recursive Hyperspectral Band Processing of Linear Spectral

20 Recursive Band Processing of Fast Iterative

Appendix A: Matrix Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653

Chein-I Chang is Professor in the Department of Computer Science and Electrical

Abstract With advanced remote sensing technology hyperpectral imaging has

© Springer International Publishing Switzerland 2017 1

r Image pixel vector

Fig. 1.2 Hyperspectral

1.2 Recursive Hyperspectral Sample Processing

Many hyperspectral imaging algorithms developed in Chang (2003a, b, 2013) are

Thus, practically speaking, a recursive process can be considered as a process-and-

1.2.1 Sample Spectral Statistics-Based Recursive

According to Chang (2016), hyperspectral target detection can be performed

1.2.2 Signature Spectral Statistics-Based Recursive

1.3 Recursive Hyperspectral Band Processing

1.3.1 Band Selection

Various spectral bands provide different levels of information of interest. The

the following optimization problem:

Depending on how the objective function J(ΩBS) is designed, the optimization in

1.3.2 Progressive Hyperspectral Band Processing

As noted earlier, the determination of nBS may be resolved by VD. Unfortunately,

1.3.3 Recursive Hyperspectral Band Processing

1.3.3.1 Sample Spectral Statistics-Based Recursive Hyperspectral Band

Section 1.2.1 develops sample spectral statistics-based RHSP for real-time

1.3.3.2 Sample Spectral Statistics-Based Recursive Hyperspectral

Similarly, a theory of signature spectral statistics-based RHBP can also be derived

1.4 Scope of Book

1.4.1 Part I: Fundamentals

Many important applications can be found in hyperspectral imaging. Part I presents

1.4.2 Part II: Sample Spectral Statistics-Based Recursive

1.4.3 Part III: Signature Spectral Statistics-Based Recursive

Unlike hyperspectral target detection, which relies on the covariance/correlation

1.4.4 Part IV: Sample Statistics-Based Recursive

• Chapter 13: RHBP for Active Target Detection: Constrained Energy

1.4.5 Part V: Signature Statistics-Based Recursive

1.5 Real Hyperspectral Images to Be Used in This Book

1.5.1 AVIRIS Data

1.5.1.1 Cuprite Data

Fig. 1.5 Five USGS ground truth mineral spectra

Cuprite Alteration Zones U

Fig. 1.7 Alteration map available from USGS

• Remove noisy bands from the five reflectance data.

1.5.1.2 Lunar Crater Volcanic Field

1.5.2 HYDICE Data

p211, p212, p22, p23

where h1 , h2 , h3 , , hj are j edges of a parallelotope Pj shown in Fig. 2.2, with the