You are on page 1of 5

Hyperspectral Image Remote Sensing Classification Using

RotBoost

I Gusti Ayu Agung Diatri Indradewi 1,*, Ni Luh Wiwik Sri Rahayu Ginantra 2, and Made Suci Ariantini3
1
STMIK STIKOM Indonesia, Informatic, Denpasar, Bali
2
STIMIK STIKOM Indonesia, Informatic, Denpasar, Bali
3
STIMIK STIKOM Indonesia, Informatic, Denpasar, Bali

Abstract. In machine learning, the classification of hyperspectral data has several challenges. The one challenge
is a huge data dimension. The solution given to overcome the challenge is to use Ensemble Learning. The benefit
of using Ensemble Learning is that we can improve the classification performance of hyperspectral data. One of
the Ensemble Learning methods is RotBoost, which is a combination of the Rotation Forest and Adaboost
methods. In this paper, the performance of the RotBoost method is evaluated by measuring its accuracy. In
addition, this paper also investigates the effect of the number of base classifiers and Boosting iterations on this
accuracy.

1 Introduction classifier. The training data used has been projected into the
new feature space using the principal componet analysis
Optical remote sensing has developed from grayscale (PCA) and several other methods. Experiments carried out
imagery to multispectral and hyperspectral. The role of the resulted in better accuracy than bagging, random forest, and
development of hardware technology also supports the adaboost.
availability of high spatial, spectral and temporal images The rotboost method proposed by [4], is a method
that are very useful for various applications. Compared to formed by combining the rotation forest and adaboost
RGB color images that have 3 bands, hyperspectral images methods. The rotboost method significantly increases
have hundreds of spectral bands. Hyperspectral images are predictive accuracy compared to the rotation forest and
able to provide more information than other types of images. adaboost methods when tested using 36 UCI datasets.
However, the hyperspectral image itself has challenges Rotboost also produces better performance than bagging and
including; high number of dimensions, number of output multiboost. The purpose of this study was to try to classify
classes, and limited data reference. Therefore, a good hyperspectral data especially indiana pines (AVIRIS) data
method is needed to classify hyperspectral data. Ensemble [5] using the rotboost algorithm by using cart as the base
learning is one promising solution considering its potential classifier and using PCA to project data into the new feature
to significantly improve accuracy. One of the existing space. In this study also will look for the number of rotation
ensemble learning methods is random forest [1]. The forest (K) base classifier and the optimal number of
random forest method can handle high dimensional data adaboost (S) iterations.
such as hyperspectral data properly. Improving the random
forest method, namely rotation forest [2], can improve the
accuracy and diversity of individuals in ensemble classifiers. 3 Material and Method

2 Related and Works 3.1 Proposed Method

[1] classifies hyperscpectral data using the random forest In this research we will make comparisons between
and adaboost tree based methods. In their research, the RotBoost and Rotation Forest method. Our first step is to do
random forest and adaboost tree based produce almost the data collection for our dataset from AVIRIS, then we split
same accuracy. In terms of training time, random forest dataset to training and validation using cross validation.
produces better performance. [3] performed classification After splitting the data, the RotBoost and Rotation Forest
using the rotation forest method by using cart as the base method applied to the classification to get the prediction

*
Corresponding author: diatri.indradewi@stiki-indonesia.ac.id
Title of the conference

result. The next step we apply RotBoost and Rotation Forest


method which we tested for predictions. Our end result
computes the results of the performance of each algorithm.
Our proposed method explain on Fig. 1.

Data Collection AVIRIS


Start

Split dataset to training


Compare Result
and validation using Performance
cross validation

End

Applied RotBo ost and


Rotation Forest

Prediction Result

Fig. 1. Our Proposed Method

Table 1. Rotation Forest Algorithms


Input
 X: training set ( N ×n matrix)
3.2 Algorithm Comparison
 Y: Label of training set ( N ×1 matrix)
 L: number of classifiers in the ensemble
3.2.1 Rotation Forest  K: number of subsets
 {w 1, … , w c }: set of class label
Rotation forest [2] is a method for building an ensemble
Training Phase
classifier based on the feature extraction. To obtain the
for i = 1… L
training data used by the base classifier, we perform a a
principal component analysis (PCA) on the K subset of the  Look for rotation matrix Ri :
feature set that is divided randomly. All principal o Share the set F feature to K subset:
components are maintained to maintain variability F i , j (for j=1 … K ¿
information in the data and K axis rotations will be a new o for j=1 … K
feature for the base classifier. The idea of the rotation forest
 take X set data from features F i , j ,
is to improve the accuracy and diversity of individuals in
ensembles. This diversity will be pushed through feature declare it as X i , j
extraction for each base classifier.  Delete a random subset consisting
𝑇 of several classes on X i , j
Suppose a training sample 𝑥 = [𝑥1, ... , 𝑥𝑛] which
 Take a bootstrap sample from
consists of n features and training set x which is an n x n X i , j 75% of the number of
𝑇
sized matrix. Class label = = [𝑦1, ... , 𝑦𝑁] which consists objects on X i , j . State the new set
of {1, ..., w}, where w is the number of classes available. '
with X i, j
Classifiers in an ensemble are expressed as D1,... ,D𝐿, and '
 Perform PCA on X i , j to obtain
the set feature expressed as f. pseudocode for the ensemble
rotation forest method described in Table 1: coefficientsC i , j
o for j=1 … K , Arrange a rotation
Title of the conference

matrix Ri from C i , j according to the x: data to be classified


following equation:
Training Phase
M1
C(i1, )1 , C(i2, )1 , … , C(i , 1 ) [ 0 ] for s=1,2, … …,S [ 0]
Ri=
[ [0]

[0]
( 1) ( 2)


[0 ]
2.
1.2( M Calculate
C i ,2 ,C i ,2 , … ,C i ,2 …
)

previous
Rotation

⋯ training
sebagai C (i1, )K ,C
[matrix
(2 )
K , … ,CC
seti ,classifier
a

⋱ method and⋮ use La=[ X R as Y ] as


(M )
K ]
0 ] R s as done in the

i , sK
Weight distribution initialization on La sebagai
1
D 1 (i)= (i=1,2 , … , N )
N
3. For t=1,2 , … ,T
a. Based on distribution D t , extract

where M =n / K random amounts of N from La with


a
o Reorder columns from Ri according replacement to obtain a new set Lt
to the order of features F, declare it as b.
a
R i
N
ε t=Pr i ∼ D ( Cat ( x i ) ≠ y i )=∑ I ( Cta ( x i ) ≠ y i ) Dt (i)
 Build classifier Di use ( X R ai ,Y ) as training set t
i=1
phase Classification use W on Lat for practice classifier
For an x, stated i , j ( x Rai ) as an opportunity from C at , and calculate the error fromC t
a

the hypothesis that x is class ofw i. Calculate as


confidence for each class w i, with the following
average combinations:
L
1
μ ( x )= ∑ d i , j ( x Rai ) , j=1, … , c
L i=1 c. if ε t >0.5, then set
 declare x into the class with the greatest 1
confidence D 1 (i)= (i=1,2 , … , N ) and
N
go to phase (a); if ε t =0.5 then set
ε t=10−10 to continue the iteration
3.2.2 RotBoost 1 1−ε t
d. select α t = ln ⁡( )
Adaboost [6] is a machine learning approach based on the 2 εt
idea of making accurate prediction rules by combining e. Update distribusi D t on La with
several rules that are relatively weak and inaccurate. This
method will manage the set of a weight 𝐷1 (𝑖) = (𝑖 = 1,2, ..., D ( i) −a , if C ( x )= y
a

Dt +1 ( i )= t × e a
{
t t i i

𝑁) from the original dataset L and the value will be set the
same at first. In the next iteration, the weight will be Zt e ()
a
t,if C t x i =y i

adjusted so that the weight of the sample that is where Zt is the normalization factor
misclassified in the previous classifier will be increased and
the sample classified correctly will be lowered. chosen and D t +1 ain distribution
Based on the explanation of the rotation forest and opportunities La
adaboost above, the rotboost classification technique that 4. Endfor
combines the two methods is explained by the pseudocode
in Table 2. Output
Class X labels are predicted by the final ensemble
¿
Table 2. Rotboost Algorithm classifier C as
Input S
X: training set ( × p matrix) C ¿ ( x ) =argmax ∑ I (C s ( x )= y)
y ∈Φ
Y: Label of training set ( ×1 matrix) s=1
N
L: Training set, L={(x i , y i ,)} =[ X Y ]
i
K: number of subset of attributes (or M: number of 3.3 AVIRIS Data Collection
input attributes for each subset)
W: base classifier
To find out the performance of the rotboost method, we will
S: the number of base classifier
T: the number of AdaBoost iterations experiment using a data hyperspectral vegetation area in
indian pines, indiana, usa that is provided by nasa's airborne
Title of the conference

visible infra-red imaging spectrometer (AVIRIS) [5]. The the number of classes that exist and the frequency of the
data consists of 200 spectral bands after removing 20 water number of classes in the data can be seen in Table 3.
absorptions bands. AVIRIS data measures 145 × 145 pixels
with a resolution of 20m / pixel. A detailed description of

Table 3. The frequency of each class in AVIRIS data 5. Base Classifier: CART
Number Number The overall accuracy value of the RotBoost and
# Class of # Class of Rotation forest methods can be seen in Table 4. The results
Samples Samples of comparing accuracy and standard deviation of both
1 Alfalfa 46 9 Oats 20
methods show that the RotBoost method gets better
Soybean-
2 Corn-notill 1428 10
notill
972 accuracy than the Rotation Forest method in classifying
Corn- Soybean- AVIRIS data.
3 830 11 2455
mintill mintill
Soybean- Table 4. Overall accuracy of RotBoost and Rotation Forest
4 Corn 237 12 593 Metode Rotboost Rotation
clean
Grass- Forest
5 483 13 Wheat 205 Overall accuracy 0.87 ± 0.0077 0.76 0.76 ± 0.0108
pasture
6 Grass-trees 730 14 Woods 1265
Building- In addition to making comparisons between rotboost
Grass- and rotation forest, we will also conduct experiments to
Grass-
7 pasture- 28 15 386
Trees- find the most optimal s and t values. There are 5 types of s
mowed
Drives and t value configurations which consist of (s = 50, t = 2),
Stone- (s = 25, t = 4), (s = 10, t = 10), (s = 25, t = 25) , (s = 4, t =
Hay-
8 478 16 steel- 93 25), and (s = 2, t = 50). The overall accuracy value and
windrowed
towers standard deviation of each configuration can be seen in
table 5. The accuracy of the rotboost method on different s
and t configurations does not show a significant difference.
3.4 Tools Except for configurations s = 50 and t = 2 which shows a
considerable difference compared to other configurations.
In this research we use computer with minimum Table 5. Overall Accuracy Of Rotboost With Different S and T
specification of processor is i5 and minimum RAM 8 Gb. Configurations
We use MATLAB application to test the algorithm. Overall Accuracy RotBoost
S=50, T=2 S=50, T=2
S=25, T=4 S=25, T=4
4 Result and Discussion S=10, T=10 S=10, T=10
S=4, T=25 S=4, T=25
S=25, T=25 S=25, T=25
To evaluate the accuracy of each algorithm, in this study S=2, T=50 S=2, T=50
using confusion matrix. Accuracy value is used to
determine closeness of measurement to the true value. The
accuracy calculated using the following formula: From the Table 4 above shows that by using RotBoost
method get the best result with overall accuracy 0.87 ±
0.0077 0.76. This concludes that in the case of predictions
TP+TN
Accuracy = on the dataset AVIRIS, RotBoost algorithm has a better
TP+ FP+ FN +TN prediction rate than Rotation Forest.
(1)

Where TP is True Positive, TN is True Negative, FP is Performance Measure


False Positive and FN is False Negative. 0.9
The performance that will be measured is overall 0.85
accuracy obtained by doing 5-fold cross validation. To 0.8
find out the comparative accuracy between RotBoost and 0.75
Rotation Forest, we will experiment using the following 0.7
parameters: Overall
Accuracy
1. The number of input features in each subset:
M = 10
2. Number of base classifiers: S = 10 RotBoost Rotation Forest
3. The number of iterations in Adaboost: T = 10
4. Feature extraction method: PCA Fig. 2. Performance Measure Graph each Algorithm
Title of the conference

5 Conclusion [2] J. J. ,. K. I. A. A. J. Rodriguez, "Rotation Forest: A


New Classifier Ensemble Method," Ieee Transactions
A rotboost method that forms an ensemble classifier, has On Pattern Analysis And Machine Intelligence, Vol.
been introduced to classify AVIRIS hyperspectral data. 28, No. 10, Pp. 1619-1630, 2006.
The rotboost method combines the rotation forest method [3] Xia, D. Junshi, H. Peijun and C. Xiyan, "Hyperspectral
with adaboost. The base classifier used is CART and the Remote Sensing Image Classification Based On
data projection method used to form the rotation matrix is Rotation Forest," Ieee Geoscience and Remote
PCA. Compared to the experiments, the accuracy of the Sensing Letters, Vol. 11, No. 1, Pp. 239-234, 2014.
rotboost and rotation forest methods in hyperspectral
classification on AVIRIS data. Experimental results show [4] C.-X. Zhang and J.-S. Zhang, "Rotboost: A Technique
that rotboost produces better accuracy than the rotation For Combining Rotation Forest and Adaboost,"
forest. S and T parameter values are also not very Pattern Recognition Letters , Vol. 29 , P. 1524–1536,
influential on rotboost accuracy. 2008.

[5] Baumgardner, B. M., L. L. and D., "220 Band Aviris


References Hyperspectral Image Data Set: June 12, 1992 Indian
[1] Chan, P. Jonathan Cheung-Wai and Desiré, Pine Test Site 3," Purdue University Research
"Evaluation Of Random Forest And Adaboost Tree- Repository., 2015.
Based Ensemble Classification And Spectral Band
Selection For Ecotope Mapping Using Airborne [6] Freund, S. Yoav and R. E., "Experiments With A
Hyperspectral Imagery," Remote Sensing Of New Boosting Algorithm," Machine Learning:
Environment, P. 2999–3011, 2008. Proceedings Of The Thirteenth International
Conference, Pp. 1-9, 1997.

You might also like