Professional Documents
Culture Documents
RotBoost
I Gusti Ayu Agung Diatri Indradewi 1,*, Ni Luh Wiwik Sri Rahayu Ginantra 2, and Made Suci Ariantini3
1
STMIK STIKOM Indonesia, Informatic, Denpasar, Bali
2
STIMIK STIKOM Indonesia, Informatic, Denpasar, Bali
3
STIMIK STIKOM Indonesia, Informatic, Denpasar, Bali
Abstract. In machine learning, the classification of hyperspectral data has several challenges. The one challenge
is a huge data dimension. The solution given to overcome the challenge is to use Ensemble Learning. The benefit
of using Ensemble Learning is that we can improve the classification performance of hyperspectral data. One of
the Ensemble Learning methods is RotBoost, which is a combination of the Rotation Forest and Adaboost
methods. In this paper, the performance of the RotBoost method is evaluated by measuring its accuracy. In
addition, this paper also investigates the effect of the number of base classifiers and Boosting iterations on this
accuracy.
1 Introduction classifier. The training data used has been projected into the
new feature space using the principal componet analysis
Optical remote sensing has developed from grayscale (PCA) and several other methods. Experiments carried out
imagery to multispectral and hyperspectral. The role of the resulted in better accuracy than bagging, random forest, and
development of hardware technology also supports the adaboost.
availability of high spatial, spectral and temporal images The rotboost method proposed by [4], is a method
that are very useful for various applications. Compared to formed by combining the rotation forest and adaboost
RGB color images that have 3 bands, hyperspectral images methods. The rotboost method significantly increases
have hundreds of spectral bands. Hyperspectral images are predictive accuracy compared to the rotation forest and
able to provide more information than other types of images. adaboost methods when tested using 36 UCI datasets.
However, the hyperspectral image itself has challenges Rotboost also produces better performance than bagging and
including; high number of dimensions, number of output multiboost. The purpose of this study was to try to classify
classes, and limited data reference. Therefore, a good hyperspectral data especially indiana pines (AVIRIS) data
method is needed to classify hyperspectral data. Ensemble [5] using the rotboost algorithm by using cart as the base
learning is one promising solution considering its potential classifier and using PCA to project data into the new feature
to significantly improve accuracy. One of the existing space. In this study also will look for the number of rotation
ensemble learning methods is random forest [1]. The forest (K) base classifier and the optimal number of
random forest method can handle high dimensional data adaboost (S) iterations.
such as hyperspectral data properly. Improving the random
forest method, namely rotation forest [2], can improve the
accuracy and diversity of individuals in ensemble classifiers. 3 Material and Method
[1] classifies hyperscpectral data using the random forest In this research we will make comparisons between
and adaboost tree based methods. In their research, the RotBoost and Rotation Forest method. Our first step is to do
random forest and adaboost tree based produce almost the data collection for our dataset from AVIRIS, then we split
same accuracy. In terms of training time, random forest dataset to training and validation using cross validation.
produces better performance. [3] performed classification After splitting the data, the RotBoost and Rotation Forest
using the rotation forest method by using cart as the base method applied to the classification to get the prediction
*
Corresponding author: diatri.indradewi@stiki-indonesia.ac.id
Title of the conference
End
Prediction Result
⋮
[0 ]
2.
1.2( M Calculate
C i ,2 ,C i ,2 , … ,C i ,2 …
)
previous
Rotation
⋯ training
sebagai C (i1, )K ,C
[matrix
(2 )
K , … ,CC
seti ,classifier
a
i , sK
Weight distribution initialization on La sebagai
1
D 1 (i)= (i=1,2 , … , N )
N
3. For t=1,2 , … ,T
a. Based on distribution D t , extract
Dt +1 ( i )= t × e a
{
t t i i
𝑁) from the original dataset L and the value will be set the
same at first. In the next iteration, the weight will be Zt e ()
a
t,if C t x i =y i
adjusted so that the weight of the sample that is where Zt is the normalization factor
misclassified in the previous classifier will be increased and
the sample classified correctly will be lowered. chosen and D t +1 ain distribution
Based on the explanation of the rotation forest and opportunities La
adaboost above, the rotboost classification technique that 4. Endfor
combines the two methods is explained by the pseudocode
in Table 2. Output
Class X labels are predicted by the final ensemble
¿
Table 2. Rotboost Algorithm classifier C as
Input S
X: training set ( × p matrix) C ¿ ( x ) =argmax ∑ I (C s ( x )= y)
y ∈Φ
Y: Label of training set ( ×1 matrix) s=1
N
L: Training set, L={(x i , y i ,)} =[ X Y ]
i
K: number of subset of attributes (or M: number of 3.3 AVIRIS Data Collection
input attributes for each subset)
W: base classifier
To find out the performance of the rotboost method, we will
S: the number of base classifier
T: the number of AdaBoost iterations experiment using a data hyperspectral vegetation area in
indian pines, indiana, usa that is provided by nasa's airborne
Title of the conference
visible infra-red imaging spectrometer (AVIRIS) [5]. The the number of classes that exist and the frequency of the
data consists of 200 spectral bands after removing 20 water number of classes in the data can be seen in Table 3.
absorptions bands. AVIRIS data measures 145 × 145 pixels
with a resolution of 20m / pixel. A detailed description of
Table 3. The frequency of each class in AVIRIS data 5. Base Classifier: CART
Number Number The overall accuracy value of the RotBoost and
# Class of # Class of Rotation forest methods can be seen in Table 4. The results
Samples Samples of comparing accuracy and standard deviation of both
1 Alfalfa 46 9 Oats 20
methods show that the RotBoost method gets better
Soybean-
2 Corn-notill 1428 10
notill
972 accuracy than the Rotation Forest method in classifying
Corn- Soybean- AVIRIS data.
3 830 11 2455
mintill mintill
Soybean- Table 4. Overall accuracy of RotBoost and Rotation Forest
4 Corn 237 12 593 Metode Rotboost Rotation
clean
Grass- Forest
5 483 13 Wheat 205 Overall accuracy 0.87 ± 0.0077 0.76 0.76 ± 0.0108
pasture
6 Grass-trees 730 14 Woods 1265
Building- In addition to making comparisons between rotboost
Grass- and rotation forest, we will also conduct experiments to
Grass-
7 pasture- 28 15 386
Trees- find the most optimal s and t values. There are 5 types of s
mowed
Drives and t value configurations which consist of (s = 50, t = 2),
Stone- (s = 25, t = 4), (s = 10, t = 10), (s = 25, t = 25) , (s = 4, t =
Hay-
8 478 16 steel- 93 25), and (s = 2, t = 50). The overall accuracy value and
windrowed
towers standard deviation of each configuration can be seen in
table 5. The accuracy of the rotboost method on different s
and t configurations does not show a significant difference.
3.4 Tools Except for configurations s = 50 and t = 2 which shows a
considerable difference compared to other configurations.
In this research we use computer with minimum Table 5. Overall Accuracy Of Rotboost With Different S and T
specification of processor is i5 and minimum RAM 8 Gb. Configurations
We use MATLAB application to test the algorithm. Overall Accuracy RotBoost
S=50, T=2 S=50, T=2
S=25, T=4 S=25, T=4
4 Result and Discussion S=10, T=10 S=10, T=10
S=4, T=25 S=4, T=25
S=25, T=25 S=25, T=25
To evaluate the accuracy of each algorithm, in this study S=2, T=50 S=2, T=50
using confusion matrix. Accuracy value is used to
determine closeness of measurement to the true value. The
accuracy calculated using the following formula: From the Table 4 above shows that by using RotBoost
method get the best result with overall accuracy 0.87 ±
0.0077 0.76. This concludes that in the case of predictions
TP+TN
Accuracy = on the dataset AVIRIS, RotBoost algorithm has a better
TP+ FP+ FN +TN prediction rate than Rotation Forest.
(1)