1 s2.0 S187705092102425X Main

Available online at www.sciencedirect.
com
ScienceDirect
Available online at www.sciencedirect.com
Procedia Computer Science 00 (2021) 000–000
ScienceDirect www.elsevier.com/locate/procedia
Procedia Computer Science 197 (2022) 668–676
Sixth Information Systems International Conference (ISICO 2021)
Rice phenology classification based on random forest algorithm for

data imbalance using Google Earth engine
Hady Suryono, Heri Kuswanto*, Nur Iriawan
Department of Statistics, Faculty of Mathematics, Computation and Data Science, Institut Teknologi Sepuluh Nopember, Surabaya 60111, East
Java, Indonesia
Abstract
The agricultural sector is one of the important sectors in the world and has a very significant contribution to the achievement of the
Sustainable Development Goals (SDGs) program. In the SDGs, attention to food security is focused on the second key indicator
namely zero hunger (SDG 2). Availability of accurate rice production data is required to measure the level of food security. This
can be done by monitoring the growth phase of a food plant which is called rice phenology classification. We used Random Forest
(RF) algorithm on the Google Earth Engine (GEE) platform in Lamongan Regency, East Java in 2019 to classify rice phenology
from Landsat-8 satellite imagery, which has the characteristics of imbalance case. To deal with the imbalanced issue, an
oversampling technique was used for sampling minority classes. Reference data for the classification model training were collected
from the Area Sampling Framework survey published by Statistics Indonesia in 2019. The results showed that the overall accuracy
(OA) using RF algorithm by modifying the dataset using oversampling was 81.46% and the kappa statistic (κ) was 0.76,
outperforming the RF technique without oversampling.
©
© 2021
2021 The
The Authors.
Authors. Published
Published by
by Elsevier
ELSEVIER B.V.B.V.
This is an
This is an open
open access article under
access article under the
the CC
CC BY-NC-ND
BY-NC-ND license
license (https://creativecommons.org/licenses/by-nc-nd/4.0)
(https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility
Peer-review under responsibilityofofthe
thescientific
scientificcommittee
committeeofofthe
theSixth
SixthInformation
Information Systems
Systems International
International Conference.
Conference.
Keywords: Sustainable development goals; rice phenology; classification; google earth engine; random forest; data imbalance.
1. Introduction
The Sustainable Development Goals (SDGs) adopted by countries in the world have significant implications for
national development planning in Indonesia in 2015-2030. Among the key indicators of SDGs, attention to food
security is focused on the second goal, namely zero hunger (SDG 2), realizing food security, improving nutrition, and
encouraging sustainable agricultural cultivation [1]. The agricultural sector is one of the principal sectors in the world
and has a significant contribution to the achievement goals of the Sustainable Development Goals (SDGs) program.
* Corresponding author. Tel.: +62 818-513-223.

E-mail address: heri_k@statistika.its.ac.id
1877-0509 © 2021 The Authors. Published by ELSEVIER B.V.

This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the scientific committee of the Sixth Information Systems International Conference.
1877-0509 © 2021 The Authors. Published by Elsevier B.V.

This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the scientific committee of the Sixth Information Systems International Conference.
10.1016/j.procs.2021.12.201
Hady Suryono et al. / Procedia Computer Science 197 (2022) 668–676 669
2 Hady Suryono et al. / Procedia Computer Science 00 (2021) 000–000
The availability of accurate rice production data is very important to measure the level of food security. Monitoring of
food crops, especially rice, is carried out to answer the second SDGs goal. Rice phenology classification is a major
step in the accurate monitoring and prediction of rice yields.
The Statistics Indonesia (BPS) and the Agency for the Assessment and Application of Technology (BPPT) use an
Area Sampling Framework (ASF) survey to monitor rice phenology. Samples come from 5 percent of the total rice
field area and are collected monthly based on the coordinates of the location that have been determined to estimate the
monthly rice harvest. This creates problems especially dealing with information on individual variations in the non-
sample areas.
Remote sensing data can be a solution to ASF sampling problems. In agriculture, remote sensing data is used to
model rice growth because it has many spatiotemporal features that can support vegetation detection and plant-related
indexes [2], classification of plant types [3], estimated harvested area, and rice mapping [4]. Landsat-8 has been used
to differentiate plant types by adding spectral-temporal pattern information based on the growth phase. This pattern
occurs because the phases of rice growth are sequential in one planting period (120 days) so that the vegetation index
values are interrelated [2].
Modeling of remote sensing data to perform rice phenology classification requires processing and management of
very large volume and unstructured satellite image data. This case is known as the "Geo Big Data" problem which
requires new technologies and resources that capable of handling large amounts of satellite imagery [5]. In particular,
the emergence of cloud computing resources, such as the Google Earth Engine, can solve the Geo Big Data problem
[6]. The increase in volume and variation of remote sensing data results in Big Data problems posed new challenges
in handling data sets so that a new approach is needed to extract relevant and accurate information from remote sensing
data [7]. The Random Forest algorithm can be used as a new approach to mapping agricultural land through GEE [8].
However, in classification, the number of class members is usually not equal [9], which means that the class
distribution is not uniform [10]. Under such conditions, most classifiers in machine learning will be biased towards
data imbalance problems, because classification machines will tend to predict major classes and ignore minor classes
[11], thus consider minority observations as noise and ignore them in the classification [12]. As a result, the prediction
accuracy for the minority class will be much lower than for the majority class [7, 13]. Given the importance of the
class imbalance problem, various techniques have been developed to overcome it because the classification of class
imbalance data sets has been identified as a major problem in machine learning [14]. To overcome this problem, the
data needs to be pre-processed to build a balanced dataset [15]. The approach taken is to apply oversampling so that it
has the potential to improve overall predictive performance. Therefore, we propose a Random Forest algorithm with
oversampling using the GEE cloud computing platform to classify rice phenology from Landsat-8 satellite imagery.
2. Methods
This study used the Random Forest method with oversampling on the GEE data processing to provide a solution
to the Geo Big Data remote sensing and data imbalanced problem for rice phenology classification.
2.1. Study Area
Lamongan is a district in East Java Province on path / row 119/065 in Landsat-8 imagery. Geographically,
Lamongan Regency is located at 6 51’ 54” to 7 23’ 6” south latitude and between the east longitude lines 112 4’
41” to 112 33’ 12”. Lamongan Regency consists of low land with an altitude of 0-25 meters covering an area of
50.17%, while an altitude of 25-100 meters covering an area of 45.68%, the remaining 4.15% an altitude above 100
meters above sea level. Lamongan Regency is bordered by the Java Sea in the north, Gresik Regency in the east,
Mojokerto Regency and Jombang Regency in the south, and Bojonegoro Regency and Tuban Regency in the west
(Fig. 1).
Fig. 1. Geographical location of Lamongan Regency
2.2. Landsat
This study used Landsat 8 surface reflectance (SR) data from GEE acquired in 2019, shape file (shp) map for rice
fields from the National Land Agency (BPN), shp map for districts, sub-districts, and village boundaries from
Statistics Indonesia (BPS) and Area Sampling Framework (ASF) data from Statistics Indonesia (BPS). ASF samples
were used as labels for training data. The sample area in ASF is called a segment with a size of 300 m x 300 m. One
segment consists of 9 sub-segments measuring 100 m x 100 m.
2.2.1. Landsat archive data in the GEE
Landsat-8 is one of the satellites launched to carry out earth monitoring. This satellite consists of 9 Operational
Land Image (OLI) sensors and 2 Thermal Infrared Sensor (TIRS) sensors. The OLI has nine-band classes that operate
in the wavelength range of 0.433-2,300 μm and provide images with a maximum resolution of 15 m [8]. Each spectral
band in the satellite image will produce a different reflectance value between locations.
GEE is a geospatial technology innovation that provides online access to Landsat-8 data [16]. After reviewing the
literature, this study used 7 bands for this analysis consisting of aerosol, blue, green, red, NIR, SWIR1, and SWIR2
for rice phenology classification (Table 1).
2.2.2. Vegetation Indices (VI) composites extraction
Vegetation indices (VI) are a spectral transformation of two or more bands designed to enhance the contribution of
vegetation properties and allow reliable spatial and temporal inter-comparisons of terrestrial photosynthetic activity
and canopy structural variations [17]. Several studies have been conducted on the vegetation index from satellite
image data to determine the growth phase of rice plants [2]. These studies used various types of satellite imagery such
as Modis and Landsat-7 imagery which has now developed into Landsat-8. Vegetation indices are derived from
satellite images to detect rice phenology, including Normalized Difference Vegetation Index (NDVI), Enhanced
Vegetation Index (EVI), Normalized Difference Built-Up Index (NDBI), and Normalized Difference Water Index
(NDWI). ρNIR and ρRED are the surface bidirectional reflectance factors for respective bands [17].
2.2.3. Area Sampling Framework (ASF)
The Area Sampling Framework (ASF) is a method developed by BPS assisted by BPPT to calculate monthly rice
harvested areas. This method is carried out using an area framework in sampling [18].
Table 1. Types and uses of the Landsat-8 Band (used in the study).
Band Landsat 8 Band Applications Vegetatio Equation
name Spectral range n Index
(µm) (VI)
Aerosol 0.43-0.45 Coastal and aerosol studies EVI EVI=2.5(ρNIR - ρRED)/(1+ρNIR+6 ρRED -7.5 ρRED)
Blue 0.45-0.51 Bathymetric mapping, distinguishing
soil from vegetation, and deciduous
from coniferous vegetation
Green 0.53-0.59 Emphasizes peak vegetation, which is NDVI NDVI= (ρNIR - ρRED)/(ρNIR +ρRED)
useful for assessing plant vigor
Red 0.63-0.67 Discriminates vegetation slopes
NIR 0.85-0.88 Emphasizes biomass content and NDBI NDBI= (ρSWIR1 - ρNIR)/(ρSWIR1 +ρNIR)
shorelines
SWIR1 1.57-1.65 Discriminates moisture content of soil
and vegetation; penetrates thin clouds
SWIR2 2.11-2.29 Improved ability to track moisture NDWI NDWI= (ρNIR - ρSWIR1)/( ρNIR +ρSWIR1)
content of soil and vegetation and thin
cloud penetration
Sampling was carried out in an area measuring 300 m x 300 m for the entire study area which is referred to as a
segment. One segment consists of 9 sub-segments measuring 100 m x 100 m. The ASF survey in Lamongan Regency
has been carried out monthly with 208 segments resulting on 1,872 sub-segments and total of 22,464 sub-segments
for one year. The monthly estimates of the harvested area depend on rice phenology data from all sub-segments
resulting from the field survey [19]. The rice growth phases recorded in the ASF consist of 6 rice phenology categories.
The definition and visual display of the rice phenology category can be seen in Table 2.
Table 2. Definition of rice phenology in ASF survey.
No Rice phenology Definition Days after planting Visual display
1 Early vegetative The rice growth phase starts from the beginning 1-35 days
of growth until the maximum tillers
2 Late vegetative The late vegetative phase starts when tillering 35-55 days
begins, which extends from the appearance of the
first tiller until the maximum number of tillers is
reached
3 Generative The growth phase starts from panicle out, 55-105 days
ripening, until before harvest
4 Harvesting The phase when the rice is being harvested or

has been harvested
5 Preparation The phase in which the paddy fields begin to be

cultivated in preparation for rice growth
6 Puso (crop If there is an attack by plant-disturbing

failure) organisms or a disaster so that rice production is
less than 11% of normal
2.3. Google Earth Engine
Google Earth Engine (GEE) is a web portal that provides global time-series satellite imagery (over 40 years), cloud-
based computing, and algorithms for processing data [6]. Available data comes from several satellites, such as:
Moderate Resolution Imaging Spectrometer (MODIS); National Oceanographic and Atmospheric Administration
Advanced very high resolution radiometer; Sentinels 1, 2, and 3; Advanced Land Observing Satellite. We processed
and computed satellite image data using Google Earth Engine (GEE) which enables smooth and fast cloud and parallel
processing on Google servers [8]. Fig. 2 shows a conceptual framework for rice phenology classification.
ASF label Over- Create ASF Validation

data of 2019 sampling shape file samples
Calculating Random
Rice
Training NDVI, EVI, Forest
Phenology
samples NDBI, Classifier on
Class
NDWI GEE
Landsat 8
Cloud Spectral Accuracy
data of 2019
Masking Bands b1-b7 assessment
on GEE
Fig. 2. The conceptual framework
2.4. Oversampling
Oversampling is a sampling technique to balance the distribution of data by increasing the number of minor class
data. Sampling technique can be done by random or duplication. With the application of sampling on imbalanced data,
the imbalanced level is getting smaller and the classification can be done appropriately [20]. In this paper, duplication
oversampling is used by directly replicating all instances of the minority class which has less data, namely the
Preparation and Puso (Class 4 and 5) to approximate the number of instances in the majority class.
2.5. Random Forest
Random forest (RF) is a popular method of classification and clustering based on an ensemble of decision trees
(DT). RF is a development of the CART method by applying bagging and random feature selection to DT, which is
to randomly select several features on each iteration [21]. RF produces lots of trees that have many iterations so that
they resemble forests. RF changes the algorithm so that the predicted results of all sub-trees have little correlation.
The classification decision is made by the majority of votes among all trees. Combining predictions from various
models in an ensemble will work best if the predictions from the sub-models are independent.
2.6. Assessment Matrices
The performance of prediction was evaluated as a function of accuracy, precision, and recall. The confusion matrix
is a summary of the prediction result and performance measure for classification problems (Table 3). The number of
true and false predictions for each class is summarized where the values (AP, AN) represent positive and negative test
data, and the values (PP, PN) represent the predicted results for the positive and negative classes [22].
TP is data from the number of correct class member predictions in the positive class, TN is data from the number
of correct class member predictions in the negative class, FP is data from the number of incorrect class member
predictions in the positive class, FN is data from the number of incorrect class member predictions in the negative
class. The evaluation metrics used to assess the prediction from the datasets are defined below in Equations (1)-(3).
Table 3. Confusion matrix.
Actual Positive (AP) Actual Negative (AN)
Predicted Positive (PP) True Positives (TP) False Positives (FP)
Predicted Negative (PN) False Negatives (FN) True Negatives (TN)
(1)
(2)
(3)
Kappa statistic [23] was introduced to provide a quantitative measure of the magnitude of agreement between
observers or the consistency between two measurement methods for nominal scales. Kappa coefficient can measure
the degree of agreement classifying objects in mutually exclusive categories.
The equation for kappa statistic is:
(4)
where is the proportion of agreements and is the expected proportion of agreements by chance.
is the number of different nominal values for the performance indicator of interest.
3. Results
Based on research conducted on satellite images to detect rice plants on several islands in Indonesia [2], it was
found that there was a spectral pattern of the development of the EVI value in one rice planting period.
Lamongan Regency is one of the rice producers in East Java Province. The distribution of ASF sample sub-
segments in Lamongan Regency in 2019 can be seen in Fig. 3(a). The ASF survey counts as many as 208 segments
in a month and 22,464 sub-segments in a year with a ratio in minority class between 0.01-0.05. Fig. 3(c) shows the
point of observation in one sample segment (9 sub-segments).
The spectral pattern of the EVI and NDVI indices illustrates the low EVI and NDVI values in the initial phase of
rice planting. In the time of entering the early vegetative to late vegetative, the EVI and NDVI values climbed steeply
until they reach the maximum peak, then the values fall again in the initial and generative phases until harvest. During
the harvest time, the EVI and NDVI values have the same low values as the initial planting phase and the preparation
phase. Examples of EVI and NDVI Spectral Patterns in Rice Planting Period in 2019 can be seen in Fig. 3(b).
Fig. 3. (a) the Google Earth Engine interactive development environment; (b) examples of EVI and NDVI spectral
patterns in 1 rice planting period in 2019; (c) distribution of ASF samples in Lamongan Regency in 2019.
Table 4. Performance of prediction of random forest algorithm with oversampling technique and without oversampling technique to handle
imbalanced data.
Technique Class Accuracy (%) Precision (%) Recall (%)
Early vegetative (0) 71.93% 78.85%
Late vegetative (1) 88.10% 68.52%
Random forest with Generative (2) 66.67% 87.18%
oversampling 81.46%
Harvest (3) 94.52% 85.19%
Preparation (4) 82.61% 90.48%
Puso/crop failure (5) 84.62% 91.67%
Early vegetative (0) 78.57% 80.49%
Late vegetative (1) 78.89% 76.34%
Random forest without Generative (2) 56.79% 80.70%
oversampling 74.31%
Harvest (3) 95.29% 67.50%
Preparation (4) 23.08% 42.86%
Puso/crop failure (5) 22.22% 66.67%
Table 4 displays the results in terms of accuracy, precision, and recall. RF without oversampling shows the overall
accuracy achieved is 74.31%. It can be seen that the precision of the model in predicting Class 4 (preparation) and
Class 5 (Puso/crop failure) is only 23.08% and 22.22%. The recall for Class 4 as a minority class is also very low,
only 42.86%. This shows that this model cannot effectively predict Class 4 and Class 5. The oversampling technique
used in this paper to handle imbalanced data in order to improve the predictive performance of the minority class. RF
with oversampling produces an overall accuracy achieved of 81.46%. It can be seen that Producer's Accuracy (PA)
prediction results for Class 4 is 82.61% and Class 5 is 84.62%.
Table 5.Confusion matrix of prediction of random forest algorithm with oversampling technique.
Actual Class
0 1 2 3 4 5 Producer’s Accuracy
Predicted Class 0 41 9 1 4 2 0 71.93%
1 4 37 1 0 0 0 88.10%
2 3 7 34 6 0 1 66.67%
3 0 1 3 69 0 0 94.52%
4 4 0 0 0 19 0 82.61%
5 0 0 0 2 0 11 84.62%
User’s Accuracy 78.85 68.52 87.18 85.19 91.67
90.48%
% % % % %
Overall Accuracy 81.46%
Kappa Statistic 0.76
Table 5 shows that Class 4 and Class 5 can be mapped with good user’s accuracy (90.48% and 91.67%,
respectively). This shows that this model can classify rice phenology effectively with a kappa statistic of 0.76 showing
a very good in the strength of agreement.
4. Conclusion
In summary, it can be concluded that oversampling is an appropriate technique to solve the problem of imbalanced
data in cases of rice phenology in the Lamongan Regency. Based on the experiment, the Random Forest classification
algorithm with oversampling outperforms good compare to RF classification algorithm without oversampling in terms
of accuracy percentage to classify rice phenology. It can be seen from the overall accuracy and kappa statistic (81.46%
and 0.76). The model also succeeded in classifying the minority classes, Class 4 and Class 5, contained in the ASF
data efficiently. The results show that our proposed method achieves the best performance in terms of rice phenology
classification to support food security data in contributing to the achievement of SDGs goals. Carrying further study
using other models is an essential step in classifying imbalanced Geo Big data. Applying other methods, such as SVM
and NN might improve model performance.
Acknowledgements
The authors would like to thank Statistics Indonesia (BPS) and Institut Teknologi Sepuluh Nopember (ITS) for
providing the opportunity and support to the author to study in the Doctor Program Department of Statistics.
Furthermore, the authors also thank to other parties who contributed to the completion of this paper.
References
[1] FAO. (2016) “Food and Agriculture Key To Achieving The 2030 Agenda For Sustainable Development.” FAO.
[2] Dirgahayu, D., I. M. Parsa, Silvia, Sri Harini, Soko Budoyo, Krisna Indriawan, Muchlisin Arief, Wawan Harsanugraha, Heru Noviar, Johannes
Manalu, Joko Santo, Ernawati. (2015) “Litbang Pemanfaatan Data Penginderaan Jauh Untuk Pemantauan Pertumbuhan Tanaman Padi Di
Lahan Sawah (Studi Kasus Pulau Kalimantan) [Title in English: R&D Utilization of Remote Sensing Data for Monitoring Rice Plant Growth
in Rice Fields (Case Study on Kalimantan Island)].” Jakarta, Pusfatja LAPAN.
[3] Azar, Ramin, Paolo Villa, Daniela Stroppiana, Alberto Crema, Mirco Boschetti, and Pietro Alessandro Brivio. (2016) “Assessing In-Season
Crop Classification Performance Using Satellite Data: A Test Case In Northern Italy.” European Journal of Remote Sensing, 49: 361–380.
[4] Zhang, Chengkang, Hongyan Zhang, Juan Du, and Liangpei Zhang. (2018) “An Automated Paddy Rice Extent Extraction with Time Stacks
of Sentinel Data: A Case Study In Jianghan Plain, Hubei, China.” 7th International Conference on Agro-geoinformatics (Agro-
geoinformatics) 8: 1–6.
[5] Shelestov, Andrii, Mykola Lavreniuk, Nataliia Kussul, Alexei Novikov, and Sergii Skakun. (2017) “Exploring google earth engine platform
for big data processing: Classification of multi-temporal satellite imagery for crop mapping.” Frontiers in Earth Science, 5: 17.
[6] Mutanga, Onisimo, and Lalit Kumar. (2019) “Google Earth Engine Applications.” Remote Sensing 11: 591.
[7] Kussul, Nataliia, Guido Lemoine, Francisco Javier Gallego, Sergii V. Skakun, Mykola Lavreniuk, and Andrii Yu Shelestov. (2016) “Parcel-
Based Crop Classification in Ukraine Using Landsat-8 Data and Sentinel-1A Data.” IEEE Journal of Selected Topics in Applied Earth
Observations and Remote Sensing 9 (6): 2500–2508.
[8] Teluguntla, Pardhasaradhi, Prasad S. Thenkabail, Adam Oliphant, Jun Xiong, Murali Krishna Gumma, Russell G. Congalton, Kamini Yadav,
and Alfredo Huete. (2018) “A 30-m landsat-derived cropland extent product of Australia and China using random forest machine learning
algorithm on Google Earth Engine cloud computing platform.” ISPRS Journal of Photogrammetry and Remote Sensing 144: 325–340.
[9] Bak, Britta Anker, and Jens Ledet Jensen. (2016) “High Dimensional Classifiers in The Imbalanced Case.” Computational Statistics & Data
Analysis 98: 46-59.
[10] Maurya, Chandresh Kumar, Durga Toshniwal, and Gopalan Vijendran Venkoparao. (2016) “Online sparse class imbalance learning on big
data.” Neurocomputing 216: 250-260.
[11] Japkowicz, Nathalie, and Shaju Stephen. (2002) “The Class Imbalance Problem: A Systematic Study”. Intelligent data analysis 6: 429–449.
[12] El-Banna, Mahmoud. (2017) “Modified Mahalanobis Taguchi System for Imbalance Data Classification.” Computational Intelligence and
Neuroscience.
[13] Pouyanfar, Samira, and Shu-Ching Chen. (2017) “Automatic Video Event Detection for Imbalance Data Using Enhanced Ensemble Deep
Learning.” International Journal of Semantic Computing 11 (1): 85-109.
[14] Yang, Qiang, and Xindong Wu. (2006) “10 Challenging problems in data mining research,” International Journal of Information Technology
& Decision Making 5 (04): 597–604.
[15] Belgiu, M., (2016) “Random Forest in Remote Sensing: A Review of Applications and Future Directions.” ISPRS Journal of Photogrammetry
and Remote Sensing 114: 24-31.
[16] Google Earth Engine. (2021, 9th March) “Google Earth Engine.” [Online]. Available: https://earthengine.Google.com.
[17] Huete, Alfredo, Kamel Didan, Tomoaki Miura, E. Patricia Rodriguez, Xiang Gao, and Laerte G. Ferreira. (2002) “Overview of the radiometric
and biophysical performance of the MODIS vegetation indices.” Remote Sensing of Environment, 83 (1–2): 195–213.
[18] Jinguji, Issei. (2015) “Dot Sampling Method For Area Estimation.” Crop Monitoring For Improved Food Security, Fao & Adb.
[19] Badan Pusat Statistik. (2015) “Pedoman Pelaksanaan Uji Coba Sistem Kerangka Sampel Area (KSA) 2015 [Title in English: Guidelines for
the Implementation of the 2015 Area Sampling Framework System (KSA) Trial].” BPS, Jakarta.
[20] Laurikkala, Jorma. (2001) “Improving Identification of Difficult Small Classes by Balancing Class Distribution.” Conference on Artificial
Intelligence in Medicine in Europe: 63-66. Springer, Berlin, Heidelberg, 2001.
[21] Breiman, L. (2001) “Random forests.” Machine learning, 45 (1): 5–32.
[22] Visa, Sofia, Brian Ramsay, Anca L. Ralescu, and Esther Van Der Knaap. (2011) “Confusion matrix-based feature selection.” MAICS 710:
120–127.
[23] Viera, Anthony J., and Joanne M. Garrett. (2005) “Understanding interobserver agreement: The Kappa Statistic.” Family Medicine, 37: 360-
363.

1 s2.0 S187705092102425X Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S187705092102425X Main

Uploaded by

Copyright:

Available Formats

Available online at www.sciencedirect.

Sixth Information Systems International Conference (ISICO 2021)

Rice phenology classification based on random forest algorithm for

* Corresponding author. Tel.: +62 818-513-223.

1877-0509 © 2021 The Authors. Published by ELSEVIER B.V.

1877-0509 © 2021 The Authors. Published by Elsevier B.V.

2.1. Study Area

Fig. 1. Geographical location of Lamongan Regency

2.2.1. Landsat archive data in the GEE

2.2.2. Vegetation Indices (VI) composites extraction

2.2.3. Area Sampling Framework (ASF)

4 Harvesting The phase when the rice is being harvested or

5 Preparation The phase in which the paddy fields begin to be

6 Puso (crop If there is an attack by plant-disturbing

2.3. Google Earth Engine

ASF label Over- Create ASF Validation

Fig. 2. The conceptual framework

2.5. Random Forest

2.6. Assessment Matrices

The equation for kappa statistic is:

You might also like