Professional Documents
Culture Documents
Article
ABSTRACT
We identified tree species based on leaf images with a convolutional neural network (CNN). We sampled approxi-
mately 200 to 300 leaves per tree from five tree species at Kyoto University Campus. Twenty to thirty 1.0 × 1.0 cm
(256 × 256 pixel) leaf images were taken per leaf, from which 10,000 leaf images (2,000 × 5 individual tree species)
were prepared for the sample data. Color, grayscale, and binary images were used as image types. We constructed 36
learning models using based on differences in learning patterns, image types, and learning iterations. Performance
evaluation of the proposed model was conducted using the Matthews correlation coefficient (MCC). Both training and
test data had high classification accuracy. The mean MCC of the five tree species ranged from 0.881 to 0.998 for the
training data and 0.851 to 0.994 for the test data. Classification accuracy was generally high for color images and low
for grayscale images. We found that there were many cases where Cinnamomum camphora was misclassified as Quer-
cus myrsinifolia, or Quercus myrsinifolia was misclassified as Quercus glauca; most Quercus glauca, Ilex integra, and
Pittosporum tobira trees were correctly classified using the training data; and misclassification using test data for Ilex
integra was very low.
Keywords: convolutional neural network, image type, leaf image, Matthews correlation coefficient, tree identification
Triggs, 2005; Yamasaki, 2010a) as image features for venation of the models for both training and test data.
patterns. It is difficult to use this type of information because
several image processing functions are necessary to extract leaf
image features to apply the classification model. MATERIALS AND METHODS
Since McCulloch and Pitts (1943) proposed the formal
neuron model, studies of artificial intelligence and its applica- Study Site
tions have greatly progressed. Particularly, with the evolution of We sampled 200–300 leaves from each of five tree spe-
computer resources and techniques after 2000, results obtained cies, Cinnamomum camphora, Ilex integra, Pittosporum tobira,
using "deep learning" techniques have attracted attention (Ya- Quercus glauca, and Quercus myrsinifolia, at Kyoto University
mashita, 2016). Convolutional neural networks (CNNs), which campus. We chose these five species as our test species because
were proposed by LeCun et al. (1998), have contributed substan- they had greater intraspecific variance in leaf shape and lower
tially to the image-recognition field (LeCun et al., 1998; Okaya, classification accuracy than other species in a previous study
2015; Yamashita, 2016). A notable example of a CNN, Alpha- (Minowa et al., 2019). Tree species planted at Kyoto University
GO, which was developed by the Google DeepMind Company, campus were used primarily so that repeated tree sampling can
was the first Go computer program superior to a professional be performed at the same sites using the same sampling protocol
human Go player in Tagai-sen (no handicap) using the deep Q- (Minowa et al., 2019); thus, samples can be compared more eas-
network method (Silver et al., 2016, 2017; Ohtsuki, 2019). Deep ily in future studies.
learning produces superior results in image recognition because
of innovations in advanced computing hardware as well as the Leaf Image Processing
use of a method that differs substantially from past image recog- Leaves sampled at the Kyoto University campus were
nition methods. Previous image recognition methods extract im- scanned using a GT-X970 (EPSON Co., Ltd., Suwa, Japan) with
age features in advance, which are inputted as training data into a color image resolution of 650 dpi. We used ImageJ 1.50 soft-
a learning model. The types of image features used have impor- ware (NIH, 2014), an open-source image-processing program,
tant effects on classification accuracy; however, image features for leaf-image processing, which was performed as follows. We
are difficult to specify because they vary between objects. Thus, randomly extracted 20–30 1.0 × 1.0 cm (256 × 256 pixel) leaf im-
extracting image features is greatly affected by the experience of ages from one piece of a leaf, excluding the edges, and prepared
researchers or developers (Yamashita, 2016). By contrast, deep 10,000 leaf images (= 2,000 × 5 tree species) as sample data.
learning can extract image features by itself and therefore is Color, grayscale, and binary images were used as image types.
highly accessible, which means non-experts in image recognition Figure 1 illustrates the three image types for the five tree species.
can easily use deep learning (Makino and Nishizaki, 2018). In Although our goal was to develop an auto-tree-identification
addition, most applications that perform deep learning use free- system using mobile terminals for analyzing leaf images pho-
license software available to the public. Therefore, researchers in tographed with mobile devices, we used scanned images mainly
various fields can perform image recognition using deep learn- because their resolution is higher than that of photographs with
ing. In tree identification with machine learning or deep learning, mobile devices. The results of this study may become a bench-
the former uses venation patterns extracted from leaves based on mark for analyses using mobile devices. Moreover, grayscale
the National Cleared Leaf Collection, housed at the Smithson- and binary images were used mainly because color images are
ian Institution (Wilf et al., 2016), whereas the latter is based on extremely useful when photography conditions or environments
point cloud data from laser-scanned forest data (Mizoguchi et are maintained, such as in indoor photography. However, color
al., 2016) or tree species identification using fine-tuning based images are not always useful for uncontrolled environments such
on aerial photography images collected by a drone (Nakane and as outdoors, as they greatly depend on the hardware character-
Wakatsuki, 2018). A previous study performed tree identification istics of individual cameras, parameters such as sensitivity, and
based on venation patterns but used machine learning with leaf- lighting color (Sato, 2011). For example, light- and shade-based
image features (Wilf et al., 2016). Moreover, even if tree identi- techniques such as Harr-like features are mainstream approaches
fication uses deep learning, such as that performed by Mizoguchi used for facial recognition (Papageorgiou et al., 1998), which
et al. (2016) and Nakane and Wakatsuki (2018), the classification uses grayscale. Moreover, because binarization is important in
is approximate and varies widely. Few reports are available on preprocessing for image recognition, binary images are used.
tree identification by deep learning based on venation patterns. Thus, we used grayscale and binary images in addition to color
However, this method would be advantageous for identifying images in this study.
tree species by mobile terminals, as it is not necessary to use the
whole leaf image. This study was conducted to identify tree spe- CNN and GoogLeNet
cies based on leaf images with a CNN. We divided one section A CNN is a neural network model mainly used in the image
of a leaf image into dozens of pieces of sample data that were recognition field. In the 2000s, histograms of oriented gradients
inputted into the CNN model. We constructed various learning and scale-invariant feature transform (Lowe, 1999; Yamasaki,
models based on differences in learning patterns, image types, 2010b) were commonly used for image features in classification
and learning iterations, and verified the classification accuracy problems that used support vector machine classifiers (Vapnik
Convolutional Neural Network Applied to Tree Species Identification Based on Leaf Images 3
and Lermer, 1963; Yamasaki, 2010a). However, in 2012, at the cause of its overwhelming classification accuracy (Sanchez and
ImageNet Large Scale Visual Recognition Challenge (ILSVRC), Perronnin, 2011). A CNN is a deep neural network consisting of
a CNN with an AlexNet algorithm won the championship be- deep layers in several steps: convolution layers, pooling layers,
and a fully connected layer (Fig. 2). The convolution layer ex- Simulation Conditions and Evaluation Performance of the
tracts local image features from a small area using convolution Learning Models
calculation for the input image. The pooling layer compresses We divided 10,000 leaf images into 10 equal sets that in-
image features by extracting the convolution layer. After many cluded each tree species and set up four learning model types
repetitions of these processes, the fully connected layer runs as (Fig. 3). Learning model-1 (LM-1) used one set (= 1,000 images)
the final process and the output is the final result (Yamashita, as training data and verified itself. Learning model-2 (LM-2) es-
2016). timated nine sets from the remainder as test data using the pa-
In the present study, we used GoogLeNet in some of the CNN rameters learned by LM-1. Learning model-3 (LM-3) used nine
algorithms. GoogLeNet is the 2014 ILSVRC champion (Sze- sets (= 9,000 images) as training data and verified itself. Finally,
gedy et al., 2015). Similar to the network in network (NIN) algo- learning model-4 (LM-4) estimated one set from the remainder
rithm proposed by Liu et al. (2014), GoogLeNet utilizes a micro- as test data using the parameters learned by LM-3. We conducted
network that can conduct a full connection between feature maps 10 iterations without duplication. Although deep learning is not
rather than the activation function. GoogLeNet consists of 22 necessary to prepare image features in advance, we did this with
layers, such as inception modules, which are composed of plural both LM-1 and LM-3 primarily because researchers or develop-
convolutions or pooling layers (Yamashita, 2016). ers need to input a large amount of image data as training data
to deep learning models. Large amounts of training data must
Learning Environment and Models for the CNN be used in deep learning, but the actual amount necessary is
The learning environment for the CNN was a computer not specified. Using LM-1 with LM-3, we hypothesized that
with a Linux operating system (Ubuntu 16.04 LTS), Intel Core the data requirements to identify tree species based on leaf im-
i7–7700K central processing unit, and an NVIDIA GeForce ages could be determined. We applied hyper-parameters used by
GTX 1080Ti graphics processing unit. We used CUDA 8.0 and GoogLeNet to all default values (Table 1). The numbers of ep-
cuDNN 5.1 to support the deep learning by the graphics pro- ochs in this study were 50, 100, and 500. Finally, we constructed
cessing unit (NVIDIA, 2019; Yamashita, 2016; Shimizu, 2017). 36 learning models (= 4 learning model types × 3 image types ×
The learning model for the CNN was DIGITS 5.0.0 (NVIDIA 3 epoch types) based on the learning patterns, image types, and
website), which enables web-based learning, and Caffe 0.15.13 learning iterations.
(NVIDIA, 2019) as a learning framework. Many previous stud- The performance of the proposed models was evaluated with
ies have used DIGITS and Caffe for applications such as auto- the Matthews correlation coefficient (MCC) (Eq.(1)), which is
matic recognition of the microstructure of steel materials (Ada- an index for determining whether a classification is conducted
chi et al., 2016), medical-field methodology (Izaki, 2017), and without bias. The MCC ranges from –1 to 1 (Motoda et al.,
automated classification of coronary angiography (Hasegawa et 2006; Witten and Frank, 2011). In Eq. (1), both true positive (TP)
al., 2018). Morino (2017) also recommended that scientists and and true negative (TN) are accurate classifications according to
technicians use Caffe as the CNN method in deep learning for the classifier; the former is the positive example whereas the lat-
image analyses. Finally, Caffe is a representative framework in ter is the negative example for each piece of training data. A false
the deep learning field; therefore, it involves advanced tuning positive (FP) occurs when the outcome is incorrectly predicted
of hyper-parameters and speeds up learning and calculations. In as 'yes' (or positive) when it is actually 'no' (or negative). A false
addition, Caffe can mount DIGITS, which can be used with web- negative (FN) occurs when the outcome is incorrectly predicted
based methods. Thus, we used DIGITS, Caffe, and GoogLeNet as negative when it is actually positive (Witten and Frank, 2011).
as tools for deep learning. Here, the MCC is the average of the sum of ten iterations for
each tree species.
Convolutional Neural Network Applied to Tree Species Identification Based on Leaf Images 5
RESULTS
TP × TN − FP × FN
MCC =
(TP + FP )(TP + FN )(TN + FP )(TN + FN )
Classification Accuracy for Both Training and Test Data
(1) Table 2 shows the classification accuracy based on differenc-
es in learning patterns, image types, and learning iterations. The
classification accuracy for the training data ranged from 0.881
(LM-1, grayscale, 50 epochs) to 0.998 (LM-1, color, 500 epochs)
(Fig. 4). Similarly, that of the test data ranged from 0.815 (LM-2,
grayscale, 50 epochs) to 0.994 (LM-4, color, 500 epochs) (Fig.
Table 2 Classification accuracy according to the differences in learning patterns, image types, and learning iterations
(1) Color image
Learning Number of data Number of Matthews correlation coefficient
models Training Test epochs Cinnamomum Ilex Pittosporum Quercus Quercus
camphora integra tobira glauca myrsinifolia
1 1,000 1,000 50 0.966 1.000 0.996 0.962 0.928
100 0.981 1.000 1.000 0.999 0.981
500 0.996 1.000 1.000 0.999 0.996
2 1,000 9,000 50 0.890 0.991 0.905 0.936 0.812
100 0.922 0.994 0.894 0.944 0.834
500 0.934 0.996 0.920 0.941 0.862
3 9,000 9,000 50 0.990 1.000 1.000 0.999 0.989
100 0.989 1.000 1.000 0.999 0.988
500 0.993 1.000 1.000 1.000 0.993
4 9,000 1,000 50 0.985 0.999 1.000 0.998 0.982
100 0.983 0.999 0.994 0.991 0.981
500 0.989 0.999 0.998 0.997 0.986
(2) Grayscale image
Learning Number of data Number of Matthews correlation coefficient
models Training Test epochs Cinnamomum Ilex Pittosporum Quercus Quercus
camphora integra tobira glauca myrsinifolia
1 1,000 1,000 50 0.825 0.978 0.997 0.860 0.747
100 0.854 0.989 0.996 0.906 0.828
500 0.936 0.998 0.999 0.967 0.931
2 1,000 9,000 50 0.687 0.958 0.946 0.813 0.672
100 0.721 0.957 0.911 0.863 0.769
500 0.793 0.967 0.937 0.890 0.827
3 9,000 9,000 50 0.951 0.998 0.999 0.969 0.950
100 0.969 0.999 1.000 0.979 0.969
500 0.988 0.999 0.999 0.992 0.988
4 9,000 1,000 50 0.936 0.993 0.997 0.954 0.929
100 0.955 0.996 0.999 0.962 0.955
500 0.980 0.997 0.999 0.983 0.975
(3) Binary image
Learning Number of data Number of Matthews correlation coefficient
models Training Test epochs Cinnamomum Ilex Pittosporum Quercus Quercus
camphora integra tobira glauca myrsinifolia
1 1,000 1,000 50 0.899 0.975 0.997 0.909 0.835
100 0.950 0.989 0.999 0.945 0.916
500 0.974 0.999 1.000 0.981 0.966
2 1,000 9,000 50 0.866 0.924 0.984 0.834 0.754
100 0.888 0.926 0.972 0.883 0.847
500 0.898 0.941 0.975 0.894 0.871
3 9,000 9,000 50 0.972 0.996 0.999 0.969 0.955
100 0.986 0.998 1.000 0.976 0.970
500 0.993 0.999 1.000 0.985 0.982
4 9,000 1,000 50 0.961 0.974 0.993 0.950 0.931
100 0.972 0.983 0.997 0.960 0.954
500 0.980 0.975 0.996 0.964 0.959
Note: Underlines in each table show that the MCC equals 1.00.
4). Both the training and test data showed remarkably high clas- epochs in LM-1 was low, the MCC value was relatively low.
sification accuracy. For each learning model, if the number of The MCC in LM-1 tended to improve as the number of epochs
Convolutional Neural Network Applied to Tree Species Identification Based on Leaf Images 7
increased. The MCC in LM-2 was lower than that of other learn- low overall, and MCC values for both C. camphora and Q. myr-
ing models. Similar to LM-1, the MCC tended to improve as sinifolia with 50 epochs were much lower than those of other
the number of epochs increased. The MCC in LM-3 indicated tree species (0.687 and 0.672, respectively). Using binary imag-
extremely high classification accuracy for all image types. For es, some MCC values for P. tobira were 1.00 (Table 2); this tree
color images in LM-3, 100 epochs (MCC = 0.990) showed low- species tended to be identified with higher accuracy than other
er accuracy than 50 epochs (MCC = 0.993). The MCC values species for both color and grayscale images. The MCC values
in LM-4 were similar to those of LM-3. Color images had the of C. camphora, Q. glauca, and Q. myrsinifolia were lower than
highest classification accuracy among all learning models. In those of other tree species in the same image types, while most
addition, the MCC for all epoch numbers was over 0.9 when MCCs in binary images were higher than those of the grayscale
using color images, although there was much more test data than images.
training data in LM-2. In both LM-1 and LM-2, the classifica-
tion accuracy of binary images was higher than that of grayscale Tendency for Misclassification of Each Tree Species
images, while In LM-3 and LM-4, the classification accuracy of Figure 5 illustrates the misclassification patterns according
grayscale images was similar to that of binary images, except for to tree species. Using color images, LM-1 correctly classified
the case of 500 epochs in LM-3. all tree species except for C. camphora, which was misclassified
Classification accuracy for each tree species varied accord- as Q. myrsinifolia (12 misclassifications), and Q. myrsinifolia,
ing to the image type. Using color images, the MCC of I. integra which was misclassified as Q. glauca (2). The misclassification
in LM-1 and both I. integra and P. tobira in LM-3 were 1.00 for in LM-2 using color images was greater than that of other learn-
all numbers of epochs (Table 2). The classification accuracy of ing models. In LM-3 using color images, all tree species were
Q. myrsinifolia was lower than that of other tree species for all correctly classified except for C. camphora, which was misclas-
learning model types; in particular, that of LM-2 ranged from sified as Q. myrsinifolia (196), and Q. myrsinifolia, which was
0.812 (50 epochs) to 0.862 (500 epochs), which was much lower misclassified as Q. glauca (7). In LM-4 using color images, Q.
than in other cases. Using grayscale images, only the MCC of glauca was correctly classified. Overall C. camphora and Q.
P. tobira in LM-3 (100 epochs) was 1.00 (Table 2). Most MCC myrsinifolia tended to be misclassified as Q. myrsinifolia and
values for both I. integra and P. tobira were over 0.99, and the Q. glauca, respectively; most Q. glauca, I. integra, and P. tobira
lowest was 0.911 (P. tobira, LM-2, 100 epochs). Similar to the species were correctly classified for all training data, while mis-
color images, the classification accuracy in LM-2 was relatively classifications of I. integra for the test data were very low.
For grayscale images, misclassifications occurred in all tree compared to the results of the present study, although the learn-
species in LM-1 and LM-2. Generally, classifications using gray- ing conditions differed (e.g., algorithms, image types). However,
scale images were similar to those using color images; however, the authors suggested that a high classification accuracy cannot
misclassification rates were higher than those of color images, be assured when the number of tree species is low. Nevertheless,
and there were numerous misclassification types not observed the classification accuracy of both the training and test data were
when using color images. high in the present study.
In LM-1 using binary images, I. integra and P. tobira were Among image types, classification accuracy was highest for
correctly classified. Conversely, LM-2 had many cases of mis- color images and lowest for grayscale images. The amount of
classification overall. Classifications made using binary images information decreased in the order of color, grayscale, and bi-
were similar to those using color and grayscale images, and mis- nary. Thus, we expected that the classification accuracy would
classification rates were lower than those of grayscale images. improve according to this order. In the present study, however,
There were some misclassification types that were not observed grayscale images provided lower accuracy than the other image
in the color images; for example, P. tobira and Q. glauca were types. In addition, when users actually photograph a leaf with
misclassified as I. integra. a smartphone, color images may be greatly influenced by the
In general, there were numerous cases of misclassification: photo environment (Sato, 2011). Thus, when developing a tree
C. camphora was misclassified as Q. glauca or Q. myrsinifolia, identification system for mobile terminals, binary images should
and Q. myrsinifolia as Q. glauca, for both the training and test be considered because they provide robust results under different
data, respectively. The misclassification of I. integra was very environmental conditions at the cost of slightly lower accuracy
limited, although there was some misclassification as Q. glauca. than color images (Sato, 2011).
Both P. tobira and Q. glauca tended to be misclassified based on
the image types.
CONCLUSIONS
LITERATURE CITED Morino, S. (2017) Tutorial for deep learning and GPU. J. Robot.
Soc. Jpn. 35: 203–208 (in Japanese)
Adachi, Y., Taguchi, M. and Hirokawa, S. (2016) Microstructure Motoda, H., Tsumoto, S., Yamaguchi, T. and Numao, M. (2006)
recognition by deep learning. Tetsu-to-hagané 102: 722–729 Data mining no kiso [Fundamentals of data mining]. Ohmsha,
(in Japanese with English abstract) Tokyo, 285 pp (in Japanese)
BIOME (2017) https://biome.co.jp/ (accessed 26 July 2019) Nakane, H. and Wakatsuki, Y. (2018) Startup of deep learning
Dalal, N. and Triggs, B. (2005) Histograms of oriented gradients application to environmental research. Kochi Univ. of Tech.
for human detection. Proc. CVPR: 886–893 Bull. 15: 111–120 (in Japanese with English abstract)
Du, J.X., Wang, X.F. and Zhang, G.J. (2007) Leaf shape based Nam, Y. and Hwang, E. (2005) A shape-based retrieval scheme
plant species recognition. Appl. Math. Comput. 185: 883–893 for leaf image. Lect. Notes Comput. Sci. 3767: 876–887
Gouveia, F., Filipe, V., Reis, M., Couto, C. and Cruz, J.B. (1997) NIH (2004) ImageJ, https://imagej.nih.gov/ij/ (accessed 14 Feb-
Biometry: the characterization of chestnut-tree leaves using ruary 2018)
computer vision. Proc. ISIE: 757–760 NVIDIA (2019) CUDA Toolkit, https://developer.nvidia.com/
Hasegawa, A., Lee, Y., Takeuchi, Y. and Ichikawa, K. (2018) Au- cuda-toolkit/ (accessed 6 February 2019)
tomated classification of calcification and stent on computed Ohtsuki, T. (2019) Saikyo igo AI AlphaGo kaitai shinsyo [Stron-
tomography coronary Angiography using deep learning. Jpn. J. gest Go, AlphaGo, new book of anatomy]. Shoeisha, Tokyo,
Radiol. Tech. 74: 1138–1143 (in Japanese with English abstract) 323 pp (in Japanese)
Izaki, T. (2017) The latest information of deep learning and its Okaya, T. (2015) Shinso gakusyu [Deep learning]. Kodansha,
application in medical field. Jpn. J Imaging Inf. Sci. Med. 34: Tokyo, 176 pp (in Japanese)
32–38 (in Japanese with English abstract) Papageorgiou, C.P., Oren, M. and Poggio, T. (1998) A general
LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998) Gra- framework for object detection. Proc. ICCV: 555–562
dient-based learning applied to document recognition. Proc. Sanchez, J. and Perronnin, F. (2011) High-dimensional signature
IEEE 86: 2278–2324 compression for large-scale image classification. Proc. CVPR:
Lee, C.L. and Chen, S.Y. (2006) Classification of leaf images. 1665–1672
Int. J. Imaging. Syst. Tech. 16: 15–23 Sato, Y. (2011) Pattern processing for image recognition. J. Ro-
Liu, M., Chen, Q. and Yan, S. (2014) Network in network. Proc. bot. Soc. Jpn. 29: 439–442 (in Japanese)
ICLR: 1–10 Shen, Y., Zhou, C. and Lin, K. (2005) Leaf image retrieval using
Lowe, D.G. (1999) Object recognition from local scale-invariant a shape based method. Proc. AIAI: 711–719
features. Proc. ICCV: 1150–1157 Shimizu, R. (2017) Hajimete no Shinso gakusyu programing
Kumar, N., Belhumeur, P.N., Biswas, A., Jacobs, D.W., Kress, [The first deep learning programing]. Gijyutsu-hyoronsha, To-
W.J., Lopez, I. and Soares, J.V.B. (2012) Leafsnap: A com- kyo, 190 pp (in Japanese)
puter vision system for automatic plant species identification Shindo, T., Yang, L.L., Hoshino, Y. and Cao, Y. (2018) A funda-
computer vision. Lect. Notes Comput. Sci. 7573: 502–516 mental study on plant classification using image recognition
McCulloch, W.S. and Pitts, W. (1943) A logical calculus of the by AI. The 61th Japan Joint Automatic Control Conference:
ideas immanent in nervous activity. Bull. Math. Biophy. 5: 175–179
115–133 Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L.,
Makino, K. and Nishizaki, H. (2018) Sansu & Razupai karaha- Driessche, G., Schrittwieser, J., Antonoblou, I., Panneershel-
jimeru deep learning [Deep learning to begin with the arith- vam, V., Lanctot, M., Dieleman, S., Grewe, D., Nham, J., Kal-
metic and Raspberry Pi]. CQ shuppansha, Tokyo, 208 pp (in chbrenner, N., Sutskever, I., Lillicrap, T., Leach, M., Kavukc-
Japanese) uoglu, K., Graepel, T. and Hassabis, D. (2016) Mastering the
Minowa, Y., Akiba, T. and Nakanishi, Y. (2011) Classification of game of Go with deep neural networks and tree search. Nature
a leaf image using a self-organizing map and tree based model. 529: 484–489
J. For. Plan. 17: 31–42 Silver, D., Schrittwieser, J., Simonyan, K., Antonoblou, I.,
Minowa, Y. and Asao, K. (in press) Tree species identification Huang, A., Guez, A., Hubert, T., Baker, L., Lai, M., Bolton,
based on venation pattern of leaf images photographed with a A., Chen, Y., Lillicrap T., Hui, F., Sifre, L., Driessche, G.,
mobile device in the outdoors. Jpn. J. For. Plann. (in Japanese Graepel, T. and Hassabis, D. (2017) Mastering the game of Go
with English abstract) without human knowledge. Nature 550: 354–359
Minowa, Y., Takami, T., Suguri, M., Ashida, M. and Yoshimura, Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov,
Y. (2019) Identification of tree species using a machine learn- D., Erhan D., Vanhoucke, V. and Rabinovich, A. (2015) Going
ing algorithm based on leaf shape and venation pattern. Jpn. J. deeper with convolutions. Proc. CVPR: 1–12
For. Plann. 53: 1–13 (in Japanese with English abstract) Vapnik, V. and Lermer, A. (1963) Pattern recognition using gen-
Mizoguchi, T., Ishii, A., Nakamura, H., Inoue, T. and Takamatsu, eralized portrait method. Automat. Rem. Contr. 24: 774–780
H. (2016) Individual tree species classification on laser scan of Wang, Z., Chi, Z., Feng, D. and Wang, Q. (2000) Leaf image
forest by deep learning. JSPE Meeting: 491–492 (in Japanese retrieval with shape features. Lect. Notes Comput. Sci. 1929:
with English abstract) 477–487
Convolutional Neural Network Applied to Tree Species Identification Based on Leaf Images 11
Wilf, P., Zhang, Z., Chikkerur, S., Little, S.A., Wing, S.L. and Yamasaki, T. (2010b) Scale-invariant feature transform (SIFT)
Serre, T.(2016) Computer vision cracks the leaf code. PNAS and bag of features (BoF). J. Inst. Image Inform. TV. Engnr.
113: 3305–3310 64: 530–537 (in Japanese)
Witten, I.H. and Frank, E. (2011) Data mining, practical machine Yamashita, T. (2016) Irasuto de manabu deep learning [An il-
learning tools and techniques. 3rd ed. Morgan Kaufmann Pub- lustrated guide to deep learning]. Kodansha, Tokyo, 207 pp
lishers, California, 629 pp (in Japanese)
Yamasaki, T. (2010a) Histogram of oriented gradients (HOG).
J. Inst. Image Inform. TV. Engnr. 64: 322–329 (in Japanese) (Received 2 July 2019)
(Accepted 7 April 2020)
(J-STAGE Advance Published 22 April 2020)