Professional Documents
Culture Documents
Labeled
Raw Data Data Prediction Ground view
l
Algae? Yes
Algae? No
3. Classify
Regions
Bounding box 2. Compute CNN Features
Aerial view
Fig. 3. The procedure undertaken to develop the proposed computer vision system for algae detection.
methods that use UAVs, satellites or airplanes for algae Subsequently, the images and their corresponding annota-
monitoring. The second class of monitoring techniques makes tions should be randomly assigned to training, validation and
use of water based robots, USVs and ships. A third approach testing datasets. Typically, we can assign 70% of images to
that has been proposed is to build a community-based al- the training set, 20% of images to the validation set and 10%
gae monitoring system composed of citizen scientists using to the test set.
ground-based cameras such as smartphones [22]. However,
all these techniques capture images from different orientations B. Object Detection Algorithm
and use computer vision pipelines that exploit the advantages Initially, the general lack of algae images on the Internet
provided by their respective hardware platforms, making it made it inconceivable to apply deep learning algorithms due
highly improbable that the vision system developed for aquatic to the unavailability of a large amounts of annotated data,akin
platforms could work with images taken from the aerial to conventional datasets such as COCO [25] or PASCAL-VOC
view or vice-versa because the images would be significantly [26]. However, a new paradigm of machine learning known
different from each other as shown in Figure 2. as Transfer Learning enabled us to overcame the limitations
However, using only one kind of platform severely restricts posed by the size of our dataset.
our ability to monitor bodies of water of different sizes 1) Transfer Learning: Transfer learning is a new paradigm
effectively. A ground-based algae monitoring system would in machine learning that enables us to use different domains,
be incapable of detecting the growth of algal blooms at the distributions, and tasks in training and testing datasets [27].
center of large lake, while aerial or aquatic robot platform This implies that, Convolutional Neural Network (CNN),
based methods would restrict algae monitoring capabilities to which has been trained on a large labeled dataset from a
only communities that possess that particular hardware. Hence, different domain can be applied for feature detection in a
a computer vision system that enables algae monitoring to completely new domain. This is made possible by the fact
be executed by different platforms, as depicted in Figure 1, that the lower layers of the network are capable of detecting
would democratize the process of algae monitoring and make general features such as edges, blobs etc., that comprise an
it available to communities all across the globe. image, while higher layers capture domain specific features.
The following subsections describe the steps shown in This enables a pre-trained network to be learn from a much
Figure 3, which reflects the steps we have undertaken to smaller dataset, to recognize a completely new class of objects,
develop our computer vision system. as only the latter layers of the network have to be trained [28].
2) Deep Learning Models: An algae monitoring system,
A. Dataset Development that conceivably could be used from mobile platforms such
The first step in the application of machine learning al- as USVs, UAVs, airplanes etc., has to detect and locate algae
gorithms is the preparation of a dataset. Because there is at near real time speeds with high accuracy. As Faster R-
no publicly available dataset containing images of algae in CNN, Single Shot Detector (SSD), and Region-based Fully
an outdoor environment, a new benchmark dataset must be Convolutional Networks (R-FCN) models have shown near
developed. Each image in the dataset should be labeled with real-time object detection on conventional datasets such as
annotation software to generate files containing the coordinates COCO and PASCAL-VOC with very high accuracy, they are
of the bounding boxes, indicating the location of algae in the applicable to our envisaged scenario. The working of these 3
image. To ensure that the complexity of the images in the models is described in the following subsections.
dataset is equivalent to that of the outdoor environment, it is • Faster R-CNN: The Faster R-CNN [29] was developed to
recommended that a significant number of background objects remove the limitations of R-CNN [30] and Fast R-CNN
such as trees, grass, plants, roads and buildings be present in [31] models. It makes use of a region proposal network
the images, rather than just having the region of interest, in to generate box proposals from an intermediate feature
this case algae, as the main part of the image. map. These box proposals are then applied on the same
feature map to extract proposal regions, on which class Algorithm 1 Training a neural network to detect algae
predictions and bounding box improvements are made. Input: A file containing pre-trained weights of the model;
• R-FCN: R-FCN network has a shared convolution archi- Label map; Configuration file for the pre-trained model;
tecture that makes the use of position sensitive score maps Output: A file containing the weights of the newly trained
to compromise between translation invariance for image model
classification and translation variance for object detection 1: repeat
[32]. 2: Prepare an annotated dataset and split it into training,
• Single Shot Detector: The single shot multibox detector validation and testing dataset
was recently developed in [33]. It is a single layer feed 3: Convert the dataset annotations into appropriate input
forward convolution network that makes the use of pre- format
defined anchor boxes, to train the network and make 4: Fine tune the hyperparameters of the neural network
predictions about the class within the anchor box and 5: Monitor the training loss and mean average precision
the offset by which the anchor needs to be shifted to fit on validation dataset
the ground-truth box. 6: If mAP graph converges stop training and observe the
final validation mAP
3) Learning from Custom Dataset: The procedures fol-
7: until validation mAP > satisfactory mAP
lowed to enable the chosen deep learning models to detect
8: Obtain the mAP of the trained network on the test dataset
algae are as follows:
9: Deploy the model into production
a) Convert data: The input data, i.e., the images and bound- 10: Set a confidence threshold and visualize the results in the
ing box coordinates, need to be converted into appropriate image
formats depending upon how the data is being ingested
and the deep learning framework being utilized. In case of
Tensorflow [34], the input data should be converted into
tfrecords while for MXNET [35], recordIO file should be validation dataset. Observing the different metric values
used. This ensures faster processing as compared to when such as training loss, mean average precision (mAP) on
the data is read directly from disk. the validation dataset enables us to infer:
b) Data Augmentation: To mitigate the effects of having
• When the model has stopped learning, so that training
a small dataset and to replicate external environmental
can be stopped
conditions such as variable illumination, fluctuating con-
• Whether subsequent iterations of training should be
trast, and blurring, the dataset should be augmented by
conducted, by optimizing hyperparameters or using a
applying transformations such as randomly changing the
larger dataset
brightness, contrast, hue, color, and saturation.
c) Label Maps: Machine learning algorithms cannot work f) Evaluating the trained model on the test dataset and
with categorical features (e.g., class labels for object of deploying into production: The trained model would be
interest in images), and hence categorical features should evaluated on the test set to examine its classification
be related with a numerical value. This is usually done by and detection accuracy on a completely new set of
using a label map. Since our dataset annotations belong data, allowing us to infer whether the chosen model is
to only one class (i.e., algae), in our label map, the algae applicable to be used as an algae monitoring system.
class is related to the ID 1. g) Visualizing results generated by neural network: For each
d) Fine tune hyperparameters in configuration files: Pre- frame/image the neural network populates the following
trained neural networks are made available with 2 com- arrays:
ponents, which are the weights of the model and the • Boxes – this array contains the normalized coordinates
configuration file that determines the meta-architecture of for each predicted bounding box.
the model, feature extractor present in the model, training • Scores – this array contains the confidence scores for
parameters and the evaluation metrics. To customize the each of the predicted boxes.
pre-trained networks for our dataset, a decaying learning • Classes – this array contains the class label for each
rate of 10% every 5000 steps is applied, and we change of the predicted boxes.
the final layer to reflect that there is only one class of • Number of detections – This array contains the value
objects in our dataset. regarding total number of detections made per image.
e) Begin training and monitor evaluation metrics: After cus- From these individual arrays, we create a list of all the
tomizing the configuration files, the training of the neural predicted bounding boxes having a confidence score of
network models can be initialized, which would fine-tune higher than 50%. Each list item contains the class label,
the weights in the latter layers of the neural networks normalized box coordinates and the confidence scores for
enabling them to detect algae. During the training of the each bounding box as shown in Figure 4.
model, after a specified number of steps, an intermediate By applying the following equations for each bounding
trained model would be stored and evaluated on the box, we convert the normalized coordinates into image
TABLE I
D ISTRIBUTION OF IMAGES ACROSS THE TRAINING , VALIDATION , AND
TEST SETS (D.= D ETECTION )
Accuracy Precision Recall Valid. mAP Test mAP FPS (GPU) FPS (CPU)
metric. The mAP is obtained by integrating the precision recall However, the detection speeds for each model, particularly
curve [36], when implemented on a GPU, are satisfactory for being used
Z 1
as algal monitoring systems as algal blooms grow in static or
mAP = p(x)dx (1)
0
slow moving water bodies.
where p(x) is the precision-recall curve. To determine the pre- C. Discussion
cision recall curve, the true positive and false positive values
Despite using state-of-art object detection models, we ob-
of the prediction can be computed by using the Intersection
served that there were a few occasions in which either algae
over Union (IoU) criterion,
was not detected or the surrounding vegetation was detected as
Boxpred ∩ Boxgt algae, which could be a topic for further study. Some examples
IOU = (2) are presented in Figure 7. The most likely cause for such
Boxpred ∪ Boxgt
incidences is that, the only defining characteristic of algae is
where Boxpred and Boxgt are the areas included in the its green color which is the most common color found in an
predicted and ground truth bounding box, respectively. Then, outdoor environment. However, a larger dataset enabling the
a threshold for IoU is designated (e.g., 0.5), and if the IoU neural network to learn more intricate features would be able
exceeds the threshold, the detection is marked as correct de- to detect algae with more consistency.
tection. Multiple detections of the same object are considered
as one correct detection, while others are considered as false V. C ONCLUSION AND F UTURE W ORK
detections. Post, obtaining the true positive and false positive In this work, our objective was to develop a computer vision
values, a precision-recall curve is generated, based on which system that can detect and locate algal blooms in real time,
the mAP can be calculated. from cameras placed on aerial, aquatic or ground based plat-
The mAP values obtained by evaluating all the 3 trained forms. In order to attain our objective we compared state-of-art
models on the validation and test dataset are presented in Table object detectors such as Faster R-CNN, R-FCN, and SSD.Our
II (See column 5 and 6). From the table, we can observe final conclusion was that an algae monitoring system based on
that all the 3 models show reasonably acceptable accuracy the R-FCN model would be highly robust, accurate and fast
in algae bloom localization. In particular, it reveals that both to enable effective, real time algae monitoring. We expect that
Faster R-CNN and R-FCN have higher detection accuracy such a system will significantly facilitate researchers, local
than SSD on both the test and validation dataset. Also, we administrators and civilians in monitoring water bodies and
can observe that the mAP on the test data set is lower than promptly curbing any excessive algal growth.
the mAP on the validation dataset. We believe, this resulted Future works will be focused on improving the current
from the fact that we had chosen the most complex images performances by developing a larger dataset and implementing
in our entire dataset with significant amounts of background field tests. In addition, we will use this computer vision system
clutter (i.e., trees, buildings, roads etc.,) to be a part of our to facilitate the working of a multi-robot ecosystem composed
test dataset. Nevertheless, despite the complexity inherent to of autonomous UAVs and USVs which would be responsible
our test dataset we observe from Figure 5 and Figure 6 that for monitoring water bodies and removal of HABs as shown
algal blooms are well detected regardless of orientation and in Figure 1.
location from where the image was taken.
3) Speed of Detection: Since we envision that our algae VI. ACKNOWLEDGEMENTS
monitoring system can be applied to fast moving mobile The author would like to acknowledge the help and support
platforms such as UAV, USV, airplanes etc., it is necessary provided by the SMART Lab members, notably Wonse Jo
that our system can perform algae detection and localization and Dr. Ramviyas Parasuraman in the development of this
in real time. The results regarding the speed with which each computer vision system.
model can detect algae in an input image is shown in Table II
(See the last two columns). In contrast to the classification and R EFERENCES
detection accuracy, SSD outperforms the other two models in [1] “General algae information,” http://www.ecy.wa.gov/programs/wq/
terms of a detection speed both when using a CPU and GPU. plants/algae/lakes/AlgaeInformation.html, accessed: 2017-09-30.
Algae: 99%
Algae: 96%
Algae: 92% Algae: 96%
Algae: 83% Algae: 99%
Algae: 99% Algae: 99% Algae: 96% Algae: 99% Algae: 96%
Algae: 92% Algae: 96% Algae: 96%
Algae: 92% Algae: 96% Algae: 33% Algae: 53% Algae: 99%
Algae: 99% Algae: 83%
Algae: 99%
Algae: 99% Algae: 83% Algae: 59% Algae: 99% Algae:
Algae: 99% Algae: Algae: 99% Algae: 98% Algae: 66% Algae: 96%
96%
Algae: Algae: 33%
33% Algae:
Algae: 53%
Algae:
Algae: 99%
99%
Algae: 99%
99% 53%
Algae:
Algae: 59%
59% Algae: Algae: 98% Algae:
Algae: 98% Algae: 66%
66%
Algae: 99% Algae: 99%
99%
Algae: 99%
Algae: 99% Algae: 99%
RFCN
RFCN
(b) Algae RFCN
detection by R-FCN
Algae: 99% Algae: 83%
Algae: 99% Algae: 82%
Algae: 53%
Algae: 66% Algae: 99% Algae: 83%
Algae: 99% Algae: 99% Algae: 88%
Algae: 83% Algae: 82%
Algae: 99% Algae: 53% Algae: 82%
Algae: 53%
Algae: 66% Algae: 88%
Algae: 66% Algae: 88%
SSD
SSD
(c) Algae detection by SSD
SSD
Algae: 99%
Algae: 66%
Fig. 5. Images that show detection results for each model from aAlgae:
Algae: 99%
ground
99% view drawn in green.
Algae: 88% Algae: 99%
Algae: 94%
Algae: 99%
Algae: 99%
Algae: 99% Algae: 99%
Algae: 66% Algae: 91% Algae: 99% Algae:Algae:
99% 99%
Algae: 88% Algae: Algae: 58% Algae: 99%
Algae: 99% Algae: 99% Algae: 99%99%
Algae: 94% Algae: 88% Algae: 91%
Algae: 99% Algae: 99% Algae: 99%
Algae: 91% Algae: 99% Algae: 99%
Algae: 99% Algae: 99%
Algae: 99% Algae: 99% Algae: 99% Algae: 58%
Algae: 99% Algae: 58% Algae: 99%
Algae: 99% Algae: 91%
Algae: 99% Algae: 99%
Algae: 91%
Algae: 99%
Fig. 6. Images that show detection results for each model from an aerial view drawn in green.
[2] “Ocean’s living carbon pumps: When viruses attack giant algal [5] D. M. Anderson, “Approaches to monitoring, control and management
blooms, global carbon cycles are affected,” https://www.sciencedaily. of harmful algal blooms,” Ocean and coastal management, vol. 52, no. 7,
com/releases/2014/10/141021101510.htm, accessed: 2017-10-12. pp. 342–347, 2009.
[3] M. Y. Menetrez, “An overview of algae biofuel production and potential [6] S. R. Bhat and S. G. P. Matondkar, “Algal blooms in the seas around
environmental impact,” Environmental science & technology, vol. 46, india networking for research and outreach,” Current Science, vol. 87,
no. 13, pp. 7073–7085, 2012. no. 8, pp. 1079–1083, 2004.
[4] H. Fallowfield and M. Garrett, “The treatment of wastes by algal [7] T. Okaichi, Sustainable Development in the Seto Inland Sea, Japan.
culture,” Journal of Applied Microbiology, vol. 59, no. s14, 1985. Terra Scientific Publishing Company, 1997.
techniques for the detection, mapping and analysis of phytoplankton
blooms in coastal and open oceans,” Progress in oceanography, vol.
123, pp. 123–144, 2014.
[24] K. F. Flynn and S. C. Chapra, “Remote sensing of submerged aquatic
vegetation in a shallow non-turbid river using an unmanned aerial
vehicle,” Remote Sensing, vol. 6, no. 12, pp. 12 815–12 836, 2014.
[25] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan,
P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in
context,” in European conference on computer vision. Springer, 2014,
(a) Incorrect detection of surrounding vegetation as algae pp. 740–755.
[26] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisser-
man, “The pascal visual object classes (voc) challenge,” International
journal of computer vision, vol. 88, no. 2, pp. 303–338, 2010.
[27] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans-
actions on knowledge and data engineering, vol. 22, no. 10, pp. 1345–
1359, 2010.
[28] H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao,
D. Mollura, and R. M. Summers, “Deep convolutional neural networks
for computer-aided detection: Cnn architectures, dataset characteristics
and transfer learning,” IEEE transactions on medical imaging, vol. 35,
no. 5, pp. 1285–1298, 2016.
(b) Inabilty to detect algae from ground and aerial view [29] S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: towards real-time
object detection with region proposal networks,” IEEE transactions on
Fig. 7. Images that show incorrect or inability to detect algae. pattern analysis and machine intelligence, vol. 39, no. 6, pp. 1137–1149,
2017.
[30] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature
hierarchies for accurate object detection and semantic segmentation,”
[8] U. S. E. P. Agency, “A compilation of cost data associated with the in Proceedings of the IEEE conference on computer vision and pattern
impacts and control of nutrient pollution,” Tech. Rep., may 2015. recognition, 2014, pp. 580–587.
[9] K. D. Hambright, X. Xiao, and A. R. Dzialowski, “Remote sensing of [31] R. Girshick, “Fast r-cnn,” arXiv preprint arXiv:1504.08083, 2015.
water quality and harmful algae in oklahoma lakes [extension from 2013 [32] J. Dai, Y. Li, K. He, and J. Sun, “R-fcn: Object detection via region-
(final report)],” Oklahoma Water Resources Research Institute, 2015. based fully convolutional networks,” in Advances in neural information
[10] S. Jung, D. Kim, K. Kim, and H. Myung, “Image-based algal blooms processing systems, 2016, pp. 379–387.
detection using local binary pattern,” 2016. [33] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C.
[11] Y. Wang, R. Tan, G. Xing, J. Wang, X. Tan, and X. Liu, “Energy-efficient Berg, “Ssd: Single shot multibox detector,” in European conference on
aquatic environment monitoring using smartphone-based robots,” ACM computer vision. Springer, 2016, pp. 21–37.
Transactions on Sensor Networks, vol. 12, no. 3, 2016. [34] J. Huang, V. Rathod, D. Chow, C. Sun, M. Zhu, M. Tang, A. Korattikara,
[12] C. S. Tan, P. Y. Lau, S.-M. Phang, and T. J. Low, “A framework for the A. Fathi, I. Fischer, Z. Wojna, Y. Song, S. Guadarrama, J. Uijlings,
automatic identification of algae (neomeris vanbosseae ma howe): U 3 s,” V. Kovalevskyi, and K. Murphy, “Tensorflow object detection api,” https:
in Computer and Information Sciences (ICCOINS), 2014 International //github.com/tensorflow/models/blob/master/research/object detection.
Conference on. IEEE, 2014, pp. 1–6. [35] “Large scale image classification,” https://mxnet.apache.org/tutorials/
[13] G. A. Carvalho, P. J. Minnett, L. E. Fleming, V. F. Banzon, and vision/large scale classification.html, accessed: 2018-02-21.
W. Baringer, “Satellite remote sensing of harmful algal blooms: A [36] S. Tang and Y. Yuan, “Object detection based on convolutional neural
new multi-algorithm method for detecting the florida red tide (karenia network.”
brevis),” Harmful algae, vol. 9, no. 5, pp. 440–448, 2010.
[14] L. Bennett, “Algae, Cyanobacteria Blooms, and Climate Change,” Cli-
mate Institute, Tech. Rep., April 2017.
[15] Y.H.Cheung and M.H.Wong, “Properties of animal manures and sewage
sludges and their utilisation for algal growth,” Agricultural Wastes,
vol. 3, no. 2, pp. 109–122, 1981.
[16] G. Hallegraeff and C. Bolch, “Transport of diatom and dinoflagellate
resting spores in ships’ ballast water: implications for plankton biogeog-
raphy and aquaculture,” Journal of Plankton Research, vol. 14, no. 8,
p. 10671084, 1992.
[17] ——, “Transport of toxic dinoflagellate cysts via ships’ ballast water,”
Marine Pollution Bulletin, vol. 22, no. 1, pp. 27–30, 1991.
[18] K. G. Sellner, G. J. Doucette, and G. J. Kirkpatrick, “Harmful algal
blooms: causes, impacts and detection,” Journal of Industrial Microbi-
ology and Biotechnology, vol. 30, no. 7, pp. 383–406, 2003.
[19] L. Peperzak, “Climate change and harmful algal blooms in the north
sea,” Acta Oecologica, vol. 24, no. 1, pp. 139–144, May 2003.
[20] H. B. Glasgow, J. M. Burkholder, R. E. Reed, A. J. Lewitus, and J. E.
Kleinman, “Real-time remote monitoring of water quality: a review
of current applications, and advancements in sensor, telemetry, and
computing technologies,” Journal of Experimental Marine Biology and
Ecology, vol. 300, no. 1-2, pp. 409–448, 2004.
[21] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning
applied to document recognition,” Proceedings of the IEEE, vol. 86,
no. 11, pp. 2278–2324, 1998.
[22] A. C. Kumar and S. M. Bhandarkar, “A deep learning paradigm for
detection of harmful algal blooms,” in Applications of Computer Vision
(WACV), 2017 IEEE Winter Conference on. IEEE, 2017, pp. 743–751.
[23] D. Blondeau-Patissier, J. F. Gower, A. G. Dekker, S. R. Phinn, and V. E.
Brando, “A review of ocean color remote sensing methods and statistical