0% found this document useful (0 votes)
19 views8 pages

Ship Classification Based On Convolutional Neural Networks

Uploaded by

seval
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views8 pages

Ship Classification Based On Convolutional Neural Networks

Uploaded by

seval
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Ships and Offshore Structures

ISSN: (Print) (Online) Journal homepage: [Link]/journals/tsos20

Ship classification based on convolutional neural


networks

Yang Yang, Kaifa Ding & Zhuang Chen

To cite this article: Yang Yang, Kaifa Ding & Zhuang Chen (2022) Ship classification based
on convolutional neural networks, Ships and Offshore Structures, 17:12, 2715-2721, DOI:
10.1080/17445302.2021.2016271

To link to this article: [Link]

Published online: 11 Jan 2022.

Submit your article to this journal

Article views: 194

View related articles

View Crossmark data

Citing articles: 1 View citing articles

Full Terms & Conditions of access and use can be found at


[Link]
SHIPS AND OFFSHORE STRUCTURES
2022, VOL. 17, NO. 12, 2715–2721
[Link]

Ship classification based on convolutional neural networks


Yang Yang , Kaifa Ding and Zhuang Chen
School of Naval Architecture & Ocean Engineering, Dalian University of Technology, Dalian, People’s Republic of China

ABSTRACT ARTICLE HISTORY


The main bottleneck limiting the use of traditional ship classification methods is the manual extraction of Received 21 July 2021
ship images before classification. To solve this problem, a ship classification method based on a Accepted 6 December 2021
convolutional neural network (CNN) is proposed in this paper. A CNN model can autonomously extract
KEYWORDS
image features, avoiding complex feature selection and extraction processes. In view of the problem of Ship classification;
an insufficient number of ship samples, transfer learning was applied to train the model using the convolutional neural
ImageNet dataset, effectively alleviating the over-fitting phenomenon in the training process. network (CNN); support-
Experiments showed that the CNN model had an accuracy of 98% in ship classification using the SHIP-3 vector machine (SVM);
dataset. The CNN was robust to external environmental challenges – such as illumination – the accuracy transfer learning;
of ship classification in foggy and night-time conditions reaching 75%, greatly exceeding the environmental factors
performance of traditional machine learning algorithms.

Nomenclature and a support-vector machine (SVM) classifier can be constructed


based on these HOG features to complete the ship classification (Qi
Symbol Definition Unit et al. 2015; Lin et al. 2018). To avoid the effects of wave ripples, the
n batch size parameter features in the minimum boundary rectangle (MBR) of the ship can
d The number of neurons – be extracted and classified using SVM (Lang et al. 2016). The afore-
l The current layers – mentioned studies used artificially extracted features to represent
wkjl−1 The connection parameter between the j-th unit of the l−1 layer –
and the i-th unit of the l layer
ship images and combined traditional machine learning algorithms
xkjl−1 The input value of the j-th neuron of the k-th sample of the l−1 – to classify the ships. The artificially extracted features could visually
layer reflect the ship image information to a certain extent. Moreover,
bl−1
i The offset of the i-th neuron of the l−1 layer – extracting features from the image could effectively reduce the
zkil−1 The output value of the i-th unit of the k-th sample of the l layer – data dimension and the calculation overhead. However, when a
TP True positive –
FN False negative – ship image is obtained based on a camera in a port or waterway,
FP False positive – the background of the image may be complicated, making feature
Accuracy The ratio of the number of samples accurately predicted by the % extraction difficult. Traditional machine learning algorithms cannot
model to the total number of samples achieve high accuracy in ship classification for the above reasons. In
Precision A variable that reflects the specific gravity of the sample of true %
positive examples judged by the classifier
addition, it is necessary to manually extract features from the
Recall A variable reflecting the proportion of positive cases correctly % images before classification. The more valid the information con-
judged to the total number of positive cases tained in the features, the higher the classification accuracy, with
F1-score Balanced F Score % feature extraction methods varying in different fields of expertise.
lr Learning rate – Compared with the raw image, the extracted features lose some use-
ful information. Consequently, it can be difficult to achieve optimal
results by manually extracting features.
In recent years, deep learning has become a research hotspot,
1. Introduction
breakthroughs having been made in image recognition, natural
Along with the integration of the international market and the glo- language processing (Greff et al. 2017), and speech recognition
balisation of economic activities, maritime traffic has become (Graves et al. 2013). Deep learning models (Lecun et al. 2015)
increasingly busy, the number of ships entering and leaving port can learn the inherent rules of sample data independently and
having increased considerably. The identification and classification obtain high-level semantic features with more information and
of ships is not only conducive to maritime traffic management but stronger characterisation by fusing the underlying features
is also to the development of artificial intelligence in the field of ship extracted. In the field of image classification, common algorithms
and ocean engineering and the acceleration of intelligent port for deep learning include the AlexNet, VGG (Simonyan and Zisser-
construction. man 2015), and ResNet (He et al. 2015) models. The AlexNet model
In the current ship classification field, a combination of feature has five convolutional layers and three fully connected layers. Tricks
extraction and traditional machine learning algorithms have been – such as ReLU, Dropout, and local response normalisation (LRN)
used. The representation and shape of the local target in an – have also been applied in convolutional neural networks (CNNs)
image can be described using the gradient or direction density of for the first time. Compared with the AlexNet model, the VGG
the edges. Based on the above characteristics, the histogram of model uses several continuous 3 × 3 convolution cores to replace
orientation gradient (HOG) features of a ship can be extracted, the larger convolution cores in the AlexNet model, improving the

CONTACT Yang Yang yyang@[Link]


© 2022 Informa UK Limited, trading as Taylor & Francis Group
2716 Y. YANG ET AL.

effect of the neural network under the conditions of the perception region sensitive to the input, so as to avoid the gradient disappear-
field. The ResNet model revealed ‘degradation’ and conceived of a ance problem.
‘shortcut connection’ for degradation, effectively alleviating the
problem of difficult training of neural networks with too much
2.1. Convolutional layer
depth. Algorithms in the field of object detection can be divided
into two categories – the first being single-target detection, rep- Convolution is the core operation of a CNN. The parameters in the
resented by YOLO series (Redmon et al. 2016) and SSD (Liu convolution core are obtained by random initialisation – which is
et al. 2016) characterised by its high speed and low accuracy; the usually uniformly distributed, Xavier (Glorot and Bengio 2010),
second being two-stage target detection, represented by R-CNN Gaussian – updated by reverse propagation. The convolution oper-
(Girshick et al. 2014), characterised by high accuracy and low ation involves dot products between the parameters in the convolu-
speed. In this study, we propose a ship classification method tion kernel and the pixel values in the same size area in the input
based on CNN, an algorithm for deep learning. Moreover, transfer image. The result is then output at the corresponding positions.
learning can be used in ship classification models to avoid the over- The characteristics of the convolution operation are local connec-
fitting phenomenon caused by an insufficient number of images tions and weight sharing. Figure 2 shows a convolution process
and to improve the accuracy of the model on small sample ship with an input image of 5 × 5, a 3 × 3 convolution kernel, a step
datasets. Considering the influence of environmental factors, ship size of 1, and an offset of zero.
classification under foggy and night-time conditions was studied.
The feasibility and effectiveness of a ship classification method
2.2. Pooling layer
based on CNN were verified by comparison with traditional
machine learning algorithms. After the convolution operation, each pixel in the output contains
information about a portion of the area in the input image. Conse-
quently, the output results in information redundancy. To improve
2. Convolutional neural network background
the computational efficiency, the output after the convolution oper-
CNNs (Khan et al. 2020) are hierarchical models composed of one ation needs to be pooled, which can reduce the data dimension and
or more convolutional layers and pooling layers, followed by non- the computational overhead. Figure 3 shows the maximum pooling
linear activation function layers with fully connected layers on top, operation with an input image of 6 × 6, a pooled area of 2 × 2, and a
as shown in Figure 1. Compared with traditional machine learning step size of 2.
algorithms, CNNs use images as inputs, avoiding the complex fea-
ture extraction and data reconstruction used in traditional machine
2.3. Fully connected layer
learning algorithms. The images are convolved by the CNNs. This
convolution operation preserves the local spatial structure of the The convolutional layer and pooling layer map the input image into
images, enabling CNNs to extract more valid information from the feature space, the fully connected layer mapping the feature into
them. Consequently, CNNs have more obvious advantages in the label space. In the fully connected layer, each neuron is con-
image classification and target detection than traditional machine nected to all neurones of the previous layer, and neurones in the
learning algorithms. Images are fed to the first layer, including same layer are not connected. The equation for a fully connected
the convolutional layer and pooling layer, which applies a trans- layer can be expressed as follows:
formation and sends the processed images to the next layer, this
process being repeated until the last layer achieves the predicted (l)

d
zki = w(l−1)
ij
(l−1)
xkj + b(l−1)
i (1)
values. The error between the predicted values and the true values j=1
is further calculated based on the task (regression task or classifi-
cation task). The error is passed forward by the backpropagation where d is the number of neurones in the l−1 layer, l represents the
algorithm to update the parameters, this process being repeated current layer; w(l−1)
kj is the connection parameter between the j-th
(l−1)
until the model converges. unit of the l−1 layer and i-th unit of the l layer; xkj is the input
However, a model that converges on the training set may not value of the j-th neuron of the k-th sample of the l−1 layer; b(l−1) i
(l−1)
perform well on the test set. To improve the generalisation ability is the offset of the i-th neuron of the l−1 layer; and zki is the out-
and to reduce over-fitting, LRN, dropout (Hinton et al. 2012; Sri- put value of the i-th unit of the k-th sample of the l layer.
vastava et al. 2014), batch normalisation (Ioffe and Szegedy 2015),
and other tricks can be used in CNNs. LRN is a method for improv-
3. Methodology
ing accuracy during deep learning training. The LRN layer imitates
the lateral suppression mechanism of the biological nervous system, To compare the effects of CNNs and traditional machine learning
creating a competitive mechanism for the activity of local neurones algorithms in ship classification, typical models were selected
with less feedback, and improving the generalisation ability of the from the field of CNNs and traditional machine learning. The Alex-
model. Dropout refers to the training process of a deep learning Net (Krizhevsky et al. 2017) model was the winner of the 2012 Ima-
network, based on a certain probability of a portion of the neural geNet Large Scale Visual Recognition Challenge (ILSVRC),
network units temporarily discarded from the network. Dropout regarded as the start of deep learning research and representative
can alleviate the occurrence of over-fitting more effectively and CNN models. The AlexNet model has five convolutional layers
achieve the effect of regularisation to a certain extent. Batch nor- and three fully connected layers. The ReLU activation function
malisation pre-processes the input data for each layer during the (Nair and Hinton 2010) was used to replace the traditional sigmoid
neural network training. The basic idea is as follows: for each hid- and tanh activation functions, and LRN was used to improve the
den layer neuron, the input distribution – which is gradually generalisation ability of the model. In general, the more complex
mapped to the nonlinear function and then to the limit saturation and deeper the model, the more the sample space can be divided
region of the value interval – is forced to return to a relatively stan- into different categories of regions. Consequently, the classification
dard normal distribution with a mean of 0 and variance of 1, so that effect of the model is even better. To observe the influence of net-
the input value of the nonlinear transformation function falls into a work depth on the ship classification task, a 19-layer VGG-19
SHIPS AND OFFSHORE STRUCTURES 2717

Figure 1. The structure of CNNs.

Figure 2. Convolution operation.

model was selected. The ReLU activation function and dropout other datasets. The model was initialised using the pre-training
method were used in the VGG-19 model. Moreover, as a classic weight parameters of the ImageNet dataset, after which the model
machine learning algorithm, SVM has been widely used in ship weight parameters were fine-tuned based on the experimental
classification, to construct an optimal classification plane that not dataset.
only separates the samples correctly, but also maximises the classifi-
cation interval.
The calculation of the gradient descent (GD) is computationally
4. Experimental results
complex. To improve the computational efficiency, CNNs are
trained using stochastic gradient descent (SGD) and backpropaga- Based on the accumulation of laboratory data, a visible-light ship
tion algorithms. SGD divides the training samples into multiple image classification dataset was constructed, and named SHIP-3.
mini-batches each time a mini-batch is fed to the model for train- The dataset contains images of three types of ships: bulk carriers,
ing, the entire mini-batch training process being called an epoch. container ships, and cruise ships, each with 289 images. (Figure 4).
Considering the internal storage form of a computer, the number To ensure a balance of the three sample types, each type of ship
of samples contained in a mini-batch is usually of the order of n image was considered separately when dividing the dataset. Twenty
to the power of 2. If the number of samples in a mini-batch is percent of the images were randomly selected from each ship class
too large, the calculation efficiency is low, occupying the GPU for testing, while the remaining 80% were used for model training
resources. If the number of samples is too small, the gradient is and validation.
easily affected by a single sample, and the convergence speed can Data augmentation can effectively expand a ship dataset with
be slow. few samples, reducing the possibility of over-fitting, and improving
The mini-batch size used in this study was 32. The learning rate
affects the convergence of the model, a large learning rate tending to
cause the model to diverge, a small learning rate slowing the con-
vergence – therefore, the learning rate was set to 0.00001 in this
study. For the model to be fully trained, the number of training
epochs was chosen to be 100 (Bengio 2012).
To accelerate the training and improve the robustness of the
model, transfer learning was adopted (Li et al. 2009; Pan and
Yang 2010). Transfer learning can train new models by migrating
some or all model parameters that have been trained on other data-
sets to new datasets. When the dataset types are similar, the robust-
ness of the model using transfer learning tends to be better than that
of a randomly initialised model. In CNNs, a lower layer extracts
general features – such as the corners of an image – and a higher
layer extracts specific features – such as the colour and shape of
an image. The extraction of general features is less dependent on
the dataset, so the parameters that extract the general features
from the ImageNet dataset can be used for feature extraction in Figure 3. Pooling operation.
2718 Y. YANG ET AL.

Figure 4. SHIP-3 dataset.

the normalisation ability of the model. In this study, the ship image
dataset was enhanced (Wang et al. 2020), by random cropping and
horizontal flipping. Figure 5 shows an original image, Figure 6
showing a randomly cropped version of it, and Figure 7 showing
the horizontally flipped version of it. With data augmentation,
1010 images were obtained for each category for a total of 3030
images. The distribution of the SHIP-3 dataset is presented in
Table 1.
To evaluate the generalisation ability of the proposed models,
three evaluation metrics widely used for this type of task were cho-
sen – that is, accuracy, the F1-score, and the confusion matrix. Accu-
racy is defined as the ratio of the number of samples accurately
predicted by the model to the total number of samples. The F1-
score is defined by precision and recall, as follows: Figure 6. Random cropping.
TP
Precision = (2)
TP + FP where TP (true positive) indicates that a given condition exists, and
it really exists. False negative (FN) is a test result that indicates that a
TP condition does not hold, while in fact it does – in other words, no
Recall = (3)
TP + FN erroneous effect was inferred. False positive (FP) – commonly called
a ‘false alarm’ – indicates that a given condition exists when it does
Precision × Recall not. The confusion matrix is a model evaluation metric that can be
F1-score = 2× (4)
Precision + Recall

Figure 5. Original image. Figure 7. Horizontal flipping.


SHIPS AND OFFSHORE STRUCTURES 2719

Table 1. Distribution of the SHIP-3 dataset.


Type of ship Training set Validation set Test set
Bulk cargo ship 810 200 58
Container 810 200 58
Cruise 810 200 58

Table 2. Results of different transfer learning methods in the SHIP-3 dataset.


Accuracy Precision Recall F1-score
Model Ways (%) (%) (%) (%)
AlexNet Random initialisation 92.86 92.53 92.53 92.69
Fine-tuning fc layers 96.87 97.11 96.87 96.87
Global fine-tuning 98.49 98.47 98.47 98.48
VGG-19 Random initialisation 92.82 92.72 92.72 92.77
Fine-tuning fc layers 96.56 96.55 96.55 96.56
Global fine-tuning 98.86 98.85 98.85 98.85
Figure 9. Accuracy curve of VGG-19.

visualised and can directly reflect the classification performance of


the model. of parameters in the CNN was large. It should also be noted that
training the model with a small dataset leads to over-fitting, making
To observe the influence of transfer learning on the CNN, three
the model less robust.
different methods were adopted to initialise neural network weights
The accuracy curve versus the number of iterations using differ-
on the AlexNet and VGG-19 models:
ent initialisation methods is shown in Figures 8 and 9. The model
combined with transfer learning quickly achieves rapid conver-
(1) Using the pre-trained weights with the ImageNet dataset and gence in a small number of iterations. Compared with transfer
freezing it, the weights of the fully connected layer were fine- learning, the model using random initialisation requires more
tuned using the samples from the SHIP-3 dataset. The freezing epochs to train. The accuracy of the model using random initialisa-
of the network part structure weight parameters was achieved tion increases with increasing iterations, finally converging to 0.9 in
by setting the lr of the part to 0. approximately 60 epochs. Meanwhile, more epochs lead to an
(2) Using the pre-trained weights with the ImageNet dataset, glo- increase in computational overhead and a waste of computing
bal fine-tuning was performed. resources.
(3) The network weights were randomly initialised. Compared with the ImageNet dataset, the SHIP-3 dataset has
fewer samples. If random initialisation is adopted directly, the par-
The results of training the model using the three different initi- ameters of the model cannot be completely fitted, making the
alisation methods described above are shown in Table 2. It can be model’s generalisation ability inadequate. Consequently, for non-
seen that during the entire training period, the accuracy of the large datasets, transfer learning techniques can solve the problem
CNN is much higher than that of the pure random model of fewer samples and more parameters, improving the robustness
(33.3%), the accuracy of the transfer learning method being higher of the model.
than that of the random initialisation method, and the accuracy
obtained using the global fine-tuning being the highest. This is
because the lower layer in the CNN extracts primarily the edge 4.1. Result with SHIP-3 dataset
information of the image. Transfer learning can improve the robustness of the model and
We used transfer learning to pre-train the model with the Ima- shorten calculation times. Consequently, both the AlexNet and
geNet dataset, and then used the trained weight parameters with the VGG-19 models adopted in this section were combined with trans-
SHIP-3 dataset directly. Consequently, the model could use the fer learning to initialise the weight parameters. At the same time,
trained weight parameters to quickly extract the general infor- the ResNet-50 – with its strong classification ability – was selected
mation of the ship image on the lower layer. Moreover, the number for comparison, as well as transfer learning. The classification
results of the three models with the SHIP-3 dataset are presented
in Table 3.
The CNN proposed in this paper achieved higher accuracy with
the SHIP-3 dataset, the model also being more robust. This is
because traditional machine learning algorithms require the man-
ual extraction of the ship image before classification. The extracted
features are passed into the model as inputs and classified by the
classifier, the accuracy of the model depending largely on the

Table 3. Results of different models in the SHIP-3 dataset.


F1-
Accuracy Precision Recall score
Model (%) (%) (%) (%)
AlexNet 98.49 98.47 98.47 98.48
VGG-19 98.86 98.85 98.85 98.85
ResNet 99.44 99.43 99.43 99.43
Figure 8. Accuracy curve of AlexNet. SVM 81.81 82.10 81.80 81.88
2720 Y. YANG ET AL.

accuracy in ship image classification than traditional machine


learning algorithms.
In the case of transfer learning, the ResNet model with a 50-layer
structure is more accurate on the ship classification dataset than the
VGG-19 model with a 19-layer structure, while the accuracy of
VGG-19 is slightly higher than that of the AlexNet model with an
8-layer structure. It can be seen that a deeper network layer facili-
tates the extraction and abstraction of features while improving
the nonlinear capabilities of the model.
Because a confusion matrix can reflect the results visually, the
results of the SVM and AlexNet models are given in the form of
a confusion matrix. The confusion matrix obtained by the SVM
using the SHIP-3 dataset is shown in Figure 10. Most ship samples
are classified correctly, just a few ship samples being misclassified.
Bulk cargo ships and containers are easily confused with each
other under the SVM model. The confusion matrix obtained
Figure 10. Confusion matrix of SVM model. using the AlexNet model is shown in Figure 11. The classification
accuracy of the AlexNet model is higher than that of the SVM.
Ship samples are classified correctly, and the model has higher pre-
features extracted manually. An approach which manually extracts cision and recall. In the AlexNet model, bulk cargo ships and con-
features is not required in CNN, with features extracted from the tainers are easily confused. This is because the similarity between
ship images automatically using a convolution kernel in the convo- the two types of ships is higher. Because of the inadequate number
lutional layer being used for ship classification. The backpropaga- of samples, the model cannot fully understand the features of the
tion algorithm was used to update the convolution kernel two types of ships. It is known from statistics that the more samples
parameters to optimise feature extraction. Consequently, by auto- available, the closer the sample information is to the overall infor-
matically learning the features in an image, CNNs exhibit a higher mation. Therefore, increasing the number of training samples of
these two types of ships could reduce the occurrence of this
phenomenon.

4.2. Classification results in the case of fog and nighttime


Changes in lighting and weather conditions can cause the ship
image recognition to decrease, as shown in Figure 12. To examine
the classification effect of the CNN with ship images taken in com-
plex environments, the CNN was used to classify the ship images
above and compare them with the SVM classification results of
this section.
The SVM and CNN models were used to classify the ship images
under complex weather conditions. The accuracy curves are shown
in Figure 13. The details of the ship image were lost because of illu-
mination and other factors. The difficulty in ship classification
increased, which decreased the classification accuracy of the SVM
and CNN models with the dataset. Owing to the decline in image
clarity, the difficulty in manually extracting features was greatly
Figure 11. Confusion matrix of AlexNet model. increased. The extracted features did not completely cover the

Figure 12. Dataset for different weather conditions.


SHIPS AND OFFSHORE STRUCTURES 2721

Disclosure statement
We declare that we have no financial and personal relationships with other people
or organizations that can inappropriately influence our work, there is no pro-
fessional or other personal interest of any nature or kind in any product, service
and/or company that could be construed as influencing the position presented in,
or the review of, the manuscript entitled. The authors report no conflicts of inter-
est. The authors alone are responsible for the content and writing of this article.

Funding
This research was financially supported by the National Science Foundation of
China (grant number 51979036).
Figure 13. The curve of the accuracy versus the number of iterations using different
models.
ORCID
Yang Yang [Link]
information of the ship’s image, making the accuracy of the SVM
model only about 0.5. The CNN model proposed in this paper
still achieved an accuracy of approximately 65%–75% with the data- References
set above, the accuracy of the VGG-19 model being close to 0.75. Bengio Y. 2012. Practical recommendations for gradient-based training of deep
The CNN model automatically extracts the feature details passing architectures. Lect Notes Comput Sci. 7700(1-3):437–478.
the feature back in the form of a multilayer feature map. After Girshick R, Donahue J, Darrell T, Malik J. 2014. Rich feature hierarchies for
the loss function is obtained, the weight parameters are updated accurate object detection and semantic segmentation. arXiv preprint arXiv:
1311.2524.
using the backpropagation algorithm. Consequently, compared Glorot X, Bengio Y. 2010. Understanding the difficulty of training deep feedfor-
with the SVM model, the CNN could better learn the features ward neural networks. J Mach Learn Res. 9:249–256.
from the image. Graves A, Mohamed AR, Hinton G. 2013. Speech recognition with deep recur-
rent neural networks. 2013 IEEE International Conference on Acoustics,
Speech and Signal Processing. 6645–6649.
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J. 2017. LSTM:
a search space odyssey. IEEE Trans Neural Networks Learn Syst. 28
5. Conclusions (10):2222–2232.
He KM, Zhang XY, Ren SQ, Sun J. 2015. Deep residual learning for image rec-
In this paper, we propose a ship classification method based on ognition. arXiv preprint arXiv: 1512.003385.
CNN. Based on the theory and framework of the proposed method, Hinton G, Deng L, Yu D, Dahl GE, Mohamed AR, Jaitly N, Senior A, Vanhoucke
we classified a variety of ships and considered the impact of com- V, Nguyen P, Sainath TH, Kingsbury B. 2012. Deep neural networks for
plex marine environmental conditions on ship image classification. acoustic modeling in speech recognition: the shared views of four research
groups. IEEE Signal Process Mag. 29(6):82–97.
The feasibility and effectiveness of ship classification based on Ioffe S, Szegedy C. 2015. Batch normalization: accelerating deep network train-
CNNs were verified by comparison with traditional machine learn- ing by reducing internal covariate shift. arXiv preprint arXiv: 1502.03167.
ing algorithms. The main conclusions that could be drawn are as Khan A, Sohail A, Zahoora U, Qureshi AQ. 2020. A survey of the recent archi-
follows: tectures of deep convolutional neural networks. arXiv preprint arXiv:
1901.06032v7.
Krizhevsky A, Sutskever I, Hinton GE. 2017. ImageNet classification with deep
(1) A CNN method combined with transfer learning was proposed convolutional neural networks. Adv Neural Inf Process Syst. 60(6):1106–1114.
to classify ship images. It can be seen from the experimental Lang HT, Zhang J, Zhang X, Meng JM. 2016. Ship classification in SAR image by
data that the classification accuracy of the model combined joint feature and classifier selection. IEEE Geosci Remote Sens Lett. 13
with transfer learning was higher than that of the random initi- (2):212–216.
Lecun Y, Bengio Y, Hinton G. 2015. Deep learning. Nature. 521:436–444.
alisation model, and the calculation overhead was significantly Li B, Yang Q, Xue X. 2009. Transfer learning for collaborative filtering via a rating-
reduced. Consequently, transfer learning on a small dataset can matrix generative model. Proceedings of the 26th International Conference on
improve the robustness of the ship classification model. Machine Learning, ICML 2009. June, Montreal, Quebec, Canada, 617–624.
(2) The experiment showed the accuracy of the CNN model in ship Lin H, Song S, Yang J. 2018. Ship classification based on MSHOG feature and
classification to be better than that of the SVM model. The task-driven dictionary learning with structured incoherent constraints in
SAR images. Remote Sens (Basel). 10(2):190.
SVM model needs to extract features manually from the ship Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC. 2016. SSD:
images before classification, the quality of the extracted fea- single shot MultiBox detector. Lect Notes Comput Sci. 9905:21–37.
tures significantly affecting the classification accuracy. The Nair V, Hinton GE. 2010. Rectified linear units improve restricted boltzmann
CNN model updates the feature parameters through a back- machines. International Conference on International Conference on
Machine Learning. June 21–24, Haifa, Israel.
propagation algorithm, which better extracts the features. Pan SJ, Yang Q. 2010. A survey on transfer learning. IEEE Trans Knowl Data
The confusion matrix showed that bulk cargo ships and con- Eng. 22(10):1345–1359.
tainer ships were easily misclassified owing to the insufficient Qi S, Ma J, Lin J, Li YS, Tian JW. 2015. Unsupervised ship detection based on
number of samples and incomplete sample information. This saliency and S-HOG descriptor from optical satellite images. IEEE Geosci
Remote Sens Lett. 12(7):1451–1455.
phenomenon could be reduced by adding training samples.
Redmon J, Divvala S, Girshick R, Farhadi A. 2016. You only look once: unified,
(3) In bad weather conditions such as foggy days, the identification real-time object detection. arXiv preprint arXiv:1506.02640v5.
of ships was affected, the ship images losing many details. The Simonyan K, Zisserman A. 2015. Very deep convolutional networks for large-
classification accuracy of the CNN and SVM models for ship scale image recognition. arXiv preprint arXiv:1409.1556v6.
images dropped significantly. However, the classification accu- Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutidinov R. 2014.
Dropout: a simple way to prevent neural networks from overfitting. J
racy of the CNN model for the same ship images was signifi- Mach Learn Res. 15(1):1929–1958.
cantly higher than that of the SVM model, and exhibited Wang YQ, Yao QM, Kwork J, Ni LM. 2020. Generalizing from a few examples: a
better robustness. survey on few-shot learning. arXiv preprint arXiv: arXiv: 1904.05046v3.

You might also like