You are on page 1of 15

sensors

Letter
Automatic Recognition of Sucker-Rod Pumping
System Working Conditions Using Dynamometer
Cards with Transfer Learning and SVM
Haibo Cheng 1,2,3,4 , Haibin Yu 1,2,3, *, Peng Zeng 1,2,3 , Evgeny Osipov 5 , Shichao Li 1,2,3
and Valeriy Vyatkin 5,6
1 State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences,
Shenyang 110016, China; chenghaibo@sia.cn (H.C.); zp@sia.cn (P.Z.); lishichao@sia.cn (S.L.)
2 Key Laboratory of Networked Control Systems, Chinese Academy of Sciences, Shenyang 110016, China
3 Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China
4 University of Chinese Academy of Sciences, Beijing 100049, China
5 Department of Computer Science, Electrical and Space Engineering, Luleå University of Technology,
97187 Luleå, Sweden; evgeny.osipov@ltu.se (E.O.); valeriy.vyatkin@ltu.se (V.V.)
6 Department of Electrical Engineering and Automation, Aalto University, 02150 Espoo, Finland
* Correspondence: yhb@sia.cn

Received: 17 August 2020; Accepted: 29 September 2020; Published: 3 October 2020 

Abstract: Sucker-rod pumping systems are the most widely applied artificial lift equipment in the
oil and gas industry. Accurate and intelligent working condition recognition of pumping systems
imposes major impacts on oilfield production benefits and efficiency. The shape of dynamometer
card reflects the working conditions of sucker-rod pumping systems, and different conditions can
be indicated by their typical card characteristics. In traditional identification methods, however,
features are manually extracted based on specialist experience and domain knowledge. In this
paper, an automatic fault diagnosis method is proposed to recognize the working conditions of
sucker-rod pumping systems with massive dynamometer card data collected by sensors. Firstly,
AlexNet-based transfer learning is adopted to automatically extract representative features from
various dynamometer cards. Secondly, with the extracted features, error-correcting output codes
model-based SVM is designed to identify the working conditions and improve the fault diagnosis
accuracy and efficiency. The proposed AlexNet-SVM algorithm is validated against a real dataset
from an oilfield. The results reveal that the proposed method reduces the need for human labor and
improves the recognition accuracy.

Keywords: working condition recognition; sucker-rod pumping system; dynamometer card;


convolutional neural network; transfer learning; support vector machine

1. Introduction
The significance of oil as a global energy source is difficult to overemphasize. The increased
demand for oil is forcing petroleum engineers to determine new strategies to increase oil production.
To exploit oil efficiently and rapidly from underground reservoirs, researchers have studied artificial
lift methods. The sucker-rod pumping system, as the most widely applied type of artificial lift method
for production wells, has been commonly adopted in the oil and gas industry.
The sucker-rod pumping system is composed of several components, some of which operate
aboveground, while other parts operate underground, i.e., down the well. The basic and primary
elements include the prime mover, pumping unit, sucker rod, and downhole pump. Figure 1 shows

Sensors 2020, 20, 5659; doi:10.3390/s20195659 www.mdpi.com/journal/sensors


Sensors 2020, 20, 5659 2 of 15
Sensors 2020, 20, x FOR PEER REVIEW 2 of 16

aa detailed
detailed schematic
schematic of
of aa typical
typical production
production unit
unit with
with the
the major
major components
components of
of the
the sucker-rod
sucker-rod
pumping system.
pumping system.

Figure 1. Schematic of a typical production unit with the major components of the sucker-rod
Figure 1. Schematic of a typical production unit with the major components of the sucker-rod pumping
pumping system. Dynamometer is installed on the rod of the pumping unit, and used to measure
system. Dynamometer is installed on the rod of the pumping unit, and used to measure load on the
load on the polished rod and plot the load in relation to the rod displacement as the pumping unit
polished rod and plot the load in relation to the rod displacement as the pumping unit moves through
moves through a stroke cycle.
a stroke cycle.

Thousands
Thousands of ofthe
theaforementioned
aforementioned production
production unitsunits
occuroccur in oilfields,
in oilfields, which arewhich are distributed
distributed spatially
spatially across the whole field, and operate for decades in the field. Therefore,
across the whole field, and operate for decades in the field. Therefore, it is difficult to monitor and it is difficult to
monitor and manage these production units. Moreover, many unexpected
manage these production units. Moreover, many unexpected exceptions may occur in the sucker-rodexceptions may occur in
the sucker-rod
pumping systempumping
during system during
long-term long-term
operation due operation due to the
to the complexity of complexity
the production of the production
environment.
environment.
The issue is how The issue is how recognize
to automatically to automatically recognize
and diagnose and diagnose
the system the system
working conditions working
in real time.
conditions in real time. Currently, this work is mainly performed by manually
Currently, this work is mainly performed by manually analyzing the shape of dynamometer cards. analyzing the shape
of dynamometer
Dynamometer cards.
card diagnosis is the most widely adopted method to evaluate the working
Dynamometer card diagnosis
conditions of the sucker-rod pumpingissystem.
the most Thewidely adoptedcard
dynamometer method to evaluate
is a closed curve thattherepresents
working
conditions of the sucker-rod pumping system. The dynamometer card is
the relationship between the displacement and load of the sucker-rod pump. The card shape generally a closed curve that
represents
reflects the the relationship
condition of the between
pumpingthe well,displacement
and differentand load of the
conditions can sucker-rod
be indicatedpump.
by theirThe card
typical
shape generally reflects the condition of the pumping well, and different conditions
characteristics on the cards [1–3]. Thus, the manual recognition and diagnosis of working conditions can be indicated
by their typical characteristics on the cards [1–3]. Thus, the manual recognition and diagnosis of
are a time-consuming and labor-intensive task, and this process usually requires specialized domain
working
knowledge.conditions
Moreover, arethe
a time-consuming and labor-intensive
massive data collected by sensors, suchtask,
as and this process
dynamometer usually
cards, requires
temperature,
specialized domain knowledge. Moreover, the massive data collected by sensors, such as
and pressure data, are more difficult for domain experts to process. The accurate and efficient
dynamometer cards, temperature, and pressure data, are more difficult for domain experts to
identification and diagnosis of the working conditions of sucker-rod pumping wells by utilizing oilfield
process. The accurate and efficient identification and diagnosis of the working conditions of
big data are a new challenge for researchers.
sucker-rod pumping wells by utilizing oilfield big data are a new challenge for researchers.
With the rapid development of smart manufacturing and artificial intelligence technologies,
several articles have attempted to review and summarize the intelligent methods for machine
Sensors 2020, 20, 5659 3 of 15

With
Sensors 2020, the
20, x rapid
FOR PEER development
REVIEW of smart manufacturing and artificial intelligence technologies, 3 of 16
several articles have attempted to review and summarize the intelligent methods for machine
abnormalities and faults diagnosis. diagnosis. Jiao Jiao et
et al.
al. [4] reviewed convolutional neural network based based
machine
machine faultfaultdiagnosis
diagnosis approaches.
approaches. A typical convolutional
A typical networknetwork
convolutional based fault diagnosis
based faultframework
diagnosis
was proposed.
framework was Itproposed.
is composed It isofcomposed
data collection,
of datamodel construction,
collection, and feature learning
model construction, and featureand
decision
learning andmaking.
decisionGaomaking.
et al. [5,6]
Gaosummarized four different
et al. [5,6] summarized fourfault diagnosis
different faultmethods,
diagnosiswhich methods,are
model-based, signal-based,
which are model-based, knowledge-based
signal-based, and hybrid/active
knowledge-based approaches, respectively.
and hybrid/active approaches, These papers
respectively.
also
Theseintroduced
papers alsothe advantages
introduced and constraints
the advantages of each techniques.
and constraints In [7], In
of each techniques. fault
[7], diagnosis
fault diagnosisand
remaining
and remaining usefuluseful
life estimation were presented
life estimation to deal to
were presented with these
deal problems.
with Liu et al. Liu
these problems. [8] presented
et al. [8]
apresented
comprehensive review of artificial
a comprehensive review ofintelligence algorithmsalgorithms
artificial intelligence in rotating in machinery fault diagnosis,
rotating machinery fault
from both the
diagnosis, views
from both of the
theoryviewsbackground
of theoryand industrial applications.
background and industrial Zhao et al. [9] summarized
applications. Zhao et al.four [9]
kinds of deep four
summarized learningkinds methods
of deep andlearning
their applications
methods to machine
and health monitoring.
their applications Lei et al.
to machine [10]
health
presented
monitoring. a review
Lei etofal.condition monitoring
[10] presented a and
reviewfaultofdiagnosis
condition of planetary
monitoring gearboxes.
and fault Hoang et al. [11]
diagnosis of
presented a survey onHoang
planetary gearboxes. deep learning
et al. [11] based bearing
presented a fault
surveydiagnosis,
on deep in which three
learning baseddeepbearinglearning
fault
methods
diagnosis,and their applications
in which were summarized.
three deep learning methods and their applications were summarized.
In this
thispaper,
paper,a dynamometer
a dynamometer card based
card automatic
based automaticfault fault
diagnosis method
diagnosis is proposed
method to classify
is proposed to
the working
classify the conditions of sucker-rod
working conditions pumping systems.
of sucker-rod pumping Twosystems.
steps areTwoinvolved
stepsinarethisinvolved
method. in Firstly,
this
AlexNet-based
method. Firstly, transfer learning istransfer
AlexNet-based applied learning
to automatically
is applied extract representativeextract
to automatically features from various
representative
dynamometer
features fromcards. Secondly,
various with these extracted
dynamometer features, anwith
cards. Secondly, error-correcting
these extractedoutput codes (ECOC)
features, an
model-based
error-correcting SVM is designed
output codes to identifymodel-based
(ECOC) the working SVM conditions and improve
is designed the fault
to identify thediagnosis
working
accuracy
conditions andandefficiency.
improve the fault diagnosis accuracy and efficiency.
This paper is is organized
organized as as follows:
follows: Section 2 introduces the related works works on on dynamometer
dynamometer
card-based intelligent fault diagnosis diagnosis methods
methods for for sucker-rod
sucker-rod pumping
pumping systems.
systems. The fault diagnosis
problem
problemisisstated
stated in in
Section
Section3. Section 4 details
3. Section the proposed
4 details hybrid AlexNet-SVM
the proposed hybrid AlexNet-SVM method. In SectionIn
method. 5,
Section
the 5, the
diagnosis diagnosis
results obtained results
with theobtained
proposed with
methodthe are
proposed
discussed.method
Finally,are discussed.
conclusions andFinally,
future
conclusions
works and future
are provided works 6.
in Section are provided in Section 6.

2. Related Works
2. Related Works
Dynamometer
Dynamometer card-based intelligent fault
card-based intelligent fault diagnosis
diagnosis of
of sucker-rod
sucker-rod pumping systems is
pumping systems is a
a new
new
and challenging topic in the oil and gas industry. Over the last few decades,
and challenging topic in the oil and gas industry. Over the last few decades, many advanced many advanced
diagnosis
diagnosis methods
methods have
have been
been proposed
proposed toto analyze
analyze thethe downhole
downhole working
working conditions
conditions of
of production
production
wells.
wells. Generally, feature extraction and fault classification are two major processes for the intelligent
Generally, feature extraction and fault classification are two major processes for the intelligent
identification
identification and
and diagnosis
diagnosis ofof suck-rod
suck-rod pumping
pumping systems,
systems, as
as shown
shown inin Figure
Figure 2.
2. According
According toto the
the
feature extraction method, the fault diagnosis problem can be divided into two categories:
feature extraction method, the fault diagnosis problem can be divided into two categories: machine machine
learning-based
learning-based methods
methods with
with manual
manual feature
feature extraction
extraction and
and deep
deep learning
learning based
based methods
methods with
with
automatic
automatic feature
feature extraction.
extraction.

Figure
Figure 2. The process
2. The process of
of intelligent
intelligent fault
fault diagnosis
diagnosis for
for suck-rod
suck-rod pumping
pumping systems.
systems.

2.1. Machine Learning-based Fault Diagnosis Methods With Manual Feature Extraction
Expert systems have been applied to diagnose the working conditions of sucker-rod pumping
systems [12–15], and pattern recognition was first introduced into the fault diagnosis approach of
dynamometer cards [15]. Expert system and pattern recognition could effectively reduce expertise
required for fault diagnosis of pumping wells with considerable savings in labor hours. However,
the accuracy of expert systems greatly depends on the domain knowledge and experience of experts.
Sensors 2020, 20, 5659 4 of 15

2.1. Machine Learning-Based Fault Diagnosis Methods with Manual Feature Extraction
Expert systems have been applied to diagnose the working conditions of sucker-rod pumping
systems [12–15], and pattern recognition was first introduced into the fault diagnosis approach of
dynamometer cards [15]. Expert system and pattern recognition could effectively reduce expertise
required for fault diagnosis of pumping wells with considerable savings in labor hours. However,
the accuracy of expert systems greatly depends on the domain knowledge and experience of experts.
With the rapid development of artificial intelligence techniques, neural networks have been widely
applied in the intelligent fault recognition of dynamometer cards [16,17]. The application of neural
networks in pattern recognition requires a large number of training samples to ensure the accuracy of
the recognition effect. In oilfields, the pumping system needs to be immediately shut down under
certain conditions, especially serious faults, such as sucker-rod breakage. Therefore, the number of
samples corresponding to these cases is limited, and it is difficult to meet the training requirements of
neural networks.
SVM, as a powerful and flexible supervised machine learning algorithm, has been adopted to
identify the working conditions of sucker-rod pumping wells [2,18,19], and they can effectively solve the
problem of constructing high-dimensional data models with limited samples. However, it is difficult to
use traditional SVMs to resolve multiple classification problems. Moreover, the representative features
of the dynamometer cards must be extracted before fault classification. The feature extraction process
is primarily based on mechanism analysis, prior knowledge, and expert experience, which means the
accuracy is greatly influenced by external factors, and cannot be strictly guaranteed.
During manual feature extraction, high-dimensional dynamometer card data are transformed into
a series of low-dimensional features by utilizing the above machine learning methods. Some valid and
useful information will inevitably be lost. Therefore, certain working conditions cannot be accurately
or correctly identified. In addition, this process largely depends on diagnostic expertise and prior
knowledge, and it is time-consuming and labor-intensive [20]. The challenge is how to extract the
features automatically and identify the faults accurately for sucker-rod pumping system in oilfield.

2.2. Deep Learning-Based Fault Diagnosis Methods with Automatic Feature Extraction
To solve the problem of information loss caused by manual feature extraction, deep learning-based
fault diagnosis methods with automatic feature extraction have been proposed over the past few
years. Deep learning-based methods automatically extract image features from dynamometer cards by
utilizing advanced deep neural networks. Convolutional neural networks (CNNs), as artificial deep
learning neural networks, have been applied for fault diagnosis of sucker-rod pumping systems in
recent years.
Zhao et al. [1] proposed data-based CNN and image-based CNN methods for fault diagnosis of rod
pumping system, and compared the proposed methods with traditional machine learning algorithms.
The results demonstrated that CNN-based approach is superior to the conventional approaches without
any need of manual feature extraction that requires domain expertise. In [21], the potential of using
artificial neural networks in well fault diagnosis was reviewed and analyzed. VGG16, ResNet34,
and ResNeXt50 were used to recognize beam pump conditions based on a pump card shape. In [22],
a fourteen-layer CNN diagnosis model was proposed recognize working conditions of sucker rod
pumping wells based on big data deep learning. In [23], Peng applied artificial intelligence in sucker rod
pumping wells. Deep neural networks based method was used to realize intelligent dynamometer card
generation, diagnosis, and failure detection. In this method, CNN and autoencoders were adopted to
get the feature representation of the dynamometer cards and classify the working conditions of sucker
rod pumping wells. The accuracy and efficiency have greatly been improved by these approaches.
However, the conducted studies using deep learning have mostly relied on neural networks trained
from scratch, which generally requires numerous epochs or iterations for a deep neural network to
converge [21]. Some methods need an amount of time to train the model and classify the pattern.
Therefore, they cannot meet the real-time requirement of industrial application.
Sensors2020,
Sensors 20,x5659
2020,20, FOR PEER REVIEW 55of
of16
15

3.Problem
3. ProblemStatement
Statement
The horsehead
The horseheadequipment,
equipment,suckersuckerrod rodandanddown
downhole holepump
pumpare are the
the major
major and
and closely
closely related
related
components involved in the production process of oil and gas, as shown in Figure
components involved in the production process of oil and gas, as shown in Figure 1. With the rod 1. With the rod
movingup
moving upand
anddown,
down,the
thetraveling
travelingvalve
valve(TV)(TV)attached
attachedto tothe
therod
rodand
andthethestanding
standingvalve
valve(SV)
(SV)at atthe
the
bottom of the pump open and close periodically, which drives the oil and gas to the
bottom of the pump open and close periodically, which drives the oil and gas to the surface from the surface from the
underground reservoir
underground reservoir [23].
[23]. The
Thejourney
journeyof ofthe
therod
rodtraveling
travelingfrom
fromthe
theupper
upperdead
deadpoint
pointto tothe
thelower
lower
dead point
dead point and
and back
backupupagain
againis is
called a stroke.
called TheThe
a stroke. dynamometer
dynamometer cardcard
displays the load
displays the on
loadtheonsucker
the
rod over a stroke. This is the major approach to evaluate and diagnose
sucker rod over a stroke. This is the major approach to evaluate and diagnose the working the working conditions of
sucker-rod pumping systems. The shape of the card indicates the working conditions
conditions of sucker-rod pumping systems. The shape of the card indicates the working conditions and performance
of the
and pump. Figure
performance 3 shows
of the pump.a Figure
theoretical cardaunder
3 shows a static
theoretical load.
card under a static load.

Theoreticaldynamometer
Figure3.3.Theoretical
Figure dynamometercard
cardunder
understatic
staticload
load[23].
[23].

In this
In this paper,
paper, the
the 77most
mostcommonly
commonlyknown knownfault
faultconditions
conditions areare
investigated.
investigated.Therefore, there
Therefore, are
there
8 classification
are categories
8 classification in this
categories inresearch, in which
this research, the normal
in which operation
the normal condition
operation (NOC) is(NOC)
condition regarded is
as a separate class. These categories are NOC, downstroke pump bumping
regarded as a separate class. These categories are NOC, downstroke pump bumping (DPB), (DPB), upstroke pump
bumpingpump
upstroke (UPB),bumping
combination of leaking
(UPB), standing
combination and traveling
of leaking valves
standing and(CST), gas valves
traveling interference
(CST),(GIF),
gas
insufficient liquid supply (ILS), sand production (SAP), and abnormal
interference (GIF), insufficient liquid supply (ILS), sand production (SAP), and abnormal dynamometer card (ADC).
The ADC category
dynamometer indicates
card (ADC). TheallADC
the abnormal shapes of
category indicates allthe
thedynamometer
abnormal shapes cardofcaused by the data
the dynamometer
losses and errors in the sensor sampling and transmission processes.
card caused by the data losses and errors in the sensor sampling and transmission processes.
Figure 44 shows
Figure shows the the typical
typical working
working conditions
conditions ofof sucker-rod
sucker-rod pumping
pumping systems,
systems, where
where thethe
horizontal axis denotes the displacement, and the vertical axis denotes the load.
horizontal axis denotes the displacement, and the vertical axis denotes the load. As shown in this As shown in this
figure, each
figure, each ofof these
these categories
categoriesexhibits
exhibitsspecific
specificfeatures
featuresreflected
reflectedininthe
thedynamometer
dynamometercard cardshape.
shape.
Figure 4h shows one of the ADC possibilities.
Figure 4h shows one of the ADC possibilities.
Sensors 2020, 20, 5659 6 of 15
Sensors2020,
Sensors 2020,20,
20,xxFOR
FORPEER
PEERREVIEW
REVIEW 6 6ofof16
16

Figure4.4.Typical
Figure Typicalworking
workingconditions
conditionsof ofthe
thesucker-rod
sucker-rodpumping
pumpingsystem,
system,where
wherethethehorizontal
horizontalaxis
axis
denotesthe
denotes thedisplacement,
the displacement,and
displacement, andthethe vertical
thevertical axis
verticalaxis denotes
axisdenotes
denotes the
the load:
load:
the (a)(a)
(a)
load: normal
normal operation
operation
normal condition,
condition,
operation (b)
(b)
condition,
downstroke
downstroke
(b) downstroke pump
pumppump bumping,
bumping,
bumping, (c) upstroke
(c)(c)
upstroke
upstroke pump
pump
pump bumping,
bumping, (d)
bumping,(d) combination
(d)combination of leaking
combinationof leaking standing
leaking standing and
standing and
travelingvalves,
traveling valves,(e)
(e)insufficient
insufficientliquid
liquidsupply,
supply,(f)(f)gas
gasinterference,
interference,(g)
(g)sand
sandproduction,
production,(h)(h)abnormal
abnormal
dynamometercard.
dynamometer card.

4.4.Methodology
4. Methodology
Methodology
In
In this
In thispaper,
this paper,aaahybrid
paper, hybridAlexNet-SVM-based
hybrid AlexNet-SVM-based model
AlexNet-SVM-based model
model isisisproposed
proposed to
proposed todiagnose
to diagnosethe
diagnose thevarious
the variousfault
various fault
fault
categories
categories of sucker-rod
of pumping
sucker-rod systems.
pumping AlexNet-based
systems. transfer
AlexNet-based learning
transfer
categories of sucker-rod pumping systems. AlexNet-based transfer learning is applied to is applied
learning to automatically
is applied to
extract useful and
automatically extract
automatically representative
extract useful
useful and features from
and representative dynamometer
representative features
features from cards. With
from dynamometer the extracted
dynamometer cards. features,
cards. With
With the the
an ECOC model-based
extracted
extracted features,an
features, SVM ismodel-based
anECOC
ECOC designed to classify
model-based SVMisisthe
SVM working
designed
designed toconditions
to theofworking
classifythe
classify the pumping
working system
conditions
conditions ofand
of the
the
improve
pumping the
pumpingsystem pattern
systemand recognition
andimprove
improvethe efficiency.
thepattern
patternrecognition
recognitionefficiency.
efficiency.
4.1. Convolutional Neural Networks and Transfer Learning
4.1.Convolutional
4.1. ConvolutionalNeural
NeuralNetworks
NetworksandandTransfer
TransferLearning
Learning
CNNs are typical feedforward neural networks with convolutional computations and deep
CNNs are
CNNs are typical
typical feedforward
feedforward neural
neural networks
networks withwith convolutional
convolutional computations
computations and and deep
deep
structures. CNNs, as some of the most representative deep learning models, have been widely
structures. CNNs,
structures. CNNs, as as some
some ofof the
the most
most representative
representative deep deep learning
learning models,
models, have
have been
been widely
widely
applied in many fields, and numerous related applications, including image classification [24–28],
appliedin
applied in many
manyfields,
fields, and
and numerous
numerousrelated
relatedapplications,
applications,including
includingimage
imageclassification
classification[25–29],
[24–28],
natural language processing [29,30], face recognition [31,32], video analysis [33,34], and pedestrian
naturallanguage
natural languageprocessing
processing[30,31],
[29,30],face
facerecognition
recognition[32,33],
[31,32],video
videoanalysis
analysis[34,35],
[33,34],and
andpedestrian
pedestrian
detection [35,36].
detection[36,37].
detection [35,36].
A typical CNN architecture is shown in Figure 5. It consists of an input layer, an output layer,
AAtypical
typicalCNN
CNNarchitecture
architectureisisshown
shownin inFigure
Figure5.5.ItItconsists
consistsof
ofananinput
inputlayer,
layer,an
anoutput
outputlayer,
layer,
and multiple hidden layers. The hidden layers are composed of a series of convolutional layers (Conv),
and multiple
and multiple hidden
hidden layers.
layers. The
The hidden
hidden layers
layers are
are composed
composed of of aa series
series of
of convolutional
convolutional layers
layers
pooling layers, and fully connected layers (FC). The Conv layer is the key functional block of a CNN,
(Conv),pooling
(Conv), poolinglayers,
layers,and
andfully
fullyconnected
connectedlayers
layers(FC).
(FC).TheTheConv
Convlayer
layerisisthe
thekey
keyfunctional
functionalblock
blockofof
which convolutes a filter matrix with values from a receptive field of neurons [37] and finally extracts
aa CNN,
CNN, which
which convolutes
convolutes aa filter
filter matrix
matrix with
with values
values from
from aa receptive
receptive field
field of
of neurons
neurons [38]
[37] and
and
representative features from input data.
finallyextracts
finally extractsrepresentative
representativefeatures
featuresfrom
frominput
inputdata.
data.

Figure
Figure5.
Figure Typical
5.5.Typical CNN
TypicalCNN architecture.
CNNarchitecture.
architecture.

Transfer
Transferlearning
Transfer learning
learning imitates thethe
imitates
imitates human visualvisual
the human
human systemsystem
visual by taking
system byfull
by advantage
taking
taking of prior knowledge
full advantage
full advantage of prior
of prior
in different
knowledge but
in related domains
different but when
related executing
domains new
when tasks in
executinga given
new domain
tasks in
knowledge in different but related domains when executing new tasks in a given domain and and
a resolves
given relevant
domain and
cross-domain learning
resolves relevant
resolves problems
relevant cross-domain [38]. In
cross-domain learning transfer
learning problemslearning,
problems [39]. representative
[38]. In
In transfer information
transfer learning, is extracted
learning, representative
representative
informationisisextracted
information extractedfrom
fromdata
datain
inthe
therelated
relateddomain
domainby
byaapretrained
pretrainedmodel,
model,and
andthe
thepretrained
pretrained
Sensors 2020, 20, 5659 7 of 15

from data in the related domain by a pretrained model, and the pretrained model then transfers the
useful information for reuse on a new target problem. Generally, transfer learning provides three
kinds of benefits for performance improvements [38–40], including (1) a higher start with an improved
performance at the initial points; (2) a higher slope with a faster performance growth; and (3) a higher
asymptote, producing a better final performance. Deep learning generally requires a large amount of
data to train deep neural networks and learn the knowledge [41]. However, transfer learning trains
networks with comparatively little data because of the pretrained model. This is very significant
since most real-world problems and tasks typically do not have millions of labeled data to train such
complex models.

4.2. Support Vector Machine


The SVM is a successful and important supervised machine learning method used to address
classification problems on the basis of the principles of empirical risk minimization and structural risk
minimization [42]. SVMs have been widely applied in many areas [43,44]. The basic idea of the SVM is
to seek the optimal hyperplane in the feature or sample space under the maximum margin principle.
The SVM was initially proposed to solve two-class problems. Given a sample dataset including
N points xk , yk N n

k=1 , where xk ∈ R are the input data, and y ∈ {±1} is the target output, the optimal
hyperplane is defined as:
ωT xk + b = 0 (1)

where ω is the weight vector and b is the bias term. The parameters ω and b are determined as:
 
yk ωT xk + b ≥ 1 − ξk (2)

where ξk is the slack variable, and ξk ≥ 0.

N
1 X
Φ(ω, ξ) = kωk2 + C ξk (3)
2
k =1

where C is a penalty coefficient, and C ≥ 0. Considering the Lagrangian multiplier method, the solution
to the optimal hyperplane can be determined as:

N N N
X 1X X
Q(λ) = λi − λi λ j yi y j K(xi , x j ) (4)
2
i=1 i=1 j=1

where λi is the Lagrange multiplier and K(xi , x j ) is the kernel function. Kernel functions are designed to
solve the inner product operation in high-dimensional space, thus addressing the problem of nonlinear
classification. These kernel functions can be of different types, such as linear, polynomial, Sigmoid,
hyperbolic tangent and radial basis function.

4.3. Proposed AlexNet-SVM Method


In this study, an AlexNet-based CNN network is proposed to automatically extract the
representative features from various dynamometer cards. AlexNet, as a well-known and successful
deep CNN for image classification, attained the highest accuracy during the ImageNet Large Scale
Visual Recognition Challenge in 2012 [24]. It is mainly composed of eight layers, including five Conv
layers and three FC layers, as depicted in Figure 6.
Considering training time with gradient descent, the saturating nonlinearities are much slower
than the non-saturating nonlinearity f (x) = max (0, x) [24]. In AlexNet, therefore, the Rectified Linear
Units layer (ReLU) layer is adopted as the activation function layer after every main layer, except the last
FC layer, to improve the training time and learning performance of the neural network. Normalization
layers follow the first two Conv layers. Max-pooling layers follow the normalization layers as well
Sensors 2020, 20, 5659 8 of 15

as the fifth Conv layer. They are adopted to downsample and reduce the size of the neural network.
To reduce the overfitting degree in the FC layers, dropout layers are designed after the first two FC
layers.2020,
Sensors After20,the lastPEER
x FOR FC layer,
REVIEWa Softmax layer is applied to produce the distribution over the input8 data.
of 16

The main
Figure 6. The main layers of AlexNet.

The SVM is training


Considering a widelytime applied classification
with gradient method
descent, to solve two-class
the saturating nonlinearities problems.
are much However,
slower
the working condition recognition problem of sucker-rod pumping systems
than the non-saturating nonlinearity f(x) = max (0, x) [25]. In AlexNet, therefore, the Rectified Linear is a multiclass problem.
Thus, layer
Units an ECOC (ReLU) model-based multiclass
layer is adopted as theSVM is proposed.
activation function Thelayer
ECOC approach
after every main is a meta-method
layer, exceptthat the
combines many binary classifiers. To solve the multiclass problem,
last FC layer, to improve the training time and learning performance of the neural network. it reduces the classification problem
with three or more
Normalization layersclasses
followtothea set oftwo
first binaryConvclassification problems.layers
layers. Max-pooling In thefollow
ECOCthe model, the error
normalization
causedas
layers bywell
pooraschoices
the fifthof input
Conv features,
layer. They finite
aretraining
adopted data, and flaws in the
to downsample andtraining
reduce algorithms
the size of canthe
be reduced by employing redundant error-correcting bits [45]. The
neural network. To reduce the overfitting degree in the FC layers, dropout layers are designed after ECOC model requires a coding
design,
the first which
two FCdetermines
layers. After thethe
classes
last FC thatlayer,
the binary
a Softmaxlearners
layerare trained on,
is applied and a decoding
to produce scheme,
the distribution
whichthe
over determines
input data. how the results of the binary classifiers are aggregated. The common coding designs
include one-versus-all,
The SVM is a widely one-versus-one, binary complete,
applied classification method toternary complete,problems.
solve two-class ordinal, dense random,
However, the
and sparse
working random recognition
condition designs. problem of sucker-rod pumping systems is a multiclass problem.
Thus,Toancombine the advantages
ECOC model-based of these two
multiclass SVM methods, a hybrid
is proposed. TheAlexNet-SVM
ECOC approach methodis a is proposed in
meta-method
this paper, as shown in Figure 7. The proposed method is an image-based
that combines many binary classifiers. To solve the multiclass problem, it reduces the classification algorithm. It takes as input
the dynamometer
problem with three cards collected
or more classesfromto aoilfield. The input
set of binary images need
classification to be adjusted
problems. In the ECOCfrom the original
model, the
size to 227 × 227 × 3 to accommodate the input pixel requirement
error caused by poor choices of input features, finite training data, and flaws in the trainingof AlexNet. This algorithm consists
of automaticcan
algorithms feature extraction
be reduced byand fault classification
employing redundant processes. For feature
error-correcting bitsextraction,
[47]. Thethe seven-layer
ECOC model
AlexNet-based
requires a coding neural network
design, whichisdetermines
adopted tothe automatically
classes thatextract
the binaryuseful and representative
learners are trained on, features
and a
from dynamometer cards. Five Conv and two FC layers are involved
decoding scheme, which determines how the results of the binary classifiers are aggregated. The in this process. The Conv1
layer filters
common the 227
coding × 227include
designs × 3 input image with 96
one-versus-all, kernels of size binary
one-versus-one, 11 × 11complete,
× 3. The Conv2 ternarylayer filters
complete,
the output
ordinal, of Conv1
dense random, layer
and with
sparse 256 kernels
random of size 5 × 5 × 48. The number of kernels for Conv3,
designs.
Conv4, To and Conv5
combine layers
the are 384,of
advantages 384, andtwo
these 256,methods,
and the sizes of corresponding
a hybrid AlexNet-SVMkernels method × 3 × 256,
areis3proposed
3 ×this
in 3 × paper,
129, and × 3 × in129,
as3shown respectively.
Figure The FC6 method
7. The proposed and FC7islayers have 4096 neurons
an image-based algorithm. each.It For
takesfault
as
classification, ECOC model-based SVM is adopted to identify the faults
input the dynamometer cards collected from oilfield. The input images need to be adjusted from the of sucker-rod pumping system.
SVM takes
original as input
size to 227 the output
227  3 oftotheaccommodate
FC7 layer. Using the this
inputmethod,
pixelthe network classifies
requirement and outputs
of AlexNet. This
the estimated
algorithm working
consists conditions feature
of automatic of the pumping
extraction system.
and fault classification processes. For feature
The AlexNet-based
extraction, the seven-layer network is trainedneural
AlexNet-based first on the base
network is dataset
adoptedand target, and then
to automatically the learned
extract useful
features
and are transferred
representative featuresto proposed method to cards.
from dynamometer realizeFiveworking
Convcondition
and two FC recognition
layers aretarget
involvedbasedin
on our
this dataset.
process. TheWith
Conv1 transfer learning
layer filters the approach,
227  227 the first seven
3 input imagelayers
with 96 arekernels
copiedof tosize
the first seven
11  11 3 .
layers
The of our
Conv2 network.
layer filters The SVM parameters
the output of Conv1 layer are randomly initialized
with 256 kernels and trained
of size 5  5  48toward the target
. The number of
task. However,
kernels for Conv3, the Conv4,
parameters and andConv5 features
layersare arefine-tuned
384, 384, and to the new
256, andtaskthetosizes
improve performance.
of corresponding
In this process,
kernels are 3  3AlexNet-based
 256 , 3  3  129 transfer
, and learning
3  3  129 automatically
, respectively. extracts
The FC6 theand
useful FC7and representative
layers have 4096
features from various dynamometer cards. With these extracted features,
neurons each. For fault classification, ECOC model-based SVM is adopted to identify the faults the ECOC model-based SVMof
sucker-rod pumping system. SVM takes as input the output of the FC7 layer. Using this method, the
network classifies and outputs the estimated working conditions of the pumping system.
Sensors 2020, 20, 5659 9 of 15

is designed to classify the working conditions of the pumping systems and improve the fault diagnosis
accuracy and efficiency.
Sensors 2020, 20, x FOR PEER REVIEW 9 of 16

Figure7.7.Architecture
Figure Architecture ofofthe
theproposed
proposed AlexNet-SVM
AlexNet-SVM method. Thismethod
method. This methodconsists
consistsofofautomatic
automatic
feature extraction and classification processes. For feature extraction, AlexNet-based transfer
feature extraction and classification processes. For feature extraction, AlexNet-based transfer learning
automatically extracts the useful and representative features from various dynamometer
learning automatically extracts the useful and representative features from various dynamometer cards.
With these
cards. Withextracted features,
these extracted the ECOC
features, model-based
the ECOC model-basedSVMSVMis isdesigned
designedto
to classify the
theworking
working
conditions
conditionsofofthe
thepumping
pumpingsystems.
systems.

The
Theorganization
AlexNet-based andnetwork
schematic diagram
is trained of the
first methodology
on the base datasetproposed in this
and target, andstudy for learned
then the working
condition
features arerecognition of pumping
transferred to proposedsystems
methodis to
depicted
realize in Figurecondition
working 8. This process is categorized
recognition into
target based
five parts: data acquisition, data preprocessing, dynamometer card generation, feature
on our dataset. With transfer learning approach, the first seven layers are copied to the first seven extraction,
and working
layers of ourcondition
network. classification.
The SVM parameters are randomly initialized and trained toward the target
Sensors 2020, 20, x FOR PEER REVIEW 10 of 16
task. However, the parameters and features are fine-tuned to the new task to improve performance.
In this process, AlexNet-based transfer learning automatically extracts the useful and representative
features from various dynamometer cards. With these extracted features, the ECOC model-based
SVM is designed to classify the working conditions of the pumping systems and improve the fault
diagnosis accuracy and efficiency.
The organization and schematic diagram of the methodology proposed in this study for
working condition recognition of pumping systems is depicted in Figure 8. This process is
Workflow
Figure 8.8. Workflow
Figure of methodology
of the the methodology
proposedproposed for working conditions recognition of
categorized into five parts: data acquisition, data for working conditions
preprocessing, recognition
dynamometer of pumping
card generation,
pumping system.
system.
feature extraction, and working condition classification.
• Experiment
5. Raw
Rawdisplacement
displacement
and Results and
and load
load data
data are
are collected
collected by by card
card collection
collectionsensors,
sensors,called
calleddynamometer,
dynamometer,
during
duringdaily
dailyoperations
operationsin inan
anoilfield.
oilfield. AA dynamometer
dynamometer is is aa valuable
valuabledevice
deviceusedusedon onsucker-rod
sucker-rod
This
pumps section
that describes
measures the
load experiment
on the polished conducted
rod and and
plots the
the results
load
pumps that measures load on the polished rod and plots the load in relation to the in obtained
relation to in
the terms
rod of working
displacement
rod
condition recognition
asdisplacement
the pumpingasunit of the
the moves
pumping pumping
through system
a stroke
unit moves using AlexNet-based
cycle. aDynamometer
through transfer
data can be used
stroke cycle. Dynamometer learning
data to and
canselect
be
ECOC-based
equipment, SVM.
used to selectrecognize operating
equipment, conditions,
recognize and reduce
operating troubleand
conditions, of installed equipment.
reduce trouble of installed
• All the data used
equipment.
The collected datainarethis experiment
then preprocessed wereand collected from atoreal
normalized theoilfield
range ofin[0,1]
northern China, and
to eliminate any
the data
 mutual were measured
The collected
effects between by
data are then sensors attached
preprocessed
the extremely large to
andandsucker
normalized rods.
small values These
to the
in rangesensors are
of [0,1]The
the dataset. dynamometers
to eliminate
commonany data
installed on the
mutual rod.methods
effects
normalization They areinclude
between small
the in
extremelysize and
Z-score, loglight
large andin mass.
small
scaling, and Therefore,
values
Min-Max they canThe
in thenormalization.
dataset. be installed
commonon data the
equipment
• to be measured
normalization
Dynamometer methods
card ininclude
a very Z-score,
is generated simple
based on and
log
the convenient.
scaling,
normalized Inload
and Min-Max oilfield, dynamometers
andnormalization.
displacement are As
data. used to
seen
measure load
 inDynamometer on the polished rod
card is generated
Figure 3, the horizontal and displacement
axis is based of
on the normalized
the displacement the pumping unit. Then
load and displacement
of the sucker-rod the collected
pumping system,data. As data
seen
and is
the
sent vertical
toindata
Figurecenter
axis viahorizontal
3,isthe
the wireless
load. network.
axis
Dynamometeris the Based
card isona plot
displacement the ofdisplacement
of the versusand
loadsucker-rod load data,
pumping
displacement dynamometer
system,
on the and
rod.theIt is
cardsuseful
were generated,
vertical axis is theas shown
load. in Figure
Dynamometer 9.card is a plot of load versus displacement
for surveillance purposes. The card shape reflects the operating condition of the pumping on the rod. It is
useful
well, andfor surveillance
different conditionspurposes.
can be The card by
indicated shape
theirreflects the operating on
typical characteristics condition
the cards.of the
pumping well, and different conditions can be indicated by their typical characteristics on the
• The AlexNet-based transfer learning technique is applied to extract the useful and representative
cards.
features from the generated dynamometer cards. With this method, the gained knowledge from
 The AlexNet-based transfer learning technique is applied to extract the useful and
pretrained neural network can be applied to a different but related problem, such as the working
representative features from the generated dynamometer cards. With this method, the gained
condition recognition of sucker-rod pumping system in this study. The neural network needs not
knowledge from pretrained neural network can be applied to a different but related problem,
train from scratch, which speeds up the training process of the network.
such as the working condition recognition of sucker-rod pumping system in this study. The
neural network needs not train from scratch, which speeds up the training process of the
network.
 Finally, the working conditions of the pumping system are classified by the ECOC-based SVM
Figure 9. Data collection of sucker-rod pumping system. These sensors are dynamometers. They are
method.
used to collect displacement and load data. Based on these data, the dynamometer cards can be
generated, where the horizontal axis is the displacement of the sucker-rod pumping system, and the
Sensors 2020, 20, x FOR PEER REVIEW 10 of 16

Sensors 2020, 20, 5659 10 of 15

• Finally,
Figure 8. the working
Workflow conditions
of the of the
methodology pumping
proposed systemconditions
for working are classified by the
recognition of ECOC-based
pumping
SVM method.
system.

5. Experiment and Results


This section
sectiondescribes
describesthethe experiment
experiment conducted
conducted andresults
and the the results
obtained obtained
in termsinofterms of
working
working condition recognition of the pumping system using AlexNet-based
condition recognition of the pumping system using AlexNet-based transfer learning transfer learning and
ECOC-based SVM.
All the
the data
dataused
usedininthis
thisexperiment
experimentwere werecollected
collected from a real
from oilfield
a real in northern
oilfield in northernChina, andand
China, the
data were were
the data measured by sensors
measured attachedattached
by sensors to suckertorods. These
sucker sensors
rods. Thesearesensors
dynamometers installed on
are dynamometers
the rod. They
installed on thearerod.
small in size
They and light
are small in mass.
in size Therefore,
and light in mass.they can be installed
Therefore, they canonbethe equipment
installed to
on the
be measured in a very simple and convenient. In oilfield, dynamometers are used
equipment to be measured in a very simple and convenient. In oilfield, dynamometers are used to to measure load on
the polished
measure loadrod
onand
the displacement
polished rod and of the pumping unit.
displacement Then
of the the collected
pumping data is
unit. Then sent
the to data data
collected center
is
via
sentwireless
to data network.
center viaBased on the
wireless displacement
network. Based onandtheload data, dynamometer
displacement and loadcardsdata,were generated,
dynamometer
as shown
cards wereingenerated,
Figure 9. as shown in Figure 9.

Figure 9. Data collection of sucker-rod pumping system.


system. These
These sensors
sensors are
are dynamometers.
dynamometers. They are
used to collect
collect displacement
displacement and
and load
load data. Based on these data, the dynamometer cards can be
data. Based
generated, where the horizontal axis is the displacement of the sucker-rod pumping system, and the
vertical
vertical axis
axis is
is the
the load.
load.

In
In this
this study,
study,88different
differentworking
working conditions
conditions areare
considered, including
considered, NOC,
including DPB,
NOC, UPB,UPB,
DPB, CST,CST,
GIF,
ILS, SAP, and ADC (refer to Section III). For each working condition, 1000 samples
GIF, ILS, SAP, and ADC (refer to Section III). For each working condition, 1000 samples are selected are selected for
recognition andand
for recognition classification purposes,
classification purposes, andand
all the
all samples are randomly
the samples divided
are randomly into into
divided training and
training
testing datasets. Eighty percent of the total dataset is adopted to train the proposed
and testing datasets. Eighty percent of the total dataset is adopted to train the proposed model, and model, and the
remaining 20% is adopted to test the model. To accelerate the training process,
the remaining 20% is adopted to test the model. To accelerate the training process, an NVIDIA an NVIDIA GeForce
MX150
GeForcewith
MX150 MATLAB is employed
with MATLAB in this experiment.
is employed in this experiment.
The displacement (D) and load
The displacement (D) and load (L) (L) data
data are
are collected
collected from
from different
different wells
wells by different sensors
by different sensors in
in
the
the oilfield. To remove discrepancies between the different wells and sensors, the acquired data
oilfield. To remove discrepancies between the different wells and sensors, the acquired data are
are
normalized
normalized using
using the
the Min-Max
Min-Max normalization
normalization method.
method. The The normalization
normalization method
method can
can be
be described
described as:
as:

∗* D min((D)
Dii − min
i i=
DD
( ) min(D))
(5)
(5)
max( D) −
max D  min(

LLii − min(L))
 min(
L∗iL*=
i
 (6)
(6)
(
max( L) −
max L ) min(LL))
 min(
where D∗i and L∗ are the normalized displacement and load, respectively, for i = 1, 2, . . . , 200, and max(D),
where Di* andi L*i are the normalized displacement and load, respectively, for i  1, 2,..., 200 , and
max(L), min(D) and min(L) are the maximum displacement, maximum load, minimum displacement
max(D),
and max(L),
minimum min(D)
load, and min(L) are the maximum displacement, maximum load, minimum
respectively.
displacement
After the displacement load,
and minimum respectively.
and load data are normalized to the range of [0,1], the data can be
transformed into various dynamometer cards. To implement AlexNet-based transfer learning method,
the input images are adjusted from the original size to 227 × 227 to accommodate the input pixel
requirement of AlexNet. As shown in Figure 7, the extracted features from FC7 of AlexNet are selected
Sensors 2020, 20, 5659 11 of 15

as the input of the SVM classifier, which is a compromise between the classification accuracy and
computational complexity.
For the multiclass SVM, the linear kernel function is adopted to accelerate the training
process. Moreover, linear kernel function is less prone to overfitting than non-linear functions.
The one-versus-one code design is employed to design the ECOC, and the code length can be
determined as:
K × (K − 1)
(7)
2
where K is the number of classes. Hence, each code has a length of 28 in this study, and the coding
design is provided in Table 1.

Table 1. A 28-bit ECOC for the eight-class working condition recognition problem.

Class 0 1 2 3 4 5 6 7
f0 1 −1 0 0 0 0 0 0
f1 1 0 −1 0 0 0 0 0
f2 1 0 0 −1 0 0 0 0
f3 1 0 0 0 −1 0 0 0
f4 1 0 0 0 0 −1 0 0
f5 1 0 0 0 0 0 −1 0
f6 1 0 0 0 0 0 0 −1
f7 0 1 −1 0 0 0 0 0
Code Word ... ... ... ... ... ... ... ... ...
f20 0 0 0 1 0 −1 0 0
f21 0 0 0 1 0 0 −1 0
f22 0 0 0 1 0 0 0 −1
f23 0 0 0 0 1 −1 0 0
f24 0 0 0 0 1 0 −1 0
f25 0 0 0 0 1 0 0 −1
f26 0 0 0 0 0 1 −1 0
f27 0 0 0 0 0 1 0 −1

To evaluate the performance and efficiency of the proposed AlexNet-SVM method for working
condition recognition, the classification results of the proposed method are compared to those of the
classical AlexNet algorithm, in which Softmax is applied in the output layer to classify the image input,
VGG16 and ResNet34. The overall classification accuracy rate is defined by the total number of correctly
classified samples divided by the total number of all samples. The overall accuracy rate is summarized
in Table 2. The overall classification accuracy of the proposed AlexNet-SVM method is higher than
99%. This is of great importance considering the diversity and complexity of the dynamometer cards
generated by the massive sensor data. As can be seen from Table 2, the best method is AlexNet-SVM,
and the overall classification accuracy is 99.50%. For other methods, the classification accuracy
increases along with the depth of the CNN networks, such as eight-layer AlexNet, sixteen-layer VGG16,
and thirty-four-layer ResNet34. In this case, SVM-based classification method achieves better accuracy
than FC-based method within a given time step.

Table 2. Overall classification accuracy of the proposed method.

Method AlexNet VGG16 ResNet34 AlexNet-SVM


Accuracy 95.64% 96.48% 97.59 99.50%

In industrial applications, real-time capability is a significant performance index. Using this


method, four photos can be identified per second, which means that the proposed method meets the
real-time requirement. To show more details about the working conditions recognition of sucker-rod
pumping system, the confusion matrix of the proposed AlexNet-SVM method is given in Figure 10.
Accuracy 95.64% 96.48% 97.59 99.50%

In industrial applications, real-time capability is a significant performance index. Using this


method, four photos can be identified per second, which means that the proposed method meets the
Sensors 2020, 20, 5659 12 of 15
real-time requirement. To show more details about the working conditions recognition of sucker-rod
pumping system, the confusion matrix of the proposed AlexNet-SVM method is given in Figure 10.
The ratios on
The ratios onthe
thediagonal
diagonal of of
thethe confusion
confusion matrix
matrix areproportions
are the the proportions of the samples
of the samples that were that were
correctly
correctly
classifiedclassified in each operating
in each operating condition,condition, and the off-diagonal
and the off-diagonal are the misclassified
are the misclassified samples
samples proportion.
proportion.
As shown in this figure, most of the ratios on the diagonal are greater than 0.99, especially for 0.99,
As shown in this figure, most of the ratios on the diagonal are greater than class
especially
NOC, DPB, forUPB,
classCST,
NOC,ILS,DPB,
andUPB,
ADC,CST, ILS,shows
which and ADC, whichofshows
that most samplesthat
aremost of samples
correctly are
classified.
correctly classified. it
For UPB condition, For
canUPB condition,
be exactly it can be
recognized exactly
based recognized
on these samples.based on these
However, samples.
the proposed
However, the proposed method misclassifies 1.25% of samples of the SAP
method misclassifies 1.25% of samples of the SAP as the NOC and 0.5% of samples of the NOC as the NOC and 0.5% of
as the
samples of the NOC as the SAP. The reason may be that the shapes of the NOC
SAP. The reason may be that the shapes of the NOC and the SAP conditions are irregular quadrilateral. and the SAP
conditions
The results are irregular quadrilateral.
are accordance with the fact The
that results
there is are
onlyaccordance with between
small difference the fact that
somethereof NOC is only
and
small difference between some of NOC and SAP dynamometer cards, which makes
SAP dynamometer cards, which makes it more difficult to distinguish the two working conditions than it more difficult
to distinguish
other conditions. theMost
twoofworking
classes are conditions than other
only misclassified fromconditions. MostasofNOC,
one class, such classesCST,areILSonly
and
misclassified from one class, such as NOC, CST, ILS and ADC. Whereas, Some other
ADC. Whereas, Some other conditions are most likely to be misclassified as conditions SAP, including conditions are
most likely to
NOC, GIF, and ILS. be misclassified as conditions SAP, including NOC, GIF, and ILS.

Figure
Figure 10.
10. Confusion
Confusionmatrix
matrixof
ofthe
theproposed
proposed method
method on
on dynamometer
dynamometer card
card dataset.
dataset.

To analyze
To analyzethe theeffect of the
effect network
of the networkstructure and extracted
structure features
and extracted on the classification
features accuracy,
on the classification
the outputsthe
accuracy, of layers
outputs FC6ofand FC8 FC6
layers of AlexNet
and FC8 are selected as the
of AlexNet input
are of theas
selected SVMtheclassifier.
input ofThe theresults
SVM
are compared
classifier. The to the proposed
results AlexNet-SVM
are compared method, in
to the proposed which the input
AlexNet-SVM of SVM
method, iniswhich
the output of layer
the input of
FC7 (as shown in Figure 7), and the accuracy rates are summarized in Table 3. The
SVM is the output of layer FC7 (as shown in Figure 7), and the accuracy rates are summarized instructures of FC6-SVM
and FC7-SVM
Table attain higher
3. The structures accuracyand
of FC6-SVM ratesFC7-SVM
than that attain
of FC8-SVM
higherbecause
accuracy the firstthan
rates two that
structures exhibit
of FC8-SVM
a more abstract feature input, while the FC8 layer skews more towards concrete
because the first two structures exhibit a more abstract feature input, while the FC8 layer skewsclasses.
more towards concrete classes.
Table 3. Overall classification accuracy of the three different structures.

Structure FC6-SVM FC7-SVM FC8-SVM


Accuracy 99.19% 99.50% 94.87%

As for computation complexity and time cost, CNN generally needs more training time than
traditional machine learning methods due to a huge number of weight parameters to be trained,
especially when the CNN network has a deeper structure. Besides, more computing resources, such as
hardware configuration, are needed for a faster training process. Some compromises need to be made
before training a deep learning network. However, as the fast development of hardware, such as
graphic processing units, deep learning-based methods are becoming more reliable and efficient for
industrial applications.
Sensors 2020, 20, 5659 13 of 15

6. Conclusions and Future Work


Pumping wells are widely distributed across oilfields, and operate over decades of these fields.
Thus, it is difficult to monitor and recognize the working conditions of these production units. Moreover,
certain unexpected exceptions may occur in the sucker-rod pumping system during long-term operation
due to the complexity of the production environment. The traditional manual recognition methods are
time-consuming and labor-intensive. Therefore, an automatic fault diagnosis method is proposed to
recognize the working conditions of sucker-rod pumping systems with massive dynamometer card data
collected by sensors. In this method, AlexNet-based transfer learning is implemented to automatically
extract the useful and representative features from various dynamometer cards. With these extracted
features, an ECOC model-based SVM is designed to classify the working conditions and improve
the fault diagnosis accuracy. The overall classification accuracy exceeds 99%, which demonstrates
that the proposed AlexNet-SVM method is an efficient method for working condition recognition of
sucker-rod pumping systems. In addition, three different network structures are compared to analyze
the effect of the network structure and extracted features on the classification accuracy. The proposed
method could be generalized to all possible working conditions of sucker-rod pumping systems as
long as relevant dataset is well provided. This is the first time to combine deep learning and traditional
machine learning method in this problem, which can effectively reduce the need for human labor,
and improve the recognition accuracy.
In future work, we plan to collect more data on special working conditions to monitor and
recognize more types of working conditions of pumping wells. Various sensor data, such as power,
current, and temperature data, will be considered to improve the classification accuracy and algorithm
performance. Moreover, it would be noteworthy to study the dynamic changes in working conditions
of sucker-rod pumping systems in a timely and accurate manner with various sensor data.

Author Contributions: Conceptualization, H.C., H.Y., and P.Z.; methodology, H.C., E.O. and V.V.; software, H.C.;
validation, H.C.; formal analysis, H.C.; investigation, H.C.; data curation, S.L.; writing—original draft preparation,
H.C.; writing—review and editing, H.C., H.Y., P.Z., and E.O.; visualization, H.C.; supervision, H.Y., P.Z.; funding
acquisition, H.Y. and P.Z. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded in part by the National Natural Science Foundation of China under grant
61533015, in part by the National Natural Science Foundation of China under Grant 61803368, in part by the
China Postdoctoral Science Foundation under Grant 2019M661156, also in part by the Liaoning Provincial Natural
Science Foundation of China under Grant 20180540114.
Acknowledgments: We would like to thank the anonymous reviewers and academic editor for their comments
and suggestions. H.C. would like to appreciate the Joint PhD Training Program of the University of Chinese
Academy of Sciences.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Zhao, H.; Wang, J.; Gao, P. A deep learning approach for condition-based monitoring and fault diagnosis of
rod pump system. Serv. Trans. Internet Things 2017, 1, 32–42. [CrossRef]
2. Li, K.; Gao, X.-W.; Tian, Z.; Qiu, Z. Using the curve moment and the PSO-SVM method to diagnose downhole
conditions of a sucker rod pumping unit. Pet. Sci. 2013, 10, 73–80. [CrossRef]
3. Xu, P.; Xu, S.; Yin, H. Application of self-organizing competitive neural network in fault diagnosis of suck
rod pumping system. J. Pet. Sci. Eng. 2007, 58, 43–48. [CrossRef]
4. Jiao, J.; Zhao, M.; Lin, J.; Liang, K. A comprehensive review on convolutional neural network in machine
fault diagnosis. Neurocomputing 2020, 417, 36–63. [CrossRef]
5. Gao, Z.; Cecati, C.; Ding, S.X. A Survey of Fault Diagnosis and Fault-Tolerant Techniques—Part I:
Fault Diagnosis With Model-Based and Signal-Based Approaches. IEEE Trans. Ind. Electron. 2015,
62, 3757–3767. [CrossRef]
6. Gao, Z.; Cecati, C.; Ding, S.X. A Survey of Fault Diagnosis and Fault-Tolerant Techniques Part II:
Fault Diagnosis with Knowledge-Based and Hybrid/Active Approaches. IEEE Trans. Ind. Electron.
2015, 62, 3757–3767. [CrossRef]
Sensors 2020, 20, 5659 14 of 15

7. Djeziri, M.A.; Benmoussa, S.; Zio, E. Review on Health Indices Extraction and Trend Modeling for Remaining
Useful Life Estimation. In Artificial Intelligence Techniques for a Scalable Energy Transition; Springer Science and
Business Media LLC: Berlin/Heidelberg, Germany, 2020; Volume 8, pp. 183–223.
8. Liu, R.; Yang, B.; Zio, E.; Chen, X. Artificial intelligence for fault diagnosis of rotating machinery: A review.
Mech. Syst. Signal Process. 2018, 108, 33–47. [CrossRef]
9. Zhao, R.; Yan, R.; Chen, Z.; Mao, K.; Wang, P.; Gao, R.X. Deep learning and its applications to machine health
monitoring. Mech. Syst. Signal Process. 2019, 115, 213–237. [CrossRef]
10. Lei, Y.; Lin, J.; Zuo, M.J.; He, Z.; Zuo, M.J. Condition monitoring and fault diagnosis of planetary gearboxes:
A review. Measurement 2014, 48, 292–305. [CrossRef]
11. Hoang, D.-T.; Kang, H.-J. A survey on Deep Learning based bearing fault diagnosis. Neurocomputing 2019,
335, 327–335. [CrossRef]
12. Derek, H.J.; Jennings, J.W.; Morgan, S.M. Sucker Rod Pumping Unit Diagnostics Using an Expert System.
In Proceedings of the SPE Permian Basin Oil and Gas Recovery Conference, Midland, TX, USA, 10–11 March
1988. SPE 17318.
13. Foley, W.; Svinos, J. Expert Adviser Program for Rod Pumping (includes associated paper 19367). J. Pet. Technol.
1989, 41, 394–400. [CrossRef]
14. Martinez, E.R.; Moreno, W.J.; Castillo, V.J.; Moreno, J.A. Rod pumping expert system. In Proceedings of the
SPE Petroleum Computer Conference, New Orleans, LA, USA, 11–14 July 1993. SPE 26246.
15. Schnitman, L.; Albuquerque, G.; Correa, J.; Lepikson, H.; Bitencourt, A. Modeling and Implementation of
A System for Sucker Rod Downhole Dynamometer Card Pattern Recognition. In Proceedings of the SPE
Annual Technical Conference and Exhibition, Denver, CO, USA, 5–8 October 2003.
16. Rogers, J.D.; Guffey, C.G.; Oldham, W.J.B. Artificial Neural Networks for Identification of Beam Pump
Dynamometer Load Cards. In Proceedings of the SPE Annual Technical Conference and Exhibition,
New Orleans, LA, USA, 23–26 September 1990. SPE 20651.
17. Nazi, G.; Ashenayi, K.; Lea, J.; Kemp, F. Application of Artificial Neural Network to Pump Card Diagnosis.
SPE Comput. Appl. 1994, 6, 9–14. [CrossRef]
18. Tian, J.; Gao, M.; Li, K.; Zhou, H. Fault Detection of Oil Pump Based on Classify Support Vector Machine.
In Proceedings of the 2007 IEEE International Conference on Control and Automation, Guangzhou, China,
30 May–1 June 2007; Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA; pp. 549–553.
19. Zhou, B.; Wang, Y.; Liu, W.; Liu, B. Identification of Working Condition From Sucker-Rod Pumping Wells
Based On Multi-View Co-Training and Hessian Regularization of SVM. In Proceedings of the 2018 14th
IEEE International Conference on Signal Processing (ICSP), Beijing, China, 12–16 August 2018; Institute of
Electrical and Electronics Engineers (IEEE): New York, NY, USA; pp. 969–973.
20. Lei, Y.; Jia, F.; Lin, J.; Xing, S.; Ding, S.X. An Intelligent Fault Diagnosis Method Using Unsupervised Feature
Learning towards Mechanical Big Data. IEEE Trans. Ind. Electron. 2016, 63, 3137–3147. [CrossRef]
21. A Sharaf, S. Beam Pump Dynamometer Card Prediction using Artificial Neural Networks. KnE Eng. 2018, 3,
198–212. [CrossRef]
22. Wang, X.; He, Y.; Li, F.; Dou, X.; Wang, Z.; Xu, H.; Fu, L. A Working Condition Diagnosis Model of Sucker Rod
Pumping Wells Based on Big Data Deep Learning. In Proceedings of the International Petroleum Technology
Conference, Beijing, China, 26–28 March 2019; pp. 1–10.
23. Peng, Y. Artificial Intelligence Applied in Sucker Rod Pumping Wells: Intelligent Dynamometer Card
Generation, Diagnosis, and Failure Detection Using Deep Neural Networks. In Proceedings of the SPE
Annual Technical Conference and Exhibition, Calgary, AB, Canada, 30 September–2 October 2019; Society of
Petroleum Engineers (SPE): Richardson, TX, USA, 2019.
24. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks.
In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Lake Tahoe, NV, USA,
3–6 December 2012; pp. 1097–1105.
25. Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition.
In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA,
7–9 May 2015.
Sensors 2020, 20, 5659 15 of 15

26. Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A.
Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; Institute of Electrical and Electronics
Engineers (IEEE): New York, NY, USA; pp. 1–9.
27. Bengio, Y.; Lee, N.-H.; Bornschein, J.; Mesnard, T.; Lin, Z. Towards Biologically Plausible Deep Learning.
Nature 2015, 521, 436–444.
28. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June
2016; Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA; pp. 770–778.
29. Zhao, L.; Qiu, X.; Zhang, Q.; Huang, X. Sequence Labeling with Deep Gated Dual Path CNN. IEEE/ACM
Trans. Audio Speech, Lang. Process. 2019, 27, 2326–2335. [CrossRef]
30. Sun, K.; Li, Y.; Deng, D.; Li, Y. Multi-Channel CNN Based Inner-Attention for Compound Sentence Relation
Classification. IEEE Access 2019, 7, 141801–141809. [CrossRef]
31. Yang, Z.; Nevatia, R. A Multi-Scale Cascade Fully Convolutional Network Face Detector. In Proceedings of
the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancún, Mexico, 4–8 December 2016;
Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA; pp. 633–638.
32. Wu, W.; Yin, Y.; Wang, X.; Xu, D. Face Detection with Different Scales Based on Faster R-CNN.
IEEE Trans. Cybern. 2019, 49, 4017–4028. [CrossRef]
33. Wu, Z.; Wang, X.; Jiang, Y.-G.; Ye, H.; Xue, X. Modeling Spatial-Temporal Clues in a Hybrid Deep
Learning Framework for Video Classification. In Proceedings of the 23rd ACM International Conference on
Multimedia—MM ’15, Brisbane, Australia, 26–30 October 2015; pp. 461–470.
34. Zhou, Z.; Chen, J.; Yang, C.-N.; Sun, X. Video Copy Detection Using Spatio-Temporal CNN Features.
IEEE Access 2019, 7, 100658–100665. [CrossRef]
35. Tomé, D.; Monti, F.; Baroffio, L.; Bondi, L.; Tagliasacchi, M.; Tubaro, S. Deep Convolutional Neural Networks
for pedestrian detection. Signal Process. Image Commun. 2016, 47, 482–489. [CrossRef]
36. Li, J.; Liang, X.; Shen, S.; Xu, T.; Feng, J.; Yan, S. Scale-aware Fast R-CNN for Pedestrian Detection.
IEEE Trans. Multimedia 2017, 20, 985–996. [CrossRef]
37. Zhao, Z.-Q.; Zheng, P.; Xu, S.-T.; Wu, X. Object Detection with Deep Learning: A Review. IEEE Trans. Neural
Netw. Learn. Syst. 2019, 30, 3212–3232. [CrossRef] [PubMed]
38. Shao, L.; Zhu, F.; Li, X. Transfer Learning for Visual Categorization: A Survey. IEEE Trans. Neural Netw.
Learn. Syst. 2015, 26, 1019–1034. [CrossRef] [PubMed]
39. Olivas, E.S.; Guerrero, J.D.M.; Sober, M.M.; Lopez, S. Handbook of Research on Machine Learning Applications
and Trends: Algorithms, Methods and Techniques; Information Science IGI Publishing: Hershey, PA, USA, 2009;
pp. 242–264.
40. Tommasi, T.; Orabona, F.; Caputo, B. Safety in Numbers: Learning Categories from few Examples with Multi
Model Knowledge Transfer. In Proceedings of the 2010 IEEE Computer Society Conference on Computer
Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 3081–3088.
41. Alnujaim, I.; Alali, H.; Khan, F.; Kim, Y. Hand Gesture Recognition Using Input Impedance Variation of Two
Antennas with Transfer Learning. IEEE Sens. J. 2018, 18, 4129–4135. [CrossRef]
42. Vapnik, V.N. The Nature of Statistical Learning Theory; Springer Science and Business Media LLC:
Berlin/Heidelberg, Germany, 1995.
43. He, Y.; Du, C.Y.; Li, C.B.; Wu, A.G.; Xin, Y. Sensor Fault Diagnosis of Superconducting Fault Current Limiter
With Saturated Iron Core Based on SVM. IEEE Trans. Appl. Supercond. 2014, 24, 1–5. [CrossRef]
44. Wu, X.; Zuo, W.; Lin, L.; Zhang, B.; Zhang, K. F-SVM: Combination of Feature Transformation and SVM
Learning via Convex Relaxation. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 5185–5199. [CrossRef]
[PubMed]
45. Dietterich, T.G.; Bakiri, G. Solving Multiclass Learning Problems via Error-Correcting Output Codes. J. Artif.
Intell. Res. 1995, 2, 263–286. [CrossRef]

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

You might also like