Professional Documents
Culture Documents
Abstract—Though handwriting recognition is a well-explored rough analysis for large-scale applications in the electrical or
research area for decades, there are a few sub-areas of this field electronic domain is made from hand-drawn diagrams, and
that have still not obtained much attention from the researchers. often in bulk, it becomes imperative to design a model for the
Some examples include recognition of hand-drawn graphics
components like circuit components and diagrams. Complete detection and the recognition of the individual components,
digitization of such handwritten documents is not possible with- which are extracted from the digitized version of the raw
out automatic conversion of the said circuit diagrams. Besides, (paper-based) images. Modern research works elaborate on the
to date, in most of the cases for commercial circuit design need for digitization of images and emphasize computer-based
purposes, concerned people manually enter the components into analysis.
the simulating software like Cadence, Spice to analyze the circuit
and judge its performance. In this work, it has been tried to
move one step towards automating this process by recognizing Besides, concerned people can interpret the symbols of
the hand-drawn circuit components which are considered as the hand-drawn circuit diagram using their knowledge and ex-
most important step for this automation. The present endeavour perience; however, this task thereafter requires manual inter-
is to design a two-stage convolutional neural network (CNN)- vention to enter the hand-drawn components into CAD tools
based model that recognizes the hand-drawn circuit components.
In the first stage, all the similar-looking (i.e., similar shape and like Multisim, CircuitMaker, Cadence followed by performing
structure) circuit components are clustered into a single group related processes like simulation and analysis of the circuit
using visual perception and input from confusion matrix of parameters. The overall process is very cumbersome and time-
singlestage CNN-based classification, and in the later stage, the taking. Also, the effort put into this rises with the complexity
circuit components belong to the same group are classified into of the diagrams under consideration. So it would be much
their actual classes. The proposed model has been evaluated on
a self-made database where 20 different classes of handdrawn simpler if a method is devised such that components are
circuit components are considered. The experimental outcome recognized directly from handdrawn schematics, reducing both
shows that the proposed two-stage classification model provides time and human resources. It is to be noted that to recognize
an accuracy 97.33which is much higher than the single-stage the entire hand-drawn circuit diagram one has to extract the
method, which provides an accuracy of 86.00 circuit components [3, 4] first from the circuit diagram and
Index Terms—Circuit component recognition, Hand-drawn
circuit, Analog and digital, CNN model then can apply circuit component recognition process.
I. I NTRODUCTION Keeping the above facts in mind, the present work deals
An electrical or electronic circuit diagram is a graphical with the classification of 20 different hand-drawn analog and
representation containing various symbols used to represent digital circuit components (e.g., Ammeter, Voltmeter, Inductor,
the circuit components such as a resistor and battery which are resistance and different gates) using a convolutional neural
connected by lines representing the wires in a real-time circuit. network (CNN)-based approach which involves grouping of
Engineers, scientists, as well as students and researchers, have visually similar components using a CNN model at the first
to deal with circuit diagrams quite often, and they require deep stage, then designing CNN models for each group to classify
analysis of the diagrams under scrutiny as well. This involves the components of a group into their exact classes. It is note-
recognition or identification of the symbols as the first step. worthy that the task of recognition from hand-drawn circuit
Modern scenario showcases ardent inclination of researchers components in offline mode possesses inherent challenges
as well as tech giants towards digitization of documents owing to variation in drawing style, with people following
[1, 2]. Digitization is converting a document into a digital random pen-up and pen-down fashion while drawing, often
format, where the data are organized into bits. The biggest leading to erratic diagrams which may turn out to be difficult
reform that digitization has brought about is the preservation for even the veterans to comprehend. Moreover, the low quality
of off-line data. It becomes especially important in large- of paper and ink, the natural noise and the noise appearing
scale applications where manufacturers have to deal with and while acquiring the images are the typical challenges in this
keep track of huge data, typically millions of datasets. Since domain. Mainly these issues motivate us to design a deep
learning-based model for the said task, as conventional feature
Identify applicable funding agency here. If none, delete this. engineering-based methods may fail to yield the desired result
II. R ELATED W ORK binary level and morphological operations are applied to obtain
a clean, connected representation using thinned lines. The
Although hand-drawn circuit component recognition is not diagram is comprised of nodes (a node is any point on a circuit
a well-explored research area, still literature survey reveals where the terminals of two or more circuit elements meet),
that some researchers have attempted this in the past, few of connections and components. Using appropriate threshold on
those are briefed here. Dewangan and Dhole [5] use a strategy a spatially varying object pixel density, nodes and components
utilizing K-nearest neighbor (KNN) classifier to build a system of the image are segmented. By the use of shape-based features
that directly reads the electrical circuit components from a and SVM classifier, components and nodes are classified.
hand-drawn circuit image. The feature vector is prepared However, the accuracy of classification is not mentioned. Also,
using a shape-based feature extraction process considering no information is available about the size of the dataset and
the geometric properties of the circuit components. Analog total classes used.
components, divided into 10 classes, are used to obtain an Patare and Joshi [14] propose a method to recognize com-
accuracy of around 90many major circuit components like ponents in a hand-drawn digital logic circuit diagram. This
digital gates, transformers are not taken into consideration and system uses a region-based segmentation method to segment
the number of circuit components used in the dataset is not circuit sketch and classifies each component using SVM that
specified. uses Fourier descriptor as the feature vector. However, an
In another work, Feng et al. [6] rely on a two-dimensional average of only 83circuit recognition accuracy is achieved.
dynamic programming (2D-DP) technique allowing symbol Edward and Chandran [15] propose a method in which the
hypothesis generation, which can correctly segment and rec- scanned image of a diagram is pre-processed to remove noise
ognize interspersed symbols. Besides, as discriminative clas- and then converted to a bi-level intensity image. Morpho-
sifiers usually have limited capability to reject outliers, some logical operations are applied to obtain a clean, connected
domain-specific knowledge is included to circumvent those representation using thinned lines. The diagram comprises
errors due to untrained patterns corresponding to erroneous nodes (a node is any point on a circuit where the terminals
segmentation hypotheses. With a point level online mea- of two or more circuit elements meet), connections, and
surement, the experiment shows that the proposed approach components. Nodes and components are segmented using
can achieve an accuracy of more than 90However, very few appropriate thresholds on a spatially varying object pixel
components (only 9) are used and the components are drawn density. Connection paths are traced using a pixel stack.
using a digital pen on the digital surface Nodes are classified using syntactic analysis. Components are
A system of offline circuit recognition and simulation classified using a combination of invariant moments, scalar
using digital image processing is proposed by Angadi and pixel-distribution features, and vector relationships between
Naika [7]. The model consists of four stages, namely pre- straight lines in polygonal representations. The node recog-
processing, segmentation, support vector machine (SVM) [8]- nition accuracy of 92and component recognition accuracy of
based circuit component classification and simulation stages, 86is achieved on a database comprising 107 nodes and 449
i.e., the authors propose a complete system. In this work, components. However, only 9 different components are taken
different shape-based features like average component height, into consideration and the size of the dataset is quite small.
inclination, and entropy of the components are used. However, Digital components are not taken into consideration in this
no data regarding the number of components used, accuracy, work.
and size of the dataset are available. A topology-based segmentation method to segment circuit
Moetesum and Younus [9] present an effective technique sketch and classify each component using the Fourier descrip-
for the segmentation and recognition of electronic components tor as a feature vector for SVM is done by Liu and Xiao [16].
from hand-drawn circuit diagrams. Segmentation is carried out An accuracy rate of over 90is achieved for each component.
by using a series of morphological operations [10] on the However, only 5 classes are considered and the dataset consists
binarized images of circuits and discriminating between three of only 55 components per class.
categories of components (closed shape, components with Dreijer [17] aims to create an alternative to the common
connected lines, disconnected components). Each segmented schematic capture process through the use of an interactive
component is characterized by computing the Histogram of pen-based interface to the capturing software. Sketches are
Oriented Gradients (HOG) descriptor [11] while classification interpreted through a process of vectorizing the user’s strokes
is carried out using an SVM classifier [12]. A segmentation into primitive shapes, extracting information on intersections
accuracy of 87.70and a classification rate of 92.00is realized between primitives, and using a naı¨ve Bayesian classifier to
demonstrating the effectiveness of the proposed technique. identify symbol components. Training data are generated by
However, the dataset is quite small consisting of 35 compo- mutating a single definition of each symbol. The symbols
nents per class having 10 classes in total. Circuit components are divided into 14 classes, each class consisting of 100
like digital gates and transistors are not taken into considera- symbols. The overall accuracy obtained was average. The
tion. system confuses diodes with boxes and not gate with dc source
In the work by Veena and Naik [13], the scanned image of thus having a very low accuracy of 29and 50, respectively.
a diagram is pre-processed to remove noise and converted to Naika et al. [18] try to recognize hand-drawn electronic
components (analog only) using a HOG-based features and much needed. Some machine learning along with conventional
subsequently used an SVM classifier. The authors have exper- feature engineering-based (e.g., [5, 7, 15]) approaches has
imented on a dataset having 2000 isolated circuit component also set foot to solve this problem and recorded a promising
images and the proposed method yields a 96.90recognition rate outcome. However, on the downside, these methods have
on a 10-class problem. In their work, they have only used some compromised with a number of circuit components (e.g., 5
electrical components while completely ignored the digital part in [16], 9 in [15], and 10 in [9]) and a variety of hand-drawn
and important analog components like ammeter and voltmeter. circuit components (e.g., 55 samples per class in [16], total 449
An artificial neural network (ANN)-based model is used by samples in [15] and 35 samples per class in [9]). Additionally,
Rabbani et al. [19] to make a system that can directly read while classifying the circuit components, these works do
the electrical symbols from a hand-drawn circuit image. The not take any measure for similar-looking circuit components
recognition process involves two steps: The first step is shape- (e.g., OR and NOR GATE, PNP and NPN transistor, and
based feature extraction, and the second one is classification ammeter and voltmeter) that might generate high inter-class
using ANN that uses a backpropagation algorithm. The ANN misclassification. Therefore, framing a more generic model
is trained and tested with different hand-drawn electrical for the recognition of offline hand-drawn circuit components
circuit component images. The results show that their proposal is indeed required.
is viable, but the accuracy obtained is much lower and the In the work [18], the authors have recorded satisfactory
dataset used is very small in size. recognition accuracy using a texture-based feature (i.e., HOG)
Recently, Roy et al. [20] have proposed a method for for a 10 class problem. In this work, texture representation
the recognition of hand-drawn electrical and electronic cir- using a customized CNN has been extracted. In this context,
cuit components, with both analog and digital components it should be mentioned that there are many recently published
included. In this method, the pre-processed images of circuit good works where authors have claimed CNN can identify
components are used for training and testing a recognition texture-based features with ease [26–28]. In these research
model using a feature set consisting of a texture-based feature articles, it has been mentioned that convolutional layers in any
descriptor, called HOG, and shape-based features that include CNN model generate different feature maps and these feature
centroid distance, tangent angle, and chain code histogram maps can be thought of as filter banks with higher complexity
[21]. Besides, the texture-based feature is optimized using a with the increase of depth. These feature maps are powerful
feature selection algorithm called ReliefF and then classified tools to extract texture features [29] and these feature maps
using sequential minimal optimization (SMO) classifier [22]. have been widely used in texture analysis. Additionally, the
It is to be noted that the current work is an extension of this authors of the work [30] have conducted different experiments
work. to provide evidence that CNN relies on object textures rather
The present authors come across some works which are than global object shapes as commonly assumed. Hence, the
aligned to the current work that include sketch symbol recog- authors have introduced Shape-ResNet (a modified version of
nition (e.g., Deufemia et al. [23]), identifying handdrawn ResNet-50) to overcome the texture biases of commonly used
graphics components from the textual parts (e.g., Avola et al. CNN models. These works establish the fact that CNN-based
[24, 25]) in an online handwritten document using machine deep learning models can be used to extract texture features
learning-based approaches. The authors of the work [24] from an object. More such claims could be found in the survey
have used discriminative features like entropy, band ratio, paper [31]. However, not all CNN architecture may work well
X scan, intersection and projection, and SVM as classifier. for all types of image classification problems. Therefore, here
While use of features like curvature and linearity along with a customized CNN model is designed.
the 6 features used in the work [24] and extreme learning Considering the facts mentioned above, a CNN-based model
machine (ELM) as classifier could be found in the work [25] to is designed to extract texture features and eventually recognize
perform classification of drawing symbols and texts. In another hand-drawn circuit components satisfactorily. Concisely, the
work, Deufemia et al. [23] propose a two-stage clustering- highlights of the present work are as follows:
based approach for labelling different types sketched symbols. Grouping of similar-looking circuit components using a
In the first stage, the authors use latent-dynamic conditional single-stage CNN-based classification and visual percep-
random field (LDCRF) to analyze the features of unsegmented tion.
stroke sequences based on spatiotemporal information of the Designing CNN models for each group obtained in the
strokes, and then select the stokes of symbol part(s) using first stage (i.e., for classification of circuit components
the contextual information. In the later stage, they group the within a group).
previously labelled stokes into symbol labels using a distance- Obtained more than 10more recognition accuracy than
based clustering technique. while recognizing the circuit components with a single-
stage classification using the CNN model.
III. M OTIVATION AND CONTRIBUTIONS
The above discussions reflect that though few attempts have IV. P RESENT W ORK
been made by the researchers in the past, still the necessity The present work deals with the classification of hand-drawn
of circuit component recognition with substantial efficiency is electrical and electronics circuit components using a two-stage
framework where the CNN model is used as a backbone
architecture. In the first stage, the authors have performed
group-level classification (i.e., classifying a component as one
of the predefined groups) while in the second stage group-
specific classification (i.e., intra-group classification) using
the same CNN architecture is performed. It is to be noted
that the predefined set of groups for visually similar looking
circuit components (e.g., AND 13370 Neural Computing and
Applications (2021) 33:13367–13390 123 gate and NAND
gate or Ammeter and Voltmeter) is formed with the help of
the confusion matrix obtained by a single-stage classification
model and visual exploration. In this section, first the data
preparation technique is described and then the customized
CNN model designed here. The entire process in Algorithm 1
is described here, while the associated processes are described
in the following subsections.
A. Data preparation
To evaluate any recognition model, a suitable database is a
prerequisite. However, to the best of the author’s knowledge,
no such dataset is available publicly for hand-drawn circuit
component recognition. Therefore, they have prepared an in-
house dataset containing circuit components images that are
drawn by engineering students, research scholars, and faculty
members. Components belonging to 20 different classes have
been collected in a preformatted sheet, similar to the works
[32, 33]. Also there is no constraint in ink colour used for
drawing the samples. A sample of the filled-in datasheet is
shown in Fig. 1. The class index for each circuit component Fig. 1. A sample filled-in datasheet containing hand-drawn electrical and
as defined is shown in Table 1. In this context, it is to be electronic circuit component images
mentioned that the components are drawn by the contributors
in a complete unconstrained fashion, thus bearing several
drawing styles for the components which have, in turn, helped During the forward pass, each filter is convolved with the input
the authors to establish the robustness of the proposed model. image using a kernel. The output volume of a convolutional
For each circuit component, 150 sample images are collected layer is obtained by stacking the feature maps generated using
which have been resized to 64 * 64 pixels. all filters along the depth dimension. It helps in extracting
features from the input image. While extracting the features,
Algorithm 1 : Classification of the circuit component images it preserves the spatial relationship between the pixels by
Input: Images of the circuit components
Output: Predicted class for the input images extracting image features using small squares of input kernel
[36].
In this work, the original binarized input image of dimension
B. Customised CNN M * N * 1 is fitted, where M and N are original image
It is already mentioned that the authors have classified 20 height and width, respectively, to the initial convolutional layer
different hand-drawn electrical and electronic circuit compo- having n1 number of filters. As a result, it generates n1 number
nents using a simple CNN-based model, which hereafter is of feature maps at the end of the first convolutional layer.
called as customized CNN. In this section, the overall archi- Later, in each convolutional layer, feature maps of dimension
tecture of this customized CNN model has been described. Mi *Ni * nj (i = 2; 4; 6; 8; j = 1; 2; 3; 4 as shown in Fig. 2) are
In this network, first, five convolution layers with successive fed, where Mi, Ni and nj denote the height and width of each
max-pooling layers are used and then the final feature map feature map, and the number of feature maps, respectively.
is flattened in a linear form which is hereafter called flatten The values of Mi and Ni are suitably chosen depending upon
vector. Finally, two fully connected (FC) layers are used of the original image height (M) and width (N). A filter with
which the last layer maps to output classes. The CNN model dimension mj * mj * nj is used where mj and nj (j = 1; 2;
is described in brief in the following subsections. 3; 4; 5) represent the length of a square-shaped filter and the
1) Convolution Layer: Convolutional layers are the fun- number of filters (sometimes called activation functions) in
damental building blocks of any CNN model [34, 35]. The each convolutional layer. In the performed experiments, the
parameters of each layer consist of a set of learnable filters. authors set mj = 3; 8j while varying the value of nj. They
TABLE I
VARIOUS CLASSES OF ANALOG AND DIGITAL CIRCUIT COMPONENTS THAT ARE COLLECTED UNDER THE SCOPE OF PRESENT WORK
Class Component name Class Component name Class Component name Class Component name
1 AC source 2 Ammeter 3 AND Gate 4 Capacitor
5 DC source 6 Ground 7 Inductor 8 NAND Gate
9 NOR Gate 10 NOT Gate 11 NPN Transistor 12 OR Gate
13 PN Junction Diode 14 PNP Transistor 15 Power Supply 16 Resistor
17 Switch 18 Transformer 19 Voltmeter 20 Zener diode
also set M ¼ N ¼ 64 for the simplicity of the model. For the as probabilities. The graph of this function is shown in Fig.
convenience of the common readers, the detailed architecture 3a.
ezi
of the convolutional layers is shown in Table 2 σ(z)i = PK
2) Pooling Layer: The pooling layer is used for reducing j=1
the spatial dimension (i.e., height and width) of any feature for i = 1,..., K and z = (z1 , z2 , ..., zK )(1)
map and doing so, helps in reducing the computational power 5) ReLU: : ReLU is one of the most popular activation
required to process the data through dimensionality reduction. functions used in deep learning. If the input to the function
Besides, it is useful for generating rotational and positional is negative then the output is going to be 0 while for the
invariant dominant features from input images, thus lessening positive inputs the output remains the same as the input is. The
the chance of overfitting during training. The commonly used mathematical form of the function is shown in Eq. (2). The
variants of the pooling strategy are max-pooling and avg- graph of this function is shown in Fig. 3b. The main advantage
pooling [23]. In the proposed work, avg-pooling technique is of using ReLU in deep learning is, in a neural network all
used with kernel dimension kj * kj (where, j= 1; 2; ...; 5). The the neurons are not activated at the same time. Those getting
value of kj may be varied but, in the performed experiments, negative input are deactivated by the ReLU function.
the commonly used value, i.e., kj = 2; 8j, is chosen.
f (x) = max(0, x) (2)
3) Fully connected layer: At the end of convolution layers,
all the feature maps are flattened linearly as vectors and then
fed to FC layers that are similar to traditional multilayer per-
ceptron models with an activation function [37] like SoftMax,
sigmoid, ReLU and tanh in the output layer. In the FC layers,
MSELoss loss is minimized by the Adam optimizer with
gradient 1e4 and the corresponding learning rate is 0.0005.
The output layer is another vector that represents the number
of classes to be recognized by the underlying CNN model.
During training, the batch size is set to 10 whereas the number
of epochs is 150. Information on FC the layers of the proposed
customized CNN model is provided in Table 2.
4) Activation Function: In a neural network, activation
function is a function that decides whether a neuron should
be activated or not by calculating a weighted sum and further
adding bias with it. A neural network without an activation
function is essentially just a linear regression model. It is used
to introduce non-linearity into the output of a neuron and using
this, it makes a neural network model capable to learn and per-
form more complex tasks. In this section, two such commonly
used activation functions have been described. SoftMax: The
SoftMax function is often used in the final layer of a neural
network-based classifier. SoftMax is a function that takes input
as a vector of K real numbers (say, 2 z1;z2...;zK RK), and Fig. 2. Architecture of the customized CNN model used here. The variables
Mi and Ni (i=1,2,...,10) represent the height and width of feature maps,
normalizes it into a probability distribution consisting of K respectively, whereas nj (j=1,2,...,5) represent the number of feature maps
probabilities (say, r z i ) proportional to the exponentials of (i.e., the number of filters) at this stage. M and N represent the height and
the input numbers as shown in Eq. (1). Some components width of the input image at the beginning, respectively. The variables kj and
mj (j=1,2,...,5) represent the length of square-shaped pooling mask (at jth
in the last fully connected layer could be negative or greater pooling) and kernel (at jth convolutional layer), respectively
than one and might not sum to 1. However, after applying
SoftMax, each component will be in the interval [0, 1] and the The value of C (i.e., the number of output classes) is varied
components will add up to 1 so that they can be interpreted as per requirement
TABLE II and 68.40 GB Disc space and Python 3 is used as a python
A RCHITECTURE OF THE CUSTOMIZED CNN MODEL THAT IS USED interpreter. Experiments are performed using 70and 30of the
THROUGHOUT THIS WORK
entire dataset as training and test samples, i.e., 105 and
Layer Input dimension feature maps(i.e., ni) Output dimension 45 samples are used per circuit component as train and
Conv1 64 *64 * 1 8 64 * 64* 8 test samples. During CNN models’ training, a set of data
Pool1 64 * 64 * 8 32 * 32* 8 augmentation techniques are applied in train samples, which
Conv2 32 * 32 * 8 16 32 * 32 *16
Pool2 32 * 32 *16 16 * 16 * 16 are rotation, dilation, erosion, and skeletonization. The batch
Conv3 16 * 16 * 16 32 16 * 16 * 32 size and number of epochs are set to 10 and 200, respectively,
Pool3 16 * 16 * 16 8 * 8 * 32 for all the following experiments.
Conv4 8 * 8 * 32 64 8 * 8* 64
Pool4 8 * 8 * 64 4 * 4 * 64
Conv5 4 * 4 * 64 128 4 * 4 * 128
Pool5 4 * 4 * 128 2 * 2 * 128
FC1 512 - 64
FC2 64 C
TABLE III
S HOWS THE CONFUSION MATRIX OF GROUP - LEVEL CLASSIFICATION OF
THE CIRCUIT COMPONENTS
TABLE IV
S HOWS THE CONFUSION MATRIX WHEN CIRCUIT COMPONENTS OF G ROUP
1 ARE CLASSIFIED
TABLE V
S HOWS THE CONFUSION MATRIX OF THE CIRCUIT COMPONENTS
BELONGING TO G ROUP 3 Fig. 6. 6 Overall confusion matrix obtained by applying the present two-stage
CNN model-based model while classifying the hand-drawn 20 different circuit
AC source Ammeter Voltmeter components. Intra-group misclassification of the first three groups (Group 1:
AC source 44 0 1 yellow colour, Group 2: green colour and Group 3: light blue colour) are
Ammeter 2 43 0 marked therein (color figure online)
Voltmeter 3 4 37
G. Tuning of parameters
As no work has been done before on this particular domain
using a deep learning model. Hence, comparative analysis with
the other deep learning models that can’t be performed have
been used to solve similar problem. However, in this section,
more experiments are performed by fitting different CNN
architectures in the proposed work. In this context, it is to be
mentioned that considering the CNN model, described in Table
2, as the base model, and by removing/adding convolutional
layers from/to a standard CNN model, many more models can
be obtained. However, for simplicity, two such variations are
Fig. 7. Comparative recognition accuracy for each circuit component using
tried with that are described in the following subsections. single-stage and two-stage classification models on the present dataset
1) Performance by changing the pooling type: In the cus-
tomized CNN model, avg-pool strategy is used. Therefore, in 3) Performances with varying number of epochs: In the
this experiment, avg-pool is substituted by max-pool strategy third experiment, the number of epochs are varied to check
TABLE VI
S HOWS THE CONFUSION MATRIX OF THE CIRCUIT COMPONENTS BELONGING TO G ROUP 4
Inductor 0 0 0 45 0 0 0 0 0 0 0
NOT 0 0 0 1 43 1 0 0 0 0 0
Gate
PN 0 0 0 0 0 44 0 0 1 0 0
Junction
Diode
Power 0 0 0 0 0 0 45 0 0 0 0
Supply
Resistor 0 0 0 0 0 0 0 45 0 0 0
Switch 0 0 0 0 2 1 0 0 45 0 0
Transformer 0 0 0 0 0 0 0 0 0 45 0
Zener 0 0 0 0 0 2 0 0 1 0 42
diode
Fig. 8. Comparative recall, precision, specificity and F1-scores for each circuit
component using single-stage and two-stage classification models
the characteristics of the models that can be constructed using Fig. 9. 9 Overall confusion matrix obtained by applying the present two-
the model descriptions of Sects. 5.7.1 and 5.7.2. In total, 6 stage CNN-based classification model on the augmented dataset. Intra-group
misclassification of the first three groups (Group 1: yellow colour, Group 2:
different models have been constructed. The comparative test green colour and Group 3: light blue colour) are marked therein (color figure
accuracies at different iterations of all these models have been online)
shown in Figs. 13, 14, 15, 16, 17. In these Figures, a model
“Conv X Y-pool” indicates it uses X number of convolutional
layers and Y (avg/max) type of pooling strategy. From Figs. methods, i.e., Dewangan and Dhole [5], Rabbani et al. [19],
14, 15, 16, 17, 18, it is observed that in the customized CNN Naika et al. [18], Liu and Xiao [16], Angadi and Naika
model (i.e., 5 convolution layers with average pooling layer), [7], Moetesum et al. [9], Veena and LakshmanNaik [13],
the accuracy increases rapidly up to a certain number of epochs Patare and Joshi [14] and, Roy et al. [20] and three standard
and then it stabilizes to a constant value. Increasing the number deep learning models, namely AlexNet [40], ResNet [41] and
of epochs further does not improve the results. So, from these MobileNet [42]) have been implemented.
results it can comment that the customized CNN used here These methods have been trained and tested on the same
is more consistent than its alternatives. Besides, these results data (train test split is described in Sect. 5.1) for all the cases.
show that the network using 5 convolutional layers and an The comparative accuracy results are shown in Table 9. This
average pooling strategy provides us the best result in all the Table also includes the performance of present customized
cases. CNN when it is applied following a single-stage approach (i.e.,
no group level classification performed) and termed as “single-
H. Comparison with state-of-the-art methods stage CNN.” Comparisons are also made in terms of average
To compare the performance of the proposed model with recall, precision F1- score, accuracy and specificity. The same
state-of-the-art methods, more experiments are performed. For experiments have been repeated on the augmented dataset as
this reason, 12 different methods (among these nine existing well and comparative results are shown in Table 10. Here also,
Fig. 10. 0 Comparative recognition accuracy for each circuit component using
single-stage and two-stage classification models on the present augmented
dataset Fig. 13. Comparative results by varying the pooling type
TABLE VIII
C OMPARISON OF THE PROPOSED METHOD WITH SOME STATE - OF - THE - ART METHODS AND SINGLE - STAGE CNN MODEL ON THE AUGMENTED DATASET
VIII. R EFERENCES
R EFERENCES