Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: Images of handwritten digits are different from natural images as the orientation of a digit, as well as sim-
Received 31 December 2019 ilarity of features of different digits, makes confusion. On the other hand, deep convolutional neural net-
Revised 20 February 2020 works are achieving huge success in computer vision problems, especially in image classification. Here,
Accepted 1 March 2020
we propose a task-oriented model called Bengali handwritten numeral digit recognition based on densely
Available online xxxx
connected convolutional neural networks (BDNet). BDNet is used to classify (recognize) Bengali hand-
written numeral digits. It is end-to-end trained using ISI Bengali handwritten numeral dataset. During
Keywords:
training, untraditional data preprocessing and augmentation techniques are used so that the trained
Bengali digit recognition
CNN
model works on a different dataset. The model has achieved the test accuracy of 99.78% (baseline was
Dataset 99.58%) on the test dataset of ISI Bengali handwritten numerals. So, the BDNet model gives 47.62% error
Deep learning reduction compared to previous state-of-the-art models. Here we have also created a dataset of 1000
Handwritten numerals images of Bengali handwritten numerals to test the trained model, and it giving promising results.
Image classification Codes, trained model and our own dataset are available at C :https://github.com/Sufianlab/BDNet.
Ó 2020 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an
open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
https://doi.org/10.1016/j.jksuci.2020.03.002
1319-1578/Ó 2020 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
2 A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx 3
model (Akhand et al., 2016), proposed by M. A. H. Akhand et al. recognition (Ghosh et al., 2019). Therefore, many successive
used pre-processing by using a simple rotation based approach state-of-the-art models came within a short span of time since
to produce patterns and it also makes all images of ISI handwritten 2012 (Sultana et al., 2018; Lim and Keles, 2018). In this subsection,
database into the same resolution, dimension, and size. CNN struc- we briefly reviewed the development of deep learning especially
ture of this model has two convolutional layers with 5 5 sized Convolutional Neural Networks in the field of image classifications.
local receptive fields and two sub-sampling layers with 2 2 sized CNN is a special type of multi-layer neural network inspired by the
local averaging areas along with input and output layers. The input vision mechanism of the animal (Ghosh et al., 2019). Hubel and
layer contains 784 receptive fields for 28 28 pixels image. The Wiesel experimented and said that visual cortex cells of animal
first convolutional operation produces six feature maps. Convolu- detect light in the small receptive field (Fukushima, 1980). Kuni-
tion operation with kernel spatial dimension of 5 reduces 28 spa- hiko Fukushima got motivation from this experiment and pro-
tial dimension to 24 (i.e., 28 þ 1 5) spatial dimension. posed a multi-layered neural network, called NEOCOGNITRON
Therefore, each first level feature map size is 24 24. The accuracy (Hubel and Wiesel, 1968), capable of recognizing visual patterns
rate of the testing is 98.45% on ISI handwritten Bengali numerals. hierarchically through learning. This model is considered as the
In Choudhury et al. (2018), A. Choudhury et al. proposed a his- inspiration for CNN. A classical CNN model is composed of one or
togram of oriented gradient (HOG) and color histogram for the more blocks of convolutional and sub-sampling or pooling layer,
selection of feature algorithms. Here, HOG is used as the feature then single or multiple fully connected layers, and an output layer
set to represent each numeral item at the feature space and SVM function as shown in Fig. 2. The benefits of using CNN are auto-
is used to produce the output from input. The test accuracy of this mated features extraction, parameter sharing and many more
algorithm is 98.05% on CMATERDB 3.1.1 dataset (which is a bench- (Lecun et al., 1998; Krizhevsky et al., 2012; Nanni et al., 2017).
mark Bengali handwritten numeral database created by CMATER Classical CNN is modified in many different ways according to tar-
lab of Jadavpur University, India). M. M. Hasan et al. proposed a get domain (Sultana et al., 2018; Liang et al., 2017; Baldominos
Bengali handwritten digit recognition model based on ResNet (He et al., 2018; Sultana et al., 2019). Yann LeCun et al. introduced
et al., 2016). Their ensemble model from their six best models, the first complete CNN model, called LeNet-5 (Lecun et al., 1998),
applied on the NumtaDB dataset (Alam, 2020), achieved to classify English handwritten digit images. It has 7 layers among
99.3359% test accuracy. In Noor et al. (2018) R. Noor et al. proposed which 3 convolutions, 2 average pooling, 1 fully connected, and 1
an ensemble model based Convolutional Neural Network for recog- output layer. They used SIGMOID function as the activation func-
nizing Bengali handwritten numerals. They train their model in tion for non-linearity before an average pooling layer. The output
many noisy conditions using customized NumtaDB dataset layer used Euclidean Radial Basis Function(RBF) for classification
(Alam, 2020). In all cases, their model achieved more than 96% test of MNIST (Leun, 2020) dataset. The weights of each layer were
accuracy on this NumtaDB dataset. A recent Bengali handwritten trained using the back-propagation algorithm (Rumelhart et al.,
numeral recognition work (Rabby et al., 2019) was there, proposed 1986). AlexNet (Krizhevsky et al., 2012) was the first CNN based
by AKM S. A. Rabby et.al. Here authors used a lightweight CNN model which won the ILSVRC challenge (Russakovsky et al.,
model to classify the handwritten numeral digits. This model is 2015) in 2012 with a significant reduction of error. AlexNet’s error
trained using ISI handwritten Bengali numeral (Bhattacharya and rate was 16.4% whereas the second best error rate was 26.17%. This
Chaudhuri, 2009) and CMATERDB 3.1.1 databases with 20% data model was proposed by Alex Krizhevsky et al. and it is trained by
for validation. Their work achieved validation and test accuracy using ImageNet dataset (Deng et al., 2009), this dataset contains
as 99.74% and 99.58% on ISI handwritten numerals, which is the 15 million high resolution labeled images over 22 thousand cate-
previous best accuracy on that dataset. Their model also shows gories. AlexNet has 11 trainable layers, and the structure is almost
good accuracy on other benchmark datasets such as CMATERDB similar to LeNet-5, but here max-pooling used instead of the
3.1.1 etc. M.S. Islam et.al proposed Bayanno-Net (Islam et al., average pooling, ReLU activation in place of the SIGMOID function,
2019), where they tried to recognize Bangla handwritten numerals softmax function in place of RBF, and 11 11 in-place of 5 5
using CNN. They worked on NumtaDB dataset (Alam, 2020). Their filter size in the first layer. In addition, for the first time dropout
model achieved 97% accuracy. A recent review article (Hoq et al., strategy, Srivastava et al. (2014) and GPU were used to train the
2020) written by M.N. Hoq et.al given a comparative overview of model. In Zeiler and Fergus (2013) Zeiler and Fergus presented
existing classification algorithms to recognized Bengali handwrit- ZFNet which was the winner of the ILSVRC challenge in 2013.
ten numerals. The building blocks of ZFNet is almost similar to AlexNet with
few changes such as the first layer filter size is 7 7 instead of
2.2. Advancements of deep learning for image classification 11 11 in AlexNet. Authors of ZFNet explained how CNN works
with the help of Deconvolutional Neural Networks (DeconvNet).
After the success of AlexNet (Krizhevsky et al., 2012), a deep DeconvNet is just the opposite of CNN. The error rate of ZFNet
learning-based model for image classification, many researchers was 11.7%. K. Simonyan and A. Zisserman proposed VGGNet
shifted to this area of research of computer vision and pattern (Simonyan and Zisserman, 2014), which is like a deeper version
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
4 A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx
of AlexNet. Here, authors used small filters 3 3 sizes for all layers. between channels. SENet has won the ILSVRC-2017 challenge with
They have used a total 6 different CNN configurations with differ- an error rate of 2.252%. Recently a new concept in convolutional
ent weight layers. This VGGNet secured 2nd place in ILSVRC chal- neural networks proposed through CapsNet (Sabour et al., 2017).
lenge in 2014 with an error rate of 7.3% just 0.6% more than the Here sub-sampling concepts are abolished and classification is pos-
error rate of the winner GoogLeNet (Szegedy et al., 2015). GoogLe- sible of images of different orientations. Though the algorithm is
Net, Going Deeper with Convolutions (Szegedy et al., 2015), is pro- not light for dynamic routing among capsules but it is extremely
posed by Christian Szegedy et al. which was a research team of promising (Patrick et al., 2019).
Google. The Structure of GoogLeNet is different from traditional
CNN, it is wider and deeper than previous models but computa-
tionally efficient. Through inception architecture, multiple parallel 3. BDNet Model Details
filters with different sizes are used, and for this, problems of van-
ishing gradient and over-fitting were tackled. Fully connected lay- The network architecture of the BDNet model has shown in Fig. 3.
ers are not used in GoogLeNet but the average pooling layer is used The BDNet consists of three Dense Blocks and two Transition Blocks
before the classifier. This model won the ILSVRC challenge 2014 followed by Batch Normalization(BN), Rectified Linear Unit(ReLU)
with an error rate of 6.7%. The increasing layer could give more activation, Average Pooling(Avg. POOLING), Fully Connected(FC)
accuracy but will suffer from vanishing gradient problem. To tackle Layer, Softmax Function with Output Layer. Each Dense Block is
this problem, Kaiming He et al. from Microsoft Research proposed made up of 6 bottleneck blocks. Structure of each bottleneck
ResNet (He et al., 2016). ResNet is a very deep model where each block is as follows: . . .! BatchNorm ! ReLU ! ConV2dð1 1Þ !
layer has a residual block with a skip connection to the layer before BatchNorm ! ReLU ! ConV2dð3 3Þ ! . . .. The number of bottle-
the previous layer. ResNet is the winner of the ILSVRC challenge neck blocks(NBL) per dense block has calculated using Eq. (1).
with an error rate of 3.57% and this is a success of beyond the
1 n4
human level. Gao Hunag et al. proposed DenseNet (Huang et al., NBL ¼ b c ð1Þ
2 3
2017), where every layer is connected to all previous layers of
the model. DenseNet overcomes the vanishing gradient problem where n is the number of layers of the network model. Dense connec-
as well as it collects required features of all layers and propagates tivity is present among bottleneck blocks of each dense block i.e. out-
to all successive layers in feed-forward fashions for features reuse. put of each bottleneck block is forwarded to all other successive
Therefore, this model requires less number of parameters to blocks for features propagation. The number of feature maps that will
achieve accuracy, so it is computationally efficient. Inspired by be forwarded depends on the growth rate, and here the growth rate is
the success of ResNet, Jie Hu et al. proposed SENet (Hu et al., 12. In between two dense blocks, we have used one transition block
2018) with the main focus to increase channel relationship which consists of: BatchNorm ! ReLU ! ConV: ! Av g:Polling. To
between successive layers. SENet has added ‘‘Squeeze-and-Excita make the model compact we reduce the number of feature maps.
tion” (SE) block into each block (ResNet Block), and for this, the Some hyper-parameters of the model has explained below and others
model adaptively re-calibrates channel-wise feature responses which are directly related to the training in Section 5.
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx 5
Fig. 4. Typical Bengali handwritten numeral digits and corresponding printed values.
Number of hidden layers and units: It is preferably good to 4. Dataset and Preprocessing of the Dataset
add more layers when the test error is no longer decreasing in
existing layers. A small number of layers may lead to The Bengali language is mainly derived from the Brahmi script
under-fitting, on the other hand, having more layers is usually and Devanagari script in the 11th Century AD. The structural view
not suitable with appropriate regularization. But adding more lay- of each character and numeral of this language are very complex.
ers makes the model more complex and computation time will So, train a model using the Bengali digit is more difficult compared
increase. After careful experiments, we have used 39 hidden layers, to the English numeral digit as the English digits have a less com-
one fully connected(FC) layer in our model. Then we have used plex structure. In addition, English numerals datasets are easily
softmax function to output 10 classes. Model details are shown available in terms of quantity and quality such as MNIST (Leun,
in Fig. 3. Here, softmax function transforms predicted scores to 2020) but it is not easy for Bengali numeral datasets. Bengali digit
predicted probability scores using the following Eq. (2). also has some high similarity features for different numerals such
as numeral 1 (in Bengali) and numeral 9 (in Bengali) has high sim-
ezi
y^i ¼ ð2Þ ilarity features, similarly numeral 5 (in Bengali) and 6 (in Bengali)
X
10
has high similarity features. The typical Bengali handwritten
e zj
j¼1
numerals and corresponding printed values as shown in Fig. 4.
where y^i denotes prediction score of i-th digit or class. 4.1. Used dataset
Optimizer: BDNet trained through back-propagation
(Rumelhart et al., 1986) using optimizer. Here, weights updated The ISI Bengali numeral off-line handwritten dataset
using SGD (Stochastic Gradient Descent) (Bottou et al., 2010). The (Bhattacharya and Chaudhuri, 2009) is one of the largest popular
mathematical loss function JðwÞ as in Eq. (3) and it is basically datasets of handwritten Bengali numerals. This dataset consists
the difference between the updated internal parameters of a model of 23392 black and white image data written by 1106 persons col-
which are used for computing the values (yi ) from the set of inputs lected from postal mail and job application forms. Among these
(xi ) used in the model and the desired output (y^i ). 23392 data, 19392 are training data and 4000 are testing data.
The entire dataset represents 10 classes for numeral 0 to 9. Some
X
n X
n
typical data items of this dataset shown in Fig. 5.
ðyi y^i Þ
2
JðwÞ ¼ 1n J i ðwÞ ¼ 1n ð3Þ
i¼1 i¼1
4.2. Preprocessing of the datasets
Here x denotes the value of a pixel. ReLU removes negative values 4.3. Own dataset
from an activation map by setting them to zero. It increases the
nonlinear properties of the decision function and removes the We have also created our own dataset of 1000 images. Among
chances of vanishing gradient of BDNet without affecting the recep- these 1000 images, 100 images per digit are there for each Bengali
tive fields of the convolution layer. numeral digit from digits zero to nine. This dataset is created by 5
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
6 A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx
Table 1
Used Dataset Distribution.
5. Training Details
Fig. 5. Sample handwritten ISI numeral image data.
Table 2
Used system specifications.
Resources Specifications
CPU R
Intel XeonR CPU @2:3GHz with 45 MB Cache
RAM 12.72 GB available
DISK 1 TB (Partially used)
GPU 1Nvidia Tesla T4 having 2560 CUDA cores,
16 GB(14.72 GB available) GDDR6 VRAM
Languages & Packages Python with PyTorch (Paszke et al., 2017)
Training & Validation Time 37.68 Seconds per Epoch
Fig. 6. Pre-processing steps.
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx 7
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
8 A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx
current best result on the ISI handwritten numeral dataset. Details A total of 9 images are wrongly classified among all the test
of the testing results described in subSection 6.4. images of the dataset which are shown in Table 4. In the confusion
matrix, it has shown that which wrongly predicted image is classi-
fied into which class. After careful observation of the patterns of
6.4. Analysis through confusion matrix these 9 images, the possible reasons behind these wrong classifica-
tion could be understood.
Testing result of the BDNet on the test dataset of ISI handwrit-
ten Bengali numeral can be presented in the confusion matrix as 6.5. Standard deviation of test results
shown in Table 3. Clearly, we can see that among 10 numeral digits
of 4000 test images, only 9 were wrongly predicted or classified. To find how spread out the test accuracy in case of training the
Among 400 test images of the digit 0, one is incorrectly predicted model multiple times with the same hyper-parameter configura-
as 3, and one is 7. Similarly, 3 images of the digit 1, 1 image of tions, we calculate the standard deviation (r) by using Eq. (7).
the digit 2, 2 images of the digit 5, and 1 image of the digit 9 were vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u N
predicted incorrectly. The details of wrong predictions are shown u1 X
in the confusion matrix in Table 3. The F1 score for this confusion
r¼t ðxi lÞ2 ð7Þ
N i¼1
matrix is 0.99775.
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx 9
Table 3
Confusion matrix of the test result of the ISI handwritten Bengali numerals test dataset.
Predicted Class
Numerals 0 1 2 3 4 5 6 7 8 9 Accuracy(%)
Actual Class 0 398 0 0 1 0 0 0 1 0 0 99.50
1 0 397 1 0 0 0 0 0 0 2 99.25
2 0 0 399 0 0 1 0 0 0 0 99.75
3 0 0 0 400 0 0 0 0 0 0 100.00
4 0 0 0 0 400 0 0 0 0 0 100.00
5 0 0 0 0 1 398 1 0 0 0 99.50
6 0 0 0 0 0 0 400 0 0 0 100.00
7 0 0 0 0 0 0 0 400 0 0 100.00
8 0 0 0 0 0 0 0 0 400 0 100.00
9 0 0 0 0 0 1 0 0 0 399 99.75
Table 4 Table 6
Wrongly Classified Images. Notable Bengali handwritten numerals recognition models and corresponding test
accuracy in the benchmark dataset (Bhattacharya and Chaudhuri, 2009).
Digit Images Actual Class Predicted Class
0 3 Models Test accuracy
5 6
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
10 A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx
Table 7
Confusion matrix of the test result on own test dataset.
Predicted Class
Numerals 0 1 2 3 4 5 6 7 8 9 Accuracy(%)
Actual Class 0 100 0 0 0 0 0 0 0 0 0 100
1 0 100 0 0 0 0 0 0 0 0 100
2 0 0 100 0 0 0 0 0 0 0 100
3 0 0 0 98 0 0 1 0 1 0 98
4 0 0 0 0 100 0 0 0 0 0 100
5 1 0 0 0 0 95 3 1 0 0 95
6 0 0 0 0 0 0 100 0 0 0 100
7 0 0 1 1 0 0 0 98 0 0 98
8 0 0 1 0 0 0 1 0 98 0 98
9 0 1 0 0 0 0 0 0 0 99 99
which are also worked on this benchmark dataset (Bhattacharya tion through image classification. The BDNet is trained using ISI
and Chaudhuri, 2009). Before BDNet, the previous two best models Bengali handwritten numerals dataset and the trained model has
(Liu and Suen, 2009; Rabby et al., 2019) achieved the test accuracy achieved a new benchmark accuracy on a test dataset. The trained
of 99.40% and 99.58% (authors of Rabby et al., 2019 claimed it) BDNet also tested using our own dataset to see the generalization
respectively whereas BDNet achieved 99.78%. All the notable mod- of the model, where a good result has been found.
els and corresponding test accuracies are shown in Table 6. Graph-
ical comparison of said models has shown in Fig. 11 where X-axis Declaration of Competing Interest
presents the models and Y-axis showed the corresponding test
accuracy in the benchmark dataset (Bhattacharya and Chaudhuri, The authors declare that they have no known competing finan-
2009). cial interests or personal relationships that could have appeared
to influence the work reported in this paper.
6.7. Test result on our own dataset
Acknowledgment
Our own test dataset is described in subSection 4.3 which has
First of all, the authors of the BDNet are thankful to CVPR Unit,
10 classes with 100 images per class. This test dataset is not used
Indian Statistical Institute, Kolkata, for providing the dataset for
during training for evaluating the generalization capability of the
academic research. The authors also thankful to the editor and
trained BDNet on a new dataset. As a result, the BDNet has got
reviewers of this journal for mentioned some comments on initial
98.80% test accuracy. The entire result has been shown in the con-
submission to implement in the revised version.
fusion matrix mentioned in Table 7.
7. Conclusion
References
Aim of the work was to propose a practical task-oriented model
to recognize Bengali handwritten numerals with good accuracy, Akhand, M.A.H., Ahmed, M., Rahman, M.M.H., 2016. Convolutional neural network
based handwritten bengali and bengali-english mixed numeral recognition. I.J.
and through this paper, we propose BDNet. BDNet is a densely con- Image Graph. Signal Process. 9. https://doi.org/10.5815/ijigsp.2016.09.06. 40–
nected deep CNN model for handwritten Bengali numeral recogni- 40.
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002
A. Sufian et al. / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx 11
Alam, S., Reasat, T., Doha, R.M., Humayun, A.I. Numtadb-assembled bengali Liu, C.-L., Suen, C.Y., 2009. A new benchmark on the recognition of handwritten
handwritten digits. URL:http://arxiv.org/abs/1806.02452. bangla and farsi numeral characters. Pattern Recogn. 42 (12), 3287–3295.
Bag, S., Harit, G., 2013. A survey on optical character recognition for bangla and https://doi.org/10.1016/j.patcog.2008.10.007.
devanagari scripts. Sadhana 38 (1), 133–168. https://doi.org/10.1007/s12046- Maas, A.L., Hannun, A.Y., Ng, A.Y., 2013. Rectifier nonlinearities improve neural
013-0121-9. network acoustic models. In: Proceedings of the 30th International Conference
Baldominos, A., Saez, Y., Isasi, P., 2018. Evolutionary convolutional neural networks: on Machine Learning. Atlanta, Georgia, USA.
an application to handwriting recognition. Neurocomputing 283, 38–52. Majumder, P., Mitra, M., Parui, S.K., Bhattacharyya, P., 2007. Initiative for indian
https://doi.org/10.1016/j.neucom.2017.12.049. language ir evaluation. In: The First International Workshop on Evaluating
Basu, S., Sarkar, R., Das, N., Kundu, M., Nasipuri, M., Basu, D.K., 2005. Handwritten Information Access (EVIA), pp. 14–16.
bangla digit recognition using classifier combination through ds technique. In: Nanni, L., Ghidoni, S., Brahnam, S., 2017. Handcrafted vs. non-handcrafted features
Pattern Recognition and Machine Intelligence, pp. 236–241. for computer vision classification. Pattern Recogn. 71, 158–172. https://doi.org/
Bhattacharya, U., Chaudhuri, B.B., 2009. Handwritten numeral databases of indian 10.1016/j.patcog.2017.05.025.
scripts and multistage recognition of mixed numerals. IEEE Trans. Pattern Anal. Nasir, M.K., Uddin, M.S., 2013. Hand written bangla numerals recognition for
Mach. Intell. 31 (3), 444–457. https://doi.org/10.1109/TPAMI.2008.88. automated postal system.https://doi.org/10.9790/0661-0864348.
Bottou, L., 2010. Large-scale machine learning with stochastic gradient descent. In: Noor, R., Mejbaul Islam, K., Rahimi, M.J., 2018. Handwritten bangla numeral
Lechevallier, Y., Saporta, G. (Eds.), Proceedings of COMPSTAT’2010, Physica- recognition using ensembling of convolutional neural network. In: 21st ICCIT.
Verlag HD, Heidelberg. pp. 177–186. pp. 1–6.https://doi.org/10.1109/ICCITECHN.2018.8631944.
Choudhury, A., Rana, H.S., Bhowmik, T., 2018. Handwritten bengali numeral Pal, U., Chaudhuri, B.B., 2000. Automatic recognition of unconstrained off-line
recognition using hog based feature extraction algorithm. In: 2018 5th bangla handwritten numerals. In: Tan, T., Shi, Y., Gao, W. (Eds.), Advances in
International Conference on Signal Processing and Integrated Networks Multimodal Interfaces—ICMI 2000. Springer, Berlin Heidelberg, Berlin,
(SPIN), pp. 687–690. https://doi.org/10.1109/SPIN.2018.8474215. Heidelberg, pp. 371–378. https://doi.org/10.1007/3-540-40063-X_49.
Das, N., Sarkar, R., Basu, S., Kundu, M., Nasipuri, M., Basu, D.K., 2012. A genetic Pal, U., Chaudhuri, B.B., Belaid, A., 2006. A complete system for bangla handwritten
algorithm based region sampling for selection of local features in handwritten numeral recognition. IETE J. Res. 52 (1), 27–34. https://doi.org/10.1080/
digit recognition application. Appl. Soft Comput. 12 (5), 1592–1606. https://doi. 03772063.2006.11416437.
org/10.1016/j.asoc.2011.11.030. Pal, U., Jayadevan, R., Sharma, N., 2012. Handwriting recognition in indian regional
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L., 2009. Imagenet: a large-scale scripts: a survey of offline techniques 11(1), 1:1–1:35.https://doi.org/10.1145/
hierarchical image database. In: The IEEE Conference on Computer Vision and 2090176.2090177.
Pattern Recognition (CVPR). Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison,
Dutta, A., Chaudhury, S., 1993. Bengali alpha-numeric character recognition using A., Antiga, L., Lerer, A., 2017. Automatic differentiation in pytorch. In: 31st
curvature features. Pattern Recogn. 26 (12), 1757–1770. https://doi.org/ Conference on Neural Information Processing Systems (NIPS 2017). Long Beach,
10.1016/0031-3203(93)90174-U. CA, USA.
Fukushima, K., 1980. Neocognitron: a self-organizing neural network model for a Patrick, M.K., Adekoya, A.F., Mighty, A.A., Edward, B.Y. Capsule networks a survey. J.
mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. King Saud Univ. Comput. Inf. Sci.https://doi.org/10.1016/j.jksuci.2019.09.014.
36 (4), 193–202. Rabby, A.S.A., Abujar, S., Haque, S., Hossain, S.A., 2019. Bangla handwritten digit
Ghosh, A., Sufian, A., Sultana, F., Chakrabarti, A., De, D., 2019. Fundamental concepts recognition using convolutional neural network. In: Abraham, A., Dutta, P.,
of convolutional neural network. In: et al., Recent Trends and Advances in Mandal, J.K., Bhattacharya, A., Dutta, S. (Eds.), Emerging Technologies in Data
Artificial Intelligence and Internet of Things. Springer, Cham. doi: 10.1007/978- Mining and Information Security. Springer, Singapore, pp. 111–122.
3-030-32644-9_36. Rahman, M.M., Akhand, M.A.H., Islam, S., Shill, P.C., Rahman, M.M.H., 2015. Bangla
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., Liu, T., Wang, X., Wang, G., handwritten character recognition using convolutional neural network. I.J.
Cai, J., Chen, T., 2018. Recent advances in convolutional neural networks. Image Graph. Signal Process. 8, 42–49. https://doi.org/10.5815/
Pattern Recogn. 77, 354–377. https://doi.org/10.1016/j.patcog.2017.10.013. ijigsp.2015.08.05.
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition. Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by back-
In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). propagating errors. Nature 323. https://doi.org/10.1038/323533a0. 533–436.
Hoq, M.N., Islam, M.M., Nipa, N.A., Akbar, M.M., 2020. A comparative overview of Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy,
classification algorithm for bangla handwritten digit recognition. In: A., Khosla, A., Bernstein, M., Berg, A.C., Fei-Fei, L., 2015. Imagenet large scale
Proceedings of International Joint Conference on Computational Intelligence, visual recognition challenge. Int. J. Comput. Vision 115 (3), 211–252. https://
pp. 265–277. doi.org/10.1007/s11263-015-0816-y.
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q., 2017. Densely connected Sabour, S., Frosst, N., Hinton, G.E., 2017. Dynamic routing between capsules. In:
convolutional networks. In: The IEEE Conference on Computer Vision and Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S.,
Pattern Recognition (CVPR). Garnett, R. (Eds.), Advances in Neural Information Processing Systems 30.
Hubel, D.H., Wiesel, T.N., 1968. Receptive fields and functional architecture of Curran Associates Inc, pp. 3856–3866.
monkey striate cortex. J. Physiol. (London) 195, 215–243. https://doi.org/ Shopon, M., Mohammed, N., Abedin, M.A., 2017. Image augmentation by blocky
10.1007/BF00344251. artifact in deep convolutional neural network for handwritten digit recognition.
Hu, J., Shen, L., Sun, G., 2018. Squeeze-and-excitation networks. In: The IEEE In: 2017 IEEE International Conference on Imaging, Vision Pattern Recognition
Conference on Computer Vision and Pattern Recognition (CVPR). (icIVPR). pp. 1–6.
Islam, M.S., Foysal, M.F.A., Noori, S.R.H., 2019. Bayanno-net: Bangla handwritten Simonyan, K., Zisserman, A. Very deep convolutional networks for large-scale image
digit recognition using convolutional neural networks. 2019 IEEE Region 10 recognition, CoRR abs/1409.1556. URL:http://arxiv.org/abs/1409.1556.
Symposium (TENSYMP), 23–27. https://doi.org/10.1109/ Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R., 2014.
TENSYMP46218.2019.8971167. Dropout: a simple way to prevent neural networks from overfitting. J. Mach.
Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. Imagenet classification with deep Learn. Res. 15, 1929–1958.
convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Sultana, F., Sufian, A., Dutta, P., 2018. Advancements in image classification using
Weinberger, K.Q. (Eds.), Advances in Neural Information Processing Systems convolutional neural network. In: 2018 Fourth International Conference on
25. Curran Associates Inc, pp. 1097–1105. Research in Computational Intelligence and Communication Networks
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P., 1998. Gradient-based learning applied to (ICRCICN), pp. 122–129. https://doi.org/10.1109/ICRCICN.2018.8718718.
document recognition. Proceedings of the IEEE, vol. 86, pp. 2278–2324. https:// Sultana, F., Sufian, A., Dutta, P. A review of object detection models based on
doi.org/10.1109/5.726791. convolutional neural network, CoRR abs/1905.01614. URL:http://arxiv.org/abs/
LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521, 436–444. https:// 1905.01614.
doi.org/10.1038/nature14539. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke,
Leun, Y. The mnist database of handwritten digits. URL:https://ci.nii.ac.jp/naid/ V., Rabinovich, A., 2015. Going deeper with convolutions. Comput. Vis. Pattern
10027939599/en/. Recogn. URL:http://arxiv.org/abs/1409.4842.
Liang, T., Xu, X., Xiao, P., 2017. A new image classification method based on Wen, Y., He, L., 2012. A classifier for bangla handwritten numeral recognition.
modified condensed nearest neighbor and convolutional neural networks. Expert Syst. Appl. 39 (1), 948–953. https://doi.org/10.1016/j.eswa.2011.07.092.
Pattern Recogn. Lett. 94, 105–111. https://doi.org/10.1016/j.patrec.2017.05.019. Zeiler, M.D., Fergus, R. Visualizing and understanding convolutional networks. CoRR
Lim, L.A., Keles, H.Y., 2018. Foreground segmentation using convolutional neural abs/1311.2901.
networks for multiscale feature encoding. Pattern Recogn. Lett. 112, 256–262.
https://doi.org/10.1016/j.patrec.2018.08.002.
Please cite this article as: A. Sufian, A. Ghosh, A. Naskar et al., BDNet: Bengali Handwritten Numeral Digit Recognition based on Densely connected Con-
volutional Neural Networks, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2020.03.002