Professional Documents
Culture Documents
Ahmad Et Al. - 2023
Ahmad Et Al. - 2023
https://doi.org/10.1007/s00521-023-08200-0(0123456789().,-volV)
(0123456789().,-volV)
ORIGINAL ARTICLE
Received: 28 January 2022 / Accepted: 6 January 2023 / Published online: 24 January 2023
© The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2023
Abstract
The new COVID-19 emerged in a town in China named Wuhan in December 2019, and since then, this deadly virus has
infected 324 million people worldwide and caused 5.53 million deaths by January 2022. Because of the rapid spread of this
pandemic, different countries are facing the problem of a shortage of resources, such as medical test kits and ventilators, as
the number of cases increased uncontrollably. Therefore, developing a readily available, low-priced, and automated approach
for COVID-19 identification is the need of the hour. The proposed study uses chest radiography images (CRIs) such as
X-rays and computed tomography (CTs) to detect chest infections, as these modalities contain important information about
chest infections. This research introduces a novel hybrid deep learning model named Lightweight ResGRU that uses residual
blocks and a bidirectional gated recurrent unit to diagnose non-COVID and COVID-19 infections using pre-processed CRIs.
Lightweight ResGRU is used for multi-modal two-class classification (normal and COVID-19), three-class classification
(normal, COVID-19, and viral pneumonia), four-class classification (normal, COVID-19, viral pneumonia, and bacterial
pneumonia), and COVID-19 severity types’ classification (i.e., atypical appearance, indeterminate appearance, typical
appearance, and negative for pneumonia). The proposed architecture achieved f-measure of 99.0%, 98.4%, 91.0%, and
80.5% for two-class, three-class, four-class, and COVID-19 severity level classifications, respectively, on unseen data. A
large dataset is created by combining and changing different publicly available datasets. The results prove that radiologists
can adopt this method to screen chest infections where test kits are limited.
Keywords SARS-CoV-2 · COVID-19 · Deep learning · Chest radiography images · Residual blocks · COVID-19 severity
levels
1 Introduction
123
9638 Neural Computing and Applications (2023) 35:9637–9655
123
Neural Computing and Applications (2023) 35:9637–9655 9639
features. Using these learned patterns, it builds a model that ● A novel and hybrid model is showcased that combines
takes input and classifies data as accurately as possible. the properties of residual blocks and bidirectional gated
This research paper presents a novel and hybrid archi- recurrent unit (Bi-GRU) for the early detection of chest
tecture named Lightweight ResGRU. The proposed archi- infection with low false-negative rate and false-positive
tecture uses residual blocks (RB) based on 2-dimensional rate.
CNN, pooling layers, dense layers, batch normalization ● A lightweight approach is proposed in which skip
layers, and a bidirectional gated recurrent unit (Bi-GRU) for
connections are used to jump over some network layers,
the prediction of non-COVID chest infections (i.e., bacterial
minimizing the number of parameters and making the
and viral) and COVID-19 infection and its different severity
model lightweight.
levels. The residual block is the building block in the pro-
posed architecture because it speeds up model training and ● A larger dataset is curated and used for model training to
propagates lower-level features via skip connections [18]. improve generalizability and prevent the problem of
Bi-GRU is a sequential processing model, a type of recur- model over-fitting. An external cohort/cross-dataset is
rent neural network (RNN). It is used in the architecture to also used to assess the performance of the suggested
compute the dependencies and connections between input model on an external dataset obtained from a different
data features [19], which improves classification accuracy. It source.
also facilitates addressing the issue of intra-class similarities ● Multi-modal CRIs (X-Rays and CTs) are used to train
(i.e., similarities between images of different classes), as it the model for binary classification (normal and COVID-
keeps the information of previous and forward images of a 19). Thus, radiologists and medical experts can use
current image in the data and predicts a class based on this either X-ray or CT to screen for COVID-19 pneumonia.
information. The classification is categorized in terms of ● The suggested model predicts the distinct severity levels
two-class multi-modal classification (normal and COVID- of COVID-19 (i.e., negative for pneumonia, atypical,
19), which uses X-rays and CTs for the training of the indeterminate, and typical) proposed by RSNA, which,
model, three-class classification (normal, COVID-19, and to the best of our knowledge, has not been published by
viral pneumonia), four-class classification (normal, COVID- any research study.
19, viral pneumonia, and bacterial pneumonia), and SARS-
CoV-2 severity levels’ classification (negative for pneu- The rest of this article is divided into the following
monia, atypical, indeterminate, and typical). Lightweight sections: After the introduction section, Sect. 2 will outline
ResGRU uses different benchmark datasets [20]–[23] that a literature survey of existing studies. Section 3 describes
contain X-rays and CTs of healthy and non-healthy (patients the datasets’ details and the proposed system's complete
suffering from chest infections) for training, validation, architecture. Experimental setups are described in Sect. 4.
testing, and external cohort validation (cross-dataset eval- The results of the proposed study are presented in Sect. 5.
uation). Accuracy, precision, sensitivity, specificity, f-mea- Section 6 displays the model's performance on the cross-
sure, false-negative rate (FNR), false-positive rate (FPR), dataset. Section 7 discusses and analyzes Lightweight
and confusion matrix are used to evaluate the proposed ResGRU and its comparison with existing research studies.
system. The links to datasets and source codes are available Finally, the conclusion of the research is presented in
at https://github.com/Mughees-Ahmad/Lightweight_ Sect. 8.
ResGRU. The following are the main contributions of the
paper:
123
9640 Neural Computing and Applications (2023) 35:9637–9655
123
Neural Computing and Applications (2023) 35:9637–9655 9641
Most researchers have focused on two-class (normal and 3.1 Dataset creation
COVID-19) and three-class (normal, COVID-19, and
pneumonia) classifications. While less concentrated on the In this research study, we started by generating a large
four-class (normal, COVID-19 pneumonia, bacterial pneu- dataset on COVID-19 prediction by using different standard
monia, and viral pneumonia) classification. Researchers datasets that are publicly available. The dataset is arranged
have used a limited and small amount of data for network by integrating and altering four different publicly available
training, which can lead to the over-fitting of the model benchmark repositories on the internet. These different
[36]. Furthermore, most researchers used a two-split dataset repositories named Chest Radiography Database
(training and validation sets) approach for model training contains (normal: 10,192, COVID-19: 3,616, viral pneu-
and testing, which may lead to biased high accuracy results monia: 1345, and bacterial pneumonia: 6012) X-ray images
[37]. Most of them did not use multi-modalities (X-rays and [21], “SARS-CoV-2 Ct-Scan” dataset consists (nor-
CTs) for model training and inference in their studies. mal: 1229, and COVID-19: 1252) CT images [23], and a
Hence, the limitations of the existing advanced studies have “Curated Dataset for COVID-19 Posterior-Anterior Chest
encouraged us to develop an efficient technique for Radiography Images (X-Rays)” which is used as an external
detecting different types of chest infections, including cohort [38] to test the model performance on an external
SARS-CoV-2. The study aims to differentiate between source dataset comprises (normal: 3270, COVID-19: 1281,
different severity levels of COVID-19 (negative for pneu- viral pneumonia: 1656, and bacterial pneumonia: 3001)
monia, atypical, indeterminate, and typical), which, to the X-ray images [20], SIIM-FISABIO-RSNA COVID-19
best of our knowledge, has not been published yet. The Detection contains (atypical appearance: 483, indeterminate
proposed method aims to provide a hybrid deep learning appearance: 1108, typical appearance: 3007, and negative
(DL) method based on multi-modal images that can be a for pneumonia: 1736) X-ray images [22].
cost-effective and novel prediction mechanism for this After combining the datasets, the combined dataset is
virus. This study can distinguish between COVID-19 chest divided into four datasets i.e., a two-class classification
infection and other lung infections (bacterial or viral), and it dataset (normal and COVID-19) that contains multi-modal
uses a large dataset for model training to enhance the CRIs (X-rays and CTs), a three-class classification dataset
generalizability of the system. Therefore, the proposed (normal, COVID-19, and viral pneumonia), a four-class
model can help doctors to determine the type of chest classification dataset, and a four-class severity level classi-
infection more accurately and quickly without requiring fication (negative for pneumonia, atypical, indeterminate,
extensive physical examinations. and typical).
123
9642 Neural Computing and Applications (2023) 35:9637–9655
Table 1 Images for four-class classification Table 2 Images for three-class classification
Class No. of Images Class No. of Images
123
Neural Computing and Applications (2023) 35:9637–9655 9643
Table 3 The number of images for two-class (Multi-Modal) Table 4 SIIM-FISABIO-RSNA augmented dataset
classification
Class No. of Images
Class X-Rays CTs No. of Images
Typical appearance 3007
Normal 10,192 1229 11,421 Indeterminate appearance 2800
COVID-19 3616 1252 4868 Atypical appearance 2694
Total 16,289 Negative for pneumonia 2678
Total 11,179
Normal 3270
3.2 External cohort: for external/cross-dataset
COVID-19 1281
validation
Viral pneumonia 1656
Bacterial pneumonia 3001
The model's performance is evaluated against an external
Total 9208
cohort to reduce the possibility of biased performance
toward the training data [38]. The “Curated Dataset for
COVID-19 Posterior-Anterior Chest Radiography Images
(X-Rays)” was used as an external cohort [38] to check the model simpler. Second, normalization is used in deep
generalizability of the model on an external dataset (normal: learning architectures to stabilize model training while
3270, COVID-19: 1281, viral pneumonia: 1656, and bac- accelerating training [41]. The image pixel values have been
terial pneumonia: 3001) [20]. A subset (i.e., 2700 random adjusted to the range [0–1] from [0–255] by dividing every
images) of this dataset is used as a cross-dataset to assess pixel by 255. The images used in this study are grayscale.
the performance of the proposed model. The detail of the
dataset is described in Table 5. 3.3.2 Data augmentation
123
9644 Neural Computing and Applications (2023) 35:9637–9655
123
Neural Computing and Applications (2023) 35:9637–9655 9645
3.4 Lightweight ResGRU: our proposed model block, a direct link skips 2–3 layers in-between in a DNN,
as shown in Fig. 7.
In this section, the proposed hybrid Lightweight ResGRU Let H(x) denotes the residual mapping where x is the
for the detection of COVID-19 and non-COVID-19 infec- input. Moreover, the stacked nonlinear layers use another
tions by using CRIs is described. The proposed model function, F(x):=H(x)−x. Then the residual mapping is
contains different components: residual block, batch nor- calculated as F(x)?x. Residual mapping H(x) can be easily
malization layer, pooling layer, flatten layer, dense layer, optimized compared to the original mapping F(x). A neural
time-distributed layer, Bi-GRU, and activation functions, as network with skip connections is shown in Fig. 7.
shown in Fig. 6. The details of the different modules of the F ð xÞ ¼ H ð xÞ x ð5Þ
model are described in the following sub-sections.
H ð xÞ ¼ F ð xÞ þ x ð6Þ
3.4.1 Batch normalization layer Hence, residual blocks are based on convolution operation.
The convolution operation with an image (X) and mask (M)
After the input layer, every two successive residual blocks is defined in Eq. (7).
and a dense layer, the batch normalization layer is applied. XX
The proposed model has five batch normalization layers. ðX M Þði; jÞ ¼ M ða; bÞX ði a; j bÞ ð7Þ
a b
The batch normalization layer standardizes the inputs of a
layer for each batch before feeding them to the next layer in Here, * denotes the convolution operation being used in the
the network. It accelerates and stabilizes the DNN training residual learning block. The window or mask (M) slides
process [43]. Batch normalization is usually applied after over the image (X) with a certain stride value. The model
the activation function, which yields better results. contains six residual blocks in total.
Let μb, and σ2b represent the batch's mean and variance,
as given in Eqs. (1 and 2). Normalize the layer inputs using 3.4.3 Pooling layer and dropout layer
the batch statistics i.e., mean and variance, and then stan-
dardize the hidden units by scaling and shifting using the The max pooling layer is applied frequently in the Light-
learned scaling parameters. weight ResGRU, which shrinks the image size and reduces
1X n the network's complexity. The max pooling layer stops
lb ¼ xi ðBatch MeanÞ ð1Þ over-fitting and accelerates the model's training. It proceeds
n a¼1
with the maximum value of the feature map (F) under the
1X n mask (M). The hyper-parameters required for the max-
r2b ¼ ðxi lb Þ ðBatchVarianceÞ ð2Þ pooling layer are a mask (M) and stride (S), as depicted in
n a¼1
Eq. (8).
whereas, Eq. (3) shows the batch normalization formula,
and Eq. (4) shows scaling and shifting using the learned
scaling parameter γ and shift parameter β.
x i lb
xi ¼ qffiffiffiffiffiffiffiffiffiffiffiffi
ffi ð3Þ
r2b
yi ¼ cxi þ b ð4Þ
123
9646 Neural Computing and Applications (2023) 35:9637–9655
Max Poolingðx; yÞ ¼ max max M ða þ x; b þ yÞ and GRU layer in the presented model. A hybrid of residual
a¼0;1;::;Sb¼0;1;::;S layers and a Bi-GRU layer is used in the model. The Bi-
; x; y ¼ 0; 1; 2; . . .; N GRU keeps the information about previous and future
ð8Þ images, allowing for more accurate forecasting [44].
where ‘S’ is stride, and x and y are rows and columns of the
3.4.6 Flatten and dense layers
input image, respectively.
It is worth noting that after every two consecutive
The flatten layer reduces the feature map to a single column,
residual blocks and a batch normalization layer, the max-
which can then be transferred to the next layer for further
pooling layer is added. Therefore, there are a total of three
processing. The dense layer is identified as the fully con-
max-pooling layers in the model. After pooling layers,
nected layer in neural networks. In the dense layer, the
dropout is the immediate next layer, which is used in the
previous layer's output is converted into a single vector. It
model to prevent it from over-fitting. It is a regularization
determines which feature map best matches a given class by
technique that randomly drops neurons and their incoming
considering the previous layer's output. It gives the correct
and outgoing edges from a layer, which helps to learn the
possibilities for the various classes by using the activation
features in different manners and improves the model's
function.
performance.
3.4.7 Activation function
3.4.4 Time distributed layer
ReLU, sigmoid, and softmax activation functions are
The time-distributed layer is suitable when working with
applied in the proposed model. ReLU is an activation
time-series data or video frames. It allows each input to
function used in the middle layers of the DNN. In contrast,
have its own layer. For example, rather than having many
sigmoid and softmax are used at the output layer for two-
“models” for each input, we can have “one model” for each
class and multi-class classification, respectively.
input. Then Bi-GRU can assist with managing the data in
“time” and checking the images’ previous and future ReLU ¼ maxð0; xÞ ð11Þ
sequences. 1
Softmax ¼ f ð xÞ ¼ ð12Þ
1 þ ex
3.4.5 Bi-GRU layer
In Eqs. (11) and (12), x is a number, where n x n.
After flattening the features, a bidirectional gated recurrent The combination of these layers and activation functions
unit (Bi-GRU), a type of recurrent neural network, is used makes our architecture distinctive and novel for detecting
(RNN). Bi-GRU is a sequential model which calculates the non-COVID and COVID-19 pneumonia with different
continuity and dependency of features in sequential data, e. categories of COVID-19. The recommended model is small
g., textual data. Hence, the Bi-GRU layer is used in the and lightweight compared to other proposed models
proposed model after the feature extraction phase. It com- described in the discussions section, as fewer parameters are
putes the dependency and connection between features of used in this model. It has also performed better than other
the input data [19]. It then passes these features to the fully published studies on a huge dataset.
connected layer in the model based on these characteristics
for classification. 3.4.8 Discussion on the proposed model
Let us divide the input image x into patches M * N
An overview of our proposed Lightweight ResGRU model
; xm;n Rwp hp c where, wp andhp , c are the height, width, and
is described in Fig. 6 and Table 6. According to Fig. 6 and
the number of channels of the patch respectively and
Table 6, the proposed model contains six residual blocks
M ¼ wwp , N ¼ hhp , and by inputting xm;n in a Bi-GRU layer in
and one Bi-GRU layer. Residual blocks have many
a sequence. advantages over simple CNN architectures. One of the
!
significant advantages is that while increasing the depth of
S i;j ¼ f FWD S Fm;n1 ; xm;n and n ¼ 1; 2; 3; . . .; N ð9Þ
the neural network, it accelerates the training process of the
network by using skip connections. The skip connection in
S i;j ¼ f REV S Rm;nþ1 ; xm;n and n ¼ N ; . . .; 3; 2; 1 ð10Þ
the residual block is the backbone that provides direct and
Here f FWD and f REV check the dependency of the features short paths from the early layers and connects them 2–3
of input data in both forward and reverse sequences, layers deeper in the neural network, as shown in Fig. 8,
respectively, as shown in Eqs. (9) and (10). There is one Bi- which contributes to faster training as well as rapid
123
Neural Computing and Applications (2023) 35:9637–9655 9647
Table 6 A summary of
Layer (type) Output shape Param #
Lightweight ResGRU for
COVID-19 Detection conv2d_1 (Conv2D) (None, 224, 224, 64) 1792
batch_normalization_1 (BatchNormalization) (None, 224, 224, 64) 256
resnet_block_1 (ResnetBlock) (None, 224, 224, 64) 148,736
resnet_block_2 (ResnetBlock) (None, 224, 224, 64) 148,736
batch_normalization_2 (BatchNormalization) (None, 224, 224, 64) 256
max_pooling2d_1 (MaxPooling2D) (None, 112, 112, 64) 0
dropout_1 (Dropout) (None, 112, 112, 64) 0
resnet_block_3 (ResnetBlock) (None, 56, 56, 128) 526,976
resnet_block_4 (ResnetBlock) (None, 28, 28, 128) 608,896
batch_normalization_3 (BatchNormalization) (None, 28, 28, 128) 512
max_pooling2d_2 (MaxPooling2D) (None, 14, 14, 128) 0
dropout_2 (Dropout) (None, 14, 14, 128) 0
resnet_block_5 (ResnetBlock) (None, 7, 7, 256) 2,102,528
resnet_block_6 (ResnetBlock) (None, 4, 4, 256) 2,430,208
batch_normalization_4 (BatchNormalization) (None, 4, 4, 256) 1024
max_pooling2d_3 (MaxPooling2D) (None, 4, 2, 256) 0
dropout_3 (Dropout) (None, 4, 2, 256 0
time_distributed (TimeDistributed) (None, 4, 512) 0
bidirectional (Bidirectional) (None, 4, 64) 104,832
flatten_1 (Flatten) (None, 256) 0
dense_1 (Dense) (None, 256) 65,792
batch_normalization_5 (BatchNormalization) (None, 256) 1024
dropout_4 (Dropout) (None, 256) 0
dense_2 (Dense) (None, 4) 257
Total params: 6,141,825
Trainable params: 6,133,121
Non-trainable params: 8,704
convergence of the model to higher accuracy. There are two 4 Experimental setup
significant purposes of skip connections in the residual
block: Firstly, it prevents the vanishing gradient problem; as This part defines the experimental arrangements used to
a result, the network weights and biases will update effec- implement and train the proposed architecture. Experiments
tively. Secondly, it overcomes the issue of accuracy satu- are performed on the Google Collaboratory Pro in the
ration in much deeper neural networks [18]. Python programming language using the TensorFlow
On the other hand, the Bi-GRU layer is used after the framework. The training process used 25 GB of RAM with
feature extraction of the input data. This layer aims to a T4 GPU. In the training period of the system, the pro-
compute the dependencies and connections between input posed Lightweight ResGRU was trained end to end through
data features [19], improving classification accuracy. It also backpropagation using the stochastic gradient descent
makes addressing the problem of intra-class similarities (i. (SGD) optimizer [45] with an initial learning rate of 0.001,
e., similarities between images of different classes) easier and the batch size during the training phase was set to 8.
because it stores the information of the previous and next The input size of images is [224, 224, 3]. The model was
images of a current image in the data and predicts a class trained on 50 epochs, which were appropriate for the con-
based on this information. vergence of the model. Models with the best validation
Table 6 contains details of the end-to-end Lightweight accuracy were chosen for testing. Furthermore, the model's
ResGRU model, including descriptions of the layers, acti- data splitting and performance evaluation are described in
vations, and learnable weights. The proposed model was the following sub-sections.
trained over 50 epochs with an initial learning rate of 0.001
using the stochastic gradient descent (SGD) optimizer.
123
9648 Neural Computing and Applications (2023) 35:9637–9655
123
Neural Computing and Applications (2023) 35:9637–9655 9649
123
9650 Neural Computing and Applications (2023) 35:9637–9655
123
Neural Computing and Applications (2023) 35:9637–9655 9651
Table 9 No. of images used for and chest infections because they contain more detailed
Subject No. of images
cross-dataset validation information about the disease. The main signs of the
2 Class 600 COVID-19 epidemic, which the WHO first reported in late
3 Class 900 2019, are severe coughing and breathing difficulties, and
4 Class 1200 X-rays and CT scans are commonly used to diagnose such
Total 2700 symptoms. However, there are some bottlenecks and chal-
lenges to using imaging-based solutions in which the
quantity and quality of data are two concepts that have a
scientists to work with medical staff. Chest X-rays and CTs direct impact on the deep learning model's performance.
are utilized to identify numerous diseases, and CRIs are Deep learning models need a hefty amount of data for the
more extensively used to study brain tumors, heart diseases, training to perform better and more successfully. The
123
9652 Neural Computing and Applications (2023) 35:9637–9655
proposed study generates a large dataset to address these (weights and biases) than other architectures, as shown in
challenges by integrating and modifying different available Table 11. Hence, the proposed model with just around 6
benchmark datasets. million parameters is lightweight and has a simple and small
Our proposed model shows that, by using a hybrid model architecture, thereby reducing the computational cost of the
containing residual blocks and Bi-GRU, a better model can deep learning model, as presented in Table 6.
be built for diagnosing chest infections because it predicts Table 11 sums up the studies on the automatic identifi-
by keeping the information of previous and future images in cation of COVID-19 based on chest radiography images
the data. It can be seen from the results that there is an acute and compares them with the proposed model. In Table 11, it
sensitivity, which means lower false-negative results. It is can be shown that all the researchers have used deep
worth mentioning that a high false-negative prediction is learning architectures to diagnose chest pneumonia using
perilous for society and patients. In that case, the infected CRIs. However, most of them used pre-trained models and
people are declared healthy, which is dangerous for healthy modified their architectures using transfer learning [46].
people because the infection can spread from these healthy- Most researchers trained deep learning models on a small
declared people. dataset, which may result in overfitting and misleadingly
A comparison of various state-of-the-art models reveals good results [36]. Researchers have used a validation
that the proposed model has fewer layers and parameters dataset instead of a separate unseen test dataset to evaluate
123
Neural Computing and Applications (2023) 35:9637–9655 9653
Table 11 Accuracy, dataset size, and parameters comparison with state-of-the-art existing models
Reference Year Model (s) 4 class 3 class 2 class COVID-19 Modality Parameters Number of
Severity (∼Millions) Images
[47] 2022 Pre-trained ImageNet Models SARS-CoV-2 Ct-Scan 70:15:15 97.60% 92.90%
Proposed Model – Lightweight ResGRU SARS-CoV-2 Ct-Scan 70:15:15 98.56% 97.17%
deep learning models, leading to biased results and high results showed that the proposed architecture achieved
accuracy scores [37]. On the other hand, the proposed study better results even when using the same dataset. The model
used an extensive dataset of around forty thousand images is trained on the SARS-CoV-2 Ct-Scan dataset [23], which
for training, validation, testing, and cross-dataset validation Kogilavani et al. [47] used to predict COVID-19 chest
of the deep learning model. It assessed the model with infection. The dataset is distributed into three splits, i.e.,
unseen datasets (test data and cross-dataset) for unbiased training, validation, and testing, in the ratio of 70:15:15
evaluation. The suggested model predicts the four distinct after augmentation. Thus, the suggested lightweight model
severity levels of COVID-19 (i.e., negative for pneumonia, is superior to other models as it performs well on large
atypical, indeterminate, and typical), which, to the best of unseen datasets (test dataset and cross-dataset). It can pre-
our knowledge, is the first work to conduct a classification dict the COVID-19 infection and its severity level. Light-
of COVID-19 severity levels. weight ResGRU achieved an overall testing accuracy of
The proposed model is trained and evaluated on a dataset 98.56% and a sensitivity of 97.17%. Table 12 shows the
used by another recent study to assess the efficacy of the results of Lightweight ResGRU compared with the study
recommended hybrid architecture. The comparison of the [47], where the same dataset was used in both cases.
123
9654 Neural Computing and Applications (2023) 35:9637–9655
8 Conclusion and future works 2. Wu F et al (2020) A new coronavirus associated with human
respiratory disease in China. Nature 579(7798):265–269. https://
doi.org/10.1038/s41586-020-2008-3
The proposed study used a novel and hybrid model to 3. “Coronavirus (COVID-19) events as they happen.” https://www.
diagnose non-COVID and COVID-19 chest infections, who.int/emergencies/diseases/novel-coronavirus-2019/events-as-
including their different severity types, using chest radiog- they-happen (accessed Dec. 02, 2021).
4. “WHO Coronavirus (COVID-19) Dashboard | WHO Coronavirus
raphy images (X-rays and CTs). It is a hybrid model based (COVID-19) Dashboard With Vaccination Data.” https://covid19.
on multiple residual blocks and Bi-GRU that detects clinical who.int/ (accessed Dec. 02, 2021).
features and the dependency of those features, respectively. 5. Bhatt T, Kumar V, Pande S, Malik R, Khamparia A, Gupta D
As a result, it provides promising classification results. The (2021) A Review on COVID-19. Stud Comput Intell 924
(April):25–42. https://doi.org/10.1007/978-3-030-60188-1_2
presented deep learning model named Lightweight ResGRU 6. Hussain E, Hasan M, Rahman MA, Lee I, Tamanna T, Parvez MZ
is proficient in the diagnosis of 2 class multi-modal (X-rays, (2021) “CoroDet: A deep learning based classification for
CTs) classification (normal and COVID-19), 3 class clas- COVID-19 detection using chest X-ray images”, Chaos. Solitons
sification (normal, COVID-19, and viral pneumonia), 4 and Fractals 142:110495. https://doi.org/10.1016/j.chaos.2020.
110495
class classification (normal, COVID-19, viral pneumonia, 7. Das AK, Kalam S, Kumar C, Sinha D (2021) “TLCoV- An
and bacterial pneumonia). It can also classify different automated Covid-19 screening model using transfer learning from
severity types (atypical, indeterminate, typical, and negative chest X-ray images”, Chaos. Solitons and Fractals 144:541.
for pneumonia) of COVID-19, which, to the best of our https://doi.org/10.1016/j.chaos.2021.110713
8. Lee EYP, Ng MY, Khong PL (2020) COVID-19 pneumonia: what
knowledge, has not been published yet. The proposed has CT taught us? Lancet Infect Dis 20(4):384–385. https://doi.
model achieved 99.0%, 98.4%, 91.0%, and 80.5% f-mea- org/10.1016/S1473-3099(20)30134-1
sure and 0.009, 0.08, 0.02, and 0.19 FNR for two-class, 9. Benmalek E, Elmhamdi J, Jilbab A (2021) Comparing CT scan
three-class, four-class classification, and COVID-19 sever- and chest X-ray imaging for COVID-19 diagnosis. Biomed Eng
Adv 1:100003. https://doi.org/10.1016/j.bea.2021.100003
ity classification respectively, on an extensive test dataset. 10. Naguib M, Moustafa F, Salman MT, Saeed NK, Al-Qahtani M
Another contribution of the study is the acquisition and (2020) The use of radiological imaging alongside reverse tran-
creation of a large dataset for chest infection prediction. The scriptase PCR in diagnosing novel coronavirus disease 2019: a
performance of the proposed Lightweight ResGRU shows narrative review. Future Microbiol 15(10):897–903. https://doi.
org/10.2217/fmb-2020-0098
the power of this method over the existing research studies. 11. Chan JFW et al (2020) A familial cluster of pneumonia associated
The future aim is to overcome resource restrictions, making with the 2019 novel coronavirus indicating person-to-person
us capable of training the model on a larger dataset. We transmission: a study of a family cluster. Lancet 395(10223):514–
expect to further increase the model's performance, espe- 523. https://doi.org/10.1016/S0140-6736(20)30154-9
12. Kong W, Agarwal PP (2020) Chest imaging appearance of covid-
cially for the external cohort, by using a significantly high 19 infection. Radiol Cardiothorac Imaging 2(1):2–5. https://doi.
number of images in the training phase. Another objective is org/10.1148/ryct.2020200028
to use the object detection algorithm on CRIs to localize the 13. Prokop M et al (2020) CO-RADS: a categorical CT assessment
infected region of the lungs, which can provide a better scheme for patients suspected of having COVID-19-definition and
evaluation. Radiology 296(2):E97–E104. https://doi.org/10.1148/
understanding of the patient’s condition to the radiologists. radiol.2020201473
14. de Jaegere TMH, Krdzalic J, Fasen BACM, Kwee RM (2020)
Radiological society of north america chest ct classification sys-
Funding The authors have no relevant financial or non-financial tem for reporting covid-19 pneumonia: Interobserver variability
interests to disclose. and correlation with reverse-transcription polymerase chain reac-
tion. Radiol Cardiothorac Imaging 2:3. https://doi.org/10.1148/
Data availability The links to the datasets created and analyzed during ryct.2020200213
the current study can be found in the [Lightweight ResGRU] reposi- 15. Yang S, Gao T, Wang J, Deng B, Lansdell B, Linares-Barranco B
tory, [https://github.com/Mughees-Ahmad/Lightweight_ResGRU]. (2021) Efficient spike-driven learning with dendritic event-based
processing. Front Neurosci 15:55. https://doi.org/10.3389/fnins.
2021.601109
Declarations 16. Holzinger A, Langs G, Denk H, Zatloukal K, Müller H (2019)
“Causability and explainability of artificial intelligence in medi-
Conflict of interests The authors have no competing interests to cine”, Wiley Interdiscip. Rev Data Min Knowl Discov 9(4):1–13.
declare. https://doi.org/10.1002/widm.1312
17. Krizhevsky BA, Sutskever I, Hinton GE (2012) Cnn实际训练的.
Commun ACM 60(6):84–90
References 18. He K, Zhang X, Ren S, Sun J (2016) “Deep residual learning for
image recognition. Proc IEEE Comput Soc Conf Comput Vis
Pattern Recognit 2015:770–778. https://doi.org/10.1109/CVPR.
1. Huang C et al (2020) Clinical features of patients infected with
2016.90
2019 novel coronavirus in Wuhan, China. Lancet 395
19. Yin Q, Zhang R, Shao X (2019) CNN and RNN mixed model for
(10223):497–506. https://doi.org/10.1016/S0140-6736(20)30183-
image classification. MATEC Web Conf 277:02001. https://doi.
5
org/10.1051/matecconf/201927702001
123
Neural Computing and Applications (2023) 35:9637–9655 9655
20. U. SAIT et al., “Curated Dataset for COVID-19 Posterior-Anterior 37. “What is the Difference Between Test and Validation Datasets?”
Chest Radiography Images (X-Rays).,” vol. 1, 2020, doi: https:// https://machinelearningmastery.com/difference-test-validation-
doi.org/10.17632/9XKHGTS2S6.1. datasets/ (accessed May 19, 2022).
21. “COVID-19 Radiography Database | Kaggle.” https://www.kag 38. Kleppe A, Skrede OJ, De Raedt S, Liestøl K, Kerr DJ, Danielsen
gle.com/tawsifurrahman/covid19-radiography-database (accessed HE (2021) Designing deep learning studies in cancer diagnostics.
May 19, 2021). Nat Rev Cancer 21(3):199–211. https://doi.org/10.1038/s41568-
22. “SIIM-FISABIO-RSNA COVID-19 Detection | Kaggle.” https:// 020-00327-9
www.kaggle.com/c/siim-covid19-detection (accessed Dec. 06, 39. M. de la I. Vayá et al., “BIMCV COVID-19?: a large annotated
2021). dataset of RX and CT images from COVID-19 patients,” pp. 1–
23. “SARS-COV-2 Ct-Scan Dataset | Kaggle.” https://www.kaggle. 22, 2020, [Online]. Available: http://arxiv.org/abs/2006.01174
com/plameneduardo/sarscov2-ctscan-dataset (accessed May 19, 40. “Medical Imaging Data Resource Center (MIDRC) - RSNA
2021). International COVID-19 Open Radiology Database (RICORD)
24. Ardakani AA, Kanafi AR, Acharya UR, Khadem N, Mohammadi Release 1c - Chest x-ray Covid? (MIDRC-RICORD-1c) - The
A (2020) Application of deep learning technique to manage Cancer Imaging Archive (TCIA) Public Access - Cancer Imaging
COVID-19 in routine clinical practice using CT images: Results of Archive Wiki.” https://wiki.cancerimagingarchive.net/pages/view
10 convolutional neural networks. Comput. Biol. Med. page.action?pageId=70230281 (accessed May 19, 2021).
121:103795. https://doi.org/10.1016/j.compbiomed.2020.103795 41. Swati ZNK et al (2019) Brain tumor classification for MR images
25. Shibly KH, Dey SK, Islam MTU, Rahman MM (2020) COVID using transfer learning and fine-tuning. Comput Med Imaging
faster R-CNN: a novel framework to diagnose novel coronavirus Graph 75:34–46. https://doi.org/10.1016/j.compmedimag.2019.
disease (COVID-19) in X-Ray images. Informatics Med Unlocked 05.001
20:100405. https://doi.org/10.1016/j.imu.2020.100405 42. Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521
26. Javor D, Kaplan H, Kaplan A, Puchner SB, Krestan C, Baltzer P (7553):436–444. https://doi.org/10.1038/nature14539
(2020) Deep learning analysis provides accurate COVID-19 43. S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep
diagnosis on chest computed tomography. Eur J Radiol network training by reducing internal covariate shift,” 32nd Int.
133:109402. https://doi.org/10.1016/j.ejrad.2020.109402 Conf. Mach. Learn. ICML 2015, 1: 448–456, 2015.
27. Wang S et al (2021) A deep learning algorithm using CT images to 44. D. M. Ibrahim, N. M. Elshennawy, and A. M. Sarhan, “Since
screen for Corona virus disease (COVID-19). Eur Radiol 31 January 2020 Elsevier has created a COVID-19 resource centre
(8):6096–6104. https://doi.org/10.1007/s00330-021-07715-1 with free information in English and Mandarin on the novel
28. Ismael AM, Şengür A (2020) Deep learning approaches for coronavirus COVID- 19. The COVID-19 resource centre is hosted
COVID-19 detection based on chest X-ray images. Expert Syst on Elsevier Connect, the company ’ s public news and informa-
Appl 164(September):2021. https://doi.org/10.1016/j.eswa.2020. tion,” no. January, 2020.
114054 45. Liu Y, Gao Y, Yin W (2020) An improved analysis of stochastic
29. Dash AK, Mohapatra P (2021) A Fine-tuned deep convolutional gradient descent with momentum. Adv Neural Inf Process Syst
neural network for chest radiography image classification on 2:1–11
COVID-19 cases. Multimed Tools Appl 15:1. https://doi.org/10. 46. Weiss K, Khoshgoftaar TM, and Wang DD, A survey of transfer
1007/s11042-021-11388-9 learning, vol. 3, no. 1. Springer International Publishing, 2016.
30. Ozturk T, Talo M, Yildirim EA, Baloglu UB, Yildirim O, Rajendra doi: https://doi.org/10.1186/s40537-016-0043-6.
Acharya U (2020) Automated detection of COVID-19 cases using 47. Kogilavani SV et al. (2022) “COVID-19 Detection Based on Lung
deep neural networks with X-ray images. Comput Biol Med Ct Scan Using Deep Learning Techniques,” Comput Math
121:10379. https://doi.org/10.1016/j.compbiomed.2020.103792 Methods Med, vol. 2022,. doi: https://doi.org/10.1155/2022/
31. Islam MZ, Islam MM, Asraf A (2020) A combined deep CNN- 7672196.
LSTM network for the detection of novel coronavirus (COVID- 48. El Asnaoui K, Chawki Y (2020) Using X-ray images and deep
19) using X-ray images. Informatics Med. Unlocked 20:100412. learning for automated detection of coronavirus disease. J Biomol
https://doi.org/10.1016/j.imu.2020.100412 Struct Dyn 39(10):1–12. https://doi.org/10.1080/07391102.2020.
32. Demir F (2021) DeepCoroNet: A deep LSTM approach for 1767212
automated detection of COVID-19 cases from chest X-ray images. 49. Gilanie G et al (2021) Coronavirus (COVID-19) detection from
Appl Soft Comput 103:107160. https://doi.org/10.1016/j.asoc. chest radiology images using convolutional neural networks.
2021.107160 Biomed Signal Process Control 6:102490. https://doi.org/10.1016/
33. Turkoglu M (2021) COVIDetectioNet: COVID-19 diagnosis j.bspc.2021.102490
system based on X-ray images using features selected from pre- 50. Kumar A, Tripathi AR, Satapathy SC, Zhang YD (2022) SARS-
learned deep features ensemble. Appl Intell 51(3):1213–1226. Net: COVID-19 detection from chest x-rays by combining graph
https://doi.org/10.1007/s10489-020-01888-w convolutional network and convolutional neural network. Pattern
34. Luz E et al (2022) Towards an effective and efficient deep learning Recognit 122:108255. https://doi.org/10.1016/j.patcog.2021.
model for COVID-19 patterns detection in X-ray images. Res 108255
Biomed Eng 38(1):149–162. https://doi.org/10.1007/s42600-021-
00151-6 Publisher's Note Springer Nature remains neutral with regard to
35. Elkorany AS, Elsharkawy ZF (2021) COVIDetection-Net: a tai- jurisdictional claims in published maps and institutional affiliations.
lored COVID-19 detection from chest radiography images using
deep learning. Optik (Stuttg) 231:166405. https://doi.org/10.1016/ Springer Nature or its licensor (e.g. a society or other partner) holds
j.ijleo.2021.166405 exclusive rights to this article under a publishing agreement with the
36. “Why Aren’t My Results As Good As I Thought? You’re Prob- author(s) or other rightsholder(s); author self-archiving of the accepted
ably Overfitting.” https://machinelearningmastery.com/arent- manuscript version of this article is solely governed by the terms of such
results-good-thought-youre-probably-overfitting/ (accessed May publishing agreement and applicable law.
19, 2022).
123