You are on page 1of 24

Multimedia Tools and Applications

https://doi.org/10.1007/s11042-023-15582-9

Optimized quaternion radial Hahn Moments application


to deep learning for the classification of diabetic retinopathy

Mohamed Amine Tahiri1 · Hicham Amakdouf2 · Mostafa El mallahi3 · Hassan Qjidaa1

Received: 17 August 2021 / Revised: 26 April 2022 / Accepted: 21 April 2023


© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023

Abstract
This paper proposes a new hybrid method of classification of fundus images provided by
the Asia–Pacific Tele-Ophthalmology Society via combining the discrete moment quater-
nion approach, the artificial intelligence approach, and machine learning in order to auto-
matically distinguish the stage of diabetic retinopathy using reduced databases divided into
five classes. The proposed method is based on two main phases: the preprocessing phase,
in which using the new radial invariant moments of Hahn in a quaternion optimized by the
ant colony algorithm, in order to calculate the original n × n image moments. The second
phase is devoted to introducing the calculated moments into the proposed convolutional
neural network model. The present work will contribute to creating new neural network
architectures that take advantage of Hahn’s new 2D radial moment descriptive capabil-
ity in quaternions. The K-fold cross-validation method is used to measure the proposed
model’s performance. Finally, graphical measures such as receiver operating characteristic
and precision-rapple curves plus a confusion matrix are presented. Furthermore, numerical
measures are adopted for f1-score, loss and precision. In 1795 images, the AUC yielded
94.58%, 97.02%, 94.87%, 97.83%, and 96.54% for the five classes of healthy, mild, moder-
ate, severe, and proliferative respectively. These results prove that the proposed method can
be used to detect and classify diabetic retinopathy at an early stage.

Keywords Quaternion Radial Hahn Moments · Invariant moment · Ant colony


optimization · Deep learning · Classification · Diabetic retinopathy

1 Introduction

Human beings are connected to the environment through the development of their senses.
Sight or vision remains the most valuable, as almost half of the human brain is engaged in
vision related activities. The loss of this sense can lead to difficulties in communication,
mainly with the outside world. This abnormality can be caused by certain medications or
by certain conditions that affect the brain or other parts of the body, including: Diabetes
[55], Hypertension [6], Inborn errors of metabolism [47], and Multiple sclerosis [40].

* Mohamed Amine Tahiri


mohamedamine.tahiri@usmba.ac.ma
Extended author information available on the last page of the article

13
Vol.:(0123456789)
Multimedia Tools and Applications

The present paper, will focus on Diabetic Retinopathy (DR) which is considered as an
ocular manifestation of diabetes [50]. In fact, there are five phases of DR, which can be
classified as follow: mild, moderate, extreme, proliferative or non-diseased (NO-DR). In
addition, DR is characterized by red lesions that appear as small, noticeable dots that may
be fine and too small [63]. Consequently, detecting DR remains a challenge, as human
readers often submit their exams a day or two days later, resulting in delayed treatment.
Therefore, researchers have developed a new and effective automated screening methods to
combat this problem. In this regard, previous studies will be cited.
S. Lal et al. propose in [35] a method for classification of original and perturbed DR
images based on adversarial training and feature fusion. S. Maqsood et al. [38] propose
a method to detect DR hemorrhages in diabetic patients. This method is based on deep
learning, feature fusion, and multi-logistic regression. P. Kandhasamy et al. propose in
[32] a new diagnostic system to determine the severity of diabetic retinopathy devel-
oped by the researchers, who use a multilevel segmentation algorithm, a support vec-
tor machine with selective features, and a genetic algorithm. S. Padinjappurathu et al.
present in [49] a method for diagnosing DR that relies on the extraction and fusion of
ophthalmoscopic features such as co-occurrence, run length matrix, and Ridgelet trans-
form coefficients. G. Saman et al. propose in [51] an approach that allows, on the one
hand, to extract the features of interest from RD images, and to classify their evolution
using the support vector machine on the other hand. S.Gupta et al. propose in [28] a new
method for smartphone-based RD detection using an optimized hybrid machine learning
approach. T. Antary et al. propose in [1] a new method based on the application of an
encoder network to embed the retinal image in a high-level representation space, where
the combination of medium and high-level features is used to achieve good results. W.
Chen et al. propose in [16] a method for detecting diabetic retinopathy based on shal-
low integrated convolutional neural networks. G. Saxena et al. they developed in [52]
an agent that includes several CNN models that were developed via machine learning
approaches for the classification of digital DR images. S. Gayathri et al. proposed in [25]
a lightweight CNN model for DR classification from fundus images.
It should be noted that deep learning-based methods for automatic screening tasks may
have two constraints: On the one hand, their efficiencies are conditioned by the collection
of a balanced and sufficient database, especially, in the medical field, knowing that the con-
ditions of hospitals are not always appropriate. So the presence of an insufficient database
prevents the effectiveness of these approaches and data augmentation, e.g. [34, 43, 45, 60],
is widely used to combat this problem by artificially expanding the training dataset with
rotation, translation and zooming, preserving the labels in order to overcome the problem
of overfitting [33]. This strategy produces a very high estimation rate and requires a large
storage space. On the other hand, their efficiency is conditioned by the use of deeper archi-
tectures such as VGG-19 [5], Resnet [20] or Google Net [20] which can pose a problem
when increasing the number of parameters; this can be prevented by a larger training set, in
order to reduce overlearning.
Motivated by the idea of automatic detection of DR while seeking to overcome the
aforementioned obstacles, we propose in this work, a new hybrid classification method
that combines by combining the discrete moment quaternion approach [7, 22, 44], arti-
ficial intelligence approach[41], and machine learning [10] to automatically distinguish
the stage of diabetic retinopathy using reduced databases. In contrast to state-of-the-art
techniques, we mainly focused on using the discrete moment quaternion approach and a
simple CNN model. Therefore, we do not need to identify characteristic features, such
as blood vessels or microaneurysms, in the images, which reduces prediction errors. In

13
Multimedia Tools and Applications

a concrete way, the methodology followed in this paper is as follows: First, a collection
of images was selected from the database provided by the Asia Pacific Tele-Ophthal-
mology Society (APTOS) in the 2019 Blindness Detection Competition. Then, several
operations are performed manually to build a reduced and balanced database consisting
of 1795 images divided into five classes: healthy, mild, moderate, severe, and prolifera-
tive. Second, the preprocessing phase, in which using the new radial invariant moments
of Hahn in quaternion (QRHMI), the latter based on the use of fast and stable calcu-
{lation of} orthogonal discrete Hahn polynomials (HP) characterized by the parameters
𝜂1 , 𝜙1 , the parameters of HP are optimized by the ant colony metaheuristic algorithm
(ACO) with the aim of obtaining an accurate classification. This step is devoted to the
computation of the moments of each image, using orders up to n = 128 to build matri-
ces of 128 × 128 × 1(image moment). It should be highlighted that this patch size was
chosen after considering different sizes of 32, 64, 256, and 512. Finally, we introduce
the moments of the images calculated in the previous phase into a new simple convolu-
tional neural network model that will be called afterwards convolutional neural networks
of Hahn invariant moments (CNN-IHM) in order to automatically distinguish the stage
of diabetic retinopathy. The four-level cross-validation method was used to measure the
performance of the proposed CNN-IHM architecture. First, we will present the "loss"
and "accuracy" curves of our CNN-IHM model by classifying two reduced bases (the
first one contains 1795 images, the second one contains the computed image moments
of the same images). Next, we will present the confusion matrix [39], receiver operating
characteristic (ROC) [13] and precision-recall (PRC) curves [56] for each class, as well
as a table containing the performance measures precision, f1-score and recall.
Before moving on to the presentation of the structure of this paper, it is essential
to highlight the strong points of this paper: 1) Use of new radial invariant moments of
Hahn in quaternion to calculate the moments that allow us on the one hand the con-
servation of information, on the other hand not to carry out the increase of data (data
augmentation). 2) The fast and stable calculation of Hahn polynomial which allows
us to overcome the problems of numerical fluctuations. 3) The optimization of the HP
parameters which allows us to take full advantage of the QRHMI. 4) achieving accu-
rate classification despite the small database.
The present paper is organized as follows: In the second section, we present the theo-
retical framework for the computation of QRHMI using ACO. In the third section, we
offer a description of the data, with particular emphasis on the preprocessing, as well as
a description of the method of classification. In the fourth section, we present the experi-
mental results and analysis. In the last section, we present the conclusion of this work.

2 Theoretical framework

The main concern of the section at hand is the incorporation of two primary elements.
On the one{hand, the
} theoretical structure of the QRHMI and the optimal choice of their
parameters 𝜂1 , 𝜙1 via ACO in order to take full advantage of the invariant moments.

2.1 Computation of the quaternion radial Hahn moments invariants


{ }
The ­nth order discrete orthogonal HP hn1 1 (r, N), with parameters 𝜂1 , 𝜙1 ≻ −1 of one
𝜂 ,𝜙

discrete variable r defined in the region of [0, N − 1], are represented by using hypergeo-
metric function as [19]:

13
Multimedia Tools and Applications

( ) ( )
(𝜂1 ,𝜙1 ) (−1)n 𝜙1 + 1 n (N − n)n −n, 𝜂1 − r, n + 𝜂1 + 𝜙1 + 1
hn (r, N) = F
3 2 |1 (1)
n! 𝜙1 + 1, 1 − N
𝜂 ,𝜙1
To ensure the numerical stability of weighted Hahn polynomial ̃ hn1 (r, N) is presented as

( 𝜂1 ,𝜙1 ) ( 𝜂1 ,𝜙1 ) wh (r)
̃
hn (r, N) = hn (r, N) (2)
𝜌h (n)

where the weight function wh (r) is defined as:


( ) ( )
Γ N + 𝜂1 − r Γ N + 𝜙1 + 1 + r
wh (r) = (3)
Γ(N − n − r)Γ(r + 1)
and the square norm 𝜌h (n) is defined as:
( ) ( )( )
Γ 𝜂1 + n + 1 Γ 𝜙1 + n + 1 𝜂1 + 𝜙1 + n + 1
𝜌h (n) = ( ) (4)
𝜂1 + 𝜙1 + 2n + 1 n!(N − n − 1)

where the orthogonal property of normalized orthogonal polynomial can be rewritten as


𝜈−1
∑ 𝜂 𝜙 𝜂 𝜙
hn1, 1 (r)hm1, 1 (r)wh (r) = 𝜌h (n)𝛿nm (5)
r=0

In mathematics, the Kronecker delta function 𝛿nm named after Leopold Kronecker is a
function of two variables [53]. The function is ‘1’ if the orders are equal and ‘0’ otherwise,
as follows:
{
0 if n ≠ m
𝛿nm = (6)
1 if n = m
(𝜂 ,𝜙 )
hn 1 1 (r, N) satisfy a recurrence relation with respect to the order n and recur-
The ̃
rence according to variable r respectively [58] (Table 1).
√ √
( 𝜂1 ,𝜙1 ) 𝜌h (n − 1) (𝜂1 ,𝜙1 ) 𝜌h (n − 2) (𝜂1 ,𝜙1 )
̃
hn (r, N) = A ̃
hn−1 (r, N) − B ̃
h (r, N) (7)
𝜌h (n) 𝜌h (n) n−2

(𝜼1 ,𝝓1 )
Table 1  Data for the ̃
hn (r, N) for recurrence relation with respect to the order n and to the variable r [58]
( )( )[ ( 2 )( ) ]
𝜼1 + 𝝓1 + 2n − 1 𝜼1 + 𝝓1 + 2n 𝜼 − 𝜷 + 2(N − 1) 𝝓 − 𝜼1 2 𝜼1 + 𝝓1 + 2N
A= ( ) r− 1 + ( )( )
n 𝜼1 + 𝝓1 + n 4 4 𝜼1 + 𝝓1 + 2(n − 1) 𝜼1 + 𝝓1 + 2n
( )( )( )( )( )
(N − n + 1) 𝜼1 + n − 1 𝝓1 + n − 1 𝜼1 + 𝝓1 + N + n − 1 𝜼1 + 𝝓1 + 2n − 1 𝜼1 + 𝝓1 + 2n
B= ( )( )( )
n 𝜼1 + 𝝓1 + n 𝜼1 + 𝝓1 + 2n − 1 𝜼1 + 𝝓1 + 2n − 2
[ ( ) ( )] ( ) ( )
(r − 1) 2 N + 𝜼1 − r + 1 − 𝜼1 + 𝝓1 + 2 − n 𝜼1 + 𝝓1 + n + 1 + (N − 1) 𝝓1 + 1
C= [( ) ( ) ( ) ]
(r − 1) N + 𝜼1 − r + 1 − 𝝓1 + 𝜼1 + 2 + 𝝓1 + 1 (N − 1)
( )
(r − 1) N + 𝜼1 − r + 1
D= [( ) ( ) ( )]
(r − 1) N + 𝜼1 − r + 1 − 𝜼1 + 𝝓1 + 2 + (N − 1) 𝝓1 + 1

13
Multimedia Tools and Applications

√ √
(𝜂 ,𝜙 ) wh (r) (𝜂1 ,𝜙1 ) wh (r) (𝜂1 ,𝜙1 )
̃
hn 1 1 (r, N) =C ̃
h (r − 1, N) − D ̃
h (r − 2, N) (8)
wh (r − 1) n wh (r − 2) n

It should be noted that the method adopted to stabilize the values of HP based on
the integration of two classical recurrence relations (7) and (8) [57]; although this algo-
rithm has been applied to ensure the values of Tchebichef polynomials [3].
Before presenting the theoretical framework of the new radial invariant Hahn
moments in quaternion, we highlighted the robustness and stability of the HP in the
presence of large orders N = 2000, as shown in Fig. 1, which means that we can use it to
determine the moments of large images.
The first works proposing to encode color images using quaternions to avoid in
particular marginal treatments on the colors were proposed in the [22, 44] paper.
They proposed using the Cartesian representation of quaternions to code the
color pixels of the images. Thus, a color image of dimension N × N will be rep-
resented by a matrix of quaternions of dimension N × N. In this regard, it is worth
noting that color contains only three components in RGB space. However, we can
describe the color information in the imaginary part of the quaternions. The pixel
of an image fB (r, 𝜃) with coordinates (r, 𝜃) will thus be coded in the following way:
f (r, 𝜃) = fR (r, 𝜃)i + fG (r, 𝜃)j + fB (r, 𝜃)k ; with fR (r, 𝜃), fG (r, 𝜃), andfB (r, 𝜃) , respectively,
the red, green, and blue components. Where the radius r varies from 0 to N / 2 and
the angle 𝜃 varies from 0 to 2π.
In this regard, we will introduce the new radial Hahn moments that can be rewritten
as shown in Eq. (9).
N∕2 6N
1 ∑ ∑ ̃𝜂1 𝜙1
ΦH Mnm = 6N
hn (r, N)f (r, 𝜃)e𝜇m𝜃
r=0 𝜃=0
N∕2 6N
1∑ ∑ ̃𝜂1 𝜙1
= 6N
hn (r, N) × fR (r, 𝜃)i + fG (r, 𝜃)j + fB (r, 𝜃)k)e𝜇m𝜃
r=0 𝜃=0

N∕2 6N
∑ ∑ 𝜂 𝜙
N∕2 6N
∑ ∑ 𝜂 𝜙
(9)
1
= 6N i fR (r, 𝜃)̃ hn1 1 (r, N)e𝜇m𝜃 + j fG (r, 𝜃)̃
hn1 1 (r, N)e𝜇m𝜃
r=0 𝜃=0 r=0 𝜃=0

N∕2 6N
∑ ∑ 𝜂 ϕ
+k fB (r, 𝜃)̃hn (r, N)e𝜇m𝜃
1 1

r=0 𝜃=0

Fig. 1  The 2D and 3D plots show the stability of HP in front of the large order of N = 2000

13
Multimedia Tools and Applications

where
N∕2 6N N∕2 6N
� �
� � � � i+j+k
f (r, 𝜃)̃
h𝜂n1 𝜙1 (r, N)e𝜇m𝜃 = ̃𝜂1 𝜙1
f (r, 𝜃)hn (r, N) cos(m𝜃) + √ sin(m𝜃)
r=0 𝜃=0 r=0 𝜃=0 3
(10)
Taking into consideration 𝜇 is a unit pure quaternion which is chosen as 𝜇 = i+j+k
√ ,𝜃
3
is the angle whose values vary between zero and 2π. In theory, the Φ ̂ H Mnm of an RGB
image in polar pixels are given by
� � � � ��
̂ H Mnm = 1 i Re ΦH Mnm (fR ) + i+j+k
Φ √ Im ΦH Mnm (fR )
� � 6N � i+j+k �
3
��
+j Re ΦH Mnm (fG ) + √
3
Im ΦH Mnm (fG ) (11)
� � � i+j+k � ��
+k Re ΦH Mnm (fB ) + √ Im ΦH Mnm (fB )
3

Following mathematical simplification, the above equation can be written as follows:


̂ H Mnm = Anm + i Bnm + j Cnm + k Dnm
Φ (12)

where

Anm = − √1 {Im[ΦH Mnm (fR )] + Im[ΦH Mnm (fG )] + Im[ΦH Mnm (fB )]}
3
1
Bnm = Re[ΦH Mnm (fR )] + √ {Im[ΦH Mnm (fG )] − Im[ΦH Mnm (fB )]}
(13)
3
1
Cnm = Re[ΦH Mnm (fG )] + √ {Im[ΦH Mnm (fB )] − Im[ΦH Mnm (fR )]}
3
1
Dnm = Re[ΦH Mnm (fB )] + √ {Im[ΦH Mnm (fR )] − Im[ΦH Mnm (fG )]}
3

In mathematical terms, calculating moments and their transformations using Eq. (12)
poses problems of information loss when rotating objects, translating, or scaling, for this
reason, we going to present the invariant moment given their beneficial effects on clas-
sification, this calculation was applied to guarantee the invariance of the Krawtchouk
polynomials [7].
The translational invariance of the Φ ̂ H Mnm can be easily obtained by transforming
the 2D color image into the geometric center before computing the Φ ̂ H Mnm , on the other
hand the scale and rotational invariance of the Φ ̂ H Mnm is computed in this way:
Let f sr (r, 𝛼) be the scaled, and rotated version of image function f (r, 𝛼) with the scale
factor 𝛽 and rotation angle 𝛼0 we have
r
f sr (r, 𝛼) = f ( , 𝛼 + 𝛼0 ) (14)
𝛽
̂ sr Mnm of scaled and rotated color image is
Mathematically, the Φ H

∕ 2 6NN
∑ ∑ r
̂ sr Mnm = 1
Φ f ( , 𝛼 + 𝛼0 )̃
h𝜂n1 𝜙1 (r, N)eμ m 𝛼 (15)
H
6N r� =0 𝛼� =0 𝜆

By letting r = r
𝛼 = 𝛼 + 𝛼0, Eq. (12) can be written as
� �

𝛽�

13
Multimedia Tools and Applications

⌢ sr nN∕2 6N
∑ ∑ 𝜂 𝜙
1
f (r� , 𝛼 � )�

ΦH Mnm = 6N
hn1 1 (r� , N)e𝜇m(𝛼 −𝛼0 )
r� =0 𝜃 � =0
� �
k k ⌢ (16)
e𝜇m𝛼0 ∑ ∑ i+1
= 6N
𝛽 bni dik H km
r=0 i=r

𝜂 𝜙1 ( )
where ̃ 𝛽r , N is defined as

hn1
n n
( � ) ∑ ∑
̃
h𝜂n1 𝜙1 𝛽r , N = ̃
h𝜂n1 𝜙1 (r, N) 𝛽 i bni dik (17)
r=0 i=k

where bni and dik defined are given by:


(n + 1)! (1 − N)n
bni = √ 2 (18)
𝜌(n, N) (n − i)!(i!) (1 − N)i


(−1)i+k 𝜌(k, N) (2k + 1)(i!)2 (1 − N)i
dik = (19)
(i + k + 1)! (i − k)!(1 − N)k

where 𝜌(n, N) is defined as


N+n
Γ(2n − 1)( )
2n + 1 (20)
𝜌(n, N) =
Nn
̂ H MInm as follows
We can define the Φ
̂ 01 ) ∑n n
𝜇marg(H ∑
̂ H MInm = e
Φ ̂ −(i+1) bni dik H
H ̂ km (21)
6N 00
k=0 i=k

Then, Φ̂ H Inm is scaling and rotation invariance of Φ


̂ H Mnm for any order n, m
After presenting Φ ̂ H M Inm, it should be noted that theoretically, the optimal choice of
the data parameters 𝜂1 and 𝜙1 is important in order to fully benefit from the advantages of
the moments. However, to achieve this objective we will adopt an algorithm which belongs
to the metaheuristic algorithms based on the intelligence Swarm [12], and which will allow
us to choose the optimal parameters among thousands of possibilities. It should be noted
that the technique of artificial intelligence has proven to be capable of generating extraordi-
nary results in many fields [9, 37].
In principle, ants use pheromone pathways, e.g., between nest and food source, to
mark their path. A colony will then select the shortest way to an exploiting source, with-
out people getting a global view of the direction[62]. In general, the procedure of the
ACO algorithm can be described as follows: ants are initially positioned at the nest.
Each ant will choose a possible route as a solution. In fact, each ant builds a feasible
solution by repeatedly applying a stochastic greedy search called the state transition rule
(22) [21]. Once all ants have completed their rounds, the amount of pheromone is recti-
fied (23) [21]. The pheromone correction rules seem to send more pheromone to the
sides where the ants need to go.

13
Multimedia Tools and Applications

� 𝛽
[𝜉ij (h)]𝛼 [ 1 dij ]
Pij = � 𝛽 (22)
∑ 𝛼 1
j∈Nodespermitted [𝜉ij (h)] [ dij ]

𝜉ij Quantity of pheromone;


h Index of iteration; and
dij Distance heuristic between i and j.
𝜉ij (h + 1) = (1 − 𝜌) 𝜉ij (h) + Δ𝜉ij (h) (23)
As part of our justification for selecting ACO as a coefficient optimization algorithm,
we have compared its convergence performance to that of other algorithms that have
been published previously, including the artificial bee colony, particle swarm optimiza-
tion, the genetic algorithm, and the cosine algorithm. Figure 2 depicts the convergence
curves obtained by applying six test functions (listed in Table 2): unimodal reference
functions (F1-F4), multimodal reference functions (F9-F13), and fixed dimensional
multimodal reference functions(F21-F22). In terms of convergence speed, it is obvious
that the ACO provides the best answers when compared to the other algorithms when
applied to the six test functions studied. { }
Traditionally, researchers base their selection of QRHMI 𝜂1 , 𝜙1 parameters on
eliminating cases after a certain number of experiments, which has resulted in limiting
the effectiveness
{ ̂ H MInm . In this context, ACO was used to select the optimal
of }Φ
parameters that will then be used to calculate the moments of the
opti opti
Voptimal = 𝜂1 , 𝜙1
images to be classified, which will increase the classification accuracy. It should be
noted that, to demonstrate the effectiveness of the algorithm, we adopted an arbitrary
application that included the reconstruction of color RD images while using the MSE as
a metric. First, we chose RD images (test images) f (x, y) that belong to the APTOS data-
base, taking into account the initial parameters NI, NA, NS, 𝛼, 𝛽 and 𝜌 and the objective
function MSE that must be minimized. In the second step, we compute the Nmax-order
moments of the images chosen in the first step using Eqs. (22) and 𝜂1 = 0, 𝜙1 = 0 , and
then reconstruct the image and evaluate the value of the MSE objective function. Note
that in this step, to rectify the value of MSE, we use Eq. (22). Third, we determine the
best MSE objective function value found by the best ants and then update the phero-
mone using Eq. (23). The steps of the algorithm are repeated iteratively until the maxi-
mum number of iterations is reached.
{ Finally,
} the output of this algorithm generates the
optimal parameters Voptimal = 𝜂1 , 𝜙1 . The proposed optimal parameter selection
opti opti

procedure is conveniently described in the pseudo-code below.

13
Multimedia Tools and Applications

Fig. 2  Convergence 8
4
10 F1

results for the studied 7


ACO
PSO
ABC
problems F1-F4-F9- 6
GA
SCA

Best score obtained so far


F13-F21-F22 5

0
0 20 40 60 80 100
Iteration
4
10 F4
18
ACO
PSO
16
ABC
GA
14 SCA

Best score obtained so far


12

10

0
0 20 40 60 80 100
Iteration
F9
500
ACO
450 PSO
ABC
400 GA
SCA

Best score obtained so far


350

300

250

200

150

100

50

0
0 20 40 60 80 100
Iteration
8
10 F13
14
ACO
PSO
12 ABC
GA
SCA
Best score obtained so far

10

0
0 10 20 30 40 50
Iteration
F21
0
ACO
PSO
ABC
-2 GA
SCA
Best score obtained so far

-4

-6

-8

-10

-12
0 100 200 300 400 500
Iteration
F22
0
ACO
PSO
ABC
-2 GA
SCA
Best score obtained so far

-4

-6

-8

-10

-12
0 100 200 300 400 500
Iteration

13
Multimedia Tools and Applications

Pseudo-code 1: Optimization of the Φ


̂ H MInm parameters by ACO
Inputs: test image f (x, y), moment order Nmax {, image size }N×N
Outputs: the optimum parameters Voptimal = 𝜂1 , 𝜙1
opti opti

Initialization of parameters
• NI = 150: Maximum Number of Iterations
• NA = 10: Number of ants running per iteration
• NS = 5: Number of best ants who deposit pheromone
• VL = [0, 0]: lower limit of critical variables
• VU = [N, N]: upper limit of critical variables
• 𝛼 = 0.8: a parameter from the algorithm to control the influence of the amount of pheromone when an ant
makes a choice (higher alpha gives pheromone more weight)
• 𝛽 = 0.8: a parameter from that controls the influence of the distance to the next node in ant choice making
(higher beta give distance more weight)
• 𝜌=0.3: pheromone hangover coefficient
Step 1: Select the optimal 𝜂1 , 𝜙1 value of proposed polynomials using Eq. (22): The starting solution popu-
lation is generated randomly in the range [0;N], and then for every solution generated the MSE function is
evaluated
Step 2: determine the best value of the objective function (MSE) found by the best ants, and then we update
the pheromone using the following formula (23):
Meanwhile, the best vector of critical variables selected so far is also characterized by good MSE that
deserve further exploration. This data can be utilized to strengthen particular search areas by reinforcing
the pheromones on the optimum path
Step 3: The algorithm’s steps are iteratively repeated until the maximum number of iterations is reached.
Finally, the reconstructed image is saved, and the reconstruction performance is evaluated using the
objective function MSE

To validate the invariance of Φ ̂ H MInm , we use a few images from the dataset. The
selected order of invariants (Φ ̂ H MI0,1;Φ
̂ H MI1,0;Φ
̂ H MI0,2;Φ
̂ H MI2,0,Φ
̂ H MI0,3,Φ
̂ H MI3,0) are
calculated for each image with Voptimal = {10.861, 11.568}, the results of the invariant sim-
ulation are shown in Tables 2, 3 and 4.
First, we will present in Table 3 the partial results of the rotation invariant moments
using an image of severe DR with a size 1000 × 1000. The image is converted by a
certain angle α = 15°, α = 70°, α = 15°, α = 130° and α = 150° respectively, by spinning
the middle around the original. In the following, we will present in Table 4 the partial
results of the translation invariant moments using an image of mild DR with various
translations at the top left (T1), top right (T2), bottom left (T3), and bottom-right (T4),
respectively. Finally, we present in Table 5 the results of the invariance with respect to
the scaling using a NO DR image of size 1000 × 1000. The image is transformed into a
different size, 800, 600, 400, and 100 respectively. It should be noted that to measure
the invariance of moments Φ ̂ H MInm , we also the deviation of the central moments[18].
The experimental results showed that the invariance rate is very low compared to other
works cited in Table 6, which revealed that under various forms of transformation, the
invariant moments Φ ̂ H MInm were very stable.
After describing in this section, the theoretical structure of the QRHMI and
how the ACO metaheuristic algorithm was used to determine the optimal choice of
Voptimal = {10.861, 11.568} parameters, we will describe in the next section the database
used, the proposed CNN model, and the method that will be used to classify and deter-
mine the stage of the DR.

13
Multimedia Tools and Applications

Table 2  Unimodal, multimodal, and fixed multimodal test functions


Functions Description Dimensions Range
∑n
F1 f (x) = 2
i=1 xi
30, 100, 500, 1000 [−100, 100]
{ }
F4 f (x) = maxi ||xi ||, 1 ≤ i ≤ n 30, 100, 500, 1000 [−100, 100]
∑n � 2 � � �
F9 f (x) = i=1 xi − 10cos 2𝜋xi + 10 30, 100, 500, 1000 [−5.12, 5.12]
( ∑n () )2 [ ( )]
F13 2
f (x) = 0.1(sin 3𝜋x1 + x − 1 1 + sin2 3𝜋xi + 1 30, 100, 500, 1000 [−50, 50]
i=1 i
( )2 ( ( )) ∑n ( )
+ xn − 1 1 + sin2 2𝜋xn + u xi , 5, 100, 4
i=1

F21 ∑5 �� �� �T �−1 4 [0, 1]


f (x) = − i=1 X − ai X − ai + ci

F22 ∑7 �� �� �T �−1 4 [0, 1]


f (x) = − i=1 X − ai X − ai + ci

Table 3  The invariance of 𝚽


̂ H 𝐌𝐈n,m with respect rotation

rotated images Φ
̂ H MI0,1 ̂ H MI1,0
Φ ̂ H MI0,2
Φ ̂ H MI2,0
Φ ̂ H MI0,3
Φ ̂ H MI3,0
Φ

Original 0,04,916,863 0,35,777,659 0,22,975,457 0,40,258,262 0,31,245,639 0,59,114,079


α = 15° 0,04,916,863 0,35,777,659 0,21,975,456 0,40,258,293 0,31,245,685 0,59,114,093
α = 70° 0,04,916,863 0,35,777,659 0,22,975,452 0,40,258,234 0,31,245,694 0,59,114,011
α = 130° 0,04,916,863 0,35,777,659 0,21,975,458 0,40,258,253 0,31,245,602 0,59,114,085
α = 150° 0,04,916,863 0,35,777,659 0,21,975,454 0,40,258,201 0,31,245,639 0,59,114,079
𝜎∕𝜇 0 0 0,024,478,686 8,46846E-07 1,20717E-06 5,60749E-07

Table 4  The invariance of 𝚽


̂ H 𝐌𝐈n,m with respect translation

Translated ̂ H MI0,1
Φ ̂ H MI1,0
Φ ̂ H MI0,2
Φ ̂ H MI2,0
Φ ̂ H MI0,3
Φ ̂ H MI3,0
Φ
images

T1 0,32,665,533 0,30,559,705 0,39,961,843 0,49,708,870 0,78,012,888 0,95,916,724


T2 0,32,665,533 0,30,559,705 0,39,961,843 0,43,911,010 0,71,952,250 0,92,863,627
T3 0,32,665,533 0,30,559,705 0,39,961,843 0,48,087,861 0,76,290,853 0,95,833,392
T4 0,32,665,533 0,30,559,705 0,39,961,843 0,44,729,751 0,73,007,217 0,95,305,168
𝜎∕𝜇 0 0 0 0,058,896,176 0,037,701,137 0,01,512,416

̂ H 𝐌𝐈n,m with respect Scaling


Table 5  The invariance of 𝚽
Scaling Factor ̂ H MI0,1
Φ ̂ H MI1,0
Φ ̂ H MI0,2
Φ ̂ H MI2,0
Φ ̂ H MI0,3
Φ ̂ H MI3,0
Φ

Original 0,63,012,511 0,73,986,378 0,26,373,845 0,43,764,506 0,24,969,677 0,47,143,489


(1000 × 1000)
S1 (800 × 800) 0,63,012,511 0,73,986,378 0,26,373,845 0,43,741,992 0,24,974,273 0,47,110,418
S2 (600 × 600) 0,63,012,511 0,73,986,378 0,26,373,845 0,43,780,205 0,24,923,072 0,47,114,472
S3 (400 × 400) 0,63,012,511 0,73,986,378 0,26,373,845 0,43,791,464 0,24,969,473 0,47,183,926
S4 (100 × 100) 0,63,012,511 0,73,986,378 0,26,373,845 0,43,789,791 0,24,969,769 0,471,176,523
𝜎∕𝜇 0 0 0 0,00,047,188 0,000,858,854 0,000,652,835

13
Multimedia Tools and Applications

Table 6  The average 𝝈∕𝝁 (%) of type of invariance Scaling translation rotation
̂ H 𝐌𝐈n,m’s invariance to scaling,
𝚽
translation, and rotation
Literature[7] 0.57369487537 0.45964026548 0.58945902010
Literature[8] 0.39847846829 0.13797502698 0.28937917398
using the param- 0.63798269629 0.37832609973 0.48813701376
eters chosen
in[57]
̂ H MInm
Φ 0.00066118966 0.0186202455 0.00408021679

3 Data, pre‑treatment, and method description

In this section, we propose a new method based on ACO-optimized QRHMIs as the first
layer of the new CNN-IHM architecture. CNN-IHMs are distinguished by their efficiency
in feature extraction as well as their stability under a variety of transformations, including
rotation, scaling, and rotational translation. Because of these characteristics, CNN-IHMs
can be used in many computer vision applications.

3.1 Data and Pre‑treatment

Convolutional neural network models demonstrate high performance in image-based appli-


cations such as medical image classification. However, balanced and sufficient data is
required to efficiently train a CNN model [48], but this is not always possible, as hospital
conditions are not always appropriate. In this paper, we will focus on a huge number of
digital retinal fundus images that are classified into five groups, including mild, moderate,
extreme, proliferative, and NO-DR, as shown in Fig. 3. These images were provided by
APTOS in 2019.

Fig. 3  Severity stages of DR (a) NO_DR, (b) mild, (c) moderate, (d) sever, and (e) proliferating

13
Multimedia Tools and Applications

Fig. 4  The flowchart that summarizes the hybrid classification method

The datasets include about 3600 images. It should be noted that a preprocessor was used
to resize the images to keep the same size of 1000 × 1000 × 3. The images were taken under
varying conditions; they differ in brightness and the tools used. In addition, this database is
very uneven, with around 49% of the images in the first class and only 8% in the fifth and
fourth classes, which leads to under-fitting and over-fitting [24, 33]. It should be noted that
the inequality problem was encountered in all DR databases, which prompted us to look for
a ranking method using very small and uniformly distributed databases.
However, we will focus our attention in this section on the new hybrid method that has
been proposed. To automatically detect the presence of diabetic retinal disease using only a
small database set, the hybrid method combines Hahn’s discrete radial moment quaternion
approach, artificial intelligence approach, and machine learning. The steps of the proposed
method are summarized in the flowchart (Fig. 4).
In practice, our hybrid DR image classification method follows the following steps:
First, a collection of images was selected from the APTOS blindness detection competition
database. Then, several manual operations are performed to create a reduced and balanced
database consisting of 1795 images divided into five categories: healthy, mild, moderate,
severe, and proliferative (see Fig. 3). Second, the preprocessing phase (Fig. 5), in which
we use the new QRMIH optimized by the ACO metaheuristic algorithm, with the aim of
obtaining an accurate classification. Note that this step is devoted to the computation of
the moments of each image using the order n = 128 to construct the image moments of
size 128 × 128 × 1. Our proposed convolutional neural network model (CNN-IHM) (Fig. 6)
takes the image moments that were computed in the previous phase and feeds them into the
model. This model will be explained in the next section.

3.2 Architecture of the proposed method Model

The proposed CNN-IHM model consists of three convolutional layers (CNVL), three pool-
ing layers, five dropout layers, one flat pool layer, and four fully connected (FC) layers as

Fig. 5  Preprocessing data using QRHMI

13
Multimedia Tools and Applications

126×126

64×64

28×28

1372

64
Fig. 6  The proposed model architecture CNN-IHM

shown in Fig. 6. The architecture of the proposed CNN-IHM model was generated using
the Draw.io tool. Our model was implemented and processed using TensorFlow in the
Jupyter notebook environment. Obviously, the proposed CNN-IHM architecture can be
considered as a modified version. However, the CNN was designed using the LeNet [17,
20] as the initial basis. The input of the CNN-IHM model architecture accepted the output
of the first stage (Fig. 5).
The CNN-IHM architecture starts with a CNVL that takes a matrix of size 128 × 128
as input. Next, the first CNVL-1 uses 3 × 3 kernel filters with a 2 × 2 step and 10 filters.
The output of the first CNVL is "max-pooled" using a pooling layer with stride 2 × 2 that
reduces the input to half its size, 61 × 61. The output of CNVL 1 is fed into a subsequent
CNVL 2 with seven filters of kernel size 3 × 3; the output of this layer is "max-pooled"
using a pooling layer with stride 2 × 2, which leads to a size reduction 28 × 28. Subse-
quently, the output of CNVL 2 is fed into a subsequent convolution layer, CNVL 3 with
3 × 3 kernel size filters; the output of this layer is "max-pooled" using a pooling layer
with stride 2 × 2, which leads to a size reduction of 14 × 14. Therefore, it is necessary to
highlight that the three CNVL are followed by a dropout layer of 0.005, 0.006, and 0.007,
respectively, and a ReLU activation function [14, 36].
The tensor is flattened into a 1372-neuron linear unit. The completely connected layers
are then utilised. The first FC layer reduces the 1372-neuron tensor to 128 neurons with
a weight count that reaches 175,744, and ReLU activation to the output. The second FC
reduces the 128 neurons into 64 neurons with a count of weight up to 8256, and uses ReLU
to activate the output neurons. The third FC layer converts 64 neurons into 16 neurons
(1040), and uses the same ReLU activation. The result of the FC layers is a tensor with
16 neurons; these are converted into a number of neurons equal to the number of classes 5
with softmax activation function [4].

13
Multimedia Tools and Applications

4 Experiments and results

The following section presents the metrics that can help to evaluate the classification
method. First, we will present the "loss" and "accuracy" curves of our CNN-IHM model
by classifying two reduced bases: the first one contains 1436 images (Datasets 1), and the
second one contains the image moment computed from these images (Datasets 2). After-
wards, we will present the confusion matrix, the ROC curves of each class, and a table that
contains the performance measures f1-score and recall. Finally, we will make a comparison
with other classification algorithms.
The technique adopted to estimate the performance of our model is characterized by a
parameter "k". This parameter determines the number of folds in the data set that we have
chosen to be after several experiments, k = 4. Each fold has the possibility to appear in the
training set (k-1); according to this approach, we will divide the dataset in two parts: train-
ing data and test data in the ratio 80:20. Which means that we will keep 80% of our total
base to train our model, and then we will randomly divide the data of this fraction into four
equal parts. On the other hand, we reserved 20% for the test in order to evaluate the perfor-
mance of the classifier.
The performance of our CNN-IHM during training on both databases (Datasets 1 and
2) is represented graphically, based on the loss and accuracy curves. These represent
the learning performance of the network in a concrete way, as they show the differ-
ences in failure and accuracy as a function of the training period. Figure 7 represents
the learning traces measured with the training dataset "Datasets 1" for 30 epochs using
the Adam optimizer and considering the learning rate as 0.0005. Figure 8 shows the
learning traces for "Datasets 2", keeping the same parameters as shown in Table 7. In
this regard, it should be noted that the number of epochs is determined using the follow-
ing criterion: i) Find the same error value in different epochs. ii) The validation starts
to decrease during several consecutive iterations. iii) The stability of the two curves
namely loss curve and accuracy curve.
The curves in Fig. 7(a) and (b) show that the "loss" curve decreases while the preci-
sion increases. Based on these curves, we can conclude that, contrary to Fig. 8(a) and
(b), we start getting good results around the fifteenth epoch. We can thus deduce our
model’s suffering with "Dataset 1," which does not learn enough and does not predict
correctly. It should be noted that the time spent during the learning period on "Datasets

Fig. 7  (a) Plot of accuracy vs. epoch. (b) Plot of loss vs. epoch

13
Multimedia Tools and Applications

Fig. 8  (a) Plot of accuracy vs. epoch. (b) Plot of loss vs. epoch

1" greatly exceeds that of the second base, as shown in Table 7. To conclude, to get a
good result using reduced bases, we had to use deeper architectures with thousands of
parameters, which makes the classification task more complex.
Next, we will show the efficiency of the Classifier on the test samples using the con-
fusion matrix, which can be considered, as an error matrix to determine the efficiency
of our method, each column of the matrix represents the number of occurrences of an
estimated class, while each row represents the number of occurrences of a real class. In
practice the test is performed on a test range of 365 images, distributed according to 5
classes as follows 74 NO_DR, 74 mild, 72 moderate, 72 severe and 73 proliferative.
Specifically, Fig. 9 also shows a confusion matrix obtained using test samples. Thus,
among 74 samples of NO_DR, 68 were classified correctly, while among 74 samples of
mild, 69 were classified correctly, and for the other classes, moderate, severe, and prolif-
erative, 67, 68, and 71 were classified correctly, which shows the ability of the classifier
to distinguish between subjects. To authenticate the effectiveness of our method, we
presented the following evaluation parameters: f1_score, recall, and receiver operating
characteristic (ROC) [54] curve of each class.
Table 8 summarizes the results found using the proposed classification method. The
precision with which we measure the ability of our system to reject irrelevant solutions
(the proportion of solutions found that are relevant). The recall with which we measure
the ability of the system to give all relevant solutions (the proportion of relevant solu-
tions that are found) [38]. The f1_score combines precision and recall into a single met-
ric by computing the harmonic mean between these two. It is in fact a special case of
the more general function F𝜂 [1]:

Table 7  Parameter tuning in the CNN-IHM model


Parameters of the CNN-IRD model 1436 image moments 1436 images
« Datasets2» « Datasets1»

Fully connected layer 4 4


Convolution layer 3 3
Learning rate 0.0005 0.0005
Epochs 30 30
Time (On CPU approximately) 9 min 53 min
Image resolution 128 × 128 × 1 1024 × 1024 × 3

13
Multimedia Tools and Applications

Fig. 9  The confusion matrix of the proposed method on a set of image moments

precision × recall
F𝜂 = (1 − 𝜂 2 ) (24)
𝜂 2 × precision + recall

The proposed classification method was evaluated using two curves: receiver operat-
ing characteristic (ROC) and precision-recall curves (PRC) for each class. The ROC is a
simple line graph that helps us to summarize all the information about the capability of the
classifier. In the ROC graph, the Y-axis represents the sensitivity, also known as the true
positive rate, and the X-axis represents the 1-specificity, also known as the false positive
rate. The PRC curve is simply a graph with precision values on the Y-axis and recall values
on the X-axis. To put it another way, the PRC curve has TP/(TP + FN) on the Y-axis and
TP/(TP + FP) on the X-axis. It is necessary to note that to measure the effectiveness of our
classification method from the two curves, we have to base it on the following criteria: The

Table 8  Classification report for precision recall f1-score Support


our method propose
Normal retinopathy DR 0.92 0.92 0.92 74
Mild DR 1.00 0.93 0.97 74
moderate DR 1.00 0.93 0.96 72
Severe DR 0.96 0.94 0.95 72
Proliferating DR 0.85 0.97 0.90 73

13
Multimedia Tools and Applications

PRC curves are as high as possible in the upper right corner. The ROC curves should be as
high as possible in the upper left corner. The area under the curve must be large [56].
The ROC and PRC curves for the five classes are shown separately in Fig. 10. The
area under the curve (AUC) values of each class, including "NO_DR", "mild," "moder-
ate," "severe" and "proliferative", was 0.9458, 0.9702, 0.9487, 0.9783 and 0.9654, respec-
tively. The mean accuracy values for each class, including "NO_DR", "mild," "moderate,"
"severe" and "proliferative", were 0.9074, 0.9453, 0.9563, 0.9813, and 0.9148, respectively.
Before proceeding to the presentation of the AUC-based comparison results between
our hybrid classification approach and other classification methods, we must first pre-
sent the ROC curves of all classes in a single figure (Fig. 11).
The proposed method is compared with other models and techniques for classification and
imaging of diabetic retinopathy as shown in Table 9. The Residual Neural Network (ResNet),
that was developed by Kaiming in [29], is decent and can greatly accelerate the training of
ultra-deep neural networks and improve the accuracy of the model. The AlexNet’s model,
[27], is one of the first deep CNN models that significantly promote the accuracy of image
classification, which contains nearly 600 million connections and 600,000 neurons with 5 con-
volutional layers. The GoogleNet’s model uses near 20 layers of depth. Its network structure is
more complex than AlexNet because of the addition of the "Inception" layers to the network
structure. Each "Inception" contains six convolution operations and one pooling operation.
The Zoom-in-Net technic [61], generates attention maps using only image level supervisions.
The attention map is a heat map that indicates which pixels play a more important role in deci-
sion making. The CNN + hand-crafted features [42] technic, is a red lesion detection method
based on the combination of both deep knowledge and domain knowledge. The Bag of Visual
Words (BoVW) method uses a maximum margin classifier in a flexible framework capable
of detecting the most common DR-related lesions [46]. The method used in the paper [26] is
based on the classification of a feature vector that is unique and generated for each image. This
feature vector is based on three types of analysis: exudate probability distribution, color analy-
sis, and wavelet analysis. The method used in the paper [2] is based on the automatic recogni-
tion of the severity level for the diagnosis of diabetic retinopathy using deep visual features.
Table 9 presents a comparison of our results with the results of other authors, who have
used a wide variety of techniques, ranging from feature extraction and deep learning net-
works to hybrid methods, performed on different datasets. The AUCs for the GoogleNet
[57], AlexNet [57], and ResNet [57] models are 92.72%, 93.42%, and 93.65%, respectively.
The common point between these models is the increase in complexity in order to improve
the model performance, which generates a huge number of parameters. However, the large
number of parameters significantly increases the amount of computation. On the other
hand, the methods based on the identification of characteristic elements, such as blood
vessels or microaneurysms, in the images considerably increase the error propagation. In
addition, particularly positive results were obtained with our hybrid method, with an AUC
value equal to 95.43%, thus outperforming the other approaches presented.
Next, we will make a comparison by adopting the value of the AUC calculated by using the
same approach proposed in this paper, but with other Discrete orthogonal polynomials (DOP)
known in the literature. It should be noted that the type of polynomials used in this comparison
belong to the families of discrete orthogonal polynomials[11, 15, 30, 31, 64]. Table 10 also
shows the AUC values obtained from the same test samples (image). DOP 1 presents the value
of AUC = 89.56% using the quaternion discrete radial Tchebichef moments [8]. DOP 2 presents
the value of AUC = 87.57% using the same polynomial used in this paper, but without optimi-
zation. DOP 3 presents the value of AUC = 88.56% by adopting the radial Meixner moment
invariants [23]. From these results, we can confirm that the proposed hybrid method to classify

13
Multimedia Tools and Applications

Fig. 10  ROC and PRC


curves for the five
classes

13
Multimedia Tools and Applications

Fig. 11  The ROC courses of all classes

the DR level outperforms other polynomials; this comes back to the used Hahn polynomial,
which is considered as one of the best shape descriptors.

5 Conclusion

A novel hybrid method for the classification of APTOS-provided fundus images is pro-
posed in this paper. The method combines the discrete moment quaternion approach, arti-
ficial intelligence approach, and machine learning to automatically distinguish the stage of

Table 9  Compares various DR classification models and methods to the proposed method
number of images used Classifier/method AUC​

Literature [59] 35,126 Google Net + hyperparameter-tuning 92.72%


Literature[59] 35,126 Alex Net + hyperparameter-tuning 93.42%
Literature[59] 35,126 RES Net + hyperparameter-tuning 93.65%
Literature[61] 89,000 Zoom-in-Net 92.10%
Literature[42] 1200 CNN + Hand-Crafted features 89.32%
Literature [1] 88,702 multi-scale attention network 84.72%
Literature[26] 1200 In this method, a unique feature vector is cre- between
ated for each image. This feature vector is 88.00% and
used to classify each image 94.00%
Literature [46] 2797 Bag-of-Visual-Words Representations + SVM 94.20%
Literature [2] 750 The method is based on deep visual charac- 92.40%
teristics
Proposed method 1795 images moments The proposed method combines the discrete 95.43%
moment quaternion approach, the artificial
intelligence approach, and machine learning

13
Multimedia Tools and Applications

Table 10  AUC comparison Discrete orthogonal polynomi- Number of used images AUC​
of other discrete orthogonal als
polynomials using the same
approach
DOP 1[8] 1795 89.56%
DOP 2 1795 87.57%
DOP 3[23] 1795 88.56%
proposed method 1795 95.43%

diabetic retinopathy using reduced databases. This method is based on two major steps: the
first one is dedicated to the calculation of the image moments by using the new QRHMI.
The second one is reserved to classify the image moments by applying the proposed
CNN-IHM model. In contrast to state-of-the-art techniques, we mainly focused on using
the discrete moment quaternion approach and a simple CNN model. Therefore, we do not
need to identify characteristic features, such as blood vessels or microaneurysms, in the
images, which reduces error propagation. The proposed method was evaluated using AUC
and tested on 1795 images. The partial performance of AUC based on "NO DR," "mild,"
"moderate," "severe," and "proliferative" classes using the original image matrices (image
moments) was 94.58%, 97.02%, 94.87%, 97.83%, and 96.54%, respectively. The total AUC
performance reached 95.43%, which is a better figure than the other approaches. Although
the proposed technique was found to be effective, it suffers from a time consumption prob-
lem due to the repetitive application of the ACO during the optimization of the QRHMI
parameters. Therefore, our team will keenly focus their research to find more satisfying
solutions to the problem, so that we can use and apply this method online.

Declarations
Conflicts of Interest The authors declare no conflict of interest.

References
1. Al-Antary MT and Arafa Y (2021)“Multi-scale Attention Network for Diabetic Retinopathy Clas-
sification,” IEEE Access 9:. https://​doi.​org/​10.​1109/​ACCESS.​2021.​30706​85.
2. Abbas Q, Fondon I, Sarmiento A, Jiménez S, Alemany P (2017) Automatic recognition of sever-
ity level for diagnosis of diabetic retinopathy using deep visual features. Med Biol Eng Comput
55(11):1959–1974. https://​doi.​org/​10.​1007/​s11517-​017-​1638-6
3. Abdulhussain SH, Ramli AR, Al-Haddad SAR, Mahmmod BM, Jassim WA (2017) On Computa-
tional Aspects of Tchebichef Polynomials for Higher Polynomial Order. IEEE Access 5:2470–2478.
https://​doi.​org/​10.​1109/​ACCESS.​2017.​26692​18
4. Adem K, Kiliçarslan S, Cömert O (2019) Classification and diagnosis of cervical cancer with soft-
max classification with stacked autoencoder. Expert Syst Appl 115:557–564. https://​doi.​org/​10.​
1016/j.​eswa.​2018.​08.​050
5. Ahmed KT, Jaffar S, Hussain MG, Fareed S, Mehmood A, Choi GS (2021) Maximum Response
Deep Learning Using Markov, Retinal Primitive Patch Binding with GoogLeNet VGG-19 for Large
Image Retrieval. IEEE Access 9:41934–41957. https://​doi.​org/​10.​1109/​ACCESS.​2021.​30635​45
6. Almobarak AO et al (2020) The prevalence and risk factors for systemic hypertension among Suda-
nese patients with diabetes mellitus: A survey in diabetes healthcare facility. Diabetes Metab Syndr
Clin Res Rev 14(6):1607–1611. https://​doi.​org/​10.​1016/j.​dsx.​2020.​08.​010
7. Amakdouf H, Zouhri A, EL Mallahi M, and Qjidaa H, (2020) “Color image analysis of quater-
nion discrete radial Krawtchouk moments,” Multimed. Tools Appl.https://​doi.​org/​10.​1007/​
s11042-​020-​09120-0

13
Multimedia Tools and Applications

8. Amakdouf H, Zouhri A, El Mallahi M, Tahiri A, Chenouni D, Qjidaa H (2021) Artificial intelligent


classification of biomedical color image using quaternion discrete radial Tchebichef moments. Mul-
timed Tools Appl 80(2):3173–3192. https://​doi.​org/​10.​1007/​s11042-​020-​09781-x
9. Bacanin N, Stoean R, Zivkovic M, Petrovic A, Rashid TA and Bezdan T, (2021)“Performance of
a novel chaotic firefly algorithm with enhanced exploration for tackling global optimization prob-
lems: Application for dropout regularization,” Mathematics 9(21):. https://​doi.​org/​10.​3390/​math9​
212705.
10. Batta M (2020) Machine learning algorithms a review. Int J Sci Res (IJ) 9(1):381. https://​doi.​org/​
10.​21275/​ART20​203995
11. Bencherqui A, Daoui A, Karmouni H and Qjidaa H (2022) Optimal reconstruction and compres-
sion of signals and images by Hahn moments and artificial bee Colony (ABC ) algorithm, Multime-
dia Tools and Applications
12. Beni G, Wang J (1993) Swarm intelligence in cellular robotic systems. In: Robots and biological
systems: towards new bionics?. Springer, Berlin, Heidelberg, p 703–712. https://​doi.​org/​10.​1007/​
978-3-​642-​58069-7_​38
13. Bewick V, Cheek L, Ball J (2004) Statistics review 13: Receiver operating characteristics curves.
Crit Care 8(6):508–512. https://​doi.​org/​10.​1186/​cc3000
14. Chen Z, Ho P-H (2019) Global-connected network with generalized ReLU activation. Pattern Rec-
ogn 96:106961. https://​doi.​org/​10.​1016/j.​patcog.​2019.​07.​006
15. Chen Y et al (2019) Single-pixel compressive imaging based on the transformation of discrete
orthogonal Krawtchouk moments. Opt Express 27(21):29838. https://​doi.​org/​10.​1364/​oe.​27.​029838
16. Chen W, Yang B, Li J, Wang J (2020) An approach to detecting diabetic retinopathy based on inte-
grated shallow convolutional neural networks. IEEE Access 8:178552–178562. https://​doi.​org/​10.​
1109/​ACCESS.​2020.​30277​94
17. Choi KS, Shin JS, Lee JJ, Kim YS, Kim SB, Kim CW (2005) In vitro trans-differentiation of rat
mesenchymal cells into insulin-producing cells by rat pancreatic extract. Biochem Biophys Res
Commun 330(4):1299–1305. https://​doi.​org/​10.​1016/j.​bbrc.​2005.​03.​111
18. Chong CW, Raveendran P, Mukundan R (2003) Translation invariants of Zernike moments. Pattern
Recognit 36(8):1765–1773. https://​doi.​org/​10.​1016/​S0031-​3203(02)​00353-9
19. Daoui A, Yamni M, El Ogri O, Karmouni H, Sayyouri M, Qjidaa H (2020) “New Algorithm for
Large-Sized 2D and 3D Image Reconstruction using Higher-Order Hahn Moments.” Circuits Syst
Signal Process 39(9):4552–4577. https://​doi.​org/​10.​1007/​s00034-​020-​01384-z
20. Deeba K, Amutha B (2020) ResNet-deep neural network architecture for leaf disease classification.
Microprocess Microsyst :103364. https://​doi.​org/​10.​1016/j.​micpro.​2020.​103364
21 Dorigo M, Maniezzo V, Colorni A (1996) Ant system: Optimization by a colony of cooperating
agents. IEEE Trans Syst Man Cybern Part B Cybern 261:29–41. https://​doi.​org/​10.​1109/​3477.​
484436
22. Dubey V (2014) “Quaternion Fourier Transform for Colour Images. Ijcsit 5(3):4411–4416 https://​
www.​ijcsit.​com
23. El Mallahi M, Zouhri A, Qjidaa H (2018) Radial Meixner Moment Invariants for 2D and 3D Image
Recognition. Pattern Recognit Image Anal 28(2):207–216. https://​doi.​org/​10.​1134/​S1054​66181​
80201​28
24. Gavrilov AD, Jordache A, Vasdani M, Deng J (2018) Preventing model overfitting and underfitting
in convolutional neural networks. Int J Softw Sci Comput Intell (IJSSCI) 10(4):19–28. https://​doi.​
org/​10.​4018/​ijssci.​20181​00102
25 Gayathri S, Gopi VP, Palanisamy P (2020) A lightweight CNN for Diabetic Retinopathy classifica-
tion from fundus images. Biomed Signal Process Control 62(August):102115. https://​doi.​org/​10.​
1016/j.​bspc.​2020.​102115
26. Giancardo L et al (2012) Exudate-based diabetic macular edema detection in fundus images using
publicly available datasets. Med Image Anal 16(1):216–226. https://​doi.​org/​10.​1016/j.​media.​2011.​
07.​004
27. Gonzalez TF (2007) “Handbook of approximation algorithms and metaheuristics,” pp. 1–1432.
https://​doi.​org/​10.​1201/​97814​20010​749.
28. Gupta S, Thakur S and Gupta A (2022) Optimized hybrid machine learning approach for smart-
phone based diabetic retinopathy detection
29. He K, Zhang X, Ren S, Sun J (2016) 2016 “Deep residual learning for image recognition”, Proc
IEEE Comput Soc Conf Comput Vis. Pattern Recognit 2016:770–778. https://​doi.​org/​10.​1109/​
CVPR.​2016.​90

13
Multimedia Tools and Applications

30. Idan ZN, Abdulhussain SH, Al-Haddad SAR (2020) A New Separable Moments Based on Tch-
ebichef-Krawtchouk Polynomials. IEEE Access 8:41013–41025. https://​doi.​org/​10.​1109/​ACCESS.​
2020.​29773​05
31. Jahid T, Karmouni H, Hmimid A, Sayyouri M, Qjidaa H (2019) Fast computation of Charlier
moments and its inverses using Clenshaw’s recurrence formula for image analysis. Multimed Tools
Appl 78:12183–12201. https://​doi.​org/​10.​1007/​s11042-​018-​6757-z
32. Kandhasamy JP, Balamurali S, Kadry S, Ramasamy LK (2020) Diagnosis of diabetic retinopathy
using multi level set segmentation algorithm with feature extraction using SVM with selective fea-
tures. Multimed Tools Appl 79(15–16):10581–10596. https://​doi.​org/​10.​1007/​s11042-​019-​7485-8
33. Khoshgoftaar TM, Allen EB (2001) Controlling overfitting in classification-tree models of software
quality. Empir Softw Eng 6(1):59–79. https://​doi.​org/​10.​1023/A:​10098​03004​576
34. Kusrini K et al (2020) Data augmentation for automated pest classification in Mango farms. Com-
put Electron Agric 179:105842. https://​doi.​org/​10.​1016/j.​compag.​2020.​105842
35. Lal S et al (2021) Adversarial attack and defence through adversarial training and feature fusion for
diabetic retinopathy recognition. Sensors 21(11):1–21. https://​doi.​org/​10.​3390/​s2111​3922
36. Lin G, Shen W (2018) Research on convolutional neural network based on improved Relu piecewise
activation function. Procedia Comput Sci 131:977–984. https://​doi.​org/​10.​1016/j.​procs.​2018.​04.​239
37. Malakar S, Ghosh M, Bhowmik S, Sarkar R, Nasipuri M (2020) A GA based hierarchical fea-
ture selection approach for handwritten word recognition. Neural Comput Appl 32(7):2533–2552.
https://​doi.​org/​10.​1007/​s00521-​018-​3937-8
38. Maqsood S, Damaševičius R, and Maskeliūnas R (2021) “Hemorrhage detection based on 3d cnn
deep learning framework and feature fusion for evaluating retinal abnormality in diabetic patients,”
Sensors 21(11):. https://​doi.​org/​10.​3390/​s2111​3865.
39. Markoulidakis I, Rallis I, Georgoulas I, Kopsiaftis G, Doulamis A, Doulamis N (2021) Multiclass
confusion matrix reduction method and its application on net promoter score classification prob-
lem. Technologies 9(4):81. https://​doi.​org/​10.​3390/​techn​ologi​es904​0081
40. Ohira M, Ito D, Shimizu T, Shibata M, Ohde H, Suzuki N (2009) Retinopathy: An overlooked
adverse effect of interferon-beta treatment of multiple sclerosis. Keio J Med 58(1):54–56. https://​
doi.​org/​10.​2302/​kjm.​58.​54
41. Oke SA (2008) A literature review on artificial intelligence. Int J Inf Manag Sci 19(4):535–570
42. Orlando JI, Prokofyeva E, del Fresno M, Blaschko MB (2018) An ensemble deep learning based
approach for red lesion detection in fundus images. Comput Methods Programs Biomed 153:115–
127. https://​doi.​org/​10.​1016/j.​cmpb.​2017.​10.​017
43. Park S, Baek Lee S, Park J (2020) Data augmentation method for improving the accuracy of human
pose estimation with cropped images. Pattern Recognit Lett 136:244–250. https://​doi.​org/​10.​1016/j.​
patrec.​2020.​06.​015
44. Pei SC, Cheng CM (1997) Novel block truncation coding of image sequences for limitedcolor dis-
play. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes Bioinformatics)
1311:164–171. https://​doi.​org/​10.​1007/3-​540-​63508-4_​119
45. Perl YS et al (2020) Data augmentation based on dynamical systems for the classification of brain
states. Chaos Solitons Fractals 139:110069. https://​doi.​org/​10.​1016/j.​chaos.​2020.​110069
46. Pires R, Jelinek HF, Wainer J, Valle E and Rocha A (2014) “Advancing bag-of-visual-words repre-
sentations for lesion classification in retinal images,” PLoS One, 9(6):. https://​doi.​org/​10.​1371/​journ​al.​
pone.​00968​14.
47 Poh S, Mohamed Abdul RBB, Lamoureux EL, Wong TY, Sabanayagam C (2016) Metabolic syndrome
and eye diseases. Diab Res Clin Pract 113:86–100. https://​doi.​org/​10.​1016/j.​diabr​es.​2016.​01.​016
48. Rahman MM, Davis DN (2013) Addressing the Class Imbalance Problem in Medical Datasets. Int J
Mach Learn Comput 2013:224–228. https://​doi.​org/​10.​7763/​ijmlc.​2013.​v3.​307
49. Ramasamy LK, Padinjappurathu SG, Kadry S, Damaševičius R (2021) Detection of Diabetic Retin-
opathy Using a Fusion of Textural and Ridgelet Features of Retinal Images and Sequential Minimal
Optimization Classifier. PeerJ Comput Sci 7:1–21. https://​doi.​org/​10.​7717/​PEERJ-​CS.​456
50. Reddy SSK (2020) Diagnosis of Diabetes Mellitus in Older Adults. Clin Geriatr Med 36(3):379–384.
https://​doi.​org/​10.​1016/j.​cger.​2020.​04.​011
51. Saman G et al (2020) Automatic detection and severity classification of diabetic retinopathy. Multimed
Tools Appl 79(43–44):31803–31817. https://​doi.​org/​10.​1007/​s11042-​020-​09118-8
52 Saxena G, Verma DK, Paraye A, Rajan A, Rawat A (2020) Improved and robust deep learning agent for
preliminary detection of diabetic retinopathy using public datasets. Intell-Based Med 3:100022. https://​
doi.​org/​10.​1016/j.​ibmed.​2020.​100022
53. Sayyouri M, Hmimid A, Qjidaa H (2013) Improving the performance of image classification by Hahn
moment invariants. J Opt Soc Am A 30(11):2381. https://​doi.​org/​10.​1364/​josaa.​30.​002381

13
Multimedia Tools and Applications

54 Shaban K et al (2020) A convolutional neural network for the screening and staging of diabetic retin-
opathy. PLoS One 15(6 June):1–13. https://​doi.​org/​10.​1371/​journ​al.​pone.​02335​14
55. Shiferaw WS et al (2020) Glycated hemoglobin A1C level and the risk of diabetic retinopathy in
Africa: A systematic review and meta-analysis. Diabetes Metab Syndr Clin Res Rev 14(6):1941–1949.
https://​doi.​org/​10.​1016/j.​dsx.​2020.​10.​003
56. Taheri M, Lim N, and Lederer J (2016) “Balancing Statistical and Computational Precision and Appli-
cations to Penalized Linear Regression with Group Sparsity,” pp. 233–240, [Online]. Available: http://​
arxiv.​org/​abs/​1609.​07195.
57. Tahiri MA, Karmouni H, Sayyouri M and Qjidaa H (2022)“2D and 3D image localization, compres-
sion and reconstruction using new hybrid moments,” Multidimens Syst Signal Processhttps://​doi.​org/​
10.​1007/​s11045-​021-​00810-y
58. Tahiri MA, Karmouni H, Sayyouri M and Qjidaa H (2020) “Stable Computation of Hahn Polynomials
for Higher Polynomial Order,” In 2020 International Conference on Intelligent Systems and Computer
Vision (ISCV), pp. 0–6 https://​doi.​org/​10.​1109/​ISCV4​9265.​2020.​92041​18
59. Wan S, Liang Y, Zhang Y (2018) Deep convolutional neural networks for diabetic retinopathy detec-
tion by image classification. Comput Electr Eng 72:274–282. https://​doi.​org/​10.​1016/j.​compe​leceng.​
2018.​07.​042
60. WanG J, Perez L et al (2017) The effectiveness of data augmentation in image classification using deep
learning. Convolutional Neural Networks Vis. Recognit 11(2017):1–8.https://​doi.​org/​10.​48550/​arXiv.​
1712.​04621
61. Wang Z, Yin Y, Shi J, Fang W, Li H, Wang X (2017) Zoom-in-net: Deep mining lesions for diabetic
retinopathy detection. Lect Notes Comput Sci (including Subser Lect Notes Artif Intell Lect Notes
Bioinformatics) 10435(LNCS):267–275. https://​doi.​org/​10.​1007/​978-3-​319-​66179-7_​31
62. Wen X (2020) Modeling and performance evaluation of wind turbine based on ant colony optimization-
extreme learning machine. Appl Soft Comput J 94:106476. https://​doi.​org/​10.​1016/j.​asoc.​2020.​106476
63. Wu B, Zhu W, Shi F, Zhu S, Chen X (2017) Automatic detection of microaneurysms in retinal fundus
images. Comput Med Imaging Graph 55:106–112. https://​doi.​org/​10.​1016/j.​compm​edimag.​2016.​08.​001
64. Yamni M et al (2020) Fractional Charlier moments for image reconstruction and image watermarking.
Signal Processing 171:107509. https://​doi.​org/​10.​1016/j.​sigpro.​2020.​107509

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under
a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of such publishing agreement and applicable
law.

Authors and Affiliations

Mohamed Amine Tahiri1 · Hicham Amakdouf2 · Mostafa El mallahi3 · Hassan Qjidaa1


Hicham Amakdouf
hicham.amakdouf@usmba.ac.ma
Mostafa El mallahi
mostafa.elmallahi@usmba.ac.ma
Hassan Qjidaa
hassan.qjidaa@usmba.ac.ma
1
CED-ST, STIC, Laboratory of Electronic Signals and Systems of Information LESSI, Dhar El
Mahrez Faculty of Science, Sidi Mohamed Ben Abdellah-Fez University, Fez, Morocco
2
Sidi Mohamed Ben Abdellah University, Institute of Sports Sciences, Fez, Morocco
3
High Normal School, Department of Mathematics and Computer Sciences, Laboratory
of ComputerSciences and Interdisciplinary Physics, Sidi Mohamed Ben Abdellah University, Fez,
Morocco

13

You might also like