Accelerated Blood Vessel Enhancement in Retinal Fu

Accelerated Blood Vessel Enhancement in Retinal
Fundus Image based on Recon gurable Hardware

Yuyao Wang (  ywang10@lamar.edu )
Lamar University
Research Article
Keywords: Retinal blood vessel enhancement, matched lter, FPGA parallel acceleration
Posted Date: April 26th, 2023
DOI: https://doi.org/10.21203/rs.3.rs-2839197/v1
License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License
Additional Declarations: No competing interests reported.

Accelerated Blood Vessel Enhancement in Retinal Fundus
Image based on Reconfigurable Hardware
Yuyao Wang, Lamar University
Beaumont, Texas 77710
ywang10@lamar.edu
Abstract— Retinal blood vessel extraction and enhancement is enhancement can be deployed not only for early diagnosis of
an intensively researched topic as it is an irreplaceable ocular diseases, e.g. diabetic retinopathy (DR), glaucoma etc.
component in ocular disease screening systems. The matched However, it is also necessary for treatment planning and
filter method has been proven superior for blood vessel execution and evaluating outcomes of performed surgeries
extraction and enhancement compared to edge detection [7][8].
algorithms in that it can extract blood vessel along its path and
retain blood vessel depth information even when blood vessels
Some research efforts included a detailed comparison of
are prone to be indistinguishable from the background. This various blood vessel segmentation algorithms [9-11]. Moccia,
work has implemented matched filter method with Verilog S. et al. have pointed it out [9], the use of deep learning
Hardware Description Language, taking advantage of the highly algorithms for segmenting vessels is gaining significant
customizable feature and parallel computation capabilities by popularity. Kumar, K. S. et al. also indicate [10] that the most
FPGA. The first proposed design method employs an innovative frequently used method in their survey is dependent on
resource-efficient technique based on matched filter technique. It machine learning and deep learning, where the two methods
can be applied to situations where budget and physical resource are considered as one group. As opposed to machine learning
is limited. The second design method is a time-efficient processing algorithms, deep learning directly extracts an appropriate
technique that provides further improvement in that it eliminates
the gap incurred in convolution between two rows of data. As has
internal representation of an image. It contrasted with machine
been verified via simulation, it can offer a continuous output with learning algorithms, where the feature extraction process
about 9% increase in processing speed for conducted simulation requires excellent specialized skills to understand which
compared to the first proposed technique. features are the most appropriate. Deep learning uses both the
rising processing power of GPU and the abundance of data.
Keywords— Retinal blood vessel enhancement, matched filter, Imran, A. et al. presented a comprehensive review of
FPGA parallel acceleration. supervised and unsupervised methods [11]. The former
includes Support Vector Machine (SVM) method, neural
I- INTRODUCTION network method, etc., while the latter includes matched filter
Eye-related diseases such as glaucoma and diabetic method, mathematical morphology, vessel tracking and
retinopathy (DR) are viral and widespread today. One of the model-based methods. It also reports that the average accuracy
most common causes of visual impairment in working-age of deep learning techniques has increased to around 98%. Its
adults is diabetic retinopathy (DR), which is present in effectiveness and performance made it one of the most popular
diabetic individuals. DR could result in blindness in older techniques for vessel segmentation. Recent advances in deep
people worldwide if proper treatment is not received [1]. learning-based methods for retinal blood vessel segmentation
Glaucoma is also one of the most frequent causes of and enhancement are illustrated [12-15].
permanent blindness. Many people in the U.S. are unaware of Some researchers also utilize deformable models for blood
the presence of glaucoma, although blindness is the third most vessel extraction and enhancement [16][17]. Since edge-based
dreaded health concern after cancer and heart attacks [2]. deformable models are predominantly controlled by external
Since blindness can be avoided via careful management and forces resulting from intensity or gradient, segmentation can
treatment, raising public awareness is crucial and essential in become challenging if curve initialization is performed at a
this fight. Optical coherence tomography (OCT) and scanning distance from the boundary of interest. In addition, the
laser polarimetry (SLP) are two commonly used advanced boundary leakage problem is another issue for the edge-based
imaging techniques for diagnosing glaucoma. OCT is a deformable model, which is particularly problematic when
powerful imaging method for diagnosing tissues because it working with noisy images or exhibiting a non-uniform
can produce cross-sectional pictures with excellent axial intensity. A common issue with algorithms using edge
resolution. It is notably helpful in the field of ophthalmology detectors is their negligence of a blood vessel's feature of
since the imaging of the retina can be performed. Looking having two edges running parallel. Therefore, it is expected to
through the back of the eye is made possible by the observe low contrast between blood vessels and background
transparency of the visual media [3-6]. Meanwhile, without further procedure. Shukla, A. K. et al. introduced a
measurements of the RNFL can be obtained using SLP, which fractional filter-based retinal vascular segmentation technique
is based on the birefringence of the retinal ganglion cell axons. [18]. The weighted fractional derivative was used to create the
These two techniques, however, are highly dependent on the fractional filter. The local covariance matrix's eigenvalue
medical staff's expertise and experience and are not cost- mappings and a fractional filter are both used in the suggested
friendly technology. Retinal blood vessel extraction and method. The centerline pixels and vessel borders acquired by
the difference eigenvalue map are thresholded and refined
over the structures discovered by the initial eigenvalue maps. of data. Hajabdollahi, M. et al. [26] proposed an artificial
Ooi, A. Z. H. et al. [19] proposed a revised and improved neural network-based system. To provide real time processing
version of the Canny operator and a GUI to perform blood and ease of implementation on FPGA, the network has been
vessel extraction. While the GUI offers intuitive operation simplified, which requires a compensation method for
with a mouse and keyboard, it inherits the drawback of deteriorated accuracy. Xiang, W. et al. [27] proposed and
traditional edge detection, where the operation is executed implemented the matched filter-based design on an FPGA
sequentially. Dash, S. et al. suggested the combined use of platform for a vein imaging system. However, it consumed an
discrete wavelet transform (DWT) and the Tyler Coye excessive amount of physical resources. It did not fully exploit
algorithm to extract blood vessels from fundus image [20] the advantage of hardware acceleration and thus had an
automatically. An improved method is also proposed where inherent gap when producing output data, which weakens its
Gamma Correction is integrated into the first proposed ability to be deployed in physically constrained applications
combined model. This method takes advantage of all three and to offer real-time data processing. To tackle the
channels of a retinal image, which is different from many abovementioned two issues, this work presents two
other researchers' suggestions that only the Green channel is techniques. The first technique can be considered the baseline
used. Therefore it is reported to provide optimal accuracy as of retinal blood vessel enhancement algorithm based on
well as specificity. Before blood vessels can be detected by matched filter while the second offers further accelerated
employing the Tyler Coye algorithm, image pre-processing processing speed.
steps are needed, and image data in RGB format are combined This work is organized as follows. Section II presents the
and transformed to the YIQ domain. The reported superior matched filter methodology. Section III illustrates the
performance of this method comes at the cost of lengthy structure of the first proposed matched filter-based technique,
execution time. In addition, it is reported to have unevenness and section IV details the second proposed technique. Results
in the walls. and discussions are provided in Section V. Section VI
Another major branch of blood vessel extraction and summarize the conclusion of this paper.
enhancement employs combined deep learning techniques
with various filters. For example, Tchinda, B. S. et al. [21] II. MATCHED FILTER METHODOLOGY
proposed the combined use of edge detection filters and neural One-dimensional match filter design is a typical
network, where the features vector are obtained by using a demodulation approach in signal processing systems, where
series of filters, i.e., Sobel filter, Canny filter, Robert filter, only the magnitude of signal s(t) changes concerning time. A
Laplacian of Gaussian, and morphological transformation matched filter is created when a predefined mask or template
before it can be fed into the neural network for training. This is correlated with an unknown signal to determine if the
data preparation process consumes a considerable amount of template is present in the unknown signal. Thus from the
time. Li, M. et al. [22] adopted a deep learning approach and perspective of improving SNR on the output, matched filter is
proposed a Pixel-wise Adaptive Filter, built on U-Net, for a linear filter that maximizes the SNR in the presence of
further refinement of the previously generated coarse additive stochastic noise. It can be deployed in a typical
segmentation map. The method proposed by Erwin, Safmi et communication system where N inputs are fed into it. If the ith
al. [23] uses CLAHE and median filter in series for increasing filter produced the maximum response, it could be determined
the contrast of the grayscale and enhancing image quality, that input Si(t) was transmitted through the system.
respectively, and data augmentation is accomplished by Extending the matching filter approach to two-dimensional
performing mathematical manipulation, i.e., horizontal flip, images necessitates considering target orientation information.
vertical flip, and vertical reverse flip, to increase the diversity The output of matched filter s(t) reaches its maximum value
of retinal image data from the preceding stage of a median when an image pixel is aligned with the matched filter at an
filter. The augmentation results would then be fed into the U- angle 𝜃 ± 𝜋/2. Therefore, the algorithm first makes the filter
net neural network for training. Ghani, A. et al. [24] proposed rotated for all possible angles from 0 to 180 degrees to include
the combined use of frontend data preprocessing and backend all possible scenarios of blood vessel angle, then record only
image classification to make predictions about retinal fundus the maximum output for each image pixel.
images. The data preprocessing procedure is operated Chaudhuri, S. et al. introduced a two-dimensional matched
sequentially using an adaptive thresholding technique. In filter for detecting blood vessels in retinal images [28]. The
summary, techniques relying on general-purpose CPUs or matched filter was developed based on three key features of
GPUs have the drawback of being executed in series that offer retinal blood vessels: (a) their modest degree of curvature
little paralleled operation, are costly to implement and are makes it possible for a succession of piecewise linear
cumbersome in terms of portability. segments to serve as a reasonable approximation for them; (b)
An increasingly appealing trend for blood vessel extraction due to their inferior reflectivity compared to the remainder of
and enhancement is by providing hardware acceleration. the retinal image, they seem darker than the background; (c)
Bendaoudi, H. et al. [25] presented two matched filter-based their width falls between a predetermined threshold and the
blood vessel segmentation architectures, one of which is background. Examination of the gray-level distributions in
implemented using VHDL hardware description language. retinal fundus pictures reveals that a Gaussian function can
Though the VHDL code was generated automatically using roughly model the intensity distributions of blood vessels. It is
their developed tool, which could bring efficiency in the a fact that the shape of the optimally matched filters has to
designing stage, it inherits the drawback from the convolution come as close as possible to the geometry of the vessel. The
operation of having an interruption or gap between two rows
mathematical expression of the ideal filter can be stated as the conversion, the grayscale image went through matched
Equation (1): filter operation, and the final results were normalized and
shown in Figure 3 (b). It confirms the feasibility and
−𝑥2 effectiveness of matched filter-based algorithms for retinal
𝐿
𝐾(𝑥, 𝑦) = −𝑒 2𝜎2 , |𝑥| ≤ 3𝜎, |𝑦| ≤ (1) blood vessel enhancement. The subsequent research effort is
2
In this Equation, the width of blood vessels is denoted by based on this established algorithm for further improvement.
the letter 𝜎, and the length of blood vessels is denoted by the
letter L. The value for 𝜎 and L is taken as 2 and 9,
respectively, as pre-defined parameters in this work as retinal
blood vessels are arbitrarily oriented at any angle 𝜃 ( 0 ≤ 𝜃 ≤
𝜋 ), matched filters should also be rotated to be aligned with
target blood vessels. The proposed design employs 12
matched filters to accurately estimate target blood vessels,
with a 15° angle difference between two adjacent filters.
Assume 𝑃𝑖 = [x,y] is a discrete point in the matched filter (a) (b) (c)
kernel and 𝜃𝑖 is the orientation of the ith kernel matched to a
blood vessel at an angle 𝜃𝑖 . The weighting coefficients are
computed based on the assumption that each Gaussian kernel
has its centre at the origin point[0,0]. The rotation matrix can
be expressed as (2):
cos 𝜃𝑖 −sin 𝜃𝑖
𝑅𝑖 = (2) (d) (e) (f)
sin 𝜃𝑖 cos 𝜃𝑖
The new corresponding location after rotation is given by

matrix multiplication 𝑃𝑖 = [u,v] =𝑃 ∗ 𝑅𝑖 . In light of the work of
Chaudhuri S. et al. and the fact that the Gaussian function can
be visualized as an infinitely long curve with two tails, we
trimmed down the infinite length Gaussian function to a
region of size N, which is defined as (g) (h) (i)
𝐿
𝑁 = {(𝑢, 𝑣||𝑢| ≤ 3𝜎), |𝑣| ≤ } (3)
2
The weighting coefficients that have been rotated in the ith

kernel can be represented as u and v, which are the updated
coordinates that have been generated due to the rotation. The
average value of the kernel is calculated using Equation (4) if
(j) (k) (l)
the number of points in a region N is known to be A:
Fig 1. 12 Convolution kernels designed
1
𝑚𝑖 = ∑𝑝𝑖 𝑘𝑖 (𝑥, 𝑦) (4)
𝐴
The average value of the convolution kernel coefficients

must be zero so that the original properties of the background
grayscale characteristics of the image remain unchanged. As a
result, the final filter convolution kernel can be derived by
making some adjustments to Equation (1) by subtracting 𝑚𝑖 :
1
𝐾′(𝑥, 𝑦) = 𝐾(𝑥, 𝑦) − ∑𝑝𝑖 𝑘𝑖 (𝑥, 𝑦) (5)
𝐴
Based on the above theory, 12 convolution kernels were

designed. The first convolution kernel of 0 degrees was
designed as a 17*17 grayscale image shown as (a) in Figure 1. Fig 2. Original retinal blood vessel image
With 15 degrees angular resolution, the other 11 kernels can
be obtained, shown from (b) to (l) in Figure 1.
The matched filter-based retinal blood vessel enhancement
algorithm has been designed in MATLAB. The retinal fundus III. RESOURCE-EFFICIENT DESIGN BASED ON MATCHED FILTER
image is shown in Figure 2. The original colored image was
first converted into grayscale, as shown in Figure 3 (a). After
1. Improvement of the first matched filter-based technique are both designed to be shared resources among different
convolution operations. It is feasible because their convolution
The matched filter is an effective method for blood vessel kernel is the only difference among the paralleled convolution
extraction and enhancement as when the image data is operations. Therefore, using the same input pixel data to
convoluted with a pre-defined kernel, only the pixels that convolute with different kernels can eliminate excessive
match the kernel the most would produce a maximum hardware overhead and achieve physical efficiency while
response and thus should be retained, while background image keeping the functionality unchanged.
data would be eliminated. 2. Procedure of MRMF
The first FPGA-based retinal blood vessel enhancement
design structure is the Minimal Resource Matched Filter
technique (MRMF), the structure of which is shown in Figure
5. Firstly, the image data is fed into FPGA directly from
external imaging systems such as a fungus camera or an
external file stored on a hard drive. Then a shift register is
utilized as a buffer to read the input image data pixel by pixel,
which then passes the data on to the downstream window data
module when needed. The window data module is designed to
be the same size as the convolution kernel, and its content
would be updated to the preceding shift register module during
the convolution operation. Downstream convolution modules
are designed to have 12 convolutions running in parallel, each
(a) (b) of which takes a predefined independent kernel as its input,
yielding a better angular resolution than four convolution
kernels would have. All responses of the 12 convolutions
would be sent to the Max module simultaneously for value
comparison that ignores all but the largest response as the final
enhanced pixel result, eliminating background and noisy
pixels. The convolution module includes all 12 convolution
operations with their corresponding kernels in this design
method. Therefore, it only needs to be instantiated once in the
(c) (d) top design module. In this way, it presents the distinctive
advantage of avoiding excessive usage of shift registers and
Fig 3. (a) original grayscale image regular data registers while keeping the functional unit in one
(b) blood vessel enhanced image
(c) sliced grayscale image module. However, it comes at the cost of losing some
(d) enhanced sliced image flexibility and portability in light of the procedures. All
convolution operations are designed in the same module,
A previous effort by Xiang, W. et al. [27] proposed a near- making it time consuming to modify the module applicable to
infrared vein imaging system, which includes a matched filter- other blood vessel enhancement scenarios with different
based blood vessel enhancement model implemented on parameters or even different numbers of kernels.
FPGA. The structure of their suggested matched filter design
can be summarized as shown in Figure 4. The procedure starts
with reading image data from an external CMOS image
sensor, which is then stored in FPGA resources and gets ready
to go through matched filters in parallel. Each matched filter
has a separate shift register to store the input image data and
for convolution operation. It is a convenient design method,
given that the matched filter can be used as an intact
functioning unit and be instantiated in a top design module
according to the number of convolution kernels needed. From
the perspective of model reuse and parallel computation, four
modules of the matched filter can run in parallel, which
greatly increases the processing speed for blood vessel
enhancement. However, one of the drawbacks is that the shift
registers used to read and store input data pixel by pixel and
the data windows used for convolution are all the same in
these four modules, making them hardware redundancies that
could disable the matched filter in certain physically
constrained circumstances. Therefore, in the proposed design
technique I, the shift register and data window for convolution
Fig 4. Structure of matched filter design in [27]
(c) Last 16 results from Modelsim Simulation
(d) Last 16 results from MATLAB Simulation
Fig 7. Last 16 simulation results comparison between Modelsim and

MATLAB
IV. PARALLEL CONTINUOUS MATCHED FILTER TECHNIQUE

Fig 5. Structure diagram for proposed MRMF design 1. Improvement of proposed PCMF
Another drawback of the matched filter-based technique
3. Simulation and verification of MRMF proposed by Xiang, W. et al. [27] is that the convolution
process is inherently interrupted between two rows of image
It is evident from the structure diagram that sharing the shift
data, as shown in Figure 8. This 8 clock cycle interruption is
registers and window data registers can achieve minimum
due to the process of the convolution operation. For a
hardware consumption, as all modules with the same
convolution kernel of size k*k, this interruption of data flow
functionality have been designed to use shared physical
typically lasts for k-1 clock cycles, and the convolution
resources. The convolution kernel is chosen to be rotated
operation can produce one valid output every clock cycle. As
every 15° with a total of 12 convolution operations for one-
convolution is conducted using a sliding window, it takes k-1
pixel data to maintain a better angular resolution. With the
clock cycles to be filled up with new data and ready for
input image size being 96*96, the shift register size is 9*96.
convolution of the next row. The simulation result shown in
The size of shift registers and convolution kernel are both
Figure 8 shows that the last data from the first row is 596. The
reconfigurable depending on the image resolution and size,
output has remained unchanged for 9 clock cycles because the
making it applicable to be deployed in most on-site
process of the shift register is updated content.
applications. Based on this methodology, a matched filter-
based retinal blood vessel enhancement design is implemented
with Verilog hardware programming language. Input retina
image is taken from the DRIVE dataset[29]. The image
simulation result from Modelsim is shown in Figure 6(a) and
Figure 7(c), while Figure 6(b) and Figure 7(b) are results
produced purely on MATLAB. The response generated from
Modelsim simulation is directly fed into MATLAB for image
Fig 8. Simulation from Modelsim showing 8 clock gap between two rows
analysis and demonstration. The comparison indicates that the
simulation results from FPGA are identical to those from
MATLAB.
(a) First 16 results from Modelsim Simulation
(b) First 16 results from the MATLAB simulation
Fig 6. First 16 simulation results comparison between Modelsim and

MATLAB
Fig 9. Parallel Continuous Matched Filter technique
To eliminate the interruption of convolution between two gap in between. The second counter cnt1 is used to count the
rows of data, the Parallel Continuous Matched Filter technique repetition of 88 clock cycles. When it is equal to the maximum
is proposed, abbreviated as PCMF hereafter. It mainly tackles value 9, it means all ten shift registers have started to read
the issue of interrupted output inherent in the convolution input pixel data and register one has already finished
process. The PCMF structure is shown in Figure 9. Ten shift generating the first row of data output, so one can reuse the
registers, namely SR1 to SR10, take their turns to read data first shift register and data window pair for the convolution of
from the corresponding RAM so that seamless convolution the 11th row. The third and last counter in the design structure
can be conducted. cnt2 is used to count the repetitions every shift register goes
through. As the final output includes 88 rows of data, 10 shift
2. Procedure of proposed PCMF
registers would be reused 9 times with the last two reused 8
The region of interest (ROI) in the target retinal images is times as they are not needed for the last round, eventually
decreased by eight pixels so that the pixels closest to the producing a total of 88*10*8+88*8=88*88=7744 pixels as
border can be discarded. It is reasonable because the number expected.
of pixels utilized in calculating the local average intensity in Controlling the read-enabling signal of each shift register
the middle is greater than the number of pixels utilized closest and the start and stop of each convolution will be tedious if
to the border. The step size is 1, and the padding size is 0 for they are to be designed independently as separate enabling
the convolution operations. With shift registers of size 9*96 signals. Therefore, they are designed using one signal with
and convolution kernel size being 9*9, the resulting image multiple bits, each bit controlling a corresponding shift
after convolution would be 88*88, a total of 7744 pixels as register. For example, the read signal shown in Figure 11 is
suggested by Equation (6), where 𝑁 suggests the output image designed to change simultaneously with cnt1 when cnt2 is
size after convolution is 𝑁 ∗ 𝑁, 𝑊 indicates the input image zero because that is the time frame for the ten shift registers to
size is 𝑊 ∗ 𝑊 , 𝐹 means convolution kernel size is 𝐹 ∗ 𝐹 , read input pixel data, and no output is produced. With the
convolution step is 𝑆, and padding step is 𝑃. Shift registers are three-counter structure, one can control the shift registers and
used to shift out input image data during convolution, and convolution operations without a separate lengthy state
window data is designed to take corresponding values in the machine. As mentioned before, it takes 864 clock cycles for a
shift register to perform convolution with a predefined kernel. shift register to be filled up and ready to start convolution.
Each shift register and window data pair can produce a Therefore, the first shift register reads input pixel data when
continuous convolution result for one row of input data, which cnt1 is less than 9 and has to stop reading data when cnt0 is 71
includes 88 output pixels. However, before starting this and cnt1 is 9, which is 864 clock cycles. Similar to the other
operation, it takes the first shift register 864 clock cycles to be shift registers, each shift register is delayed 88 clock cycles
filled with 9*96 pixels of image data and ready for between them. The first shift register starts to do the
convolution. Therefore, to reuse the first shift register for convolution the moment cnt2 changes from 0 to 1, and it
another row of convolution, one has to wait for at least 864 would take 88 cycles to finish. After that, convolutions from
clock cycles before the stored data can be replaced with the other 9 shift registers follow up. When the first shift
updated ones as convolution kernel size is 9*9 and a register is doing convolution, it is also reading another row of
continuous 9 rows of image data has to be in the shift register. input data simultaneously so that when the 10th shift register
To eliminate the timing gap between two rows of convolution, finishes producing the output, the first one can start doing
10 shift registers are required to take their turns and do the convolution for the 11th row of data. The same pattern of
convolution so that the outputs from 10 independent reading and doing convolution can be repeated until the full
convolutions would consume 88*10=880 clock cycles, more image data is processed.
than enough for a reused shift register to be filled up with
updated data.
𝑊−𝐹+2𝑃
𝑁= +1 (6)
𝑆
3. Controlling shift registers and convolutions

A well-organized shift register would take turns reading the
image data and producing the output needed for convolution.
Otherwise, the 10 shift registers would be out of control
easily. The proposed design structure is based on the working Fig 10. Structure of counters
procedure built on three counters, cnt0, cnt1 and cnt2, as
shown in Figure 10. The first counter cnt0 is used to counter
88 clock cycles so that two adjacent shift registers can be 88
clock cycles apart, indicating the second shift register would
start to read image pixel data 88 clock cycles after the first
shift register starts to read so that the second row of
convolution output can follow up the first one. They can
Fig 11. Designing read enable signal for shift registers
produce convolution results of two rows of data without any
4. Simulation and verification of PCMF technique is superior to traditional CPU-based software
With an FPGA RAM block able to offer two ports reading implementations on MATLAB in terms of processing speed.
simultaneously, this design technique would require a total of
5 independent RAMs, each of which would support two
independent shift registers.
The first step in this PCMF technique is to read the sliced
image data and store the data in five separate RAMs on
FPGA. Each RAM is configured to be a dual-port RAM so
that two separate convolution operations can be conducted one
after another. Subsequently, the convolution operation is
scheduled to start the moment shift register1 has received all
864 pixels of data, and the result is sent to the Max module for
value comparison so that the maximum value of the 12
convolution results can be retained as the final result of the
corresponding input image pixel. Given that 12 different input
values in the Max module would require 4 different
comparisons for a final maximum value, it would take the
Max module 5 clock cycles to produce a max value for a set of
12 input data from the upstream module, with one clock cycle
timing slack introduced for data synchronization. To produce
an output data for every clock cycle, the Max module has been
designed to operate at a 5x higher frequency than the Fig 13. Processing speed comparison between CPU and FPGA
convolution module. It can be implemented on an FPGA
without effort by instantiating a built-in PLL module. The
Most of the previously matched filter-based techniques were
simulated results can be seen in Figure 12, where out_cnt is
proposed for one-dimensional processing signals[30][31], the
the output pixel counter, and out_data is the generated output
structure of which is shown in Figure 14 (a). Figure 14 (b) and
pixel data. The last output pixel data in the first row is 596
(c) compare the structures between [27] and the proposed
when out_cnt is 88, and pixel data in the second row instantly
PCMF technique. It is revealed that PCMF exploits further
follow up with no gap in between.
parallelism in matched filter structure by using on-chip RAM
as an image data buffer. By using five separate RAMs in a
single convolution module, continuous convolution without a
gap can be achieved, as previously verified and shown in
Figure 12. Processing speed comparison for different kernel
sizes has also been conducted and shown in Figure 15. The
input image is 96*96 pixels, and the output is 88*88 pixels. As
is indicated in the figure, the processing time for a traditional
Fig 12. Output data indicating no gap between two rows of data matched filter-based FPGA implementation increases linearly
as kernel size increases. The proposed PCMF technique offers
V. RESULTS COMPARISON constant processing time for a given input image as it
eliminates the gap between two rows of data during
The first proposed Minimal Resource Matched Filter convolution. Therefore, the extra time consumed by a
technique is superior to the design in [27] because it improves traditional matched filter without the PCMF method,
overall physical efficiency. The Parallel Continuous Matched represented by the green line in Figure 15, increases with
Filter technique offers further improvement compared to increasing kernel size.
MRMF as it eliminates the gap when generating output
between two rows of output pixel data. In Figure 13, the
processing time needed is recorded and compared between the
MRMF method implemented in FPGA and a pure software-
matched filter method implemented on MATLAB for various
numbers of pictures. Processing time for the pure software
method was recorded during simulation in MATLAB R2020a
with a parallel pool set to 64 workers on an AMD EPYC 7B12
64-core server processor. MATLAB simulation was repeated
10 times for a given number of pictures, and the average
processing time was recorded along with positive and negative
variations shown as red bars on the plot. Meanwhile, the
processing time for FPGA is calculated based on a clock (a) 1-D matched filter structure [30][31]
frequency of 100MHz. It shows that the proposed MRMF
Conv(N*M)
Input Buffer Max Value
Conv(N*M)
(b) 2-D matched filter structure [27]
Fig 16. Computational efficiency comparison between PCMF and [27]
VI. RESULTS CONCLUSION

Some researchers have suggested a two-dimensional
matched filter for retinal blood vessel segmentation and
enhancement, which has been evaluated as an effective
technique. However, very few publications target the issue of
excessive resource consumption of matched filters
(c) structure of the proposed PCMF implemented on FPGA or releasing the full potential
Fig 14. Structure comparison between matched filter techniques processing speed of a hardware-implemented matched filter
technique.
This work presents two matched filter-based retinal blood
vessel enhancement techniques. The first Minimal Resource
Matched Filter technique aims to reduce excessive physical
resource exploitation for budget or physical resource-
constrained applications while maintaining the advantage of
accelerated parallel convolution due to the implementation of
the matched filter algorithm on FPGA. Contrary to the
previous technique that employs different shift registers for
different convolutions, even when the input pixel data is the
same, one can design the shift register as a shared resource
among different convolution kernels. Therefore, minimal
resource consumption can be achieved. The second proposed
Parallel Continuous Matched Filter technique offers further
speed improvement based on the first one in that it eliminates
the inherent gap incurred in a convolution operation between
two rows of data. This technique uses five dual-port RAM
blocks for input data storage. Two shift registers are employed
Fig 15. Computational efficiency comparison for different kernel sizes for each RAM block to perform convolution in a pipelined
manner, which eliminates the gap between two rows of data.
A further comparison, shown in Figure 16, is made between Simulations have been carried out in Modelsim and MATLAB
PCMF and [27], where the core frequency of FPGA is set to as a comparison.
100MHz, so processing time for both techniques can be It has been revealed that the two proposed FPGA-based
calculated using clock period 10ns. The improvement in techniques offer highly accelerated processing of retinal blood
processing speed is calculated to be (TPCMF-T[27])/T[27]=8.99%, vessel enhancement, especially compared with traditional
where TPCMF, T[27] are processing times for the PCMF CPU-based software implementation. The PCMF technique
technique and [27], respectively. There is expected to be more offers a further improvement of around 9% in processing
speed improvement if the image size is smaller or convolution speed compared to the first proposed Minimal Resource
kernel size is larger, as indicated in Figure 15. Matched Filter technique. It can be deployed where real-time
processing and the instant evaluation result are desired while
additional hardware requirement is an acceptable tradeoff. segmentation method based on deep learning network. Computerized
Medical Imaging and Graphics, 90, 101902.
Future research endeavors shall be made to reduce the
[14] Chen, C., Chuah, J. H., Ali, R., & Wang, Y. (2021). Retinal vessel
physical resources consumed by PCMF and to design a segmentation using deep learning: a review. IEEE Access, 9, 111985-
collaborative system where the proposed technique shall be 112004.
used as a hardware acceleration module. [15] Yang, Y., Wan, W., Huang, S., Zhong, X., & Kong, X. (2022). RADCU-
Net: residual attention and dual-supervision cascaded U-Net for retinal
blood vessel segmentation. International Journal of Machine Learning
Declarations and Cybernetics, 1-16.
[16] Samant, P., Bansal, A., & Agarwal, R. (2020). A hybrid filtering-based
Ethical Approval retinal blood vessel segmentation algorithm. In Computer Vision and
Machine Intelligence in Medical Image Analysis: International
No human or animal studies. Symposium, ISCMM 2019 (pp. 73-79). Springer Singapore.
[17] Jin, Q., Meng, Z., Pham, T. D., Chen, Q., Wei, L., & Su, R. (2019).
Competing interests DUNet: A deformable network for retinal vessel segmentation.
No conflict of interest. Knowledge-Based Systems, 178, 149-162.
[18] Shukla, A. K., Pandey, R. K., & Pachori, R. B. (2020). A fractional filter
based efficient algorithm for retinal blood vessel segmentation.
Funding Biomedical Signal Processing and Control, 59, 101883.
No funding was received for this work. [19] Ooi, A. Z. H., Embong, Z., Abd Hamid, A. I., Zainon, R., Wang, S. L.,
Ng, T. F., ... & Ibrahim, H. (2021). Interactive blood vessel segmentation
from retinal fundus image based on canny edge detector. Sensors, 21(19),
Availability of data and materials 6380.
Data sharing not applicable to this article as no new datasets [20] Dash, S., & Senapati, M. R. (2020). Enhancing detection of retinal blood
were generated or analyzed during this study. vessels by combined approach of DWT, Tyler Coye and Gamma
correction. Biomedical Signal Processing and Control, 57, 101740.
[21] Tchinda, B. S., Tchiotsop, D., Noubom, M., Louis-Dorr, V., & Wolf, D.
REFERENCES
(2021). Retinal blood vessels segmentation using classical edge detection
filters and the neural network. Informatics in Medicine Unlocked, 23,
[1] Chung, Y. C., Xu, T., Tung, T. H., Chen, M., & Chen, P. E. (2022). Early
100521.
screening for diabetic retinopathy in newly diagnosed type 2 diabetes and
[22] Li, M., Zhou, S., Chen, C., Zhang, Y., Liu, D., & Xiong, Z. (2022,
its effectiveness in terms of morbidity and clinical treatment: A
March). Retinal vessel segmentation with pixel-wise adaptive filters. In
nationwide population-based cohort. Frontiers in Public Health, 10.
2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI)
[2] https://www.glaucomapatients.org/basic/statistics/
(pp. 1-5). IEEE.
[3] Aumann, S., Donner, S., Fischer, J., & Müller, F. (2019). Optical
[23] Erwin, Safmi, A., Desiani, A., Suprihatin, B., & Fathoni. (2022). The
coherence tomography (OCT): principle and technical realization. High
Augmentation Data of Retina Image for Blood Vessel Segmentation
resolution imaging in microscopy and ophthalmology: new frontiers in
Using U-Net Convolutional Neural Network Method. International
biomedical optics, 59-85.
Journal of Computational Intelligence and Applications, 21(01),
[4] Israelsen, N. M., Petersen, C. R., Barh, A., Jain, D., Jensen, M.,
2250004.
Hannesschläger, G., ... & Bang, O. (2019). Real-time high-resolution
[24] Ghani, A., See, C. H., Sudhakaran, V., Ahmad, J., & Abd-Alhameed, R.
mid-infrared optical coherence tomography. Light: Science &
(2019). Accelerating retinal fundus image classification using artificial
Applications, 8(1), 11.
neural networks (ANNs) and reconfigurable hardware (FPGA).
[5] Kyei, S., Owusu-Afriyie, B., Tagoh, S., Kwarteng, M. A., Nsiah, P., &
Electronics, 8(12), 1522.
Guramatunhu, S. (2021). Clinical and sociodemographic characteristics
[25] Bendaoudi, H., Cheriet, F., Manraj, A., Ben Tahar, H., & Langlois, J. P.
of glaucoma patients at a tertiary referral facility in Zimbabwe. Malawi
(2018). Flexible architectures for retinal blood vessel segmentation in
Medical Journal, 33(1), 15-20.
high-resolution fundus images. Journal of Real-Time Image Processing,
[6] Cennamo, G., Reibaldi, M., Montorio, D., D'Andrea, L., Fallico, M., &
15(1), 31-42.
Triassi, M. (2021). Optical coherence tomography angiography features
[26] Hajabdollahi, M., Karimi, N., Reza Soroushmehr, S. M., Samavi, S., &
in post-COVID-19 pneumonia patients: a pilot study. American Journal
Najarian, K. (2018). Retinal blood vessel segmentation for macula
of Ophthalmology, 227, 182-190.
detachment surgery monitoring instruments. international journal of
[7] Lavanya, R. (2021, May). Combined Diagnosis of Diabetic Retinopathy
circuit theory and applications, 46(6), 1166-1180.
and Glaucoma Using Non-Linear Features. In 2021 5th International
[27] Xiang, W., Li, D., Sun, J., Liu, J., Zhou, G., Gao, Y., & Cui, X. (2021).
Conference on Computer, Communication and Signal Processing
FPGA-Based Two-Dimensional Matched Filter Design for Vein Imaging
(ICCCSP) (pp. 1-6). IEEE.
Systems. IEEE Journal of Translational Engineering in Health and
[8] Mookiah, M. R. K., Acharya, U. R., Chua, C. K., Min, L. C., Ng, E. Y.
Medicine, 9, 1-10.
K., Mushrif, M. M., & Laude, A. (2013). Automated detection of optic
[28] Chaudhuri, S., Chatterjee, S., Katz, N., Nelson, M., & Goldbaum, M.
disk in retinal fundus images using intuitionistic fuzzy histon
(1989). Detection of blood vessels in retinal images using two-
segmentation. Proceedings of the Institution of Mechanical Engineers,
dimensional matched filters. IEEE Transactions on medical imaging,
Part H: Journal of Engineering in Medicine, 227(1), 37-49.
8(3), 263-269.
[9] Moccia, S., De Momi, E., El Hadji, S., & Mattos, L. S. (2018). Blood
[29] https://www.kaggle.com/datasets/andrewmvd/drive-digital-retinal-
vessel segmentation algorithms—review of methods, datasets and
images-for-vessel-extraction
evaluation metrics. Computer methods and programs in biomedicine,
[30] Fadhil, M., & Farhan, H. (2019). Design a prototype FPGA model for
158, 71-91.
target detection by radars passive based on synthetic-aperture radar
[10] Kumar, K. S., & Singh, N. P. (2022). Analysis of retinal blood vessel
algorithm. Journal of Engineering Science and Technology, 14(3), 1542-
segmentation techniques: a systematic survey. Multimedia Tools and
1557.
Applications, 1-55.
[31] Choi, Y., Jeong, D., Lee, M., Lee, W., & Jung, Y. (2021). Fpga
[11] Imran, A., Li, J., Pei, Y., Yang, J. J., & Wang, Q. (2019). Comparative
implementation of the range-doppler algorithm for real-time synthetic
analysis of vessel segmentation techniques in retinal images. IEEE
aperture radar imaging. Electronics, 10(17), 2133.
Access, 7, 114862-114887.
[12] Atli, I., & Gedik, O. S. (2021). Sine-Net: A fully convolutional deep
learning architecture for retinal blood vessel segmentation. Engineering
Science and Technology, an International Journal, 24(2), 271-283.
[13] Boudegga, H., Elloumi, Y., Akil, M., Bedoui, M. H., Kachouri, R., &
Abdallah, A. B. (2021). Fast and efficient retinal blood vessel

Accelerated Blood Vessel Enhancement in Retinal Fu

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Accelerated Blood Vessel Enhancement in Retinal Fu

Uploaded by

Copyright:

Available Formats

Accelerated Blood Vessel Enhancement in Retinal

Fundus Image based on Recon gurable Hardware

Posted Date: April 26th, 2023

Additional Declarations: No competing interests reported.

The new corresponding location after rotation is given by

The weighting coefficients that have been rotated in the ith

The average value of the convolution kernel coefficients

Based on the above theory, 12 convolution kernels were

(c) Last 16 results from Modelsim Simulation

(d) Last 16 results from MATLAB Simulation

Fig 7. Last 16 simulation results comparison between Modelsim and

IV. PARALLEL CONTINUOUS MATCHED FILTER TECHNIQUE

(a) First 16 results from Modelsim Simulation

(b) First 16 results from the MATLAB simulation

Fig 6. First 16 simulation results comparison between Modelsim and

3. Controlling shift registers and convolutions

Input Buffer Max Value

(b) 2-D matched filter structure [27]

Fig 16. Computational efficiency comparison between PCMF and [27]

VI. RESULTS CONCLUSION

You might also like