You are on page 1of 189

KU Leuven

Biomedical Sciences Group

Faculty of Medicine

Department of Imaging and Pathology

Medical Physics and Quality Assessment

A SIMULATION FRAMEWORK FOR


VIRTUAL CLINICAL TRIALS IN
CHEST RADIOGRAPHY

Sunay RODRÍGUEZ PÉREZ

Jury:

Supervisor: Prof. Dr. Nicholas W. Marshall


Co-Supervisors: Prof. Dr. ir. Hilde Bosmans
Dr. Lara Struelens

Chair Prof. Dr. Tania Roskams


Secretary: Prof. Dr. Tom Depuydt
Dissertation presented
Jury members: Prof. Dr. Klaus Bacher
in partial fulfilment of
Prof. Dr. Mathias Prokop
the requirements for
Prof. Dr. Josep Sempau
the degree of Doctor in
Prof. Dr. Walter De Wever
Biomedical Sciences
Prof. Dr. Tom Depuydt

July 2021
Table of contents

List of abbreviations vi
General Introduction 1
Motivation for chest radiography ....................................................................................................................... 1
Technical aspects and clinical request of CXR ............................................................................................. 2
Image quality and optimization in digital chest radiography ............................................................................. 2
Methods for optimization .............................................................................................................................. 3
Objectives of the thesis ...................................................................................................................................... 6

Chapter 1 Chest radiography at UZ Leuven hospital. Classical optimization study of chest


radiography 9
1.1 Chest radiography at UZ Leuven ................................................................................................................. 9
1.1.1 Methods................................................................................................................................................ 9
1.1.2 Results ................................................................................................................................................ 11
1.2 Classical optimization study ...................................................................................................................... 15
1.2.1 Methods.............................................................................................................................................. 15
1.2.1.1 Optimization variables................................................................................................................ 15
1.2.1.2 Image dataset acquisition ........................................................................................................... 15
1.2.1.3 Figure of merit (FOM) ............................................................................................................... 17
1.2.2 Results ................................................................................................................................................ 18
1.2.3 Discussion ..................................................................................................................................... 25
1.3 General conclusions ................................................................................................................................... 28

Chapter 2 Physical thorax phantom Lungman, characterization and validation for CXR studies
30
2.1 Introduction................................................................................................................................................ 30
2.2 Methods ............................................................................................................................................. 32
2.2.1 PMMA equivalence of the Lungman phantom ............................................................................. 32
2.2.2 Comparison of the Lungman phantom and real patients in terms of EI, KAP and exposure time 33

i
2.2.3 Creation of a voxel model of the Lungman phantom. Comparison of organ absorbed dose using
Kyoto Kagaku tissue equivalent and ICRP materials .................................................................................. 33
2.3 Results ............................................................................................................................................... 35
2.3.1 PMMA equivalence of the Lungman ............................................................................................ 35
2.3.2 Comparison of the Lungman phantom to patients using EI, KAP and exposure time
…………………………………………………………………………………………..………..37
2.3.3 Voxelized model of the Lungman phantom. Comparison of organ dose for ICRP and Kyoto
Kagaku tissue equivalent materials ............................................................................................................. 39
2.4 Discussion ......................................................................................................................................... 40
2.5 Conclusion ......................................................................................................................................... 41

Chapter 3 A new approach to dose and image quality surveys in chest radiography 42
3.1 The combined use of KAP and EI for improving outlier selection in dose monitoring for projection
radiology .......................................................................................................................................................... 42
3.1.1 Introduction ........................................................................................................................................ 42
3.1.2 Methods.............................................................................................................................................. 43
3.1.2.1 Collecting the data ...................................................................................................................... 43
3.1.2.2 Expressing exposure index in terms of detector air kerma ......................................................... 43
3.1.2.3 Outlier selection ......................................................................................................................... 44
3.1.3 Results ................................................................................................................................................ 44
3.1.3.1 Expressing EI in terms of DAK .................................................................................................. 44
3.1.3.2 Outliers ....................................................................................................................................... 44
3.1.4 Conclusions ........................................................................................................................................ 46
3.2 Survey of chest radiography systems: any link between contrast detail measurements and visual grading
analysis? ........................................................................................................................................................... 47
3.2.1 Introduction ........................................................................................................................................ 47
3.2.2 Methods.............................................................................................................................................. 47
3.2.2.1 Contrast detail (c-d) test object ................................................................................................... 47
3.2.2.2 Anthropomorphic phantom ........................................................................................................ 48
3.2.2.3 Image acquisition ....................................................................................................................... 49
3.2.2.4 Image quality assessment ........................................................................................................... 49
3.2.3 Results ................................................................................................................................................ 51
3.2.3.1 Acquisition settings .................................................................................................................... 51
3.2.3.2 IAK values for Lungman and TO20 ........................................................................................... 53
3.2.3.3 TO20 test object score ................................................................................................................ 53
3.2.3.4 VGAS ......................................................................................................................................... 56
3.2.3.5 VGC score – comparison against a reference system ................................................................. 58
3.2.3.6 Correlation between TO20 and VGAS ....................................................................................... 58

ii
3.2.4 Discussion .......................................................................................................................................... 59
3.2.5 Conclusions ........................................................................................................................................ 62
3.3 General conclusions of the Chapter ........................................................................................................... 63

Chapter 4 Methodology and validation of a simulation platform for Virtual Clinical Trials in
chest radiography 64
4.1 Introduction................................................................................................................................................ 64
4.2 Methods ..................................................................................................................................................... 65
4.2.1 Creation of hybrid images .................................................................................................................. 65
4.2.1.1 Creation of scatter projections .................................................................................................... 66
4.2.1.2 Creation of primary projections .................................................................................................. 67
4.2.2 Adding real sharpness and noise characteristics to the hybrid images ............................................... 68
4.2.2.1 Detector characterization: Signal transfer properties.................................................................. 68
4.2.2.2 Detector characterization: Modulation Transfer Function .......................................................... 68
4.2.2.3 Detector characterization: Noise Power Spectrum ..................................................................... 69
4.2.2.4 Noise separation: generation of 2D noise coefficients ............................................................... 69
4.2.2.5 Scaling PV in hybrid image ........................................................................................................ 69
4.2.2.6 Sharpness modification routine: adding detector MTF to the simulated images ........................ 70
4.2.2.7 Noise modification routine: creation of noise image .................................................................. 70
4.2.3 Validation ........................................................................................................................................... 71
4.2.3.1 Image validation dataset: measurements of Signal Difference to Noise Ratio ........................... 71
4.2.3.2 Validation sharpness modification routine ................................................................................. 72
4.2.3.3 Validation noise modification routine ........................................................................................ 72
4.2.3.4 Validation of antiscatter grid simulation .................................................................................... 73
4.3 Results........................................................................................................................................................ 73
4.3.1 Detector characterization ................................................................................................................... 73
4.3.2 Creation of synthetic radiographic images including real detector characteristics............................. 76
4.3.3 Validation: SDNR measurements ...................................................................................................... 76
4.3.4 Validation: sharpness modification routine ........................................................................................ 78
4.3.5 Validation: noise modification routine ............................................................................................... 79
4.3.5.1 Noise simulation in rib equivalent materials .............................................................................. 79
4.3.6 Validation: grid simulation................................................................................................................. 80
4.4 Discussion .................................................................................................................................................. 81
4.4.1 Limitations and comparison to other simulation frameworks. ........................................................... 82
4.5 Conclusions................................................................................................................................................ 83

Chapter 5 Modelling of anthropomorphic chest phantoms and associated clinical tasks 85


5.1 Modelling of anthropomorphic phantoms including clinical tasks for use in Virtual Clinical Trials ........ 86

iii
5.1.1 Methods.............................................................................................................................................. 86
5.1.1.1 Realistic Anthropomorphic Flexible (RAF) phantom ................................................................ 86
5.1.1.2 Modifications to the RAF phantom ............................................................................................ 86
5.1.1.3 Creation of clinical tasks within the phantom models ................................................................ 89
5.1.1.4 Voxelization of the phantoms ..................................................................................................... 92
5.1.1.5 Creation of radiographic images................................................................................................. 93
5.1.1.6 Task realism ............................................................................................................................... 94
5.1.2 Results ................................................................................................................................................ 95
5.1.2.1 Comparison of organ volumes, mesh vs voxels ....................................................................... 100
5.1.2.2 Realism study ........................................................................................................................... 101
5.1.3 Discussion ........................................................................................................................................ 105
5.1.3.1 Comparison to the Lungman phantom ..................................................................................... 105
5.1.3.2 Library of computational models including clinical tasks ........................................................ 106
5.1.3.3 Realism evaluation ................................................................................................................... 107
5.2 Methodology to create 3D models of Covid-19 pathologies for Virtual Clinical .................................... 108
5.2.1 Introduction ...................................................................................................................................... 108
5.2.2 Materials and methods ..................................................................................................................... 108
5.2.2.1 RAF Phantom ........................................................................................................................... 108
5.2.2.2 Modelling of the pathologies from CT scans ........................................................................... 109
5.2.2.3 Creation of voxelized phantoms ............................................................................................... 110
5.2.2.4 Generating radiographic images using a simulation framework .............................................. 111
5.2.2.5 Assessment of task realism ....................................................................................................... 111
5.2.3 Results .............................................................................................................................................. 112
5.2.3.1 Pathology models ..................................................................................................................... 112
5.2.3.2 Mesh and volume comparison .................................................................................................. 113
5.2.3.3 Simulated radiographic images ................................................................................................ 114
5.2.3.4 Assessment of task realism ....................................................................................................... 116
5.2.3.5 BMI and pathology modifications ............................................................................................ 118
5.2.4 Discussion ........................................................................................................................................ 119
5.3 General Conclusions ................................................................................................................................ 122

Chapter 6 Virtual clinical trial in chest radiography 123


6.1 Methods ................................................................................................................................................... 123
6.1.1 Anthropomorphic computational models ......................................................................................... 123
6.1.2 Image simulation .............................................................................................................................. 123
6.1.2.1 Scatter images........................................................................................................................... 124
6.1.2.2 Primary images ......................................................................................................................... 125

iv
6.1.2.3 Adding sharpness and noise ..................................................................................................... 127
6.1.3 Organ dose calculations ................................................................................................................... 127
6.1.4 Image dataset.................................................................................................................................... 128
6.1.5 Observer study using FROC ............................................................................................................ 129
6.1.6 Statistical analysis ............................................................................................................................ 130
6.2 Results...................................................................................................................................................... 131
6.2.1 Simulated images ............................................................................................................................. 131
6.2.2 Organ doses ...................................................................................................................................... 134
6.2.3 Reading study ................................................................................................................................... 138
6.2.3.1 JAFROC analysis – dose modalities ........................................................................................ 140
6.2.3.2 JAFROC analysis – tube voltage/grid use modalities .............................................................. 142
6.2.3.3 Catheter localization vs dose .................................................................................................... 147
6.2.3.4 Noise level acceptability vs dose .............................................................................................. 147
6.3 Discussion ................................................................................................................................................ 148
6.4 Practical conclusions ................................................................................................................................ 152

Conclusions and future work 154


References 161
Summary 173
Acknowledgments 175
Curriculum Vitae 177

v
List of abbreviations

Acronym Term
AAPM American Association of Physicists in Medicine
AEC Automatic Exposure Control
AFROC Alternative Free-response Receiver Operating Characteristic
AI Artificial Intelligence
AP Anterior Posterior
AUC Area Under the Curve
BMI Body Mass Indexes
CI Confidence Interval
CNR Contrast to Noise Ratio
CR Computed Radiography
CT Computed Tomography
CXR Chest Radiography
DAK Detector Air Kerma
DCC Dose Conversion Coefficient
DI Deviation Index
DQE Detective Quantum Efficiency
DR Digital Radiography
DRL Diagnostic Reference Level
EI Exposure Index
ESAK Entrance Surface Air Kerma
EXI Siemens Exposure Index
FFD Free Form Deformation
FFT Fast Fourier Transform
FN False Negative
FOM Figure of Merit
FOV Field of View
FP False Positive

vi
FPD Flat Panel Digital Detector
FPF False Positive Fraction
FROC Free-response Receiver Operating Characteristic paradigm
GGO Ground Glass Opacity
HD Hausdorff Distance
HU Hounsfield Unit
HVL Half Value Layer
IAK Incident Air Kerma
ICC Intra-Class Correlation
ICS Image quality Criteria Score
JAFROC Jackknife Free-response Receiver Operating Characteristic analysis
KAP Kerma Area Product
LAT Lateral
LUT Look Up Table
MC Monte Carlo
MPV Mean Pixel Value
MTF Modulation Transfer Function
NNPS Normalized Noise Power Spectrum
NPS Noise Power Spectrum
PA Posterior Anterior
PACS Picture Archiving and Communication System
PMMA Poly(methyl methacrylate)
PV Pixel Value
PVC Polyvinyl chloride
QC Quality Control
RAF Realistic Anthropomorphic Flexible
REX Reached Exposure Value
ROC Receiver Operating Characteristic
ROI Region of Interest
SDNR Signal Difference to Noise Ratio
SE Standard Error
SID Source-to-Image receptor Distance
SPR Scatter to Primary Ratio
TEI Target Exposure Index
TN True Negative
TO20 Contrast detail test object produced by Leeds Test Objects
TP True Positive
Tp Transmission of Primary radiation
Tt Transmission of Total radiation
TQM Dose management platform

vii
VCT Virtual Clinical Trial
VGA Visual Grading Analysis
VGAS Visual Grading Analysis Score
VGC Visual Grading Characteristics
VSC Vlaams Supercomputer Centrum

viii
General Introduction

Motivation for chest radiography

Chest radiography (CXR) is some of the most frequently performed diagnostic procedures carried out in clinical
practice. According to ICRU Report 70, CXRs constitute approximately 25% of all X-ray examinations performed
in radiology departments [1]. Despite the availability of alternative diagnostic techniques like Computed
Tomography (CT) that produces volumetric (3D) datasets, plain radiography remains the core imaging
examination for the chest. Its main advantages are short acquisition times, rapid reading and interpretation, relative
low cost and wide availability. Compared to modalities like CT, the dose delivered to the patient is low [2], and,
furthermore, the images are easily archived and occupy little space on the picture archiving and communication
system (PACS). Erect examinations of the thorax are a fast way to rule out a broad range of respiratory diseases
or to monitor response to a certain treatment. Additionally, mobile systems are widely used in Intensive Care
Units (ICU) to monitor patients in critical condition, without having to transport them to the radiology department.

Chest radiography involves the imaging of a wide range of tissues with different X-ray attenuations, from high
density in the bones to low density in the lung tissue. This results in a wide dynamic range that must be
accommodated by the X-ray detector. Furthermore, the imaging system must be able to render extremely fine
details in the lung tissue and bone structure, and at the same time show low contrast structures in the lungs. This
presents a great challenge in terms of X-ray technique selection, the imaging receptor and the applied image
processing. Additionally, there is usually a combination of clinical reasons for which a chest radiography is
requested. All these factors means that thorax imaging remains a challenging examination today, despite all the
accumulated experience over decades of CXR studies [3].

The guidelines for good radiographic technique were established by the European Commission in 1996 [4]. For
Posterior Anterior (PA) and Lateral (LAT) chest projections, a tube voltage of 125 kVp is suggested, a total
filtration equal or greater than 3 mm of aluminium (Al) and exposure times of less than 20 ms and 40 ms for PA
and LAT projections, respectively [4]. This guidance was published explicitly for screen-film (S/F) (i.e., analogue)
techniques and remains influential to this day despite the dramatic change in technology currently used for chest
imaging.

Nowadays, most radiology departments use exclusively digital X-ray detectors for projection imaging. Digital
detectors offer many advantages over S/F technology, including increased dynamic range, immediate availability

1
of images on the hospital imaging network and extremely flexible post processing options. However, the
introduction of digital detectors has the potential to increase patient dose [5,6]. An increase in radiation exposure
might go unnoticed as this reduces image noise and consequently improves image quality [5,6]. In S/F imaging,
such an increase in exposure would have led to a loss of contrast and increase in optical density, possibly rendering
the film unreadable. Notwithstanding all the benefits of digital radiography with flat panel detectors and the scope
for optimization in chest X-ray imaging, many current clinical protocols are derived from a combination of the
European guidance [4] for S/F imaging, manufacturer’s advice and pre-set protocols installed on the system,
together with local protocols established over many years, probably using the older S/F technology.

Technical aspects and clinical request of CXR

There are three standard projections in chest radiography: PA, LAT and Anterior Posterior (AP). In the PA
configuration the patient stands erect facing the entrance plane of the image receptor/grid assembly (known as the
Bucky) with their back facing the X-ray source. In the LAT projection the patient is standing with the left side
facing the Bucky while AP projections are commonly acquired in patients with limited mobility, confined to a
bed or a chair, in this case the detector is placed behind the patient. Choice of projection results in marked
differences in the images in terms of the anatomy displayed, for example in AP the heart is magnified with respect
to PA [7].

A list of indications for chest X-ray in symptomatic and asymptomatic patients have been reported in many
publications [1,8]. Some of the more common indications are:

• Known or suspected thoracic abnormality for patients in primary care and general practice, secondary,
tertiary care/ hospitalized patients and follow up.
• Screening for lung cancer in asymptomatic (high risk) population.
• Symptoms or signs associated to the respiratory and cardiac system, such as chest pain, dyspnoea or
cough.
• Daily monitoring of patients in the ICU, patients with life supporting devices, and patients who have
undergone cardiac or thoracic surgery.
• Assessment of patients with chest trauma.
• Routine, as a complement to physical examination on hospital admission, pre-employment or
preoperative, and occupational lung disease.
• Investigation of occult pulmonary infection recommended in post-transplant patients.
• For neonates and children, radiography remains as the primary diagnostic study for the chest.
• To rule out metastasis in patients with extra thoracic malignancies.

Based on the clinical assessment and/or evaluation of the chest radiography, further examination of the chest with
other imaging modalities may be indicated. This typically happens when chest radiography fails to reveal or
characterize an abnormality.

Image quality and optimization in digital chest radiography

Optimization in diagnostic radiology is intended to ensure that the image quality is sufficient to achieve the correct
decision regarding patient treatment, while keeping the patient exposure as low as possible. While optimization

2
in S/F imaging focused on achieving the proper exposure in terms of dose magnitude, latitude of the S/F
combination and the achieved optical density, in digital imaging there are possibilities to tune the acquisition
parameters to specific clinical tasks. Examples include the use of lower or higher dose levels depending on the
request [9]. Optimization in digital radiology should then always be referred to the task in question.

Achieving a sufficient image quality such that the physician makes consistent, correct clinical decisions is a
complex subject. This can be explained by the different levels of image quality required by the wide range of
clinical tasks in the chest, among them retrocardiac abnormalities, subtle nodules, bone fractures, catheter
positioning/placement, etc [1]. An example of this is shown in Figure 1. To the left, we can see a noisy, low
contrast chest X-ray image where the clinical task is the visualization of a catheter. To the right, we can see a
good quality, high contrast chest X-ray image where the clinical task is the detection of a lung mass. In both
examples, the quality of the image can be considered satisfactory or appropriate, in the sense that the images
display the necessary information that was requested.

Technical parameters such as tube voltage, X-ray tube filtration, scattered radiation rejection method (i.e., air gap
or antiscatter grid), and automatic exposure control (AEC) selection have a direct influence on the image quality
and consequently on the dose delivered to the patient. The influence on diagnostic performance of these
parameters and others like the image receptor, image processing and display is the focus of many optimization
studies. To study the influence of these parameters on clinical performance, numerous images comparing different
exposure settings are often required. Many methods exist to perform these types of studies and these are covered
briefly in the section below.

Methods for optimization

First, we have clinical studies which offer the gold standard in terms of tasks, realism, and observer performance
determined by the radiologist. However, a number of factors may preclude their use. For example, a large number
of patients are required to generate statistically meaningful results, which can be difficult for the less common
pathologies. This makes optimization a time-consuming process in terms of case selection. Case reading may also
be problematic for chest radiology procedures, given the broad range of chest tasks that need to be sampled. A
further limitation of clinical studies is that results may only apply to patient groups of similar body habitus or
pathological condition, as the same patient cannot be imaged repeatedly without clinical justification. This can
affect the study design, making it difficult to highlight which exposure parameter is more likely to affect the
diagnostic performance, since anatomical differences and patient positioning can affect the outcome of the study.

An alternative may be to use physical phantoms as surrogates for patients. However, physical phantoms allow
little variability in the anatomy and the modelling of clinical tasks is very limited. Among the works using physical
phantoms, Tingberg and Sjostrom [10] used an anthropomorphic chest phantom to acquire images at different
tube voltages. They performed a Visual Grading Analysis (VGA) study based on subjective scorings based on the
CEC Quality Criteria [4] and found that image quality increased with the reduction of tube voltage at a fixed
effective dose. Numerous authors have performed optimization studies based on overall image quality criteria to
investigate the effect of different exposure parameters like dose level, patient thickness and tube voltage [11–13].
Metz et al. [14] performed a receiver operator characteristic (ROC) study in which the influence of tube voltage
and dose level in diagnostic performance was investigated. Images of an anthropomorphic physical phantom with

3
inserted nodules and interstitial and reticular lesions were evaluated by radiologists. They found that reduction of
dose in the mediastinum caused significant decrease in detectability, however in the lung region this effect was
not seen. They also found 120 kVp to be the most effective tube voltage for detectability and effective dose.

Figure 1. PA chest X-ray. a) Low contrast image for catheter visualization. b) High contrast image for lung nodule detection.

On the other hand, medical physics approaches tend to favour the measurement of technical parameters that
characterize the X-ray tube, grid and detector performance. X-ray tube performance is assessed using the peak
tube voltage, the exposure time and half value layer (HVL). Detector performance can be characterized using
parameters such as modulation transfer function (MTF), noise power spectrum (NPS) and detective quantum
efficiency (DQE). The MTF is used as a quantitative measure to describe the resolution or sharpness of the system.
The MTF quantifies the degradation of contrast (or modulation) transfer from detector input to output as a function
of spatial frequency. The NPS is used to characterize the noise texture in the image and is defined as the variance
within an image divided among various spatial frequency components. Finally, the DQE is the combined effect
of the sharpness and noise performance of the system [15,16]. Overall imaging performance of the system and/or
the detector can be measured using test objects, typically constructed from homogenous poly(methyl
methacrylate) (PMMA) plates in combination with metal details or inserts, to quantify the small detail
detectability and signal difference to noise ratio (SDNR). These technical measures can be made with excellent
precision and reproducibility but have the disadvantage of being far away from anatomical structures in real
patient images and the associated imaging tasks. Consequently, the results of these methods may not directly
predict clinical task performance. Following this line of research is the work of Launders [17] in which an
amorphous selenium digital detector was evaluated. DQE and threshold contrast were used as metrics to compare
system performance at different tube voltages. As a result, an optimal tube voltage between 90 and 110 kV was
proposed. In the work of Dobbins et al. [18], tissue contrast to bone contrast was used along with a figure of merit
(FOM) equal to squared signal to noise ratio (SNR) divided by the incident exposure to the patient as dose metric.
The found that 120 kVp and 0.2 mm copper filtration gave the best overall performance for both metrics used.

To overcome many of the challenges and shortcomings of these methods, computational modelling can be applied
in a method now called a Virtual Clinical Trial (VCT) [19]. In a VCT, computer simulations can be used to model
imaging systems including the X-ray source and detector and they can also include accurate, objective, and
detailed measurements of detector imaging performance, using the MTF and NPS. Realistic computational models
of anatomy can be used as input in the simulations as patient surrogates. These phantoms can simulate the

4
variability in the human anatomy [20–22], additionally different pathologies and devices can be included as
clinical tasks [23,24]. The simulated images of the anthropomorphic models can then be evaluated by human or
model observers [25–27]. As a result, VCTs are a practical, inexpensive, and flexible way to investigate diagnostic
performance in controlled environments where the ground truth is always known. These methods are not free
from difficulties of course, in that the imaging system must be accurately modelled and the anatomy and pathology
must be sufficiently realistic. VCTs have been successfully applied in different imaging modalities. In CT
imaging, these methods have been used to evaluate organ doses for different acquisition protocols [28] and to test
different reconstruction techniques [29–31]. In breast imaging VCTs have been conducted to evaluate lesion
detectability for different imaging modalities [32].

Regarding simulation platforms to produce synthetic images, there are a few specifically developed for chest
radiography. Among them, Sandborg et al. [33] have used a Monte Carlo framework to simulate radiographic
images of computational models, which were used to optimize chest radiography in a S/F system based on CEC
image quality criteria. This framework lacks detailed detector characteristics in the resultant images. The
framework was also applied by Ullman et al. [34] to investigate nodule detection in an anthropomorphic
computational phantom. In that case, the transmission of primary and scattered photons through the grid was not
simulated but calculated analytically following the work of Day and Dance [35]. Ullman et al. used nodule contrast
relative to bone contrast and a FOM equal to SNR2 divided by the effective dose to evaluate the effect of exposure
settings in nodule detection. They found that, except for the thickest patients, the air gap provided better SNR
than grid in technique at a lower effective dose. Regarding tube voltage, no clear recommendation was made, they
found that the FOM was higher between 90 and 120 kVp while the tissue to bone contrast indicated 120 to
150 kVp to be better.

An approach to simulating images for the study of chest radiography systems has also been presented by Moore
et al. [36] in which patient CT images are used as input for the simulations. This framework lacks detailed
characterization of scattered radiation. Instead, the scatter fractions are measured experimentally using physical
phantoms; deviations between the phantoms and patients may limit the accuracy of the simulated images.
Additionally, for this type of measurements it is difficult to evaluate the change in scatter conditions across the
image.

Other X-ray simulators have been created to be applied for different imaging modalities, like neonatal
imaging[37], cone beam CT [38], mammography [39] and CT [40]. The latter do not include the simulation of
the antiscatter grid, as it is not required in some of these imaging applications, and often lack a detailed modelling
of the detector characteristics.

From the information collected above it can be seen that literature in chest optimization studies is extensive, as
might be expected given the frequency and importance of this examination, it also covers different imaging
technologies and methods for optimization. A broad range of conclusions exists regarding the optimal parameters
for chest radiography. There is no clear consensus on the tube voltage technique to be used (high or low) nor in
the scatter rejection methods to be preferred [11,18,34,41–43]. It is also clear that the outcomes of these studies
depend somewhat on the different ways in which image quality is defined and measured. In the past few years,
the notion of a task-based definition of image quality has emerged [44] and that studies should be linked to the
clinical task to be answered. This is something that is missing in a number of these studies. Computer simulations

5
have proven to be an effective and versatile tool for the simulation of X-ray images. However, there is need for a
more comprehensive tool for chest radiography, one that can accurately simulate the scatter and primary
contributions to the image formation and is able to incorporate real detector characteristics like sharpness and
noise in the image.

Objectives of the thesis

The main objective of the thesis is the creation of a simulation platform that can be used to produce physically
correct synthetic radiographic images of computational anthropomorphic models. The framework should allow
the study of different elements of the imaging chain that influence clinical task performance. Along the same line,
realistic anatomical models should be included.

Ideally, chest X-ray examinations should be indicated with a full reference to the clinical request, something that
was missing in earlier studies. Because of the absence of clinical tasks in the optimisation process and the rapid
technological evolution in imaging today, CXR protocols should be updated in the light of this development. Such
methods are also required if the most is to be made of newly developed CXR X-ray detector technology [45].
Simulation platforms represent a fast and efficient tool that can be applied to optimize imaging system, resulting
in more efficient and effective patient imaging.

This thesis describes the development of a set tools that have been validated and applied to the study of diagnostic
performance in CXR, but in fact they set the groundwork for simulating a wide range of new devices and
applications. The thesis objectives were:

• Investigate the current clinical protocols used for chest radiography in adult patients.
• Evaluate different exposure parameters using a simple experimental approach for optimization of
exposure techniques. Compare the results with the current clinical protocols used for chest radiography.
• Identify a suitable physical anthropomorphic phantom to be used as surrogate of real patients for
experimental measurements to assess dose and image quality.
• Evaluate different standard methods for dose and image quality monitoring and their suitability for
optimization studies in chest radiography.
• Develop a methodology for the simulation of the imaging chain, allowing the generation of synthetic
radiographic images, including realistic levels of sharpness and noise for a given X-ray detector.
• Validate the imaging chain simulation using standard image quality metrics.
• Develop realistic anthropomorphic computational models of the thorax. The models should be able to
represent different body types and include a range of clinical tasks commonly found in chest X-ray.
• Perform a Virtual Clinical Trial for chest radiography using the tools created to study the influence of
different exposure techniques in diagnostic performance.

These objectives will be addressed in six chapters structured as follows:

Chapter 1: ‘Chest radiography at UZ Leuven hospital. Classical optimization study of chest radiography’.
Includes a survey of four room-based X-ray systems at UZ Leuven Hospital. The survey looks at the current
technical set up and exposure parameter selection for chest posterior anterior (PA) projection imaging.
Additionally, the exposure indicator (EI) and kerma area product (KAP) values for adult PA thorax examinations

6
were monitored and compared in the systems under study. Subsequently, a standard physics test object-based
experimental approach was applied in one of the X-ray systems. The study investigated the influence of exposure
parameters and antiscatter grid use on the visibility of image details with different material compositions over a
homogenous background. The outcome of the method is measured in terms of a FOM that includes a measure of
image quality, i.e., the signal difference to noise ratio (SDNR), and a patient dose metric, in this case lung dose
calculated from Monte Carlo simulations. The FOM values were used to establish the beam quality and scatter
condition combination that gave the highest detectability per unit dose. Higher FOM values were found for low
tube voltage technique, differently to the clinical protocols used. This type of measurements can be easily
performed and can provide insight on the influence of the radiation quality on SDNR, however the object used
does not reflect actual thorax anatomy.

Chapter 2: ‘Physical thorax phantom Lungman, characterization and validation for CXR studies’. In the search
for a more realistic test object, the validation of a physical CXR anthropomorphic object is then performed. The
Kyoto Kagaku Lungman thorax anthropomorphic phantom was validated for dose assessment purposes in chest
radiography. The validation is performed by comparing the Lungman phantom with a range of patients in terms
of kerma area product, exposure time and exposure delivered to the X-ray detector (i.e., using the Exposure Index),
for chest PA and LAT examinations. A computational voxel model of the Lungman was created by segmenting
the CT images of the phantom. This model was used in Monte Carlo simulations to compare the organ doses
obtained using the tissue equivalent materials of the physical phantom with real tissue composition reported by
ICRP Publication 89 [46]. The Lungman phantom proved to be appropriate for dose and AEC performance
evaluation of X-ray systems. However, a more realistic and flexible computational model was necessary.

Chapter 3: ‘A new approach to dose and image quality surveys in chest radiography’. Different methods of
evaluating dose and image quality in chest radiography were evaluated. First, a new method for dose monitoring,
is introduced. The method uses KAP and EI values to identify outliers in chest PA examinations performed in
adult patients rather than just the KAP values as it is commonly done. The examinations were performed in the
room-based X-ray systems described in Chapter 1. Then, a survey of chest radiography systems in Flanders was
performed. The image quality of the systems was compared using a contrast-detail object and the Lungman
anthropomorphic thorax phantom, validated in Chapter 2. The correlation between contrast-detail analysis and
VGA from the phantom images was studied. The study showed the need to optimize a number of systems in the
chest X-ray departments included in the survey. No correlation was found between the results obtained from
technical image quality and clinical image quality as perceived by radiologists. These methods used to evaluate
image quality were to some extent quantitative but open ended, suggesting the need for a more comprehensive
tool for task-based optimization of CXR.

The first step towards the creation of a more comprehensive tool for task-based optimization was described in
Chapter 4: ‘Methodology and validation of a simulation platform for Virtual Clinical Trials in chest radiography’.
The work presented covers one of the main objectives of the thesis: the creation of a simulation platform that can
be used to study different elements of the imaging chain and their influence on image quality and doses delivered
to the patients. The methodology to simulate synthetic radiographic images consists of three main stages: (1)
creation of noise-free hybrid simulated images, the term hybrid is used because Monte Carlo simulations and ray
tracing techniques are combined in a single image, (2) addition of measured real detector characteristics to the

7
hybrid images, to obtain realistic levels of sharpness and noise and (3) a validation step, where simulated and
experimental data are compared. The methodology is validated using a standard image quality metrics (SDNR)
measured in real and simulated images. Additional validation of the noise and sharpness modification routine and
the modelling of a moving antiscatter grid were also carried out. With a validated platform the next step was the
development of realistic models of human anatomy.

Chapter 5: ‘Modelling of anthropomorphic chest phantoms and associated clinical tasks’, describes the second
crucial stage for task-based Virtual Clinical Trials, which is the creation of realistic patient models including a
range of pathologies. The methodology to create a library of anthropomorphic thorax phantoms representing
different body types was described. The phantoms were based on an existing polygonal mesh model, the Realistic
Anthropomorphic Flexible (RAF) phantom [22]. This phantom was used over the computational model of the
Lungman because of its more realistic background and flexibility. A set of lesions and devices commonly found
in thorax exams were modelled and included within these phantoms. A similar methodology based on patient CT
image segmentation and mesh modelling was then implemented to create 3D computational models of pathologies
associated with Covid-19 disease. The anatomy of the models and of the simulated clinical tasks were validated
by experienced radiologists. Additionally, the images were uploaded for analysis to an AI software which served
as extra validation.

Finally, the simulation framework from Chapter 4 and the computational models from Chapter 5 are applied in
Chapter 6: ‘Virtual clinical trial in chest radiography’. An image dataset was created from simulated radiographic
images of the computational phantoms generated at different exposure settings. Dose conversion coefficients for
the different exposures were also calculated for each of the phantom types. A free response receiver operator
characteristic (FROC) analysis was performed to evaluate the performance of four radiologist in detecting a series
of clinical tasks commonly found in chest radiography. The effect of different tube voltages and grid use and dose
levels in the diagnostic performance was studied.

Finally, a general conclusion of the manuscript is presented together with the outlook.

8
Chapter 1
Chest radiography at UZ Leuven hospital.
Classical optimization study of chest
radiography

This chapter introduces some basic approaches to the subject of optimization in chest radiography and is used to
establish the starting point to the PhD work. There are two sections to this chapter. The first part includes a small
survey of four digital radiography (DR) X-ray rooms in the radiology department in UZ Leuven hospital where
chest radiography examinations (CXR) are performed. The system survey was performed in terms of exposure
settings used, Kerma Area Product (KAP) and Exposure Indicator.

In the second part, a standard physics test object-based experimental approach was applied to study the influence
of exposure parameters and antiscatter methods on the visibility of image details with different material
compositions. The method relies on measurements of signal difference to noise ratio (SDNR) at various exposure
settings and calculation of lung doses. The SDNR is used as a simple metric to quantify object detectability while
lung dose is used as the patient dose metric. A Figure of Merit (FOM) was established from these two parameters
in order to determine the exposure and antiscatter technique combination that provides better detectability per unit
dose.

1.1 Chest radiography at UZ Leuven

1.1.1 Methods

A survey was performed with the objective of collecting exposure data from chest radiography exams carried out
in adult patients over a period of time. In UZ Leuven, chest X-ray exams are performed in several X-ray systems
from different manufacturers. Among them a Canon CXDI -11 (room 1), a Carestream DRX Evolution (room 2),
an Oldelft Triathlon DR (room 3) and a Siemens Axiom Luminos dRF (room 4). The four systems use Caesium
Iodide (CsI) indirect-conversion flat-panel digital detectors (FPD). In addition to these systems, CXR
examinations are also performed at the Emergency Department using an Oldelft Triathlon system also with a CsI
FPD. Regarding mobile X-ray systems, there are eleven of these devices, mainly used in the Intensive Care Unit
9
(ICU). Among them, a Siemens Mobilett XP Hybrid, used with Computed Radiography (CR) cassettes, with
needle-structure storage phosphors. Finally, there are eight mobile units that use wireless flat panel digital
radiography detectors: three Carestream DRX Revolution mobile, four Agfa DX-D100 and three Oldelft Mobile
DR.

To examine various aspects of current chest X-ray use, four X-ray rooms used for CXR examinations in the
Radiology Department were monitored for a 12-month period (April 2015-April 2016). Patient exposure
information from these systems was collected using Total Quality Monitoring software (TQM) (Qaelum NV,
Belgium). The software retrieves data relevant to the examination from the DICOM header of each patient image.
These parameters are stored in a database and can be filtered as a function of examination, study type and system,
for user defined periods of time. The exposure settings (tube voltage (kVp) and tube current (mAs)) for chest
exams in these rooms were recorded, together with the number of examinations performed in the selected period.
Additionally, parameters like Kerma Area Product and Exposure Indicator were studied.

KAP is a parameter used for monitoring the exposure delivered to the patient during a radiographic exam. KAP
is the product of air kerma and irradiated area and is usually expressed in units of cGy*cm2. KAP is measured by
placing an ionization chamber (KAP meter) at the collimator; this ionization chamber is larger than the X-ray
beam in order to cover the entire radiation field. KAP is independent of the distance to the source, as irradiation
area increases with the square of the distance from the focus, while the kerma decreases with the inverse of the
square distance.

The exposure index (EI) is a parameter defined and used in digital radiography to report the radiation exposure at
the detector entrance for a given image. Ideally, the EI provides feedback to the technologists regarding
appropriate radiographic techniques. This is particularly necessary in digital systems, where image processing
makes it difficult to assess visually whether the detector is correctly exposed, as opposed to screen-film systems
where an under or over exposed detector is immediately visible. However, the exposure indicators used for many
X-ray detectors are specific for each manufacturer and even for every imaging system, which leads to differences
in scale [47,48]. The International Electrotechnical Commission (IEC) has proposed a unified indicator called
Exposure Index (EI) [48]; some efforts to harmonize these values have also been made by the American
Association of Physicists in Medicine (AAPM) task group [47]. The new standard for EI [47,48] defines a linear
relationship between EI and the exposure at the detector entrance, in place of the non-linear (inverse or
logarithmic) relationships previously used by manufacturers. Hence EI changes as the exposure level at the
detector input changes, giving an indication of the exposure. However, the EI on its own does not give an
indication whether the image has been correctly exposed or not, two additional parameters must be considered.
The first is the target exposure index (TEI) that is generally set for each examination by the manufacturer.
Different TEI values can be set depending on the body part to be examined (chest, extremities, abdomen, etc.)
and also varies between different systems, depending on the filtration, sensitivity of the detector plate, etc. The
second is the deviation index (DI), which expresses whether the detector is exposed as predefined, towards a
specific TEI. The DI is calculated using equation 1.1.
𝐸𝐼
𝐷𝐼 = 10 ∗ 𝑙𝑜𝑔10 (1.1)
𝑇𝐸𝐼

10
1.1.2 Results

An overview of the number of examinations performed in each room is shown in Table 1.1. From a total of 70,433
radiographs acquired over the 12-month period for the four rooms, 49% corresponded to chest studies. The largest
number of exams was performed with the Canon system in Room 1, with 99% corresponding to chest exams.
Only PA and Lateral CXR views are acquired with this system since the room configuration does not permit
imaging of patients confined in a bed or chair. The remaining systems, Carestream, Oldelft and Siemens, were
used for a range of radiographic exams and, for these rooms, CXR represented respectively 14%, 13% and 17%
of the total number of examinations.

The settings (in terms of tube voltage and tube current) for CRX are also listed in Table 1.1 for bucky exams,
where the patient stands at the detector. Exposure time is controlled with the Automatic Exposure Control (AEC)
device, which is programmed to achieve a constant exposure at the detector, irrespective of patient size, and is
measured using an ionization chamber. The AEC systems have three sensing regions, which are selected according
to the projection: for lateral (LAT) projections the central chamber (mediastinum region) is selected while for
posterior-anterior (PA) projections the right and left upper chambers (lung field) are selected. As can be seen, the
default clinical settings for bucky exams are harmonized. Regarding additional tube filtration, only the Oldelft
uses 0.1 mm Cu. All the rooms utilize antiscatter grids; grid specifications are given in Table 1.2. As can be seen,
the Siemens system has a second antiscatter grid with smaller focus distance for examinations performed in the
bed. Differences can be observed in the grid design, with the grids of the Canon and Oldelft systems having 40
lp/cm, compared to 80 lp/cm for the Carestream and Siemens system. There is also some variation in the focus
distance going from 125 to 180 cm. All the grids use Al interspace, except those of the Siemens system where
Cellulose fibre (i.e., paper) is used. The source to image receptor distance (SID) is 180 cm in all systems except
in the Siemens that uses 150 cm.

Table 1.1: Exposure settings (kVp and mAs) for the four dedicated X-ray rooms, for examinations on the average patient
performed in the bucky (PA and LAT). Total of radiography exams and CXR examinations performed in the systems
investigated over a 12-month period, with the percentage of CXR exams.

SID bucky kVp mAs Extra Cu Number of


Total (n) % CXR
(cm) (bucky) (bucky) filtration CXR exams
Room1 Canon 180 125 AEC - 27910 27653 99.1%
Room2 Carestream 180 120 AEC - 4634 656 14.2%
Room3 Oldelft 180 125 AEC 0.1mm 18631 2444 13.1%
Room4 Siemens 150 120 AEC - 19258 3405 17.7%

Table 1.2: Grid information of the systems surveyed

Grid ratio Line pairs per cm (lp/cm) Focus distance (cm) Interspace material
Room1 Canon 12:1 40 180 Aluminium
Room2 Carestream 12:1 80 140 Aluminium
Room3 Oldelft 12:1 40 180 Aluminium
Room4 Siemens 15:1 80 150 Cellulose fibre
(2 grids) 15:1 80 125 Cellulose fibre

11
1.1.2.1 Kerma Area Product

In Figure 1.1, KAP values for PA projections of the different systems are shown. The data presented correspond
to three months randomly selected within the monitored 1 year period (in this case from January till March 2016).
As can be seen, the larger KAP values fall within the reference levels for chest in the Belgian data surveys by the
Federal Agency of Nuclear Control (FANC). The Diagnostic Reference Level (DRL) reports show KAP values
below 30 cGy*cm2 and 10 cGy*cm2 for the 75 percentile and the 25 percentiles respectively of all the centres in
Belgium [49]. The median values of the KAP distributions are 8.7, 5.6, 6.1 and 4.2 cGy*cm2 for Room 1, Room 2,
Room 3 and Room 4, respectively. From all the exams performed during the monitored period, 59% had KAP
values below 10 cGy*cm2 and 99% below 30 cGy*cm2.

Figure 1.1: KAP values for PA projections of the four systems investigated over a three-month period (January to March
2016).

1.1.2.2 Exposure Indicator

The mean, median, minimum and maximum of the Exposure Indicator values for PA, AP and LAT projections
were determined for the four systems. These data are shown in Table 1.3 (mean) and Table 1.4 (minimum, median
and maximum). We can see that the systems use different exposure indicators. For example, Canon and Oldelft
use the Reached Exposure Value (REX) [unitless], which is a function of the brightness and contrast as selected
by the operator [47]. Since the brightness and contrast is not changed by the operators at the acquisition console
in our institution, the REX can be used to monitor the exposure at the detector. Siemens uses an Exposure Index
(EXI), which is the average Pixel Value (PV) in the central segment of a 3x3 matrix positioned in the centre of
the field of a For Processing image [units = 100*µGy]. Carestream uses an exposure indicator known as the
Exposure Index (i.e., EI) [units=100*µGy], which represents the average pixel value of the clinical region of
interest (i.e., segmented anatomy), thus this EI is dependent from the part of the body examined [48]. More
detailed distributions of the Exposure Indicator values within the monitored period are shown in Figure 1.2 for
PA projection (a, b and c) and for Lateral (d, e and f) projection. Room 1 and 3 were plotted together since they
both use REX. We can also see that Room 2 and 4 have similar distributions.

12
Table 1.3: Mean exposure indicator results for AP, PA and LAT projections for the four systems.

Mean Exposure Indicator

Chest Projection Room 1 (REX) Room 2 (EI) (100* µGy) Room 3 (REX) Room 4 (EXI) (100* µGy)
AP - 144.1 372.5 144.2
PA 178.9 110.6 209.8 112.0
Lateral 357.6 162.6 314.0 162.3

Table 1.4: Minimum, median and maximum exposure indicator results for AP, PA and LAT projections for the four systems.

Minimum – Median – Maximum Exposure Indicator

Chest Projection Room 1 Room 2(100* µGy) Room 3 Room 4(100* µGy)
AP - 37 - 137 - 252 83 - 340 - 993 81 - 275 - 387
PA 58 - 172 - 725 61 - 108 - 222 20 - 208 - 513 51 - 108 - 307
Lateral 64 - 349 - 991 72 - 157 - 326 16 - 306 - 698 72 - 160 - 313

(a) (b)

(c) (d)

13
(e) (f)

Figure 1.2: Exposure Indicator distribution for the four X-ray rooms. a, b and c) Exposure indicators for PA projections for
Room 2 (EI) Room 4 (EXI) and Room 1 and 3 (REX) respectively. e, f and g) Exposure indicators for LAT projections for
Room 2 (EI) Room 4 (EXI) and Room 1 and 3 (REX) respectively.

The deviation index was also studied for the exams included in the survey. However, the DI calculation is not
always possible, especially not on older systems, since it was not till 2008 that these parameters were established
[47][48]. Not all manufacturers have included this in the system software and in fact it may not even be possible
to upgrade the software versions. Therefore, this analysis was only feasible for the Carestream system. This system
sets a target EI of 226 for chest and we would expect a mean DI close to zero. Figure 1.3 shows the DI value
distribution reported for LAT and PA projections for the Carestream systems. For LAT and PA projections the
mean deviation indexes were -1.71 and -3.26 respectively. The negative values indicate that the average EI is
lower than the TEI. Our DI values correspond to ~20% and ~50% under-exposure, respectively. From the Figure
it can be seen that the detector is systematically used at lower exposures than the target value input by the
manufacturer. This reflects local optimization of the AEC target exposure values.

Figure 1.3: Deviation index distribution for the Carestream system.

14
1.2 Classical optimization study

As a next phase in the preparation of the PhD project, a classical experimental approach was applied for technique
optimization. The end point of an optimization process is given by the figure of merit. The FOM should include
a measure of image quality and of exposure or dose. As a measure of quality, the Signal Difference to Noise Ratio
was calculated. The latter is considered an appropriate metric for optimization in digital systems, as X-ray contrast
of some (relevant) object is combined with a measure of the noise in an image, and therefore it is expected to
correlate to some extent with detectability [50,51]. Thus, this part of the study combines experimental
measurements of SDNR at different beam qualities with lung dose calculations using Monte Carlo simulations.

1.2.1 Methods

1.2.1.1 Optimization variables

In this study three exposure technique variables were considered: the tube voltage, the filtration (thickness and
material), and the degree of scattered radiation (grid in and grid out). Regarding the imaged object, different tasks
or detail types were studied (Aluminium and nodule and bone equivalent materials) on a PMMA background.

The FOM was established to find the settings providing maximum SDNR per unit of lung dose (D lungs). SDNR is
proportional to the square root of the exposure at a given beam quality and therefore the FOM was defined as
SDNR2/Dlungs. This normalization results in a FOM that is independent of dose for a quantum noise limited system,
but the FOM is influenced by changes in beam energy and scattered radiation i.e., parameters that influence large
area signal and noise in the image. In this case, D lungs is the absorbed dose in the lungs, chosen because of the
importance of this organ in chest radiography.

1.2.1.2 Image dataset acquisition

The measurements were conducted on the Carestream DRX Evolution system (Carestream, New York). The
system uses a 35x43 cm2 CsI based flat panel detector with a pixel spacing of 139 µm and is generally operated
under AEC. The inherent filtration of the X-ray tube is 3.03 mm Al for large focus. Figure 1.4a shows the
experimental setup used in the study. The object imaged consisted of PMMA slabs with a total thickness of 90 mm
and three clinical tasks represented by different materials. The PMMA thickness selected is equivalent to the lung
region [52]. The task simulating materials were placed in between the PMMA slabs, so that there is 40 mm of
PMMA in the tube exit direction and 50 mm towards the detector side. The tasks consisted in a 1x1 cm2
Aluminium square of 2 mm thickness, a sphere of 12 mm diameter made of tissue equivalent material (CT
number: +100) mimicking a lung nodule and a bone equivalent insert (CT number: +200) of 2 cm thick. The
object was placed at a patient equivalent position, using the SID as in clinical practice in the selected room
(180 cm).

The Carestream “Pattern” program was used for image acquisition. This is a program with minimal image
processing that can be used for technical studies such as evaluation of the detector or evaluation of the AEC
function [53]. The images were acquired at tube voltages ranging from 60 to 140 kVp in steps of 10 kVp. Three
different filter configurations were used: no extra filtration, additional filters of 0.1 mm Cu + 1 mm Al and 0.2 mm
Cu + 1 mm Al. For simplicity they will be referred throughout the text as: no filter, 0.1 mm Cu and 0.2 mm Cu.
For each beam quality, seven exposures were made using tube load (i.e., tube current time product (mAs)) values

15
that covered detector air kerma (DAK) values from 0.7 µGy to 20 µGy. All the images were acquired with the
antiscatter grid in place (Gin) and with the grid removed (Gout). All the images were then exported as DICOM
For Processing with a linear lookup table (LUT).

a) b)

Figure 1.4: a) Scheme of experimental setup for the SDNR measurements, b) ROIs used for SDNR calculation for the
different material details in the acquired images.

SDNR was calculated for each material insert following the protocol defined by the European Reference
Organization for Quality Assured Breast Screening and Diagnostic Services (EUREF) [54] (equation 1.2). A
region of interest (ROI) of 5 × 5 mm2 was placed at the centre of each task and other four ROIs of the same
dimension in the background around each task. The ROIs in the background were placed at 10 mm between the
centroid of the detail ROI and that of the background ROI (except for the bone material where that distance was
increased considering the size of the detail) (see Figure 1.4b). The ROIs were used to calculate the mean pixel
value and standard deviation.

|𝑀𝑃𝑉𝑏𝑘𝑔 − 𝑀𝑃𝑉𝑡𝑎𝑠𝑘 |
𝑆𝐷𝑁𝑅 = (1.2)
𝜎𝑏𝑘𝑔

where MPVtask is the mean pixel value in the task region, MPV bkg is the mean pixel value in the background
(PMMA) and σbkg is the standard deviation in the background. Once, the SDNR values had been measured as a
function of tube load (mAs), for a given exposure setup i.e., energy, Cu filter setting and grid use, the following
steps were used to estimate SDNR for a given target mean pixel value (MPV).

1. The SDNR values were plotted as a function of MPV measured in the background of the image, using in
a ROI at the centre of the image
2. A power curve of the form 𝑆𝐷𝑁𝑅 = 𝑎𝑀𝑃𝑉 𝑏 was fitted to the data
a. From the curve fit coefficients, SDNR can be found at a given MPV i.e., at the target MPV used
by the AEC
3. A linear curve fit 𝑀𝑃𝑉 = 𝑎 + 𝑏 ∗ 𝑚𝐴𝑠 was performed
a. From this, the mAs required to generate the target MPV can be calculated

16
4. The normalized output of the X-ray tube at 100 cm (𝐾µ𝐺𝑦/𝑚𝐴𝑠 ) was measured as a function of tube
voltage using a calibrated R-100 Piranha dosimeter and a 2nd order polynomial curve fit applied
(𝐾µ𝐺𝑦/𝑚𝐴𝑠 = 𝑎 + 𝑏 ∗ 𝑘𝑉 + 𝑐 ∗ 𝑘𝑉 2 )
a. From this, the air kerma at some point can be calculated for a given mAs and tube voltage
5. From these steps, given a target MPV, the SDNR, mAs and the entrance surface air kerma (ESAK) at
the phantom entrance plane can be estimated

1.2.1.3 Figure of merit (FOM)

The target MPV (see Table 1.5) was established for each tube voltage from exposures made under AEC control
of the 90 mm phantom, with no additional filtration and the central AEC chamber set to control the acquisition.
The target MPV was calculated as the average PV at the image centre, measured using a ROI of 10×10 mm². The
MPV was used to quantify the signal rather than the PV linearized to some quantity such as air kerma, because
this quantity relates directly to the energy absorbed in the X-ray sensitive layer of the detector. This applies for
the case of linear response with no PV offset. The use of air kerma as a linearization parameter would require the
use of a response function measured for each beam quality, as the relationship between energy of the X-ray
photons, the air kerma and the X-ray energy absorbed in the X-ray detector changes as beam quality changes. The
SDNR and ESAK at the target MPV were calculated for all tube voltages and filtrations using the steps and
equations described in Section 1.2.1.2.

In order to calculate the lung dose (D), the ESAK values were multiplied by filter and kilovoltage specific lung
factors that were calculated using the Monte Carlo code PCXMC (STUK, Finland) [55]. The software calculates
the mean values of the absorbed dose (µGy) averaged over the organ volume for each beam quality used. An adult
hermaphrodite mathematical phantom model is used in these simulations (see Figure 1.5). The phantom was
irradiated in a PA position following a standard clinical procedure. A total of 20000 particles were simulated for
a maximum energy of 150 kVp to cover the range of tube voltages that were used in the experimental setup.
Statistical uncertainty obtained in the simulations were below 1.2%. The lung factors were determined for the
same experimental conditions used in the measurements. The SID was set to 180 cm and beam dimensions at the
skin surface were 36×30 cm2. The beam quality (tube voltage and filtration) was adjusted in each case and the
lung factors were calculated for an ESAK of 1 µGy.

For each beam quality, a lung dose conversion factor (Lungfactor) was obtained, which was then multiplied by the
corresponding ESAK to obtain the lung dose (Dlungs) for each specific setting. The figure of merit was then
calculated as follows:

𝑆𝐷𝑁𝑅 2 𝑆𝐷𝑁𝑅 2
𝐹𝑂𝑀 = = (1.3)
𝐿𝑢𝑛𝑔𝑓𝑎𝑐𝑡𝑜𝑟 ∗𝐸𝑆𝐴𝐾 𝐷𝑙𝑢𝑛𝑔𝑠

The FOM for each of the settings studied was then normalized to a reference FOM, corresponding to the FOM
for the PA clinical protocol of the Carestream system: 120 kVp, no added filtration and antiscatter grid in place.
𝐹𝑂𝑀
𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝐹𝑂𝑀 = (1.4)
𝐹𝑂𝑀(𝐶𝑙𝑖𝑛𝑖𝑐𝑎𝑙 𝑃𝐴 𝑝𝑟𝑜𝑡𝑜𝑐𝑜𝑙)

17
Figure 1.5: PCXMC main screen with the setups to reproduce the experimental setup.

1.2.2 Results

1.2.2.1 SDNR measurements

Figure 1.6 shows measured SDNR values as a function of MPV for 120 kVp and grid out. The dashed lines are
the power curves fitted to the data for each material (see steps 1 and 2 in Section 1.2.1.2). The procedure was
repeated for all the configurations investigated for each of the clinical tasks, for simplicity only one graph is
displayed. Using the fitting coefficients, the SDNR values at the target MPV were calculated for all imaging
configurations and clinical tasks. The SDNR values calculated for the target MPV for all imaging configurations
are shown in Table 1.5.

Figure 1.6: SDNR calculated for nodule, Al and bone as a function of MPV. Data corresponds to 120 kVp, no added filtration
and grid out. Exponential curves (dashed lines) were fitted to the data.

18
Table 1.5: For each exposure setting used the values of lung dose and SDNR for the three tasks are shown. The lung dose (D) is calculated by the multiplication of ESAK and the lung factor
calculated in PCXMC. The mAs corresponding to each tube voltage and target MPV is also shown.

No added filter GRID IN GRID OUT

target Lung ESAK SDNR SDNR SDNR ESAK SDNR SDNR SDNR
kV MPV factor mAs (µGy) D (µGy) Al Nodule Bone mAs (µGy) D (µGy) Al Nodule Bone
60 610 0.40 19.2 173.7 68.78 17.02 20.99 39.49 6.0 16.4 6.5 9.38 11.16 21.01
70 650 0.47 9.8 125.5 58.86 13.72 17.74 33.57 3.4 16.1 7.5 8.24 10.36 19.59
80 592 0.54 5.0 85.1 45.74 10.54 14.62 27.89 1.9 13.6 7.3 6.80 9.31 17.35
90 551 0.60 3.0 63.2 37.79 8.64 12.61 23.94 1.2 12.3 7.4 6.06 8.44 15.64
100 538 0.65 2.1 54.1 35.15 7.50 11.46 22.40 0.9 11.5 7.5 5.47 8.02 14.70
110 521 0.69 1.5 46.9 32.55 6.78 10.51 20.55 0.6 10.9 7.6 5.17 7.62 13.68
120 504 0.73 1.1 42.1 30.84 6.23 9.74 19.26 0.5 10.5 7.7 4.78 7.24 13.14
130 480 0.76 0.9 37.2 28.48 5.65 9.17 18.05 0.4 10.0 7.6 4.38 6.71 12.41
140 460 0.79 0.7 34.3 27.23 5.34 8.80 17.29 0.3 9.5 7.5 4.21 6.53 12.39

0.1mm Cu GRID IN GRID OUT

target Lung ESAK SDNR SDNR SDNR ESAK SDNR SDNR SDNR
kV MPV factor mAs (µGy) D (µGy) Al Nodule Bone mAs (µGy) D (µGy) Al Nodule Bone
60 610 0.55 29.5 136.6 75.15 17.40 18.53 35.12 9.8 45.4 24.9 8.27 10.63 19.51
70 650 0.64 13.8 100.6 64.40 13.91 16.31 30.92 5.0 36.3 23.2 7.21 9.99 18.10
80 592 0.71 6.7 69.0 49.02 10.79 13.48 25.04 2.6 27.0 19.2 6.06 8.84 16.27
90 551 0.77 3.8 52.8 40.68 8.76 11.58 21.66 1.6 22.2 17.1 5.42 8.08 14.70
100 538 0.82 2.6 45.2 37.04 7.82 10.86 20.11 1.1 19.9 16.3 4.83 7.61 13.87
110 521 0.86 1.8 39.8 34.27 7.01 10.04 18.30 0.8 18.0 15.4 4.45 7.11 13.12
120 504 0.88 1.4 36.3 31.97 6.36 9.42 17.08 0.6 16.7 14.7 4.17 6.85 12.58
130 480 0.91 1.0 33.0 30.03 5.77 8.58 15.55 0.5 15.3 13.9 3.99 6.48 12.04
140 460 0.93 0.8 30.6 28.44 5.55 8.40 15.46 0.4 14.4 13.4 3.80 6.33 11.61

19
0.2mm Cu GRID IN GRID OUT

target Lung ESAK SDNR SDNR SDNR ESAK SDNR SDNR SDNR
kV MPV factor mAs (µGy) D (µGy) Al Nodule Bone mAs (µGy) D (µGy) Al Nodule Bone
60 610 0.65 41.2 113.2 73.59 15.06 17.58 35.79 13.5 37.0 24.0 7.68 10.06 18.73
70 650 0.74 17.4 83.4 61.73 12.28 16.04 30.03 6.4 30.8 22.8 6.73 9.72 17.76
80 592 0.82 8.0 58.1 47.62 9.64 13.42 25.23 3.2 23.3 19.1 5.72 8.58 15.73
90 551 0.87 4.5 44.9 39.08 8.05 11.65 21.98 1.9 19.3 16.8 5.01 7.85 14.25
100 538 0.91 2.9 39.0 35.46 7.10 10.67 19.81 1.3 17.5 15.9 4.60 7.34 13.40
110 521 0.94 2.1 34.9 32.77 6.52 9.95 18.69 0.9 16.0 15.0 4.25 6.98 12.98
120 504 0.96 1.5 32.1 30.85 5.91 9.44 17.37 0.7 15.1 14.5 3.99 6.65 12.04
130 480 0.98 1.2 29.3 28.73 5.40 8.81 16.23 0.5 13.8 13.5 3.70 6.33 11.80
140 460 0.99 0.9 27.3 26.99 4.99 8.14 15.17 0.4 13.2 13.0 3.57 5.98 11.25

20
Figure 1.7 display the SDNR values as a function of tube voltage for the target MPV for grid in and grid out, the
ESAK values corresponding to the SDNR values can be seen in the secondary vertical axis. When working at fixed
PV, grid out technique leads to lower ESAK values than grid in, as observed in the figures. Although the dose is lower,
the SDNR values drops considerably and it is likely that this will reduce detectability of a given object, this illustrates
the importance of working with a FOM that takes into account both SDNR and dose. Thus, when working at fixed
target MPV, the SDNR is reduced by 35%, 32% and 34% for Al, nodule and bone respectively, when the grid is
removed. This reduction is more pronounced at lower energies, for example, for no additional filtration and Al detail
the SDNR reduction following the removal of the grid, ranges from 45% at 60 kVp to 21% at 140 kVp. This can be
illustrated by the power dependency of SDNR with PV and thus with mAs (Figure 1.6). At lower energies and grid in
acquisitions the mAs needs to be increased up to a factor of 3 to arrive to the same PV as in grid out. As the energy
increases this factor reduces to approximately 2.

An alternative scheme can now be considered, in which a target SDNR can be used for grid in and grid out techniques,
instead of a target PV. Table 1.6 shows the values for examples at 80 kVp and 120 kVp for the nodule equivalent task.
The Table shows the mAs, MPV, ESAK, SDNR and FOM values when working at a fixed MPV of 592 and 504 or
fixed SDNR of approximately 14.6 and 9.7 for 80 and 120 kVp, respectively. When working at fixed SDNR, for grid
out the PV increases by factors of 2.7 and 2.0, at 80 kVp and 120 kVp respectively, compared to the grid in case.
Additionally, there is an increase in mAs and consequently in ESAK but still below the values reached for grid in.
This means that imaging with the grid out is more efficient than a grid in technique, demonstrated by the FOM values.
Specifically, the FOM for grid out is 7% and 19% higher for 80 kVp and 120 kVp, respectively compared to grid in.
These results are a clear example on how the AEC works and the importance of the FOM in this type of study.

Figure 1.7 shows that, for the same filter configuration, SDNR values increase at the lower tube voltages. The addition
of extra filtration led to a general reduction in the SDNR, expected due to the loss of contrast with higher energy X-
ray beams. The SDNR for the bone insert is always higher compared to the 2 mm Al and the nodule; bone has a higher
linear attenuation coefficient leading to increased contrast.

Table 1.6: Example of FOM calculation at a fixed MPV or fixed SDNR for grid in and grid out technique at 80 and 120 kVp.

80 kVp

mAs MPV ESAK (µGy) SDNR Nodule FOM (SDNR²/D)


GRID IN 5.04 592 85.1 14.62 4.65
GRID OUT 1.88 592 31.8 9.31 5.045
GRID OUT 4.75 1620 80.2 14.65 4.95

120 kVp

mAs MPV ESAK (µGy) SDNR Nodule FOM (SDNR²/D)


GRID IN 1.14 504 42.1 9.74 3.08
GRID OUT 0.50 504 18.5 7.25 3.89
GRID OUT 0.96 1080 35.8 9.78 3.67

21
Figure 1.7: SDNR values as a function of tube voltage for a target PV for all filter configurations, for grid in (a) and grid out (b).
The ESAK values corresponding to the SDNR value at the target MPV can be seen in the secondary vertical axis.

1.2.2.2 Lung dose calculation

Figure 1.8a shows the relationship between pixel value and mAs for 120 kVp and no additional filtration (step 3
Section 1.2.1.2). Figure 1.8b shows the measured output at 100 cm from the source as a function of tube voltage for
the filter configurations (step 4 Section 1.2.1.2). Polynomial curves were fitted to the output data and used to establish
the output at any desired tube voltage. The output corrected by the inverse square distance and multiplied by the mAs
necessary to achieve the target PV (obtained from PV vs mAs) gave the ESAK values used in the FOM calculation
(step 5 Section 1.2.1.2).

(a) (b)

Figure 1.8: a) Relationship between PV and mAs for 120 kVp and 90 mm PMMA phantom with no additional filtration. b) Output
of the Carestream system, measured without backscatter and no object in the beam for the range of tube voltage studied for the
three filter configurations. Polynomial curves are fitted to the data (dotted lines).

Target PV, mAs and ESAK values for all beam qualities used, grid in and grid out, can be found in Table 1.5. As
expected, for a fixed PV the ESAK is lower for grid out compared to grid in and it decreases for the harder beams,

22
i.e., when additional filters are added and/or kV is increased. The lung factor calculated in PCXMC, also shown in
Table 1.5, showed the opposite behaviour of ESAK. An increase can be seen at higher tube voltage and additional
filtration, caused by higher energy photons that deposit their energy in the lung tissue, several centimetres deep within
the patient. The lung factor showed an average increase of 21% and 29% for the addition of 0.1mm Cu and 0.2 mm
Cu, respectively. The increase in the lung dose value after additional filtration is larger at lower tube voltages than at
higher tube voltages. This is because the increase in the number of photons after additional filtration is more
pronounced at low tube voltage compared to higher tube voltages. The lung dose (D lungs) was finally calculated from
the multiplication of the lung factor from PCXMC and the ESAK; values for the lung dose are also shown in Table 1.5.

1.2.2.3 Figure of merit

Figure 1.9 shows the graphs of the relative FOM with respect to reference FOM (120 kVp, no added filter and grid
in) as a function of PV for the Al detail (a), nodule (b) and bone (c). The different curves represent the relative FOM
for all filter configurations and for grid in and grid out for a target constant PV at the detector.

As observed, there is a general trend towards higher FOM for lower tube voltages for all clinical tasks; this increase
is more pronounced for the Al target with a FOM 3 times higher at 60 kVp compared to the clinical protocol. For the
2 mm Al detail the FOM increases as tube voltage decreases for all filter and grid configurations. For the nodule, the
FOM is higher below 90 kVp for all configurations, while for the bone insert this is the case for tube voltages below
80 kVp. The introduction of additional filtration does not improve the FOM in any of the cases compared to no
additional filtration. Additionally, it can be seen that the curve with the higher FOM values corresponds to no added
filter and grid out for all details. Figure 1.10 (a and b) shows the ratio of the FOM for grin in and FOM for grid out
for the nodule and bone detail, respectively. As observed, above 80 kVp, the FOM for grid out is generally higher
than for grid in, this was previously discussed in Section 1.2.2.1.

23
Figure 1.9: Relative FOM with respect to the reference FOM (i.e., 120 kVp, no added filter and grid in) for Al detail (a), nodule
(b) and bone (c) as a function of tube voltage, for all filter configurations and for grid in (left) and grid out (right).

24
(a) (b)

Figure 1.10: Ratio of FOM for grid in and FOM for grid out as a function of PV for nodule (a) and bone (b) equivalent details, for
all filter configurations.

1.2.3 Discussion

The results in Figure 1.7 demonstrate an increase of SDNR for low tube voltages. There are several potential reasons
for this increase. One is that the quantum absorption efficiency of most digital detectors is higher at low photon
energies resulting in a higher conversion of incident quanta to detector signal and relatively lower quantum noise
compared to higher energies [56]. Furthermore, reducing photon energies increases photoelectric absorption and
reduces Compton scatter in the patient or test object, leading to a higher contrast. As expected, SDNR values decrease
with the addition of Copper filtration, this is due to loss of contrast for the harder beams for reasons previously
explained. The results obtained are in line with previous studies [11,18,51,56–58].

From the relative FOM in Figure 1.9, it can be seen that the introduction of additional filtration does not improve the
FOM in any of the cases compared to the use of no additional filtration.

The FOM curves were higher for no additional filtration and grid out for all clinical tasks. This is in line with the
results from Doyle et al [59], where the FOM for the lung region was higher for grid out compared to grid in technique.
In the work of Doyle et al, effective dose was used to quantify patient harm or risk in the FOM. However, for the same
study the FOM in higher attenuating areas like the heart and sub diaphragm, the opposite behaviour was seen i.e., the
FOM was higher for grid in compared to grid out. This could be expected since only 90 mm of PMMA were required
to represent low attenuating areas like the lungs, where the scatter-to-primary ratio (SPR) is below 1.0, compared to
areas like the mediastinum where SPR generally lies above 1.0. A study by Ullman et al. [60] found large variations
in the SPR across the thorax region, for example, SPR values for a 24 cm thick patient were 0.5 and 2.5 in the lungs
and behind the spine respectively. Thus, the use of a grid is not so beneficial for the lung area and consequently for
the 90 mm PMMA block used to simulate the lung regions in this study. The FOM values obtained are a clear example
of this. To better illustrate how the scatter changes across the thorax region, Figure 1.11 (a and b) show Monte Carlo
simulated images of a homogeneous 9 cm PMMA block and an anthropomorphic thorax phantom respectively (the

25
simulation and the phantom will be described in detail in Chapters 4 and 5 respectively). The pixel values in both
images correspond to SPR values displayed in the calibration bar at the left lower corner. The images were generated
at 120 kVp and grid in. The SPR variation across the thorax region is clearly visible, an effect which is not obtained
with a homogeneous block of PMMA. Although the 90 mm PMMA are equivalent to the lung region, we can see that
the SPR is different to that of the lung region in the phantom. As it will be described in Chapter 2, the PMMA
equivalence was calculated by matching the average PV, i.e., in terms of photon attenuation only, and not the scatter
fraction in the region.

(a) (b)
Figure 1.11: SPR distributions in simulated radiographic images of (a) 9 cm thick PMMA block and (b) a thorax phantom. Both
images were generated at 120 kVp and grid in.

This highlights one of the shortcomings of this study, the fact that the PMMA used was equivalent to the lung area
only in terms of X-ray attenuation and no additional PMMA blocks were used to represent higher attenuating areas
like the mediastinum. Additionally, the homogeneous PMMA cannot represent the large variations in SPR of the chest
which is a limitation for the evaluation of grid use.

As with the SDNR, an increase in the FOM was also observed as tube voltage was reduced for all details, this is
consistent with previous work. For example, a study by Doyle et al [51] used the contrast to noise ratio (CNR) as a
metric for optimization of beam quality for digital CXR. The study used a geometrical chest phantom and a FOM
defined by CNR2 divided by effective dose was calculated for each beam quality. The FOM was highest at 60-70 kVp.
Tingberg and Sjostrom [56] used an anthropomorphic phantom in a study where tube voltage was varied while keeping
the dose constant. They found that clinical image quality per unit of effective dose assessed by Visual Grading
Analysis (VGA) improved with the reduction of tube voltage.

Chotas et al. [58] studied the SNR at a fixed patient risk (effective dose) as a function of tube voltage. Images of an
anthropomorphic chest phantom and a geometrical phantom were acquired using a photostimulable storage phosphor.
The study found better SNR at lower tube voltage in the lung regions, while in the mediastinum and sub-diaphragm
areas, the tube voltage affected marginally the SNR.

26
Conversely, Dobbins et al. [18] recommended 120 kVp with the addition of 0.2 mm Cu as optimal exposure setting.
Their study considers a combination of SNR per unit exposure, which improved at lower tube voltage, and the ratio
tissue-contrast to bone contrast, which increased at higher tube voltage. Their result could potentially be influenced
by the use of entrance air kerma as the unit of patient exposure in their study, which decreases for harder beams, for
example when using extra filtration. Interestingly, they suggested that scatter in the mediastinum decreases with lower
kV. Figure 1.12 shows an example of the results obtained if we were to calculate the FOM as SDNR 2/ESAK. As the
figure shows using ESAK rather than lung dose still leads to relatively higher FOM at lower tube, however, the
addition of Cu filtration becomes beneficial in the FOM differently to what was observed in Figure 1.9.

Figure 1.12: Relative FOM for Aluminium and grid in, in this case the FOM = SDNR 2/ESAK.

A clinical study performed by Uffmann et al. [11] acquired images of patients at 90, 121 and 150 kVp and the images
were scored by five radiologists. The results showed that the visibility of anatomical structures and image quality in
general was rated superior in images acquired with the lower tube voltage for the same effective dose to the patient.
Among the reasons to decrease the tube voltage, there is an increased efficiency of the detector at lower energies,
which leads to higher number of photons detected. At lower tube voltage there is less scatter in the image due to the
increase of the photoelectric effect and the reduction of Compton scatter radiation. Additionally, at lower energies the
photon scattering is more isotropic and is more probable to be caught by the grid or be absorbed before reaching the
detector.

While the SDNR (a parameter related to object detectability) of the tasks increased at low tube voltages, there are
some drawbacks associated with the use of lower energies, for example, reduced penetration makes the ribs more
noticeable in the image, and this can interfere with the detection of subtle lesions in low attenuation areas, like lung
nodules[18] . One could argue whether this could be a reason to use high tube voltage for chest radiography with DR
detectors. This may be the case for S/F systems, but it does not necessarily apply to DR imaging for several reasons.
In digital radiography, the density difference between lung parenchyma and ribs was found to be 50% less for DR
than in screen-film in the range of 80-140kV [61]. This is possible thanks to DR post-processing, which enables
compression of the dynamic range to reduce density differences between the tissues. Additionally, post-processing

27
algorithms like bone suppression techniques could be applied for an improved detection of details obscured by the
ribs.

Another potential drawback associated with the use of low tube voltages is the necessity to increase the tube loading
(mAs). As noted, this increases the skin dose due to higher ESAK and KAP. Additionally, exposure time will be
longer, which could be above the recommended 20 ms for chest PA. This limit is set to avoid motion blurring of fine
structures in the image, one of the most common reasons for image rejection in radiography [62]. To investigate the
degree of reduction possible for the tube voltage without exceeding the 20 ms, an anthropomorphic chest phantom
’Lungman’ (Kyoto Kagaku, Tokyo, Japan) representative of a standard BMI patient was imaged in the Carestream
system of the UZ Leuven hospital. Images were acquired with the phantom at the standard patient equivalent position,
using the clinical protocol under AEC control. Exposure time from the AEC was recorded for the range of tube
voltages studied (60 to 140 in steps of 10 kVp), with grid-in and grid-out configuration. The exposure time remained
below 20 ms for tube voltages as low as 80 kVp for grid in and 70 kVp for grid out. For grid out, when setting 60 kVp,
the AEC exposure time was 23 ms, just close to the limit. Note that Lungman represents a patient of normal thickness
and many patients would have a greater projected thickness and therefore longer exposure time. The Lungman
phantom will be described in Chapter 2.

1.3 General conclusions

In the first part of this chapter, four X-ray rooms used for CXR imaging in the radiology department used a consistent
CXR protocol for upright bucky examinations i.e., a high tube voltage technique, with antiscatter grid, under AEC
control. These protocols were developed using initial experience with film screen systems but then adapted over time
as different generations of digital image receptors were adopted and integrated within the department, first CR imaging
and now CsI-based flat panel detectors. Differences in the Exposure Indicator calibrations were observed, the different
manufacturers use different definitions and methods of calculation. Patient dose monitoring showed that KAP values
of 99% of the chest examinations were all below the national DRL, set for chest radiography.

In the second part of the chapter, a simple experimental approach for technique optimization was performed for the
CsI FPD based Carestream system. Although all systems at UZ Leuven use a high tube voltage for CXR, the different
SDNR and FOM results indicated that lower tube voltages are favourable for a patient lung equivalent of 90 mm of
PMMA and evaluated at a typical pixel value in the image. Moreover, the higher FOM values were obtained for grid
out images, at this PMMA thickness, which corresponds to the approximate X-ray attenuation in the lung region.

Physics test object measurements like the one used can be easily performed with a good reproducibility, however this
type of test object, with details in a homogeneous background, do not reflect actual thorax anatomy. Although it is
clear that a lower tube voltage was beneficial for the FOM established, working practice is different. It may well be
that the use of a lower tube voltage reduces the detectability of some lesions in specific parts of the chest X-ray. This
was certainly the case with film-screen image receptors, and maybe the use of traditional settings has not been
scrutinized in sufficient detail now that FDP are the norm. It should be noted that this simple approach, using SDNR
in simple tasks materials, provides valuable insight on the influence of the radiation quality on SDNR and can serve

28
as a base for more comprehensive studies like the one presented later in the thesis. However, a more clinical approach
would be needed to explain the current working practice and/or convince the medical team to change practice.

Regarding the possible use of lower tube voltages, limitations exist on the permissible increase in exposure time
without excessive motion blurring and on the visualization of details behind the ribs. Overall, this study shows that
there is potential for optimization in CXR imaging, investigating familiar variables such as tube voltage, grid use and
filtration. Although tissues like nodule and bone were used for the tasks, there is also the need for more realistic test
objects with a relevant anatomical background, depicting some of the common clinical tasks found in chest
radiography. This is especially the case when trying to balance clinical task performance against actual dose level.
The development of such anthropomorphic models including a range of clinical tasks is described later in the thesis.
The next chapters describe the validation of a physical CXR anthropomorphic test object in terms of dosimetric
accuracy, followed by its application in a survey of CXR in Belgium.

29
Chapter 2
Physical thorax phantom Lungman,
characterization and validation for CXR studies

In the search for a chest radiography (CXR) test object to better resemble the anatomy of the patient, we found the
anthropomorphic phantom Lungman. The Lungman is a thorax physical phantom that will be used throughout the
thesis work. Previously, in Chapter 1 it was used to test the Automatic Exposure Control (AEC) of the Carestream
system at different tube voltages. In Chapter 3, it will be used for image quality evaluation in a Visual Grading Analysis
study. Because of the importance of the phantom, its characterization and validation were recommended and
performed, the results obtained are presented in this Chapter.

First, the equivalence of the phantom in centimetres of PMMA was established. Then, the phantom exposure data was
compared with those from a range of patients for chest examinations. Last, the creation of a computational model of
the Lungman phantom is described. This model was used in Monte Carlo simulations to compare the tissue equivalent
materials of the physical phantom with real tissue composition reported by ICRP Publication 89 [46]

The work presented in this chapter belongs to the publication “Characterization and validation of the thorax phantom
Lungman for dose assessment in chest radiography optimization studies” Journal of Medical Imaging, 2018. Part of
this work can also be found in the conference proceedings “Validation study of the thorax phantom Lungman for
optimization purposes” SPIE Medical Imaging 2017: Physics of Medical Imaging.

2.1 Introduction

Phantoms consist of one or more tissue equivalent materials that are combined to simulate the interaction of ionizing
radiation inside the body [63]. They can represent different levels of anatomical accuracy, from simple homogeneous
blocks to more detailed structures, including internal features mimicking organs. Besides tissue equivalent materials,
phantoms can also contain inserts of real human tissue [64]. The development of physical phantoms has continued

30
over time and a wide variety of phantoms are now available [65]. In diagnostic radiology, phantoms are widely used
for the optimization of image quality versus dose [18,43,66,67].

Physical phantoms are used where relevant as surrogate of patients. This also applies to chest X-ray imaging, given
that new acquisition techniques are developed and deserve detailed studies. Phantoms described in the literature
include simple homogeneous types composed of tissue equivalent materials, such as poly (methyl) methacrylate
(PMMA) blocks [68], or more complex phantoms such as the ‘Duke’ chest phantom [69]. The latter contains Copper
and Aluminum sheets resembling basic anatomical structures, along with circular details used to assess object
detectability. This phantom has been previously used to optimize beam quality selection in chest radiography [59].

In recent years, there has been an evolution in the test objects used in optimization studies, progressing from phantoms
with simple homogeneous background to more anthropomorphic phantoms [34,70–75]. The optimization process can
benefit from the use of anthropomorphic phantoms as they more closely resemble the body anatomy and human tissue
composition. This trend is reinforced by developments taking place in the field of 3D printing technology. With this
rapid progress, efforts have been made to include structures that simulate real clinical tasks [76–78]. The use of these
phantoms as surrogates for real patients for dosimetry studies requires validation of the phantom performance. If the
phantom is used for optimizing the detection of certain pathology, then the phantom must not only simulate real human
anatomy and pathology, but also the interaction of radiation in the tissue. From this, typical minimum dose levels
required for diagnostic tasks can be found.

For dose assessment purposes and optimization studies performed as a function of beam quality, the phantom should
be made of materials with X-ray absorption properties close to those of human tissue. In this way, results remain
accurate as X-ray energy is changed. To optimize system settings such as AEC, the phantom should produce similar
exposure parameters as those used for real patients [79]. Parameters that can be used to evaluate this are air kerma
area product (KAP) and dose at the detector level assessed by exposure index (EI). This may also require that the
phantom is realistic enough that segmentation algorithms function as they would do for patient images. Finally, similar
image morphology and noise may also be required to optimize processing and display.

The Lungman anthropomorphic chest phantom (Kyoto Kagaku, Tokyo, Japan) (Figure 2.1) is designed to be used in
plain radiography and in CT scanning for protocol optimization. The phantom is supplied with two extra layers of
tissue equivalent material (chest plates). These are 27 mm thick, allowing a total additional body thickness of 54 mm
to be simulated. The internal structures of the phantom are easily removed so that inserts can be added to represent
clinical tasks. Although the Lungman has been used to study chest radiography [66,75,80], no publications were found
describing how closely the phantom represents patients. Therefore, the aim of this work was to validate the Lungman
phantom for dose assessment purposes in chest radiography. The procedure applied was divided in three steps:

(1) Establish the PMMA equivalent thickness of the lung and mediastinum regions of the Lungman. This will allow
benchmarking against standard (homogeneous) test objects.

(2) Compare KAP, exposure time and EI data for images acquired with the phantom to those of real patients under
automatic exposure control.

31
(3) Create a voxelized model of Lungman so that a detailed dosimetric comparison can be made between the phantom
materials and the materials in ICRP Publication 89 [46].

Figure 2.1: Anthropomorphic physical phantom Lungman.

2.2 Methods

2.2.1 PMMA equivalence of the Lungman phantom

Three configurations were used when estimating the PMMA equivalence of the Lungman phantom. The first was
without tissue layers, then with one layer placed at the back of the phantom and finally with two layers (one front and
one back). Images of the different configurations were acquired for tube voltages ranging between 60 kVp and 120
kVp in steps of 10 kVp. At a given tube voltage, a Posterior Anterior (PA) image of the Lungman phantom was
acquired under AEC control and the delivered tube current-time product (mAs) recorded. Subsequently, the AEC was
de-selected and the mAs set to ~1.5 times the mAs delivered by the AEC. This reduced the influence of noise on the
measured data and ensured that the detector was not saturated during the acquisitions. These mAs settings (at the
specific tube voltage) were used to acquire three Lungman images: standard Lungman, Lungman plus one layer and
Lungman plus two layers. The Carestream ‘Pattern’ program was used for all acquisitions. This is a DICOM ‘For
Processing’ image type provided by the vendor that is suitable for technical evaluations i.e. with all the necessary
detector corrections but without clinical image processing. The phantom was removed and PMMA blocks of
dimension 26 x 30 cm2 were positioned in its place. Images were then acquired for thicknesses between 10 cm to
20 cm in steps of 2 cm, using the mAs used for Lungman. This ensures a fixed relationship between X-ray attenuation
for the Lungman materials and PMMA, as quantified by the pixel value (PV). The remaining settings were those used
in routine clinical practice: a source to image receptor distance of 180 cm and filtration of 2.96 mm of Aluminum.
The images were exported from the system in DICOM format with a linear look up table. Subsequently, the mean PV
was measured for a 10 x 10 mm2 region of interest (ROI) at the centre of the PMMA images. The PV was plotted
against PMMA thickness for the different tube voltages and the data points were fitted with an exponential function
(Equation 2.1), where TPMMA is PMMA thickness in centimetres.

𝑃𝑉 = 𝐴 ∗ 𝑒 −𝐵∗𝑇𝑃𝑀𝑀𝐴 (2.1)
Using the inverse of the above relationship, the PV data in images of the Lungman acquired at the same exposure
parameters (tube voltage and mAs) were converted to equivalent centimetres of PMMA.

32
2.2.2 Comparison of the Lungman phantom and real patients in terms of EI, KAP and exposure time

In order to compare phantom exposure data with typical patient data, the phantom was imaged using a Carestream
DRX Evolution radiographic system (Carestream, New York, USA) used for chest X-rays in our hospital. The system
uses a Caesium Iodide (CsI) based flat panel detector and is generally operated under automatic exposure control.
Phantom images were acquired using the standard adult chest program. PA projections were taken under standard
clinical conditions: tube voltage of 120 kVp, source-to-detector distance of 180 cm and the anti-scatter grid in position.
The left and right (lung field) chambers of the AEC device were used. Exposure index, KAP and exposure time values
were recorded and compared to the values from real patients. The acquisitions were then repeated for the phantom
with the additional tissue layers attached.

Exposure Index, KAP and exposure time data of patients were then surveyed for this X-ray room. The survey included
1795 adult patients undergoing chest thorax PA examinations over a two-year period, from March 2015 to March
2017. The calibration of the integrated KAP meter is verified annually during the medical physics test. The parameter
EI is calculated by the system from pixel value data in a relevant image region i.e. the segmented anatomy. For this
system, EI is calibrated against detector air kerma (DAK) using the RQA5 beam quality [81]. EI gives an indication
of the radiation exposure at the detector entrance used for a given image and it may depend on the segmentation region
used and the body part imaged [48],[47]. These data were extracted from DICOM headers using an automated dose
management platform (TQM, Qaelum NV, Belgium). Finally, the PMMA thickness that matched the three phantom
configurations (Section 3.1) was imaged at a patient equivalent position in order to compare the EI and KAP values
obtained for the PMMA blocks with those from Lungman and the patients.

2.2.3 Creation of a voxel model of the Lungman phantom. Comparison of organ absorbed dose using
Kyoto Kagaku tissue equivalent and ICRP materials

In the final validation step, a computational model of the Lungman phantom was created. The model was obtained
from the segmentation of a 3D dataset acquired on a Siemens Definition Flash CT scanner using the following
parameters: 120 kVp, 450 mA, pixel size of 0.95 x 0.95 mm2 and slice thickness of 0.6 mm. CT image segmentation
was performed using 3D Slicer [82] and Fiji [83], both of them open source software packages for visualization and
image analysis. Thresholding, region growing, edge detection and model fitting were used for the delineation of the
organs. The segmented organs were then exported as OBJ files, a format that contains the information of the vertexes
and the vertex normals of the 3D objects. For use in Monte Carlo simulations with PENELOPE/penEasy Monte Carlo
transport code [84],[85], the OBJ files were then voxelized using software developed by Lombardo. [86]. An algorithm
was developed in the Python programming language to transform the output from the previous software to a format
that can be read by penEasy. This format consisted of a two-column file containing information about material index
and density, as well as a header section.

The voxel model of the phantom was used in Monte Carlo simulations to assess the suitability of the Kyoto Kagaku
tissue-equivalent materials for dosimetry studies. To this end, two versions of the voxel phantom were created: one
using Kyoto Kagaku tissue equivalent materials and another with material composition given in ICRP Publication 89.

33
The absorbed dose in each of the organs was then compared for the two versions. The phantoms were irradiated with
a cone beam covering the entire surface of the phantom in PA and anterior posterior (AP) projections, for tube voltages
ranging from 60 kV to 125 kV. The energy distribution of the source was simulated using the model of Boone [87]
for a Tungsten anode X-ray tube. Spectra for the different energies were provided in penEasy as external text files
with 1 keV sampling intervals. The energy deposition tally was used, which scores the energy deposited per simulated
history in each organ. The output from the tally was divided by the mass corresponding to each organ to obtain the
organ dose for both material types (i.e. Kyoto Kagaku and ICRP). The results were then expressed as a relative
deviation from the ICRP material case:

DKYOTO −DICRP
R dev = DICRP
(2.2)

Table 2.1 shows the mass ratios of the Kyoto Kagaku materials used in the Lungman phantom and the corresponding
ICRP materials. Chemical composition of the Kyoto-Kagaku tissue equivalent materials in the Lungman was provided
by the manufacturer. Due to the complexity of the phantom airways and the spatial resolution of CT images, the
smallest structures could not be fully segmented. However, as observed in Figure 2.1, the lung region is composed of
air and bronchi simulating material, thus the latter is the main contribution to the average lung density in the phantom.
To compensate for the lack of true density distribution in the voxel model of the phantom, three different approaches
were considered to model the lung material.

Table 2.1: Ratio between Kyoto Kagaku and ICRP 89 organ masses.

Organs Kyoto-Kagaku/ICRP organ mass (g)

Thoracic wall 1.117


Ribs 0.701
Diaphragm 1.020
Lungs 0.357
Vertebrae 1.023
Heart 1.028
Bronchi 1.029
Trachea 1.025

The first step was to calculate the average density in the lung region of the CT images. The lung borders were identified
and segmented from the surrounding tissue. The average attenuation in the lungs was then calculated by averaging
Hounsfield Unit data within this volume. To convert this value to density (g/cm3), ‘k-Wave’, a MATLAB toolbox for
simulation of acoustic wave fields, was employed. This toolbox uses the experimental data reported by Schneider
[88].

In deciding upon the method to model the lungs, a preliminary study was carried out in which three methods were
evaluated. The first approach assumed that the entire lung region was homogenously filled (i.e., segmented bronchi
excluded from the model) with a mixture of soft tissue equivalent material and air. The density used for this was

34
calculated previously from the average attenuation in the lungs. The second approach was to include the segmented
bronchi and to set lung material to that used for the first approach (mixture of soft tissue and air). However, in this
case the mixture density was equal to the average lung density (calculated from the average HU) minus the density
contribution given by the bronchi actually segmented in the model. The third approach was to use air in the lungs (as
in the physical phantom) and keep the segmented bronchi model as a separate material. For the latter two approaches,
the simulated material was based on the tissue equivalent material reported by the manufacturer.

The reference materials and proportions of mixtures used for the comparison were taken from ICRP Publication 89
[46]. Weighted mixtures of trabecular and cortical bones were employed for the bones. Specifically for the vertebrae,
a proportion of 25% cortical and 75% trabecular was used, while in the ribs, 94% corresponded to cortical and 6% to
trabecular bone [46]. For the thoracic wall, the material used was residual tissue i.e. a mixture between soft tissue and
fat. For the trachea, the material reported in the ICRP was used in the external walls while the inside was filled with
air. The diaphragm was filled with skeletal muscle, and the heart with a mixture of 60% blood and 40% heart wall
striated muscle. As mentioned before, three methods were considered for simulating the lungs, for the first (i.e.
homogenous mixture) the lung material was set as a mixture of air, lung tissue, vessels, bronchi and blood with average
density and composition reported in the ICRP. In the second approach (including segmented bronchi and material
mixture in the lungs) the density of the lung mixture was adjusted to include the contribution from the segmented
bronchi. In the third approach, air was used to surround the bronchi material. For the last two cases the bronchi were
filled with bronchus material.

The three approaches used for lung modelling were compared in terms of relative deviation of the organ dose between
the Kyoto Kagaku and the ICRP materials. When comparing the first and second approach, there was no difference in
the relative deviation for the lung dose in both cases. For the second and third case, the relative deviation of the dose
in the bronchi varied in less than 2%. Therefore, the second approach was considered the most suitable for modelling
the lung, since the segmented bronchi were present and the average density of the lung region corresponded to that of
the physical phantom.

2.3 Results

2.3.1 PMMA equivalence of the Lungman

Figure 2.2 shows the relationship between the average pixel value for the PMMA images and PMMA thickness, with
tube voltages as a parameter. The results of these curve fits were then applied to the Lungman images acquired at the
same tube voltage (Equation 2.1). To establish the PMMA equivalence in different regions of the phantom (i.e. lungs
and mediastinum), three ROIs were positioned at locations corresponding to the AEC chamber positions marked on
the Bucky (Figure 2.3). The mean PV in the PMMA equivalent images from these regions gave the equivalence in
centimetres of PMMA of the right and left lungs and the mediastinum at different energies.

Figure 2.4 illustrates the PMMA equivalence of the Lungman phantom (including the tissue layers) in the mediastinum
(a) and lung regions (b). The PMMA equivalence curves for the three versions of the phantom have a similar trend.
There is a slight increase in PMMA equivalence at higher tube voltages, but overall, there is not a strong dependence

35
on energy. For Lungman with no attachments, PMMA equivalence in the mediastinum varied from 13.5 cm to
13.7 cm, while for the lungs (averaged between right and left lungs) the results ranged from 9.5 cm to 10.0 cm. These
results are consistent with data reported by Dobbins et al.[89], where a PMMA phantom of 9.3 cm thickness was used
to simulate the lung tissue thickness for chest radiography. For the phantom data with increased thickness, PMMA
equivalence in the mediastinum region ranged from 16.3 cm to 16.9 cm and from 18.6 cm to 19.2 cm for one and two
additional layers, respectively. In the lung region, these equivalences ranged from 13.2 cm to 13.9 cm and from
16.1 cm to 16.9 cm.

700
60 kVp
600 70 kVp
80 kVp
500 90 kVp
Pixel Value

100 kVp
400 110 kVp
120 kVp
300

200

100

0
8 10 12 14 16 18 20
PMMA thickness (cm)

Figure 2.2: Relationship between Pixel Value and PMMA thickness Figure 2.3: Image of the Lungman phantom converted
(fixed mAs for each tube voltage), for the range of tube voltages studied. to PMMA equivalence. The ROIs drawn correspond to
the AEC.

(a) (b)
Mediastinum PMMA equivalence Lung region PMMA equivalence
20 20

18 18
PMMA thickness (cm)

PMMA thickness (cm)

16 16

14 14

12 12

10 10

8 8
50 60 70 80 90 100 110 120 130 50 60 70 80 90 100 110 120 130
Tube voltage (kVp) Tube voltage (kVp)
Lungman+2layers Lungman+1 layer Lungman Lungman+2layers Lungman+1 layer Lungman

Figure 2.4: PMMA equivalence for the mediastinum region (a) and the lungs region (b) in the Lungman phantom and the extra
tissue layers as a function of peak tube voltage.

36
2.3.2 Comparison of the Lungman phantom to patients using EI, KAP and exposure time

Patient data distributions of KAP, EI and exposure time are presented in Figures 2.6(a-c) respectively, along with
values obtained for the Lungman phantom with and without the tissue layers. The EI, KAP and exposure time for the
Lungman configurations were compared with the median value of each parameter distribution. Relative differences
between the EI median value of the patient distribution and the Lungman phantom ranged from 3% to 8%. The smaller
difference was obtained for the Lungman with one tissue layer attached. For the KAP and exposure time distributions,
the closest values to the patient median value were found for the Lungman without any extra layers, with 14%
deviation in both cases. Once the extra layers were attached, differences of 30% and 100% were obtained for the
Lungman with one and two layers respectively.

(a) (b)
30 20
Air Kerma Area Product Exposure Index
18
25
16
Patients Patients
14
20 Lungman Lungman

Lungman + 1layer 12 Lungman + 1layer


Frequency (%)
Frequency (%)

Lungman + 2layers Lungman + 2layers


15 10

8
10
6

4
5

0 0

KAP (dGy*cm2) Exposure Index (100*µGy)


(c)
45

Exposure time
40

35
Patients

30 Lungman

Lungman + 1layer
Frequency (%)

25
Lungman + 2layers

20

15

10

0
3-4 5-6 7-8 9 - 10 11 - 12 13 - 14 15+
Exposure time (ms)
Figure 2.5: KAP (a), EI(b) and exposure time (c) distributions for a range of patients undergoing chest PA examinations with
the Carestream system compared with the KAP values for the Lungman phantom.

These results suggest that the use of Lungman in chest optimization procedures will give EI, KAP and exposure time
values that correspond to those found in typical patient distributions of these parameters. They also indicate that the
phantom together with the chest plate is necessary to represent a wider population. As can be seen in the histograms

37
of Figures 2.5(a-c), the KAP and the exposure time increased when the tissue layers were placed in the phantom (as
expected), however, the EI value decreased.

To further study this behaviour, the 2D plot in Figure 2.6 was used, in which patient EI data were plotted as a function
of KAP for the same examination (blue point cloud). In order to indicate local peak densities, a Gaussian-kernel
density estimator was used to apply colour mapping to the data. This type of 2D plot can be employed to study AEC
device performance as well as to detect radiographic or technical problems in the system. For a given field size, the
AEC device is expected to increase the KAP according to patient thickness in order to keep EI constant. However, as
illustrated in the graph, the EI values spread over a range of values rather than being close to some specific target
value. A number of factors contribute to this variation in EI value, including variable collimation [90] and
segmentation by the EI algorithm. There is also the known dependence of EI with patient thickness [91]: patients with
different thickness and anatomy will produce a change in the beam quality, which then influences the EI value 16.

In Figure 2.6, the EI and KAP values corresponding to the Lungman phantom and the PMMA slabs with the lung
equivalent thickness of the Lungman are plotted. The PMMA equivalent blocks were imaged at a patient equivalent
position using the routine clinical settings for thorax imaging. The three PMMA thicknesses used were 10 cm, 14 cm
and 17 cm, corresponding to the lung region PMMA equivalence of the Lungman phantom including one and two
chest plates, respectively. While the KAP values for the PMMA are close to those obtained with the Lungman, values
for EI lie some distance from the centre of the patient distribution. This is most likely due to segmentation errors in
the EI calculation, the accuracy of which may also depend on body part being imaged [47]. These results suggest that
anthropomorphic rather than homogeneous phantoms should be used if EI values are to be investigated.

Figure 2.6: Patients point cloud of EI as a function of the KAP. EI and KAP values for the Lungman phantom with extra tissue
layers and the PMMA equivalence thickness of the lung region of the three phantom configurations are also displayed.

38
2.3.3 Voxelized model of the Lungman phantom. Comparison of organ dose for ICRP and Kyoto
Kagaku tissue equivalent materials

Eight organs were selected for segmentation from the CT dataset: vertebrae, diaphragm, heart, trachea, bronchus,
lungs, thoracic wall and ribs. The thoracic wall and the ribs were divided into posterior and anterior subsections.
Figure 2.7a illustrates the segmentation result. The voxel model was generated using a resolution of 1.0×1.0×1.0 mm3
(Figure 2.7b), which represented a compromise between a voxel size sufficient for dosimetry studies and a reasonably
low computational load.

Figures 2.8a and 2.8b illustrate the relative deviations between organ absorbed dose using Kyoto Kagaku materials
with respect to the ICRP materials as a function of tube voltage, for AP and PA irradiations, respectively.

In the thoracic wall, for the subsection closer to the beam, the organ dose for the Kyoto Kagaku material
underestimates the dose for the ICRP material. The opposite is observed for the subsection farther from the beam
which is influenced to a greater extent by the remaining organs. The deviations for this organ do not show a strong X-
ray energy dependency, with differences no larger than 17%.

Notable differences were seen in the rib doses, with the largest differences found in the subsection farther from the
source (i.e., ribs anterior for PA irradiation and ribs posterior for AP irradiation), where overestimation reached 41%.
For this material, the deviations in dose are strongly influenced by the energy of the beam with larger differences at
lower energies. On the other hand, for the subsections nearer to the beam the discrepancies were smaller;
underestimations from 2% to 12% were obtained for the dose in the phantom material compared to the dose in the
ICRP material.

(a)
(b)
Figure 2.7: (a) Rendering of segmented organs used for the voxel model of the Lungman phantom, (b) voxel model of the
phantom with a resolution of 1.0×1.0×1.0 mm3.

39
For the vertebrae in the PA irradiations, dose was overestimated compared to the ICRP materials by 9%. For the AP
projection, these differences were greater (from 23% to 32%), with smaller deviations at the higher energy. This can
be attributed to the influence of surrounding tissue on vertebrae dose being larger for AP irradiations.

In the diaphragm, the relative deviation has little energy dependence, and furthermore the incident beam direction
does not have a strong effect. In this case underestimations with respect to ICRP went from 21% to 23% for PA and
from 24% to 27% for AP.

For the heart in the AP projection, differences in absorbed dose between the two materials were less than 4%. For PA
irradiations differences were larger and showed stronger energy dependence, ranging from 5% to 18%. This can be
attributed to the difference in dose deposition in the surrounding organs, especially the ribs and vertebrae being more
prominent in PA irradiations.

For the trachea in PA irradiations, dose underestimations ranged from 21% to 24%; in the AP projection these
underestimations increased, ranging from 23% to 32%. It appears that differences for small organs such as this are
greatly influenced by the composition of surrounding organs.

For the lungs, doses for the phantom using Kyoto materials gave overestimations compared to the ICRP materials.
These deviations went from 3% to 9% for AP data and from 5% to 13% for the PA data. This was expected since the
average density of the physical phantom was low compared to the value reported in ICRP. However, these differences
are reasonable and do not depend greatly on beam energy. For the case of the bronchi, the dose in the Kyoto Kagaku
material underestimated the values obtained using ICRP material. These underestimations were not greatly influenced
by the direction or the energy of the beam, with values between 16% and 21%.

50% 50%
AP irradiation diaphragm PA irradiation diaphragm
40% 40%
thorax_wall_ant thorax_wall_ant
30% 30%
thorax wall_post thorax wall_post
Relative deviation organ dose
Relative deviation organ dose

20% lungs 20% lungs

10% ribs_ant 10% ribs_ant

0% ribs_post 0% ribs_post

-10% vertebrae -10% vertebrae

heart heart
-20% -20%
bronchi bronchi
-30% -30%
trachea trachea
-40% -40%
50 60 70 80 90 100 110 120 130 50 60 70 80 90 100 110 120 130

Tube voltage (kVp) Tube voltage (kVp)

(a) (b)
Figure 2.8: Relative deviations in the organ absorbed dose using Kyoto Kagaku materials with respect to ICRP materials as
function of the mean energy of the spectra, a) for AP irradiations and b) for PA irradiations.

2.4 Discussion

This work has evaluated the suitability of the Lungman thorax phantom for use in dose assessment in chest radiography
optimization studies using a three-step procedure – a procedure that could be applied on other phantoms. PMMA
40
equivalence of the Lungman was first established. For the lung region of the Lungman with no added layers, the
PMMA equivalence varied from 9.5 to 10.0 cm for different energy spectra, which is in line with literature values.
PMMA equivalence was also established for thicker versions of the phantom were one and two chest plates were
attached to the phantom. These equivalence estimates should allow comparison of Lungman data to studies that used
PMMA as phantom material.

By comparing phantom data with surveyed patient data it was found that the Lungman and the Lungman with chest
plates covered the range of EI, KAP and exposure time values found in patient data. However, when using the PMMA
equivalence corresponding to the lungs, the EI values obtained lay far from the patient distribution. This demonstrates
the need for anthropomorphic phantoms containing realistic patient structure when assessing AEC performance with
parameters that are sensitive to image content, such as EI.

In the last step, a voxel model of the phantom was developed and used in Monte Carlo simulations to compare the
Kyoto Kagaku tissue equivalent materials with those in ICRP Publication 89. The comparison was performed in terms
of organ absorbed doses for different energy spectra. Although differences in absorbed dose for the ribs could be as
high as 41%, this occurred in the region farther from the X-ray source. This can be attributed to the surrounding
materials having a stronger influence on doses due to scatter radiation. For the lungs, a crucial organ when using the
phantom for optimization studies, these deviations were less than 13%. The differences found in absorbed dose can
be attributed to the influence of surrounding organs; this is the case for small organs like the trachea. Additionally,
differences in material composition clearly create discrepancies and reflect the difficulty in manufacturing tissue
equivalent materials. Notwithstanding these problems, the voxel model of the phantom could be used in dosimetric
studies. On the other hand, when physical survey measurements of X-ray installations are required, the physical
version of the Lungman phantom is the obvious alternative to simulations.

The main limitation of the study lies in the accuracy of the voxel phantom representation of the true phantom. In the
creation of the voxel phantom via segmentation of tomographic images, the CT voxel size sets a clear limit on
resolution of the acquired dataset and the level of detail possible in the voxel phantoms. The knock-on effect was a
difficulty in the accurate segmentation of all the airway structures within the phantom. Ultimately this led to errors in
the density calculation and therefore in the final dose calculation.

2.5 Conclusion

The findings of this study suggest that the Lungman phantom can be considered an appropriate anthropomorphic
phantom for dose and AEC performance evaluation of X-ray systems in the diagnostic imaging energy range. In the
next Chapter the phantom will be used in a survey of CXR in Belgium.

41
Chapter 3
A new approach to dose and image quality
surveys in chest radiography

In this chapter different methods of evaluating dose and image quality are studied. The first part of the Chapter
describes a new method for dose monitoring, in which air kerma area product (KAP) and exposure index (EI) [47][48]
values are used to identify outliers in chest X-ray Posterior-Anterior (PA) examinations performed in adult patients.
The examinations were performed in the room-based X-ray systems described in Chapter 1. The text is largely based
on the digital poster “Combined use of KAP and EI for improving outlier selection in dose monitoring for projection
radiology” presented at the European Congress of Radiology (ECR) 2018, Vienna, Austria.

In the second part of the chapter, a survey of chest radiography systems in Flanders was performed. The image quality
of the systems was compared using a contrast-detail object and the Lungman anthropomorphic thorax phantom,
previously validated in Chapter 2. The correlation between contrast-detail analysis and Visual grading analysis (VGA)
from the phantom images was studied. The text belongs to the publication “Survey of chest radiography systems: Any
link between contrast detail measurements and visual grading analysis?” European Journal of Medical Physics, 2020.
These results were also presented as “Snapshot of chest radiography in Flanders: Any link between physical and
clinical image quality?” at the Belgium Hospital Physicists Association (BHPA) Symposium, held in 2019 in Aalst,
Belgium.

3.1 The combined use of KAP and EI for improving outlier selection in dose monitoring for
projection radiology

3.1.1 Introduction

This work proposes the simultaneous use of the DICOM header tags EI (0018, 1411) and KAP (0018,115E) as means
of improving specificity in patient outlier selection for routine dose and quality monitoring in projection radiography.
Generally, in patient dose surveys, outliers are identified based on the Diagnostic Reference Levels (DRL). In Belgium
42
the KAP reference levels reported for chest are below 3.0 dGy·cm 2 and 1.0 dGy·cm 2 for the 75 percentile and the 25
percentiles respectively of all Belgian centres [49]. Thus, during dose surveys, cases with higher KAP than expected
may then be further scrutinized for optimization activities. This leads to the cases signalled as outliers being generally
obese patients, but should they be classified as such? In the present study, we hypothesized that a combined use of EI
and KAP will lead to an improved selection of outliers different from just high KAP examinations corresponding to
obese patients. The KAP and EI will provide respectively, information about the dose delivered by the X-ray tube and
the exposure at the detector, for examinations controlled by the Automatic Exposure Control (AEC). Thus, for
example, cases within normal KAP values but with EI values higher than expected will also be signalled as outliers,
rather than just obese patients signalled by high KAP values.

3.1.2 Methods

3.1.2.1 Collecting the data

Three of the four digital radiographic systems surveyed in Chapter 1 were included in the study, all used on a daily
basis to perform chest X-ray examinations. The systems were: Carestream DRX Evolution, Oldelft Triathlon and
Siemens Axiom Luminos. The survey included 5300 examinations from adult patients who underwent digital chest
radiography during a two-year period (from February 2015 to February 2017) on any of the mentioned systems. Only
Posterior Anterior (PA) projections were included in the survey, and therefore data from bed side imaging and Lateral
projections were excluded. To extract and export the examination and X-ray acquisition data, dose monitoring
software TQM (Qaelum NV, Belgium) was used.

3.1.2.2 Expressing exposure index in terms of detector air kerma

The EI gives an indication of the exposure level at the detector entrance for a given image and is based upon image
noise level [47][48]. However, different manufacturers often use proprietary ways of estimating this factor, which
leads to systematic differences in EI values. The IEC and AAPM task group 116 have established methods and
recommendations to standardize the EI, which should be consistent between manufacturers. The three systems under
study were from different vendors, and as seen below their EI definition differs:

- Oldelft implements the Reached Exposure Value (REX), which depends on the exposure at the detector, and
also on the brightness and contrast as selected by the operator. As brightness and contrast are not changed by
the operators at the acquisition console in our institution, the REX can be used to monitor the exposure at the
detector. The REX is unitless
- Carestream implements an Exposure index (EI), which is the average PV of the image region considered
relevant for the chosen exam, i.e., segmented anatomy. The EI is reported in units of 100*µGy.
- Siemens implements an Exposure Index (EXI), which is the average PV in the central segment of a 3x3
matrix positioned in the centre of the field of a For Processing image. The EXI is reported in units of
100*µGy.

43
In order to make use of the EI values to compare the detector air kerma (DAK), the EI values had to be transformed
to a common metric. The relationship between EI and DAK was established for the three systems. DAK was measured
with an R100 solid state dosimeter of a Barracuda system at the detector position. The RQA5 beam quality (21 mm
Aluminium at the tube exit and 6.1 mm Al half value layer (HVL)) [15] and six different tube current-time product
(mAs) levels (0.5, 1, 2, 4, 8 and 10) were used. The RQA5 is the standard beam quality recommended for calibration
of the EI by the International Electrotechnical Commission (IEC) [48]. Once the DAK was measured at the different
exposure levels, the selected mAs was programmed and flat field images were acquired with the X-ray detector in
position and antiscatter grid out, the FOV was adjusted to cover the detector area. The exposure index reported by the
system from the flat field images was then plotted as a function of DAK and a linear curve fit was applied to the data.
The fitting coefficients were then applied to estimate DAK values from EI values extracted from the patient DICOM
images.

3.1.2.3 Outlier selection

The newly calculated exposure indices and the KAP values were then plotted in a 2D scatter graph. Colour mapping
was applied to the data to reveal the point cloud density using a Gaussian-kernel density estimator. The median values
from the patients KAP and EI distributions were identified for each system, as well the interval gathering 98% of the
patients. As a first approach, the cases outside these ranges were considered outliers and candidates for further
investigation. Then, the cases with high KAP but EI values within 10% from the median EI value for the system (i.e.,
obese patients with a normal EI) were discarded and marked as normal. This operation was performed for two reasons:
(1) to reduce the number of outliers to a reasonable number for further study and (2) to give a more relevant selection
than just the collection of obese patients, allowing improved dose management.

3.1.3 Results

3.1.3.1 Expressing EI in terms of DAK

Figure 3.1 shows the relationship between the EI reported by the system for the flat field images and the measured
DAK. The point clouds generated from plotting the EI expressed in terms of DAK (EI(DAK)) metric versus KAP are
shown in Figure 3.2. In an ideal system we would expect a spread of the KAP values and an approximately constant
EI value since the AEC should aim for a constant exposure at the detector despite the patient thickness. As can be
seen, the Carestream and the Siemens systems present distributions with similar shape. This is not the case in the
Oldelft system, for which a larger range in the DAK metric is observed for a given KAP range. This is something for
which we don’t have explanation and could be linked to the definition of REX established by the manufacturer.

3.1.3.2 Outliers

For the selection of the outliers, the first step was to identify the intervals containing 98% of the patients (~ 2.3 standard
deviations). For the Carestream system, this range corresponded to ±50% from the EI median, while for the Siemens
and Oldelft, the ranges corresponded to ±65% and ±80%, respectively, these values are expected given the point cloud
distributions from Figure 3.2; where the Carestream and the Siemens system show tighter distributions. The median
EI expressed in DAK for the Carestream, Siemens and Oldelft was 1.4, 3.52 and 1.74 µGy respectively. The KAP
44
analysis showed that in 98% of the examinations, the KAP values were below 1.25, 1.5 and 1.8 dGy·cm2 for the
Carestream, Siemens and Oldelft system, respectively. All were below the national dose reference level (DRL) of 3.0
dGy·cm2 (for 75% of the institutions in Belgium)[49]. In the 2D EI/KAP graphs (Figure 3.3) the red squares represent
the outliers identified using the method applied to the Carestream (a), Siemens (b) and Oldelft (c) systems. The median
(pink dashed lines) and the lower and upper deviations from the EI (green dashed lines) are also shown.

Figure 3.1: Exposure Index as reported by the system plotted Figure 3.2: Exposure Index expressed in terms of DAK
against air kerma at the detector input plane (DAK) for the (IE(DAK)) plotted against KAP for the Carestream, Siemens
Carestream, Siemens and Oldelft systems at RQA5. and Oldelft systems.

While a DRL approach would only signal 12 cases as outliers, with the new approach, a total of 204 outliers were
listed. The outliers were separated in four categories: (1) high EI and high KAP (i.e., high exposure), (2) high KAP
and low EI (values outside the ±10% interval from the median EI, which correspond approximately to a deviation
index of ±0.5 of the target range considered acceptable), (3) low KAP and low EI (i.e., low exposure) and (4) high EI
and low KAP (i.e., high detector exposure). Several cases from each category were selected for follow up by an
experienced radiologist, who examined a total of 35 images. During this revision, a technical (i.e., imaging system
related) problem of focus-detector misalignment was detected (n=3). Operator problems detected included lateral
projections acquired as PA (n=5), incorrect tube voltage setting and blank images (n=5), incorrect detector exposure
(n=9) generally involving underexposed images with a noise level not acceptable for diagnostic applications, incorrect
patient positioning (large abdomen part visible) and/or poor collimation (n=6). Other problems found were KAP value
reported as zero (n=2) and obese patients (n=5). The latter five were cases with high KAP and relatively normal EI
values that fell outside the ±10% range from the median of the KAP. If the classical way of choosing outliers would
have been studies, taking the only the exams with KAP above 3.0 dGy·cm 2, there would have been 106 cases signalled.

45
(a) (b)

(c)
Figure 3.3: 2D plot of EI(DAK) versus KAP for the (a) Carestream, (b) Siemens and (c) Oldelft system. Outlier cases are
marked with red squares. Pink dashed line marks the median EI and two green dashed lines mark the limits of ±50%, ±65% and
±80% from the median respectively.

3.1.4 Conclusions

The combined use of KAP and EI helped to identify outliers due to technical and radiographic issues rather than
generate false alarms, for example, generated by obese patients. Dose and quality management platforms may benefit
from the combined use of different DICOM tags to find relevant cases for further investigation and/or quality
improvement projects.

46
3.2 Survey of chest radiography systems: any link between contrast detail measurements and
visual grading analysis?

3.2.1 Introduction

One of the main difficulties in the optimization process of CXR is to find a practical definition and means to evaluate
image quality. A practical approach to study imaging performance is using test objects, typically constructed from
homogenous poly(methyl methacrylate) (PMMA) plates in combination with metal details or inserts, to quantify the
small detail detectability and signal difference to noise ratio (SDNR) [92], as was illustrated in Chapter 1. There has
been a long history of test objects dedicated to the measurement of the contrast-detail (c-d) performance of the detector
or the imaging system, using for example the Leeds TO20, CDRAD or CDMAM test objects[93–95]. These technical
measures are reproducible and are used to compare the overall image quality of different systems [96]. C-d analysis
combines metrics like contrast, noise and spatial resolution with the performance of the human observer. C-d analysis
has become more widespread with strict c-d based image quality criteria applied in screening mammography QC
protocols, and expressed in terms of threshold contrasts for certain disc diameters [97]. This is based on the observation
that small detail detectability is a crucial marker of system performance in mammography. There are no such minimum
image quality criteria currently available for chest radiography, probably due to the wide range of tasks to be visualised
by chest radiography systems.

Most test objects have a homogeneous background, a narrow dynamic range and a lack of anatomical realism. As a
result, it can be difficult to establish a clear link to clinical performance. Anthropomorphic phantoms are available,
which are more representative of real anatomy. These phantoms are used as surrogates in optimization studies. Sets
of images can be acquired at different exposure settings, without repeated exposures to patients. The output of these
studies can characterize observer performance, using analyses like Receiver Operating Characteristic (ROC), Visual
Grading Analysis (VGA) [98] or Visual Grading Characteristics (VGC) [99].

This work uses both a contrast detail object and anthropomorphic phantoms to evaluate the image quality for a group
of chest digital radiography systems situated across Belgium. First, technical image quality of the systems was
evaluated using the TO20 c-d test object, followed by an assessment using Visual Grading Analysis of images acquired
with the Lungman anthropomorphic phantom. Radiologists were asked to apply a set of image quality criteria to the
phantom images. We investigated the correlation between contrast-detail measurements and clinical image quality as
perceived by the observers and examined whether there was a need for optimizing some of the systems. Finally, we
reflect on the effectiveness of these methods for optimization in chest radiography.

3.2.2 Methods

3.2.2.1 Contrast detail (c-d) test object

Contrast-detail detectability was measured using the TO20 test object, from Leeds Test Objects. The TO20 test object
has 144 circular details of different thickness and 12 diameters (ranging from 0.25 mm to 11.1 mm) (Figure 3.4). This
test object is normally imaged with a typical calibration X-ray spectrum, generated at tube potentials between 60 kV

47
and 80 kV, with 1 mm, 1.5 mm or 2 mm copper filtration placed at the X-ray tube. Observers count to the last disc
considered visible for each diameter and the number of discs is converted to a threshold contrast for the beam quality
chosen via tables provided by the Leeds Test Objects. The resulting threshold contrast detail detectability curve
provides an estimate of X-ray detector imaging performance [93] and is often used in quality control (QC) protocols
[100].

A different approach was used here in which the TO20 test object was imaged in combination with PMMA positioned
at the Bucky as described in a previous study [50], using the clinical protocols for chest PA examinations. This is how
the CDRAD test object is typically used [94,101,102] and is considered more representative of clinical practice. The
thicknesses of the PMMA blocks were equivalent to the three available thicknesses of the Lungman chest phantom
(without extensions and with 2 available extensions to increase chest thickness), and with the attenuation evaluated in
the lung region [103]. The PMMA thicknesses used are 9 cm, 13 cm and 16 cm, corresponding to the standard
Lungman thickness and Lungman with the additional plates [103]. For the 9 cm thickness, the TO20 was placed in
between 5 cm (bucky side) and 4 cm (beam side) PMMA; for the 13 cm thickness, TO20 was in between 7 cm (bucky
side) and 6 cm (beam side) and, finally, for the 16 cm thickness, there was 8 cm PMMA on both sides of TO20. For
these imaging conditions, the threshold contrasts are no longer known and therefore the analysis is performed using
the number of visible discs, similar to the study described by Neitzel et al [104].

Figure 3.4: Contrast detail (c-d) test object TO20.

3.2.2.2 Anthropomorphic phantom

The Kyoto Kagaku Lungman thorax phantom is an anthropomorphic phantom made of tissue equivalent materials that
can be used for plain radiography and computed tomography[103,105–107]. The different structures mimic different
organs including the diaphragm, bronchi, heart, trachea, thoracic wall and the bones of the thoracic cage (Figure 3.5).
The phantom was imaged with supplied spheres of different diameters (10 mm and 12 mm) and attenuation
coefficients (Hounsfield units of approximately -800, -630 and +100) in order to mimic lung nodules [108]. In this
study, images were acquired of the standard size Lungman, the Lungman with one chest plate (placed on the back)
and the Lungman with two chest plates (one placed in the front and one placed in the back). The standard Lungman
represents an adult male of approximately 65.4 kg, while the addition of one and two chest plates to the phantom
results in thicker phantoms that are representative of heavier persons [108]. While it is recognized that adding a single

48
plate (either at the front or the rear of the phantom) is not anatomically accurate, this method enables patients of
different thickness to be simulated.

(a) (b)
Figure 3.5: (a) Picture of anthropomorphic thorax phantom Lungman including additional chest plates. (b) Posterior Anterior
projection of the Lungman phantom.

3.2.2.3 Image acquisition

Twenty-two radiographic systems routinely used for chest imaging were included in the survey. The systems
comprised eight manufacturers: Philips, Siemens, Fuji, Shimadzu, Oldelft, Stephanix, Canon and Carestream. Six
different acquisition parameters, corresponding to the standard clinical protocol, were recorded for each system: tube
voltage, focus size, filtration, collimation, AEC selection and source to image-receptor distance (SID) for the Posterior
Anterior (PA) view. All images were acquired with the antiscatter grid in place and with the AEC engaged.

The three thicknesses of the Lungman were imaged on a patient equivalent position for a PA examination, following
the clinical protocol. For the c-d test object, three images were acquired for each system, at each PMMA configuration.
For a given PMMA thickness, the TO20 was rotated 90 degrees between acquisitions. In this way the discs were
imaged in different areas of the X-ray detector, which reduced the influence of structure noise in the image scoring
process. The beam was collimated to cover the PMMA plates used with the TO20, which were of size 26 x 30 cm2.
The AEC cells controlling the exposure were set to the clinical default for Chest PA examinations, for each system.
The incident air kerma (IAK) was measured without backscatter at the entrance of the couch or vertical bucky with a
calibrated R100 Piranha dosemeter. Then this value was corrected by the distance to estimate the IAK for the Lungman
and the PMMA. All the images were exported as ‘For Presentation’ in DICOM format.

3.2.2.4 Image quality assessment

3.2.2.4.1 TO20 test object score

The c-d images were read by two experienced observers (medical physicists) on a radiology display calibrated to the
DICOM Grayscale Standard Display Function (GSDF) standard [109]. For each disc diameter the consensus reading
on the total number of visible discs was recorded. The number of discs for a given diameter was not converted to a
threshold contrast, rather the number of visible discs was used as a marker of the contrast-detail detectability [104].

49
3.2.2.4.2 Visual Grading Analysis

The Lungman images were evaluated by three radiologists, the first with 30 years of experience while the remaining
two were junior radiologists. The scoring criteria were adapted from the European guidelines for Image quality [110]
but with additional criteria regarding the visibility of lung nodules inserted in the phantom as well as the noise and the
overall quality of the image (Table 3.1). Images were scored one by one using a five-points Likert scale (Table 3.2).
Images were displayed using the ViewDEX software [111]. Readers were able to adapt the window level and the
magnification of the image in the interface of the software and no time limit was applied. Three sets of 22 images
were used, corresponding to the three Lungman thicknesses. Before starting the study, a short training session
containing five images was performed for the readers to get accustomed to the software and the scoring system.

The results from the VGA study are presented using the VGA score (VGAS), i.e. the mean value of all the scores
given to a certain image, calculated using equation 3.1 [98]:

∑𝑖,𝑜 𝑆𝑐
𝑉𝐺𝐴𝑆 = (3.1)
𝑁𝑖 𝑁𝑜 𝑁𝑐

where Sc are the individual scores given by observer O to the image i, N i is the number of images, No the number of
observers and Nc the number of image quality criteria.

To assess agreement between the readers, the Intra-Class Correlation (ICC) coefficient was calculated. As an absolute
reading was performed, a two-way mixed model and a consistency agreement were considered most suitable for the
analysis. Based on the 95% confidence interval of the ICC estimate, ICC values less than 0.5 were indicative of poor
agreement, between 0.5 and 0.75 moderate, from 0.75 to 0.9 good and above 0.9 excellent. [112]

Spearman’s correlation was used to investigate the correlation between VGAS and IAK and between VGAS and TO20
scoring. Correlation strength was evaluated using the following criteria: from 0 to 0.29 very weak, from 0.30 to 0.49
weak, from 0.50 to 0.69 moderate, from 0.70 to 0.89 strong and from 0.90 to 1.00 very strong [113].

3.2.2.4.3 Visual Grading Characteristics (VGC) analysis

Visual grading characteristics analysis was carried out with the absolute VGAS and with the method proposed by Bath
et al. [110]. VGC allows a comparison of two systems, designated A and B. First a (2xn) frequency table was
generated, with n=5 corresponding to the number of confidence levels. This gave the number of test results in each
rating category for the two systems being compared. The points in the VGC curve are the image quality scores (ICS),
calculated by gradually increasing the decision threshold, directly calculated from the fulfilment of the criterion for
the two systems. The last point comprises the image quality criteria as 1 for both systems (ICS A = ICSB = 1) and the
curve starts at ICSA = ICSB = 0. Finally, the area under the curve (AUC) was calculated using the trapezoidal rule; if
the 95% confidence interval includes 0.5 then there are no statistically meaningful differences between both systems.

Only two devices can be compared at a given time using this method, therefore system T was designated as the
reference, chosen because this unit system is heavily used for chest examinations at UZ Leuven. Furthermore, system
T had the highest score for the TO20 reading, suggesting that there are no major technical problems with this system.
50
The VGC analysis was carried out using the VGAS for the three Lungman images for each device and reader
separately.

Table 3.1: List of image quality criteria used in the VGA scoring.

List of image quality criteria

1 Reproduction vascular pattern in the whole lung, particularly the peripheral vessels

2 Visually sharp reproduction of the trachea and proximal bronchi

3 Visually sharp reproduction of the borders of the heart and aorta

4 Visualization of the retrocardiac lung and mediastinum

5 Visually sharp reproduction of the diaphragm and costophrenic angles

6 Quality reproduction of the Hilar region

7 Visualization of the spine through the heart shadow and bones in general

8 Lung attenuation and penetration of the ribs

9 Quality of lung nodule visible in the image

10 Overall quality of the image

11 Diagnostic acceptable image noise

12 Confidence for diagnostic applications

Table 3.2: Confidence level scale.

Grade Confidence level scale

1 Confident that the criterion is fulfilled

2 Somewhat confident that the criterion is fulfilled

3 Indecisive whether the criterion is fulfilled or not

4 Somewhat confident that the criterion is NOT fulfilled

5 Confident that the criterion is NOT fulfilled

3.2.3 Results

3.2.3.1 Acquisition settings

The acquisition settings for chest PA protocols of the devices studied can be seen in Table 3.3, information on the
detector has also been included. Note that to keep the results anonymous in the graphs that follow, each system was
assigned a random letter. Additional Copper filtration (0.1 mm) was used by 36% of the systems. The SID ranged

51
from 150 to 200 cm, with 68% of the systems using 150 cm. With peak tube voltages ranging from 110 kV to 133 kV
(125 kV median, used by 59% of the systems), these settings can be considered a high voltage technique. All systems
used a grid as anti-scatter method. Acquisitions were performed using the AEC in all the cases, for PA protocols the
right and left AEC cells were activated but for one system the large AEC cell at the centre was used.

Table 3.3: List of radiographic systems surveyed and the acquisition settings for the thorax PA protocol.

SID Detector Pixel size


Radiographic system kVp Extra filtration AEC
(cm) type/manufacturer (µm)

Philips Diagnost 96 117 150 - R, L CR/not stated 100

Siemens AXIOM Luminos dRF 125 150 - R, L DR/Trixell 136

Siemens AXIOM Multix M 125 150 - R, L DR/Fuji EVO 150

Fuji Sonialvisions G4 125 150 - R, L DR/not stated 139

Large
Siemens AXIOM Luminos dRF 125 150 0.1 mm Cu DR/Trixell 136
Centre

Siemens AXIOM Aristos FX Plus 125 150 - R, L DR/Trixell 143

Stephanix Oldelft Triathlon D²RS 125 177 - R, L DR/Canon 160

Stephanix Oldelft Triathlon D²RS 125 160 - R, L DR/Canon 160

Oldelft Triathlon EVO DR 125 200 0.1 mm Cu R, L DR/Canon 160

Shimadzu Fuji Sonialvision


112 150 - R, L DR/not stated 132
Safire2

Shimadzu Fuji DR Velocity 125 180 - R, L DR/not stated 200

Fuji Sonialvisions G4 125 150 0.1 mm Cu R, L DR/not stated 139

Fuji Sonialvisions G4 110 150 0.1 mm Cu R, L DR/not stated 139

Fuji FDR EVO 120 150 - R, L DR/Fuji EVO 150

Siemens Iconos R200 133 150 - R, L DR/Agfa DR 14s 148

0.1mm Cu,
Philips Diagnost Optimus DIDI1 117 150 R, L DR/not stated 144
1mmAl

0.1mm Cu,
Philips Diagnost Optimus DIDI2 117 150 R, L DR/not stated 144
1mmAl

Siemens Luminos 125 150 0.1mm Cu R, L DR/Trixell 148

Siemens Iconos R200 117 150 - R, L DR/Agfa DR 14e 150

Canon CXDI -11 125 200 - R, L DR/Canon 160

Carestream DRX Evolution 120 180 - R, L DR/Carestream 139

Oldelft Triathlon DR 125 200 0.1 mm Cu R, L DR/Canon 160

52
3.2.3.2 IAK values for Lungman and TO20

IAK for the standard size Lungman ranged from 21 to 163 µGy, from 47 to 336 µGy for the Lungman with 1 tissue
layer and from 76 to 616 µGy for the Lungman with 2 tissue layers. On average, the IAK increased by a factor of
between 2 or 3 when adding one or two extra layers. Figure 3.6 shows the IAK values for the three Lungman sizes
and TO20 configurations for each of the systems surveyed. The minimum, maximum, mean and median values for
these data are summarized in Table 3.4. As can be seen there is generally close agreement between the IAK obtained
for each Lungman size and their corresponding TO20 with PMMA slabs.

700
600
500
IAK (µGy)

400
300
200
100
0
A B C D E F G H I J K L M N O P Q R S T U V
Radiographic systems
TO20+9cm TO20+13cm TO20+16cm

Lungman Lungman+1 Lungman+2

Figure 3.6: Comparison of IAK values for the three Lungman sizes and their corresponding lung equivalent PMMA thicknesses
used for imaging the TO20.

Table 3.4: Minimum, maximum, mean and median values for the IAK values obtained for the TO20 imaged within 9, 13 and 16
cm PMMA and the three Lungman sizes.

IAK (µGy) TO20+9cm Lungman TO20+13cm Lungman+1 TO20+16cm Lungman+2

min 31 21 64 47 102 76

max 136 163 315 337 615 617

mean 65 66 141 121 257 213

median 59 58 121 97 205 164

3.2.3.3 TO20 test object score

From the TO20 scoring, large differences were observed in the total number of visible discs: from 38 to 83 (at 9 cm
PMMA), from 31 to 78 (13 cm) and from 22 to 72 discs (16 cm), with averages of 72, 65 and 59, respectively. During
the reading, artefacts like vertical lines and structures such as AEC cells were seen in images of 27% of the systems
(e.g. Figure 3.9a). Variations in the IAK were observed: for 9 cm PMMA, IAK values ranged from 31 µGy to

53
136 µGy, for 13 cm from 64 µGy to 315 µGy and for 16 cm PMMA from 102 µGy to 615 µGy (i.e. by a factor 4.4,
4.9 and 6 respectively). There is an average reduction of 10% in object detectability (i.e. number of visible discs) as
the PMMA thickness is increased from 9 cm to 13 cm and from 13 cm to 16 cm.

Figure 3.7 shows for all systems the fraction of visible discs out of the total (144), along with the IAK values. The
fraction of visible discs varies from 35% and 58%, with a median value of 47%. System F is an outlier, with just 15%
of discs visible for the largest PMMA thickness. An average uncertainty of 10.9% (excluding system F) was obtained
for the TO20 reading. This uncertainty is the relative deviation in scores, calculated as the ratio of the mean and the
standard deviation of the number of discs seen by 2 observers reading 3 images for a given system. The value reported
is the average for all systems and PMMA thicknesses.

0.70 700

0.60 600
Visible discs / Total discs

0.50 500

IAK (µGy)
0.40 400

0.30 300

0.20 200

0.10 100

0.00 0
A B C D E F G H I J K L M N O P Q R S T U V
Radiographic System
9cm PMMA + TO20 13cm PMMA + TO20 16cm PMMA + TO20
IAK_9cm IAK_13cm IAK_16cm

Figure 3.7: Fraction of visible discs from the total number of discs present in the TO20 test object imaged between 9, 13 and 16 cm
PMMA for 22 systems. The IAK for each configuration is also represented with dashed lines. Systems are anonymized by letter.

Figures 3.8a, 3.8b and 3.8c show the fraction of visible discs out of the total as a function of IAK for all the systems
for 9, 13 and 16 cm PMMA, respectively. The variability in the results reflects the different AEC operating levels of
the systems and also that they have different antiscatter grids and detectors. This leads to a range in the fraction of
total discs visible for the same IAK. We can estimate the expected relationship in terms of number of visible discs for
the Leeds contrast-detail test objects [104], as they are designed with a factor  2 change in contrast and in diameter
between successive discs (for a given beam quality). As there are 12 diameters, a factor 2 increase in IAK should
increase the number of visible discs by 12, for quantum noise limited systems. This expected relation between visible
discs and IAK is shown as a black dotted line in Figures 3.8a, 3.8b and 3.8c. The expected curve is shifted with respect
to the y-axis so it matches the average number of discs of the five systems with the lowest IAK. We expect some
deviations from this expected curve for a number of reasons. First, these systems have different X-ray detectors with
different DQE(u) values (representing imaging performance), leading to different numbers of visible discs for the
same IAK [96]. This will be mitigated to some extent by the fact that all the detectors, but one, are flat panel DR types,
and we can expect fairly similar imaging performance. Second, differences in tube voltage will also cause some
variation in the number of visible discs for the same IAK value. The rather narrow range of tube voltages (110 kV to

54
133 kV; 125 kV median) will limit this variation. Still, it is clear that there are large differences from this expected
relationship for some of the higher IAK systems.

0.70 0.65
9cm 13cm
0.65 expected relationship expected relationship
Log. (9cm) 0.60 Log. (13cm)
Visible discs/ Total discs

Visible discs/ Total discs


0.60
0.55
0.55
0.50
0.50
0.45 y = 0.0594ln(x) + 0.1705
0.45 y = 0.0333ln(x) + 0.3724 R² = 0.2016
R² = 0.0545
0.40 0.40

0.35 0.35
0 25 50 75 100 125 150 0 50 100 150 200 250 300 350
(a) IAK (b) IAK

0.60
16cm
expected relationship
0.55
Log. (16cm)
Visible discs/ Total discs

0.50

0.45

0.40
y = 0.0649ln(x) + 0.0665
0.35 R² = 0.2521

0.30

0.25
0 100 200 300 400 500 600
(c) IAK
Figure 3.8: Fraction of visible discs versus incident air kerma (IAK) for the TO20 imaged with 9 cm (a), 13 cm (b) and 16 cm (c)
PMMA for 22 systems. The expected curves for a quantum noise dominated system (dotted black line) are also shown.

55
(a) (b)
Figure 3.9: (a) TO20 image for system F, showing the vertical line noise and the presence of AEC sensing regions, (b) the Lungman
image of system F confirms the presence of vertical line noise.

3.2.3.4 VGAS

Figures 3.10a, 3.10b and 3.10c show the VGAS (mean score given to each Lungman image) for readers 1, 2 and 3
respectively, for each Lungman thickness and system (N i=1 and NO=1 in equation 3.1). For reader 1, VGAS ranged
from 1 to 2.6 (mean=1.6 and median= 1.4), for reader 2 from 1 to 4.0 (mean=2.2 and median= 2.0) and for reader 3
from 1 to 4.3 (mean=2.4 and median= 2.3).

Figure 3.11 compares the VGAS (mean score given to each system) for the three readers, the VGA values comprise
the scores for the three images of all Lungman sizes (N i=3 and NO=1 in equation 3.1). As shown, the VGAS stayed
below 3 for most of the systems, meaning that all the radiologists believe the quality of the systems was generally
good. In fact all the images were scored as suitable for diagnostic applications. It can also be observed that reader 1
gave lower scores to the images (i.e. thought that image quality criteria were more often fulfilled) compared to readers
2 and 3. The IAK for the standard Lungman is also plotted in the figure.

The reader ICC was calculated for the three phantom thicknesses. Only a moderate agreement was found between the
three readers, with ICC = 0.70 (95% CI: 0.49-0.85), therefore the average of the three readers was not used in further
analysis. There was no correlation between VGAS and IAK values for any of the Lungman thicknesses.

56
4.5 4.5
Reader 1 Lungman Reader 2 Lungman
4.0 4.0 Lungman +1
Lungman +1
3.5 Lungman +2 3.5 Lungman +2

3.0 3.0

VGAS
VGAS

2.5 2.5

2.0 2.0

1.5 1.5

1.0 1.0
A B C D E F G H I J K L M N O P Q R S T U V A B C D E F G H I J K L M N O P Q R S T U V
Radiographic System Radiographic System

(a) (b)

4.5
Reader 3 Lungman
4.0 Lungman +1
3.5 Lungman +2

3.0
VGAS

2.5

2.0

1.5

1.0
A B C D E F G H I J K L M N O P Q R S T U V
Radiographic System

(c)

Figure 3.10: VGAS for 22 systems surveyed (1 = best, 5 = worse), VGAS are shown for the three Lungman sizes: standard
Lungman (no layers), Lungman + 1 tissue layer and Lungman + 2 tissue layers. Reader 1 (a) reader 2 (b) and reader 3 (c).

4.0 180
3.5 160
3.0 140
120
2.5
100
VGAS

IAK (µGy)

2.0
80
1.5
60
1.0 40
0.5 20
0.0 0
A B C D E F G H I J K L MN O P Q R S T U V
Radiographic system
Reader 1 Reader 2 Reader 3 IAK

Figure 3.11: VGAS (mean score for each system) given by each reader. IAK for the standard size Lungman is also shown (dashed
line).

57
3.2.3.5 VGC score – comparison against a reference system

Using system T as a reference the AUCs were calculated for all the systems. Table 3.5 shows the AUC including the
95% confidence interval for the three readers. For each reader a set of systems showed a significant quality difference
with respect to the reference T. In these cases, the 95% CI did not include 0.5 and these are indicated in bold in
Table 3.5. For readers 1, 2 and 3 respectively, 43% (9 cases), 24% (5 cases) and 38% (8 cases) of the systems had
image quality significantly poorer than system T (i.e. CI > 0.5). For systems A, F, K and U, poorer image quality than
system T was found by all three readers. Readers 2 and 3 found respectively 10% (2 cases) and 19% (4 cases) of the
systems significantly better than the reference system (CI < 0.5).

Table 3.5: AUC including 95% CI for the three readers. Comparison between 21 of the systems surveyed and system T (defined
as the reference system).

Radiographic system Reader 1 Reader 2 Reader 3


A 0.64[0.77-0.52] 0.75[0.86-0.64] 0.86[0.95-0.78]
B 0.63[0.76-0.51] 0.48[0.61-0.35] 0.62[0.75-0.49]
C 0.59[0.72-0.46] 0.41[0.54-0.28] 0.27[0.39-0.16]
D 0.49[0.63-0.36] 0.50[0.63-0.37] 0.48[0.61-0.34]
E 0.66[0.79-0.54] 0.38[0.51-0.25] 0.42[0.55-0.28]
F 0.82[0.91-0.72] 0.73[0.85-0.62] 0.85[0.94-0.76]
G 0.70[0.82-0.58] 0.45[0.58-0.31] 0.72[0.84-0.60]
H 0.67[0.79-0.54] 0.49[0.63-0.36] 0.78[0.88-0.67]
I 0.55[0.68-0.42] 0.34[0.47-0.22] 0.39[0.52-0.26]
J 0.44[0.57-0.31] 0.53[0.66-0.40] 0.46[0.60-0.33]
K 0.75[0.86-0.64] 0.72[0.84-0.61] 0.88[0.96-0.79]
L 0.62[0.75-0.49] 0.42[0.55-0.29] 0.29[0.40-0.17]
M 0.52[0.65-0.39] 0.41[0.54-0.28] 0.29[0.41-0.17]
N 0.82[0.92-0.72] 0.56[0.69-0.43] 0.71[0.83-0.59]
O 0.45[0.59-0.32] 0.34[0.46-0.21] 0.24[0.35-0.13]
P 0.43[0.56-0.29] 0.48[0.62-0.35] 0.56[0.69-0.42]
Q 0.47[0.60-0.33] 0.55[0.68-0.41] 0.62[0.75-0.49]
R 0.60[0.73-0.46] 0.43[0.56-0.30] 0.43[0.57-0.30]
S 0.52[0.65-0.39] 0.47[0.60-0.33] 0.48[0.62-0.35]
U 0.80[0.90-0.70] 0.83[0.93-0.73] 0.89[0.97-0.81]
V 0.47[0.60-0.33] 0.65[0.78-0.52] 0.72[0.84-0.60]

3.2.3.6 Correlation between TO20 and VGAS

Spearman correlation was calculated for the VGAS (i.e. mean score given to each system by each reader) and the total
TO20 scores. The VGAS of readers 2 and 3 did not show a correlation with the TO20 results, while a moderate
correlation was found for reader 1 (rS = -0.556 with p<0.01).

58
Figure 3.12 illustrates a plot between the VGAS of each reader and the TO20 results for each system (i.e. fraction of
visible discs).

0.55

0.50
Visible discs/Total discs

0.45

0.40

0.35

0.30 Reader 1
Reader 2
0.25 Reader 3

0.20
1.0 1.5 2.0 2.5 3.0 3.5 4.0
VGAS

Figure 3.12: Relationship between TO20 scoring (i.e. fraction of visible discs) and VGAS for the three readers. The y-axis
corresponds to the total number of visible discs and the x-axis to the total VGAS for each reader.

3.2.4 Discussion

Variations in acquisition settings were seen across the systems, but in general they all employed a high tube voltage
technique with an anti-scatter grid and a SID of 150 cm or greater. Apart from the higher tube current-time product
(mAs) values delivered by the AEC, there was no change in the acquisition protocols for larger size patients. This is
consistent with the results of Al-Murshedi et al [114] who found that the tube voltage was only adapted for one system
out of 17 in a survey performed in the UK. Large variations in incident air kerma were observed, leading to the
question whether the higher values are justified. Currently, the medical physics service does not impose a target
detector IAK for a particular detector type (e.g. CsI flat panel) for the Chest PA examination – and therefore the IAK
at the detector varies from centre to centre. As far as we know, there are no such reference values in literature. Given
the higher IAKs on some systems without an associated increase in contrast-detail detectability, leads to the need to
examine the AEC setup at these centres at a future step.

It is well known that large variations in patient dose are often seen in patient dose surveys, regardless of the underlying
image receptor technology (screen film/CR or DR)[115]. With CR and DR detectors, there is also the potential for
dose creep to occur, given the wider dynamic range found for these detectors. One of the tasks of the medical physicists
is to routinely survey performance of equipment, identify outliers operating at unjustified higher dose levels and take
corrective action. This clearly needs further attention for the systems in this survey. This study was in part set up to
find a useful procedure for this task.

The TO20 results showed clear differences between some of the systems. The data in Figures 3.8a, 3.8b and 3.8c show
that systems with higher phantom IAK do not consistently perform better than systems delivering lower doses. For all
three PMMA thicknesses, the increase in the number of discs does not follow the trajectory of the expected quantum
59
noise curve for the higher IAK units. Some deviation from this relationship is expected, due to differences in detector
DQE(u) and tube voltage, however there are a number of units that are probably not working in a quantum noise
dominated region and whose setup should be investigated further.

We can also see that system F has a very low score and this was due to vertical lines present in the image (Figure 3.9a).
These were not noticed at the time of acquisition as they only became clearly visible at a narrow window width setting.
One explanation could be the (temporary) use of an incorrect grid, but the lines are more consistent with some form
of dark noise present in the image. After checking further with the X-ray department, these lines were not visible in
later clinical images acquired on the unit and is therefore some kind of intermittent fault. All of the detectors are
subject to routine service, including a manufacturer (detector) specific calibration routine performed at the recommend
frequency (typically semi-annually, annually, or after a service intervention). This particular fault was resolved
without an additional detector calibration. It is interesting to note that while the VGAS results for this system were at
the low end of the range, especially for Readers 2 and 3, VGAS was not as clear an outlier as was the case for the c-d
scores. This suggests that c-d test objects, which generate effectively homogeneous images with low contrast stimuli,
are more sensitive to this type of image artefact than images of phantoms containing higher contrasts, anatomy
mimicking structures. Although the readers commented on the line structures in the image, the readers were able to
‘see through’ the periodic artefact present in the image (Figure 3.9b).

The design of the TO20 test object is such that one additional disc should be visible (at all diameters) if the dose is
doubled, assuming that the system is quantum noise dominated: this would give an increase of 12 discs in the total
score. For example, for 9 cm PMMA, the highest and lowest scores (excluding the outlier system F) were system T
(83 discs) and E (66 discs), giving a difference of 17 discs. A dose increase of a factor of ~2.5 would be needed for
system E to achieve the same low contrast detectability as system T. However, such a dose increase does not
necessarily guarantee the same number of visible discs, since the SNR and corresponding object detectability is also
affected by other noise sources such as structure and electronic noise present in the image [116]. For example, the
reading was affected by the presence of AEC cells in some of the images; system F suffered from this artefact too
(Figure 3.9a). This effect of the AEC cells was not visible in the Lungman images and none of the readers noticed
them during the scoring process – a result of the large dynamic range of the Lungman phantom (Figure 3.9b). This
suggests that the Lungman phantom is not very sensitive to subtle changes in large area image uniformity, but may be
promising to evaluate other aspects such as image processing techniques that require an anthropomorphic image to a
certain extent.

Factors leading to the differences in number of visible discs and IAK include the tube voltage, the operating point
(chosen target exposure at the X-ray detector) and the efficiency of the antiscatter grid. The total and primary
transmission of the grid are extremely influential in terms of patient dose and the ability of the system to transfer the
SDNR to the output image [117,118]. These parameters are not always visible/displayed on the antiscatter grids and
are not routinely recorded or assessed in the QC protocols used at UZ Leuven. Following this study, it has been
considered to include a more detailed evaluation of the antiscatter grid at the Commissioning stage. Other factors to
consider for the lack of correlation between the number of visible discs and IAK may be the different image processing

60
algorithms used in the systems. However, several studies where contrast detail test objects were evaluated under
different image processing techniques did not find significant differences in the scoring results [102,119,120].

From the VGAS (Figures 3.10a, 3.10b and 3.10c) we can see that the quality perceived by the readers generally
decreased with the increase of the phantom thickness, as would be expected. This is also reflected in the TO20 contrast-
detail scores, where the fraction of visible discs decreased with increasing PMMA thickness (Figure 3.7). Going from
9 cm to 13 cm PMMA, the mean number of discs reduced by 7, while going from 9 cm to 16 cm, on average 13 fewer
discs were seen.

As can be seen from the data in Figure 3.11, no correlation was found between the IAK delivered by the system and
the image quality evaluation given by the readers, as was the case for the c-d analysis. This could be the result of using
a global VGAS, rather than examining features whose score is expected to be limited by noise. To test this, the
relationship between IAK and single image quality criteria was also evaluated for criteria in the lung field and
mediastinum as well as the noise criteria and the overall image quality. However, no correlation was observed with
IAK for any of these criteria.

The link between the phantom surrogate clinical image quality as perceived by the radiologists and the number of
visible discs from the contrast detail object gave moderate correlation for reader 1 and none for reader 2 and 3. This
could be explained by the fact that VGA is sensitive to even more factors than the c-d score, such as image processing.
Sund et al. found a stronger link between the clinical impression of image quality and image processing than with
quantitative metrics such as DQE [121]. The influence of image processing is obviously important but also complex
and may depend on the imaging task, the anatomy, the type of study (e.g. a detection study or a VGA study), the test
object and even the readers and the type of image processing to which they are familiar with. For example, Zanca et
al found that imaging processing type had a significant effect on microcalcification detection in mammography [122]
while Warren et al found no such effect[119]. In a recent VGA based study, Smet et al [123] did not find a significant
influence of image processing on VGAS, while dose level did significantly change VGAS.

Among the limitations of the study, we note that TO20 was imaged against a homogenous block of PMMA of constant
thickness, while patients and the Lungman phantom generate a strongly inhomogeneous exposure. The AEC chambers
are selected as used clinically (e.g. L and R), but this results in a homogeneously exposed image for TO20 at the target
level, while for a patient and for Lungman, the target exposure will be achieved in the lung region and regions such
as the mediastinum and diaphragm will have a lower exposure. This implies that the TO20 images at the 13 cm and
16 cm thickness are generally imaged at the target exposure level of the system and we obtain no information on the
c-d detectability in lower signal regions of the image (corresponding to thicker anatomy, e.g. in the mediastinum),
where the electronic noise may form a larger fraction of the total image noise. To do this, additional, manually
controlled acquisitions would be needed, with a mAs set to give the detector IAK found in the mediastinum or
diaphragm regions of patients or Lungman.

Some aspects of this work warrant further study. The chest PA protocols used did not include adjusted settings for
larger patients. The only change in the acquisition settings was an increase in mAs delivered by the AEC, such that

61
the signal from the detector (quantified by the exposure index) was held constant. Dedicated exposure settings for
larger patients typically involve a programmed increase of tube voltage, with the aim of increasing penetration through
the thicker body, which in turn helps to limit the increase in IAK and exposure time due to the thicker subject.
However, a simple tube voltage increase will not counteract the reduction in object contrast (or ‘image quality’ in
general) found for the thicker patient. Additional changes are likely to be required, such as increasing exposure at the
detector in order to hold SDNR for larger patients close to the level of that for thinner patients. This was described
previously for digital mammography, where SDNR or object detectability for larger breasts (6 or 7 cm thick) was held
at the same level as for thinner breasts (4.5 cm thick), by increasing exposure at the x–ray detector[39]. Whether
changing detector IAK to hold the SDNR at some target level would work in chest imaging is not clear and needs
further investigation. Compared to breast imaging, chest images have a strong inhomogeneous radiation pattern. In
addition, the modulation of DAK to achieve some target SDNR requires the system or detector to be operating in
quantum noise limited region and as can be seen from Figure 3.8, this is probably not the case for some of the systems
surveyed.

Another area requiring additional work is whether agreed image quality levels can be developed and applied.
Currently, there are no target image quality levels, quantified either by VGAS or by a contrast-detail score, for chest
radiography imaging. Our analysis also showed that it is very difficult to impose any values from surveys as performed
in this work, even though two widely accepted evaluation methods were applied. While a detailed optimization of
every system was not the aim of the work, it is not clear how to proceed from here. There are probably systems in
which dose levels should be decreased, and perhaps even systems where dose should be increased. The formulation
and evaluation of critical radiological tasks may help in answering this. However, while there is consensus on
important anatomical structures [110], there are in fact no agreed reference tasks/pathologies. Task performance
evaluation in chest radiography is a complex subject due to the multiplicity of tasks performed in chest x–ray
projection imaging and the range of absorption coefficients from the different organs like lungs and ribs. One can
question how far the detectability of subtle lesions, or reader performance for some other critical task, is tested in a
VGA study. The development of a more dedicated, application-specific framework would enable task performance
for some common examinations to be quantified and thus the dose levels used in chest X-ray imaging to be justified.

3.2.5 Conclusions

This work has surveyed the imaging performance of a group of 22 X-ray systems used for chest X-ray imaging in
Belgium, using contrast-detail and VGA methods. While the VGC analysis showed clear differences in image quality
between some of the systems surveyed, conflicting results were also seen. Systems with excellent low contrast
detectability were rated poorly in the VGA analysis, and vice versa. The VGAS of only one of the three readers showed
a moderate correlation with the number of discs visible in the contrast-detail test object.

Acquisition parameters for the 22 units were reasonably similar e.g. high tube voltage technique (125 kV median
value), with antiscatter grid in place, however a wide range in phantom IAK was found. Furthermore, these changes
in dose did not correlate with the image quality characterized by the VGA study, even when studying image features
that are expected to be sensitive to image noise. Similar behaviour was observed when evaluating the relationship

62
between the number of discs seen in the c-d phantom and IAK, where systems with a higher IAK did not have the
improved detectability, characterized by number of visible discs, expected for quantum noise dominated systems.

Results were more consistent when looking at the influence of PMMA or Lungman phantom thickness. Reduced
performance was seen for both the number of visible discs and VGAS as PMMA or phantom thickness was increased.
While IAK increased with increasing thickness, as expected for AEC controlled systems, there was no compensation
in technique factors (except for mAs) such that technical image quality matched that for the standard patient/thickness.

These results show that there is need to optimize a number of the systems in the chest X-ray departments of the survey.
Dedicated protocols should be used for overweight or obese patients, to ensure a task specific metric provides results
that are largely independent of patient thickness.

3.3 General conclusions of the Chapter

This chapter has demonstrated that the combined use of DAP and EI for dose monitoring presents a more effective
method to track outliers than simply comparing measured doses against DRLs. The newer method at least includes
some measure of the signal at the image receptor (i.e., the EI) and therefore an indication of image quality by proxy.
The standard methods used to survey image quality - technical approach using a contrast-detail test object and a VGA
rating scale method used with anthropomorphic phantom images - are to some extent quantitative but open ended.
The results presented in this Chapter suggest that there is a need for a more comprehensive and flexible tool for
optimization of chest projection imaging systems. We could draw the hypothesis that a study including information
on the task or clinical question to be answered will be a better approach. The next chapter describes the development
and validation of a platform aimed to task-based optimization studies for chest X-ray radiography.

63
Chapter 4
Methodology and validation of a simulation
platform for Virtual Clinical Trials in chest
radiography

In the previous chapters we have seen that the classical methods of dose and image quality evaluation in chest
radiography may have limitations when used for optimization or evaluation of task performance. The survey
performed has shown that there is scope for optimization and the need of a more comprehensive approach. In line with
the principles of Medical Physics 3.0 [124], the decision was taken to design a performance evaluation study that
approached the clinical reality of the radiologists, i.e. utilizing a task-based optimization. The Virtual Clinical Trial
(VCT) method was chosen for this [19], in which images of computational anthropomorphic models are simulated for
the range of parameters under investigation. This chapter describes the design and implementation of a VCT total
simulation framework that can be used to generate synthetic radiographic images of realistic anatomical models
including relevant clinical tasks.

The imaging chain is simulated using a combination of ray tracing methods and Monte Carlo transport code to generate
radiography images. Detector sharpness and noise characteristics, quantified for a real X-ray image receptor, are added
to the simulated images. The modelling is implemented and validated using a simple homogeneous test object.

The content of this chapter belongs to the manuscript ‘Methodology and validation of a simulation platform for Virtual
Clinical Trials in chest radiography’ ready to be submitted to the journal Medical Physics.

4.1 Introduction

Optimization in chest radiography should guarantee an appropriate balance between the image quality required to
successfully perform the task in question and the dose used to generate the image. Several optimization studies can be
found in literature with varied conclusions depending on the measures of image quality employed [10,11,18,42,43,74].

64
A summary of these results were previously discussed in the Introduction Chapter. Currently, a high energy technique
is applied in the routine clinical practice. This technique was inherited to some extent from screen/film (S/F) imaging
and may not have been updated after the introduction of digital detectors. This technique was probably chosen for the
overall performance, with many types of lesions shown in very different anatomical realities and visible without any
dedicated, locally enhancing image processing. Furthermore, radiologists are used to viewing and working with CXR
images acquired at high energy e.g., 120 kV. It can be questioned whether a high energy technique with anti-scatter
grid remains the most effective choice of acquisition parameters, given the newer flat panel digital detectors (FPDs).

To study the influence of all the acquisition parameters with potential impact on the diagnostic performance, numerous
images of the pathologies should be acquired where the settings are varied. In CXR this becomes very challenging
due to the wide range of clinical tasks. This makes studies using patients impractical, due to ethical limitations and
challenging case selections. To overcome this type of challenges Virtual Clinical Trials (VCTs) can be
implemented [19]. In VCT, computational modelling can be used to create physically correct, synthetic radiographic
images of realistic anthropomorphic phantoms. VCT methods have been used extensively in mammography imaging
and are now being used in a wide range of X-ray based modalities plus other modalities such as PET and SPECT,
MRI and ultrasound imaging [19]. These methods either be described as partial, in which lesion models or imaging
tasks are simulated into real patient images [125–127], or total simulation in which the X-ray imaging system
components are simulated along with computational models of the anatomy and lesions [36,128]. Computer
simulations have the potential to investigate numerous elements in the imaging chain making them a valuable tool.
This includes beam quality, field size, dose levels and different scatter rejection methods. The anthropomorphic
computational models can represent the variability in the human anatomy [20–22] and the different pathologies and
devices can be included [23,24]. The simulated images of the anthropomorphic models can then be evaluated by real
or model observers [25–27], in order to investigate the influence of different parameters on lesion detectability. This
makes computer simulations a powerful tool for optimization in diagnostic radiology.

The aim of this work was to create a simulation framework that can be used to study different elements of the imaging
chain as well as their influence on image quality and doses delivered to the patient. The work consists of three main
stages: (1) creation of noise-free ‘hybrid’ simulated images, called hybrid because Monte Carlo simulations and ray
tracing techniques are combined into a final image, (2) addition of measured real detector characteristics to the hybrid
images, to obtain realistic degrees of sharpness and noise and (3) a validation step, where simulated and experimental
data are compared.

4.2 Methods

4.2.1 Creation of hybrid images

The Monte Carlo (MC) method uses probability distributions, derived from material cross sections to simulate the
interaction of radiation with matter: energy loss, angular deflection (scattering), absorption and production of
secondary particles [84]. These make MC methods a powerful tool in studying radiation transport and associated
imaging applications. However, because of the random sampling of many single particle events, Monte Carlo

65
simulations require long computational times to achieve good statistics. On the other hand, ray tracing algorithms are
capable of producing highly detailed noise free 2D projections from 3D objects in relatively short computational
times [129]. Ray trace methods compute the attenuation of X-ray photons through an object, but scatter radiation is
not considered. In the first stage of the simulation these two techniques were combined to create a hybrid image, the
main motivation to do this was the necessity to obtain a high-resolution and completely noise-free image to which real
detector characteristics could be added afterwards. The use of only MC simulations was considered an impractical
approach due to long computational times -even with supercomputing resources-, instead, the hybrid approach was
chosen. Monte Carlo simulations were implemented to obtain the image formed by the scattered radiation while ray
tracing techniques were used to generate the image formed by the primary (unscattered) radiation.

4.2.1.1 Creation of scatter projections

Monte Carlo calculations were implemented using the PENELOPE/penEasy [84] [85] transport code. In this work the
simulation parameters were defined to reproduce an X-ray examination from one of the radiography rooms of our
institution. System specifications corresponded to a Carestream DRX Evolution general purpose X-ray system
(Carestream, New York, USA). The source was modelled as a cone beam positioned at 140 cm from the detector. The
Boone polynomial method was used to generate the required X-ray spectra [87], while attenuation coefficients to
apply the X-ray tube filtration were taken from the material database from PENELOPE [84].

The Carestream system has a Mitaya focused grid, which was simulated according to the manufacturer specifications:
lead strips with Aluminium interspacing, 12:1 ratio and 80 line-pairs per centimetre. The grid septa were inclined to
simulate the grid focus distance of 140 cm. To simulate the moving grid, the grid geometry was shifted four times
towards the positive direction of the y-axis to cover a full line-pair (lead strip and Al interspace), the size of the
(simulated) grid was big enough to cover the full detector each time. The carbon fibre top and bottom layers of 32 µm
each were also simulated. The CsI detector was simulated with dimensions 43×35 cm2 and defined as a perfect
absorbent material: all photons that arrived to the detector were absorbed. Additionally, a collimator was modelled
close to the source to simulate the desired field of view (FOV), an important aspect when reproducing the scatter
fraction produced by the object.

In this study, the object to be imaged was a simple test object consisting of a homogeneous PMMA (26x30 cm and
9 cm thick) block and an Aluminium detail (1x1 cm2 and 2 mm thick). Different types of objects can of course be
defined using a quadric geometry (like planes, ellipsoids, cylinders, cones, etc), or in the case of human like anatomy,
voxelized phantoms can be loaded in the simulation.

To generate the output images, the Pixel Image Detector Tally was used, which computes the energy deposited in the
detection plane per unit area and per particle simulated (eV/cm 2/history). This tally discriminates between detected
photons according to their interaction history in the phantom, enabling scattered and primary photon images to be
obtained. The scattered radiation can be further divided by the interaction type prior to absorption in the detector,
namely as a Rayleigh, Compton or Multi-scatter event. Definition of the tally requires the pixel size of the output
image to be specified. Given that scattered radiation is a long-distance effect in the image [130], containing little

66
information on detailed structures, a pixel dimension of 5×5 mm2 was used for characterization of the spatial
distribution of scattered radiation reaching the detector [37].

The Monte Carlo code therefore generates low resolution scatter images where the pixel values (PV) correspond to
energy deposited per unit area and number of histories [eV/cm 2/history]. The final low spatial resolution scatter images
were re-binned to the real image dimension using bilinear interpolation, i.e., to a physical size of 43×35 cm2 with pixel
size of 139 µm.

4.2.1.2 Creation of primary projections

The primary images are generated using a ray tracing code implemented in Python. Projections were calculated
analytically via the Siddon’s algorithm [129]. The result from the ray tracing is a high resolution, noise free primary
image of the object with the same dimensions as the real image (2560 x 3072 pixels, with pixel size of 139 µm), where
the pixel values represent the probability that a photon arrives in the detector without undergoing energy loss or
angular deflection.

The software takes as input a voxelized object in a 3D matrix format, where each element of the matrix corresponds
to the material index of each element forming the object. In this specific application, a 260x300x128 matrix was
created to represent the PMMA and the Al detail. A voxel resolution of 10x10x2 mm2 was chose according to the
smaller dimensions present in the object given by the Al detail. Monochromatic rays are generated from a point source
and traced through the object to the centre of each detector pixel. The algorithm calculates the attenuation through the
object using equation 4.1.

𝐴(𝑥, 𝑦, 𝜀𝑖 ) = 𝑒 ∑ −µ(𝜀𝑖 ,𝑥,𝑦,𝑧)∗ 𝑙(𝑥,𝑦,𝑧) (4.1)

where l is the length traversed by the ray within the voxel with (x,y,z) index and µ(𝜀𝑖 )(𝑥, 𝑦, 𝑧) is the linear attenuation
coefficient of the voxel material at energy ɛi. The output from the ray tracing step is a stack of 2D images (𝐴(𝑥, 𝑦, 𝜀𝑖 ))
corresponding to each energy bin of the spectrum. Energy bins of 5 keV were chosen in this study since there were no
significant differences in the results with respect to the use of 1 keV bins (differences below 2%). Higher energy
resolution can be used if desired.

To combine scatter [unit = eV/cm2/history] and primary [unit = probability] images, the latter were multiplied by
Monte Carlo calculated flood images (i.e., without object) representing the initial photon intensity at a specific energy
(I0(εi)) [unit = eV/cm2/history]. Flood images were generated using penEasy, thus with the same units as the scatter
images. Mono-energetic beams for identical source position, collimation and source-to-image receptor distance (SID)
as applied to create the scatter images (section 4.2.1.1) were used. These were then combined to create a stack of 2D
images binned at the same energies used in the ray tracing (5 keV). The simulations were stopped when the relative
uncertainty value reported by the MC of 0.5% was reached. To reduce computation time, a 5×5 mm2 pixel size was
used, the images were then rebinned to the real dimension using bilinear interpolation. The flood images were
weighted by the probability of emission at a given energy bin from the selected spectrum. For the simulation of images
with antiscatter grid the flood images were filtered by the primary transmission of the grid for the selected beam
quality. The primary image was then calculated following equation 4.2.
67
𝑁𝜀 𝑁𝜀
𝑃(𝑥, 𝑦) = ∑𝑖=1 𝐹(𝑥, 𝑦, 𝜀𝑖 ) ∗ 𝐴(𝑥, 𝑦, 𝜀𝑖 ) = ∑𝑖=1 𝐼0 (𝜀𝑖 ) ∗ 𝑒 −µ(𝜀𝑖)𝑙 (4.2)

where P is the primary projection, F(x,y,ɛi) is the weighted flood image at energy ɛi coming from Monte Carlo and
A(x,y,ɛi) is the ray-trace image representing the attenuation mask at energy ɛi (calculated from equation 1). The hybrid
image is then formed by adding the primary image (P) to the interpolated scatter image generated with penEasy.

4.2.2 Adding real sharpness and noise characteristics to the hybrid images

The addition of real detector sharpness and noise characteristics to the simulated hybrid images was based on the
methods described by [131] and [66]. Image sharpness was characterized using the presampling MTF and detector
noise by the NPS.

4.2.2.1 Detector characterization: Signal transfer properties

Measurements of detector performance were carried out on the Carestream system. This device is equipped with a
43×35 cm2 CsI digital flat panel detector (FPD) with a pixel size of 139 m. The response function of the detector
was measured for tube voltages of 60, 80 and 120 kVp, respectively with 10 cm and 20 cm PMMA placed at the tube
exit. The acquisitions were carried out for an SID of 140 cm, large focus and no antiscatter grid. The X-ray field size
was set to irradiate the entire X-ray detector plane.

A calibrated R100 Piranha solid-state detector was used to measure the air kerma at the entrance of the bucky.
Acquisitions were done for a range of tube current-time product (mAs) settings to cover target dose levels of 1/3.2
and 3.2 times 2.5 µGy [132]. The air kerma values measured with the dosimeter were corrected by the inverse square
distance to give the air kerma at the detector entrance (DAK). Using the same mAs and kV settings, the FPD was set
in place and flood images were acquired in DICOM ‘For Processing’ format. The average PV and the variance were
then computed from a 10x10 mm2 region of interest (ROI) at the centre of the flood images. The response function of
the detector was obtained by plotting the measured average PV against DAK; a linear curve was applied, and the fit
coefficients recorded. The response function was used to linearize the data prior to assessment of sharpness and noise,
an important condition for linear system analysis.

4.2.2.2 Detector characterization: Modulation Transfer Function

The presampling Modulation Transfer Function (MTF) of the detector was measured using the edge method [15,133]
for each of the beam qualities studied. The test device consisted of an attenuating Tantalum square with dimensions
50x50 mm2 and thickness 1 mm, built to have straight, sharp edges. The edge was positioned perpendicular to the X-
ray beam at the centre of the receptor and was oriented with a slight twist with respect to the detector pixel matrix
(2° to 4°). In this way, a super-sampled edge spread function (ESF) can be constructed [133,134] for the horizontal
and vertical directions; some smoothing was applied to the ESF using a monotonic conditioning method [135]. The
images were acquired with an incident DAK of approximately 8 Gy to keep the noise level low. The presampling
MTF was calculated using software implemented in the IDL programming language [116].

68
4.2.2.3 Detector characterization: Noise Power Spectrum

The detector NPS was estimated for the beam qualities previously mentioned as a function of DAK. The NPS was
calculated from the homogeneous images acquired to measure the response function. A 1024×1024 pixels ROI at the
image centre was extracted. A 2D polynomial was fitted and subtracted from this ROI with the aim of suppressing
low frequency trends in the image that are not characteristic of detector performance [15,136]. Records of 128 x 128
pixels were then extracted with an overlap of 50% in the x and y directions. A Fast Fourier Transform (FFT) was
applied to each ROI and then the squared modulus of the FFT result was added to the NPS ensemble. Finally, the NPS
spectra contributing to the ensemble were averaged to form the final NPS. The 2D NPS for each DAK level was saved
as text file of floating-point values for use in the polynomial noise separation step. In addition, a 1D NPS was estimated
by forming the radial average of the 2D data since the NPS was discovered to be isotropic. The on-axis NPS data (0°
and 90°) were excluded from the radial average, as these values are biased by background trends [15,137]. These
calculations were performed by the previously mentioned software [116].

4.2.2.4 Noise separation: generation of 2D noise coefficients

The image application method described by [131] requires the separation of the noise power spectrum into the three
noise components: electronic, quantum and detector structured noise. To achieve this, a 2 nd order polynomial was
fitted to the 2D NPS at each spatial frequency (u,v), with a weighting 1/DAK² applied to the residuals [138]. The
fitting coefficients in (4.3) correspond to the 2D noise coefficient arrays: electronic noise (e(u,v)), quantum noise
(q(u,v)) and structure noise (s(u,v)).

𝑁𝑃𝑆(𝑢, 𝑣) = 𝑒(𝑢, 𝑣) + 𝑞(𝑢, 𝑣) ∗ 𝐷𝐴𝐾 + 𝑠(𝑢, 𝑣) ∗ 𝐷𝐴𝐾 2 (4.3)

The use of these noise coefficients to generate noise of the appropriate texture is described in Section 4.2.2.7.

4.2.2.5 Scaling PV in hybrid image

Before applying real system characteristics, the first step was to establish the target detector air kerma (DAKt) for the
simulated hybrid image. For a standard simulation, this could be simply set to the desired DAK level to be simulated.
In the present study, the DAKt was calculated from the real images of the test object. Four 5×5 mm2 ROIs positioned
in the background at the centre of the image were used to measure pixel value which was converted to DAK t using
the response function. The same set of ROIs was used in the hybrid image to calculate a calibration coefficient, applied
to scale the pixel values in the hybrid image so that they match the DAK t. The result of these operations was a scaled
hybrid image at the relevant beam quality (equation 4.4).
𝐷𝐴𝐾𝑡
𝐻𝐷𝐴𝐾𝑡 (𝑥, 𝑦) = 𝐻(x, y) ∗ (4.4)
𝐷𝐴𝐾𝑆

Where 𝐻𝐷𝐴𝐾𝑡 (𝑥, 𝑦) is the scaled hybrid image at certain beam quality, 𝐻(x, y) is the hybrid image, DAKt is the target
DAK given by the real image and DAKS is the DAK in the simulated hybrid image.

69
4.2.2.6 Sharpness modification routine: adding detector MTF to the simulated images

The sharpness modification stage requires a 2D version of the detector MTF and this was constructed by fitting the
two measured 1D presampling MTFs (for horizontal and vertical directions across the detector) with 9th order
polynomial curves. A 2D MTF was computed using weighting matrices [139]. The FFT of the hybrid image was
calculated and multiplied by the 2D MTF followed by inverse FFT transformation back to the spatial domain [37].
This gave a noise free and blurred image (equation 4.5).

𝐼𝑏𝑙𝑢𝑟 = 𝔉−1 [ 𝔉(𝐻𝐷𝐴𝐾𝑡 (𝑥, 𝑦)) ∗ 𝑀𝑇𝐹(𝑥, 𝑦) ] (4.5)

where Iblur is the blurred hybrid image and MTF(x,y) is the 2D MTF formed from the horizontal and vertical MTF
measured in the Carestream system.

4.2.2.7 Noise modification routine: creation of noise image

In order to generate realistic noise images at the DAK t we followed the work of [131]. The total noise image was
generated from three separately weighted noise images, representing the detector quantum (Iq(x,y)), electronic
(Ie(x,y)) and structured noise (Is(x,y)). Three Gaussian white noise images with zero mean () and unit variance (2)
were produced. In the spatial frequency domain, they represent images of constant magnitude and random phase. For
a given noise image, the FFT of the white noise images was multiplied by the square root of 2D noise coefficients and
the result was transformed via the inverse FFT back to the spatial domain (equations 4.6 a-c).

𝐼𝑒 (𝑥, 𝑦) = 𝔉2 −1 [𝔉2 (𝑁𝑔1 ) ∗ √𝑁𝑃𝑆𝑒 ] (4.6a)

𝐼𝑞 (𝑥, 𝑦) = 𝔉2 −1 [𝔉2 (𝑁𝑔2 ) ∗ √𝑁𝑃𝑆𝑞 ] (4.6b)

𝐼𝑠 (𝑥, 𝑦) = 𝔉2 −1 [𝔉2 (𝑁𝑔3 ) ∗ √𝑁𝑃𝑆𝑠 ] (4.6c)

where NPSe, NPSq and NPSs are the 2D noise coefficients corresponding to the electronic, quantum and structure
noise sources. Ng1-3 are Gaussian random noise images, Ie(x,y), Iq(x,y) and Is (x,y) are the images corresponding to the
electronic, quantum and structured noise coefficients.

With the assumption that electronic noise is an additive noise source, quantum noise scales with the square root of the
DAK and the structure noise scales with the DAK, the total noise image (I noise(x,y)) was formed. Ie was added as a
constant and Iq and Is were scaled on a pixel by pixel basis by the square root of the hybrid image and by the hybrid
image, both of them normalized to DAK (HDAKt(x,y)) (4.7).

𝐼𝑛𝑜𝑖𝑠𝑒 (𝑥, 𝑦) = 𝐼𝑒 (𝑥, 𝑦) + 𝐼𝑞 (𝑥, 𝑦) ∗ √𝐻𝐷𝐴𝐾𝑡 (𝑥, 𝑦) + 𝐼𝑠 (𝑥, 𝑦) ∗ 𝐻𝐷𝐴𝐾𝑡 (𝑥, 𝑦) (4.7)

The blurred and the total noise images were then summed to form the synthetic image (equation 4.8). I total was then
multiplied by the relevant response function in order to convert DAK to system pixel value.

𝐼𝑡𝑜𝑡𝑎𝑙 (𝑥, 𝑦) = 𝐼𝑛𝑜𝑖𝑠𝑒 (𝑥, 𝑦) + 𝐼𝑏𝑙𝑢𝑟 (𝑥, 𝑦) (4.8)

70
4.2.3 Validation

4.2.3.1 Image validation dataset: measurements of Signal Difference to Noise Ratio

Signal difference to noise ratio (SDNR), measured from both real and simulated images of a simple test object, was
used to validate the imaging chain simulation. This parameter allows validation of the large area contrast, sharpness
and noise in the simulated images. The test object was composed of PMMA blocks with dimensions 26x30 cm2 and
two thicknesses: 10 and 20 cm, plus a 2 mm thick Al square of dimension 1x1 cm2. The Al was placed on top of 10 cm
of PMMA, additional PMMA slabs were added on top of the Al square when needed. The object was imaged at a SID
of 140 cm in the Carestream DRX Evolution system using tube voltages of 60, 80 and 120 kVp. Images were acquired
with and without the antiscatter grid. For each tube voltage, three images were acquired, one with the mAs delivered
by the automatic exposure control (AEC) (mAsAEC) and two additional images with mAsAEC * 1.5 and mAsAEC / 1.5.
The X-ray beam was collimated to cover just the PMMA area (see Figure 4.1). To generate the simulated images of
the object the methodology previously described was implemented for the same acquisition settings as in the real
images.

SDNR was computed from five square regions of interest of 5 x 5 mm2, one positioned within the detail and the
remaining four within the background region, 10 mm from the target detail (see Figure 4.1). SDNR was calculated
using equation 4.9:

𝑃𝑉𝑜𝑏𝑗 −𝑃𝑉𝑏𝑘𝑔
𝑆𝐷𝑁𝑅 = (4.9)
𝑆𝐷𝑏𝑘𝑔

where PVobj is the mean pixel value for the object (i.e., Al square) and PV bkg and SDbkg are the average PV and the
average standard deviation from the four ROIs drawn in the background (PMMA) respectively.

Figure 4.1: ROIs used to measure SDNR in the real and simulated images. The squared ROI at the centre was placed on the Al
and the remaining four in the PMMA.

The uncertainty for the SDNR values measured from the real images was estimated by repeating a set of measurements
five times. The image set corresponded to 120 kVp and 20 cm PMMA. After each acquisition, the experimental setup
was dismantled and set up again five times in a row, which also allowed to evaluate the influence of the geometry set
up on the measurement’s variability. The standard deviation and standard error for grid in and grid out measurements
was calculated and reported.

71
4.2.3.2 Validation sharpness modification routine

In addition to SDNR measurements, validation of the noise and sharpness modification routine was also performed.
To validate the sharpness modification routine, a blur free, high resolution edge image is needed. This was obtained
from a real edge image acquired on a Siemens mammography system of 0.085 mm pixel size. By dividing by the
measured MTF for the Siemens system, a blur free image was obtained. Sharpness characteristics measured from the
Carestream system (Sections 4.2.2.2 and 4.2.2.6) are applied by multiplying by the MTF, followed by inverse Fourier
transform (equation 4.10).

𝑀𝑇𝐹𝑂
𝐼𝑏𝑙𝑢𝑟𝑟𝑒𝑑_𝑒𝑑𝑔𝑒 = 𝔉−1 (𝔉(𝐼𝑒𝑑𝑔𝑒 ) ∗ ) (4.10)
𝑀𝑇𝐹𝑆

where Iblurred_edge is the blurred edge image with the MTF from the Carestream system, Iedge is the edge image from the
mammography system. MTF0 is the 2D MTF in the original system (i.e., mammography system) and MTF S is the 2D
MTF of the system that we want to mimic (i.e., Carestream system). The MTF from Iblurred_edge was measured using
previously mentioned software [116] and compared to the MTF measured from the real images acquired in the
Carestream system.

4.2.3.3 Validation noise modification routine

To validate the noise modification routine, uniform images of dimensions 2560 x 3072 pixels and pixel spacing of
0.139 mm were generated. The PV assigned to the images corresponded to the target DAK given by the real flat field
images from Section 4.2.2.1 and 4.2.2.3. The noise modification routine was applied to the uniform images (Section
4.2.2.7) and then the NNPS was measured [116] and compared to the real flat field images acquired for the NNPS
calculation (Section 4.2.2.3). Validation was performed for 60, 80 and 120 kVp with target DAK values ranging from
0.7 µGy to 30 µGy approximately.

The method used for the NPS simulation had been introduced for mammography beam qualities [131]. This raises the
question whether the same method can be used in the chest where organs with different X-ray attenuation properties
are present. Consequently, we investigated whether the NPS measured in PMMA with attenuation equivalent to the
lung region can predict the magnitude and texture of the noise in low signal sections of the same image, where the
electronic noise represents a larger fraction of the total noise. In the case of the chest, this could be in the rib region,
where there can be strong changes in beam quality compared to the lung region.

To assess the accuracy of the noise simulation, images of lung and rib equivalent materials were acquired. For the
lungs 10 cm PMMA was used [52] and for the ribs a strip of Al 6 mm thick [140] was placed on top of the PMMA
blocks. Real images of the test object were acquired at 60 and 120 kVp, three dose levels, 140 cm SID and no
antiscatter grid. The experimental setup was reproduced in the computer simulations to generate the synthetic images.
In the simulation, noise was added to the entire image using the NNPS measured from homogenous PMMA acquired
as explained in Section 4.2.2.3. NNPS was then measured in the Al region of the real and simulated images as
described in Section 4.2.2.3.

72
4.2.3.4 Validation of antiscatter grid simulation

As a means of validating the grid simulation, primary transmission (Tp) and total transmission (Tt) were measured
experimentally for the Mitaya grid. The measurements were made for a range of PMMA thicknesses (10, 13, 16 and
20 cm), 120 kVp and 140 cm SID. Tp was measured with the PMMA plates placed at the tube exit. Two circular lead
collimators of different radii (5.2 mm, 3.3 mm) were aligned, one at the tube exit port (before the PMMA) and the
other to the exit side of the PMMA. Beam collimation was reduced to the size of the largest pinhole diameter. The
narrow beam geometry and the air gap between the PMMA ensured that minimal scattered radiation reached the X-
ray detector. The images were acquired with and without the grid and exported as DICOM ‘For Processing’. Following
image linearization, mean PV for grid in (PV+) and grid out images (PV-) were measured using a circular ROI of
2 mm diameter placed at the centre of the pinhole collimator.

To estimate Tt, the PMMA blocks were placed at a patient equivalent position and for each PMMA configuration
images were acquired with and without the grid. The same ROI was used to measure PV+ and PV- from linearized
‘For Processing’ images.

Images of the PMMA blocks with thicknesses 10, 13, 16 and 20 cm and 120 kVp were simulated using penEasy. The
geometry setup modelled corresponded to that of the Tt measurements, i.e no pinhole collimators and PMMA placed
at a patient equivalent position. This allowed measurement of both Tt and Tp because penEasy software can
discriminate the tallied photons in primary or scatter. For the Tp calculation, images were formed using un-scattered
photons only, while for the Tt the total image (primary + scatter) was employed. PV+ and PV- were then measured
with a circular ROI of 2 mm placed at the centre of the simulated images, as for the real images.

Tp and Tt were calculated using equations 11 and 12 respectively, where 𝑃𝑉𝑝+ and 𝑃𝑉𝑝− are the mean PV values
measured in the primary images with grid in and grid out, respectively. 𝑃𝑉𝑡+ and 𝑃𝑉𝑡− are the mean PV values
measured in the total images with grid in and grid out, respectively. The values obtained for real and simulated images
were compared.

𝑃𝑉𝑝+
𝑇𝑝 = (4.11)
𝑃𝑉𝑝−

𝑃𝑉𝑡+
𝑇𝑡 = (4.12)
𝑃𝑉𝑡−

4.3 Results

4.3.1 Detector characterization

Figure 4.2 displays two examples of the response function of the detector for 60 and 120 kVp and 10 cm PMMA. The
linear model fitted the function well (R2 > 0.99). The fitting coefficients obtained for each beam quality are shown in
Table 4.1. As can be seen, the curve gradient increases with increasing tube voltage and PMMA thickness. A small
offset (intercept) is also observed in all cases.

73
Figure 4.2: Relationship between average PV and DAK for 60 and 120 kVp and 10 cm PMMA. Linear fitting equation and R2
values of the data are also shown.

Table 4.1: Linear fitting coefficients of the response function for six beam qualities used in the study, PV = A + B*DAK.

Coeff. 10cm_60kVp 10cm_80 kVp 10cm_120 kVp 20cm_60 kVp 20cm_80 kVp 20cm_120 kVp

A -11.64 -6.50 -18.48 -4.40 -14.52 -6.70

B 134.22 164.88 211.73 601.52 827.30 927.42

R2 0.999 0.999 1.000 1.000 1.000 0.999

Presampling MTF was found to be isotropic with regard to the two directions across the detector, and therefore the
two curves were averaged. Figure 4.3 shows average MTF curves for beam qualities of 60, 80 and 120 kVp, and 10 cm
PMMA. As expected there is no difference in the MTF shape with beam quality, consistent with results for a different
CsI/a-Si detector [141].

The NNPS curves measured for 10 cm PMMA and tube voltages of 60, 80 and 120 kVp are shown in Figure 4.4.
Three dose levels of approximately 0.78, 2.50 and 7.00 µGy are shown. As illustrated, the different beam qualities do
not change the shape of the NNPS, a consequence of the shape of the MTF curve being independent of beam energy.
There is a clear difference in the magnitude of the NNPS as DAK is changed. Additionally, a small change in NNPS
shape with DAK level is seen, especially at low DAK values (~0.7 µGy/image). This is an indication that the dominant
noise source at low exposures (i.e. electronic noise) has a different power spectrum compared to the NNPS at higher
DAK levels [142].

74
Figure 4.3: Presampling MTF measured at 60, 80 and 120 kVp and 10 cm PMMA.

Figure 4.4: NNPS measured at 60, 80 and 120 kVp and 10 cm PMMA. For each beam quality the NNPS measured at three dose
levels is shown.

Following the procedure outlined in Section 4.2.2.4, 1d NNPS was separated into electronic, quantum and structured
noise components. The polynomial fitting coefficients with negative values were made zero. The electronic component
(Figure 4.5a) is mostly flat in shape at spatial frequencies above 2 mm -1 and thus consistent with a white noise source
that is not subject to filtering by the MTF [143]. The quantum component (Figure 4.5b) is filtered by the MTF of the
detector [144] [145] and decreases with spatial frequency, consistent with correlation of the quantum noise by the CsI

75
phosphor. Furthermore, the shape of the quantum coefficient curves do not change as energy is changed. The
structured noise (Figure 4.5c) shows little variation with beam quality, consistent with the study made at
mammography energies [131]. The large ramp in the curves at low spatial frequencies (<0.5 mm-1) is generated by
slowly changing structures in the image.

Figure 4.5: Electronic (a), quantum (b) and structure (c) noise coefficients for 60, 80 and 120 kVp and 10 cm PMMA.

4.3.2 Creation of synthetic radiographic images including real detector characteristics

Figure 4.6 shows example images generated using the methods described in sections 4.2.1 and 4.2.2 for the simulation
of the imaging chain, and the real image acquired in the Carestream system. Figure 4.6a shows the scatter and primary
images obtained following the methodology in section 4.2.1.1 and 4.2.1.2. The images were then combined to form
an ideal hybrid image of the test object. In this case, the images were created at 120 kVp and 10 cm PMMA. Figure
4.6b shows the blurred image, a result of filtering the blur-free hybrid image with the 2D MTF (section 4.2.2.6). Figure
4.6c shows the images corresponding to the different noise sources combined using equation 4.7 to form the total
noise image (section 4.2.2.7). Finally, the blurred and total noise image are added to form the final synthetic image,
which is then converted from DAK to PV using the corresponding response function (Figure 4.6d).

4.3.3 Validation: SDNR measurements

Figure 4.7 plots SDNR graphs for real and simulated images, for grid out (a-c) and grid in (e-g). As expected, the
SDNR decreases as PMMA increases from 10 to 20 cm and as tube voltage increases. The SDNR for a given beam
quality increases when mAs (dose) is increased. The relative differences between the SDNR in the real and simulated
images are also plotted in the graphs. For grid in, the average difference is 6.5% with values ranging from 0.5 to 19%.
For grid out the average difference is 5.1%, slightly lower than for grid in, with values ranging from 1.6% to 12%.
Higher relative differences are observed for 60 kVp in both cases. The standard deviation (std) and standard error (SE)
were calculated from the repeatability measurements. The standard deviation was 0.127 and 0.061 for grid in and grid
out measurements, respectively. The standard error was 3.3% and 1.6% for grid in and grid out measurements,
respectively.

76
Figure 4.6: Illustration of the various steps carried out to form the final synthetic image of the PMMA blocks with an Al detail
placed at the centre. (a) shows the primary image and the scatter image used to form the blur and noise-free energy integrated
hybrid image. (b) shows the blurred image after filtering of the hybrid image with the 2D MTF. (c) shows the total noise image
formed using the different images corresponding to the three noise sources: electronic, quantum and structure. (d) shows the final
synthetic image converted from DAK to PV using the response function (e) shows the real image of the test object acquired in the
Carestream system.

77
Figure 4.7: SDNR values measured in simulated and real images of an Al detail and PMMA blocks 10 and 20 cm thick.
Comparison of images acquired with grid out (a, b and c) and grid in (d, e and f) for 60, 80 and 120 kVp respectively. The relative
differences (%) for each setting are also shown on the secondary vertical axis.

4.3.4 Validation: sharpness modification routine

Figure 4.8 shows the comparison between the measured and simulated MTF. Relative deviations below 3% were
obtained for frequencies up to 4 mm-1 (below the Nyquist frequency, i.e., 3.59 mm-1).

78
Figure 4.8: Comparison of measured and simulated MTF for a synthetic edge image.

4.3.5 Validation: noise modification routine

Figure 4.9 shows the curves for the NPS measured in the simulated images (dashed line) and in the real images (solid
line). The NNPS curves shown are for 60 (a), 80 (b) and 120 kVp (c) and four air kerma levels, from 0.7 to 30 µGy
approximately. As observed, there is a good correspondence between the measured NNPS in the real and the simulated
images. For frequencies between 1 mm-1 and the Nyquist frequency (3.59 mm-1), relative differences remain below
12%. Larger differences were observed for the lower spatial frequencies (<0.5 mm-1) with maximum deviations of
18%. In average, differences found for 60, 80 and 120 kVp were 3.6%, 4.8% and 4.3%, respectively.

Figure 4.9: NNPS measured in real and simulated images for 60, 80 and 120 kVp at different dose levels.

4.3.5.1 Noise simulation in rib equivalent materials

The resulting NNPS measured in the 6 mm thick Al (~rib equivalent) for real and simulated images are shown in
Figure 4.10. As can be seen the NNPS for the real and simulated images showed good agreement, suggesting that
changes in beam quality going from lung to bone regions do not significantly affect the shape and magnitude of the
NNPS. Maximum relative differences between the NNPS in real and simulated images were 20% and 10% for 60 and
120 kVp respectively, over the range of 1 mm-1 and the Nyquist frequency. Average difference for both tube voltages
(60 and 120 kVp) was 10%. These are similar to the differences seen for the NNPS in Figure 4.9.

79
Figure 4.10: NNPS measured in rib equivalent material (Aluminium) in real and simulated images for 60 (a) and 120 kVp (b) at
different dose levels. The noise modification was carried out using the 2D noise coefficients measured on the lung equivalent
material (PMMA).

4.3.6 Validation: grid simulation

Figure 4.11 shows the Tp and Tt values obtained for 120 kVp and the range of PMMA thicknesses (10, 13, 16 and
20 cm), indicating close agreement between real and simulated values. For both cases, Tt values decrease as PMMA
thickness increases, expected due to the increase of scattered radiation. For the measurements with (real) grid, Tt
values ranged from 0.30 to 0.37 and for the simulated grid from 0.32 to 0.41. On the other hand, Tp values are roughly
constant as PMMA thickness changes, with values around 0.67 and 0.68 for the real grid and from 0.71 to 0.72 for the
simulated grid. Relative differences between real and simulated images varied between 5% and 6% for Tp and 9% to
13% for Tt. As illustrated, Tp and Tt values for the simulated images were consistently higher compared to the real
images.

Figure 4.11: Primary (Tp) and total (Tt) transmission measured in the Mitaya antiscatter grid and the simulated model of the same
grid.

As with the SDNR values, reproducibility was evaluated by five consecutive measurements of Tp and Tt for 20 cm,
in between which the experimental setup was dismantled. For the Tt values, a standard deviation of 0.004 and standard
error of 0.2% was found, while for Tp the standard deviation was 0.04 and standard error 1.8%.

80
The values calculated are in line with the results reported by Chan and Doi [146] and Mizuta et al. [147] for similar
beam qualities and antiscatter grids with similar properties.

4.4 Discussion

The simulation platform has two main steps. First, a perfect hybrid image is created, by combining a ray tracing
algorithm and Monte Carlo simulations using penEasy. Low resolution scatter images are obtained from Monte Carlo
simulations. Primary images are obtained from ray traced images scaled by the initial photon energy taken from MC
flood images produced with no object in the beam. In this way, primary and scatter images can be summed. Monte
Carlo simulations are used to generate low resolution scatter images. Since the image formed by scattered radiation
does not contain important anatomical information a pixel size of 5 x 5 cm2 was considered sufficient to characterize
the scatter field [37]. The primary images are a combination of the attenuation mask of the object calculated
analytically using a ray tracing algorithm that applies the Beer-Lambert Law and flood images representing the initial
photon energy calculated in penEasy. This step results in primary and scattered images with the units such that they
can then be added to obtain the so-called hybrid image. These images are generated for a perfect energy integrating
detector. It is worth noticing that noise-free images are essential before the second step, when realistic levels of
sharpness and noise are added, characterized using the MTF and NNPS measured from a real detector. In this way we
can ensure that the blurring and noise present in the final images corresponded to the MTF and NNPS measured in
the real detector. Although a full MC approach is possible, it is likely to be too slow computationally to obtain
completely noise free images, thus the hybrid approach was chosen where MC is combined with ray tracing [19]. A
limitation to this approach is the need for additional simulation to be able to combine the images. Furthermore, ray
tracing methods tend to produce very sharp images which do not look entirely realistic, this could be overcome by a
supersampling technique in which several rays arrive to the same detector pixel or by generating rays from a sort of
pencil beam to simulate the geometric unsharpness produced by the focal spot [36].

Validation was first performed by comparing SDNR values measured in real and simulated images of a simple test
object (Figure 4.7). large area contrast and noise were also verified for grid in and grid out cases. Additional validations
were performed for the MTF and NNPS modification (Figures 4.8 and 4.9). In both cases good agreement was found
between the magnitudes measured in real and simulated images. For the NNPS, the validation step ensured that not
only a correct standard deviation was imparted to the image but also the correct noise texture. This validation can be
considered comprehensive of all the elements of our imaging chain. Steps like image processing and display are not
included.

An important aspect of the validation had to do with the antiscatter grid, often cumbersome due to the complexity of
the geometry that can make the simulations very time consuming. The data showed that a Mitaya focused grid could
be simulated with reasonably good accuracy. The validation was carried out by measuring and comparing primary
and total transmission for the simulated and real grids (Figure 4.10). The simulated grid showed higher Tp and Tt
values compared to the real grid. This could be attributed to small differences in the geometry, for example the glue
used in the manufacturing of real grids, which is not present in the simulated grid. There is also potential for
misalignment of the grid during the measurement of Tp and Tt on the actual imaging system.
81
The simulations were run on a desktop computer with Intel® Core™ i5-4590 with 4 Cores at 3.30Ghz. The ray tracing
algorithm can be run in short times, in this case during less than 10 minutes, but this depends on the complexity and
resolution of the object being imaged. The Monte Carlo simulations of the scatter images were the more time-
consuming step of the process even with pixel sizes of 5x5 mm2, in particular the images including the antiscatter grid.
For the scatter images the simulations were split into parallel processes using different random seeds and then
combining the results. Using four cores, the simulations for the grid in configuration took approximately one week, to
achieve a combined uncertainty below 3% for the scatter images. This computational time will be reduced to hours if
supercomputing is used, however this proved that the simulations are still achievable using a typical PC from a
research lab. The time required for this step will also vary with the type of object being imaged and the computer
power available. Applying the sharpness and noise characteristics to the images is quick, taking only a few seconds
per image.

4.4.1 Limitations and comparison to other simulation frameworks.

The motivation behind this study was to create a simulation framework to be used in optimization studies that model
clinical tasks. While this is likely to involve complex images, validation steps require simple, well controlled input,
such as the homogeneous PMMA backgrounds and Al inserts used in the study here. These test objects have been
used in optimization studies as surrogates of more realistic anatomy by [148]. This simple object was considered
appropriate to assess the capacity of the framework to reproduce SDNR and scattered radiation, as a technical image
quality metric. The study by Elangovan et al. [127] used contrast to noise ratio (CNR), measured for an Al detail in
a PMMA background. Deviation between measured and simulated images in our study was slightly compared to that
found by Elangovan, where maximum deviation for CNR was 9%. One reason for this may be the far larger energy
range and PMMA thickness range considered in this study. An alternative validation method is to simulate a threshold
contrast-detail test object, as detail detectability is governed by the system sharpness, noise and scattered radiation
present in the image. This approach has been adopted by Smans et al. [37], using the CDRAD test object for computed
radiography imaging plates. Both Shaheen et al. [149] and Mackenzie et al. [131] used gold discs in the CDMAM
mammography contrast-detail test object as a validation step.

Although realistic levels of sharpness and noise were added to the images, the modelling of the CsI detector was
simplified by making it a perfect absorbent material. Badano and Sempau [150] describe the simulation of X-ray
photon interactions and the subsequent light photon cascades generated within the CsI scintillation layer. These
simulations can provide insight into the design and development of X-ray detectors [151], but detailed information on
detector parameters is required. The method outlined here allows the rapid simulation of an X-ray detector that already
exists and has been characterized using standard methods. Moreover, this model can go beyond existing detectors, by
testing theoretical MTFs. Additionally, the impact of NNPS and MTF can be studied separately.

The availability of raw data is crucial to set up the noise and sharpness modification process. Thus, the system should
allow the extraction of For Processing images with linearized PV for the measurement of the MTF and NNPS. This
is sometimes difficult to achieve in older digital systems. If the application is detectability studies with radiologists as
observers, clinical image processing must be applied to the images. This can be identified as a limitation of the

82
simulation platform. In the next stage of the study, anatomically realistic phantoms will be used as input in the
simulations to obtain more realistic images, in this case clinical processing will be provided by manufacturers under
collaboration agreements.

An approach to simulating images for the study of chest radiography systems was presented by [36]. In that platform,
patient CT images form the input. Patient CT images have the advantage in terms of anatomical variation compared
to computational phantoms, however the latter can provide more flexibility in terms of resolution and addition of
clinical tasks. The main difference between the simulation from Moore et al. and this work is regarding the scatter
images. In the work of [36] the scatter images are measured experimentally using an anthropomorphic physical
phantom and then added to the simulated projection. This could potentially limit the range of patient body mass
indexes (BMIs) studied, as the scatter fraction calculated corresponds to the physical phantom which in turn depends
on the BMI used as reference. On the other hand, scattered radiation of any imaging object can be characterized in
detail using the Monte Carlo simulation implemented in the current work. Regarding the noise addition step, in a more
recent study, the simulation method described by Moore et al [152] was updated and the method introduced by [153]
for the NPS simulation was implemented. Characterization of the detector for the measurement of the NNPS was
performed using copper filtration, rather than PMMA described in this work.

In terms of overall methodology, the framework presented in this work is similar to those from [37] for neonatal
imaging and from [38] for cone beam CT. However, both frameworks do not include the antiscatter grid simulation.
Another difference is that these studies use the total NPS for noise simulation, while NPS is split into electronic,
quantum and structure noise components here. For low dose simulations, this can be become important since the
correct modelling of the electronic noise is crucial at lower dose acquisitions and for regions of high attenuation where
the fraction of electronic noise to quantum noise can be relatively high.

4.5 Conclusions

In this work, a methodology to simulate synthetic radiographic images has been described and validated. The
validation of the simulation platform was carried out by implementing a model of a Carestream X-ray system available
in our hospital used for thorax imaging, with and without grid. A ray tracing technique was used to generate noise free
primary images and Monte Carlo software penEasy for the generation of the scatter images. Additionally, real noise
and sharpness characteristics were added to the images, given by the NNPS and MTF measured in a real CsI digital
detector. Validation was performed against experimental measurements of SDNR in a test object consisting of PMMA
and a small Al detail. The validation was carried out for different PMMA thicknesses, tube voltages, dose levels and
with the antiscatter grid in and out. Average differences between SDNR measured in real and simulated images were
6% and 5% for grid in and grid out respectively. Additional validation of the noise and sharpness modification process
was carried out as well as the grid geometry. For frequencies between 1 mm-1 and the Nyquist frequency, the maximum
difference for the MTF was 12%. For the NNPS in real and simulated images, average deviations found for all tube
voltages remained below 5%. Maximum relative differences for the grid parameters Tp and Tt were 6% and 13%
respectively. The results obtained can be considered satisfactory for the validation of the framework. The simulation

83
framework developed here has been shown to be accurate, flexible in terms of imaging component specification, and
has many potential applications, including the generation of images for use in VCTs.

84
Chapter 5
Modelling of anthropomorphic chest phantoms
and associated clinical tasks

Following the validation of the Lungman physical phantom for applications in chest radiography (Chapter 2), the next
step was to use the computational version of Lungman that we had developed. Initial work used the three Lungman
versions described in Chapter 2 to represent different patient sizes in the computational models. Additionally, clinical
tasks were modelled within the phantom. After discussion with radiologists, we decided to work for a more realistic
anatomy for the next stage of task-based optimization and virtual clinical trials, by using the Realistic
Anthropomorphic Flexible (RAF) phantom [86]. This phantom was judged a promising candidate for the creation of
new and improved anthropomorphic computational chest models. The development of realistic computational models
to represent human anatomy, including pathologies and medical devices related to chest radiography (CXR), is the
main subject of this chapter.

The first part of this chapter describes the creation of computational anthropomorphic phantoms to represent different
body types, for example male and female phantoms with different Body Mass Indexes (BMI). Then, a set of lesions
and devices commonly found in thorax exams were modelled and included within these phantoms. The models
presented in the chapter are the result of a series of iterations, in which the phantoms and pathologies were continually
improved. This was possible thanks to the feedback from an experienced radiologist and the increased expertise in
mesh modelling from the developer side. Part of this work was presented as ‘Creation of a set of computational
phantoms for clinical task-based optimization studies in chest radiography’ at the Belgium Hospital Physicists
Association (BHPA) Symposium, held in 2019 in Aalst, Belgium.

In the second part, a similar methodology based on patient CT image segmentation and mesh modelling is
implemented to create 3D computational models of pathologies associated with Covid-19. The text belongs to the
publication “Methodology to create 3D models of Covid-19 pathologies for Virtual Clinical Trials” published in the
Journal of Medical Imaging, Special Issue on Covid-19 Medical Imaging Research, January 2021[24].

85
5.1 Modelling of anthropomorphic phantoms including clinical tasks for use in Virtual
Clinical Trials

5.1.1 Methods

5.1.1.1 Realistic Anthropomorphic Flexible (RAF) phantom

The RAF phantom is a full body male phantom developed by Lombardo et al. [86] using polygonal mesh modelling.
This type of geometrical representation is widely used in computer graphic modelling [154] and makes use of
interconnected collections of polygons sharing vertices and edges.

The RAF represents an adult male, 176 cm tall and with a Body Mass Index (BMI) of 24 kg/m2. The phantom has
been validated against the ICRP Publication 110 phantom [155]. It contains a detailed model of human anatomy
(Figure 5.1a) and is ideally suited for imaging applications. The polygonal mesh format of the phantom allows further
modification of the anatomy to simulate, for example, different organ sizes or new elements such as the inclusion of
lesions. The posture of the phantom can also be modified; this type of flexibility is important for the accurate modelling
of patient position during radiographic examinations, as this can range from an erect chest Posterior Anterior (PA) to
a bedside Anterior Posterior (AP) examination. Moreover, if a voxelized version is necessary for applications
including Monte Carlo simulation and ray tracing, the voxel resolution of the phantom can be set to match the
requirements of the specific application.

5.1.1.2 Modifications to the RAF phantom

The RAF phantom (Figure 5.1a) was used as a base to create a set of computational models for chest imaging
applications and therefore only the organs in the thorax region of the RAF phantom were utilized in this study. The
main modifications described below include the variations of the external body shape, variations of the lung
background and creation of new organ models.

To study the influence of body size and shape in chest radiography, three additional versions of the standard RAF
phantom (BMI=24 kg/m2) were created representing an overweight male (BMI=29 kg/m2) and obese male
(BMI=40 kg/m2) and a female version with a BMI of 27 kg/m2 (Figure 5.1d). The open-source software MakeHuman
(available at http://www.makehumancommunity.org/) was used to create prototype human bodies with the same BMIs
as the new models. The latter were used as reference to modify the external shape of the existing skin mesh of the
RAF. This was done to avoid alterations in the mesh topology that could lead to artifacts while setting the posture of
the phantom and later in the voxelization step. The graphic modelling software (3ds Max (Autodesk, USA)) was used
for this purpose. The Soft selection tool was employed, as this allows the vertices of the meshes to be deformed in a
natural way. Figure 5.1d shows an example of the soft selection, the red colour represents the meshes explicitly
selected while the remaining colours represent the meshes in the vicinity. As the selected meshes are transformed (i.e.
translated, rotated and/or scaled), the elements in the vicinity will also be transformed in a smooth manner. This effect
of transforming the vicinity meshes decreases with distance -from the selected meshes- or the ‘strength’ of the

86
selection. For the female version of the phantom, all the internal organs were kept unchanged, the only difference was
the addition of 3D volumes representing the breasts (see Figure 5.1b).

a) b)

d)
c)

Figure 5.1: a) Picture of the Realistic Anthropomorphic Flexible (RAF) mesh whole body phantom, b) skin modification procedure,
with from left to right, the standard male, the modified overweight male and a female version adapted for chest imaging
applications, c) new organs segmented from CT images of overweight patient, d) overweight mesh version of the RAF phantom
including new internal organs coming from segmentation.

When designing the obese and overweight versions, new organ models were modelled to provide the phantoms with
some variability in the internal anatomy and not only on the external body shape. New organ models were created for
the lungs, heart, diaphragm, trachea and main bronchi. These organs were first segmented from CT images of a real
patient with a BMI of 29 kg/m2, obtained from a freely available thorax CT database [156]. The voxel dimension of
this dataset was 0.8× 0.8 mm2 and slice thickness was 1.25 mm. Segmentation was performed using 3D Slicer software
[82]. The diaphragm, lungs and heart were segmented using a combination of Thresholding, Sphere Brush (a
volumetric brush that applies modifications to slices above and below the current slice) and Island Identifier (creates
a unique label value for each connected region in the current label map) tools. The trachea and main bronchi were
segmented using the Grow from Seed effect, which starts from a drawn segment within the anatomical structure and
then region growing is applied to extend the segmentation to the full organ. The segmented organs (Figure 5.1c) were

87
exported directly to a mesh using the OBJ file format, which can be loaded into the 3ds Max application. The remaining
organs such as the bones, the remainder of the airways and skin were not segmented, as the 3D models already
available in the RAF were adapted to the new organs dimensions to speed up the development of the phantoms. Further
improvement of the segmented volumes was performed in 3ds Max, where imperfections from the CT segmentation
were corrected by surface smoothing. This was done carefully, to not modify the volume of the segmented organs by
more than 15%.

An important feature for chest imaging that was missing in the RAF, was the presence of a more detailed lung
background with bronchial trees, pulmonary arteries and pulmonary veins [157]. To add these tissues to the phantom,
segmentation was performed from CT images. The segmentation of the lung structures was performed using 3D slicer
following a semiautomatic procedure illustrated in Figure 5.2. First a mask was created in the lung region using
Threshold followed by the Island tool to remove residual volumes thresholded outside the lungs. Using the Smoothing
tool with the option of Closing (filling sharp corners and holes smaller than a specified kernel size), the interior holes
in the lungs were filled. Using this volume as a mask, Threshold was again applied with the aim of isolating the
pulmonary structures. These steps produced a first rough model of the lung structures. Further refinement was then
performed manually using the Scissors, Sphere Brush and Smoothing tools. Figure 5.2 shows an example of the 3D
volume obtained following the described procedure. Note that these figures were created for visualization purposes
only. The segmented volume was imported into 3ds Max where it was further refined and fitted within the phantom
lungs using anatomical landmarks.

Figure 5.2: Illustration of the main steps followed for the segmentation of pulmonary structures. The model shown was created for
visualization purposes only.

88
5.1.1.3 Creation of clinical tasks within the phantom models

The ultimate aim of the PhD project is to perform task-based optimization studies, and thus modelling of common
clinical tasks found in chest X-ray imaging was a crucial step. This section describes lesion and device modelling
within the RAF phantom models. Several variations of the same pathology were produced to simulate a realistic range
of phantom models, 11 in total, featuring pathologies like lung nodules, rib fractures, pneumothorax, pleural effusion
and medical devices such as catheters. Task modelling was supported by information found in the literature, real
radiographic images featuring the types of lesions/devices and anatomical references of interest. Real images were
especially useful to select the position of the tasks within the model, to make them clinically realistic. The realism of
the tasks was then evaluated in an observer study.

5.1.1.3.1 Catheters

Catheters are thin tubes, designed to be inserted into the body. They are typically made of PVC or silicone. This study
includes the simulation of central venous catheters, which are commonly placed in veins in the neck (internal jugular
vein), chest (subclavian vein), arms or groins. This type of catheter is used to administer medication and/or fluids.

a)

b) c)

Figure 5.3: a) Illustration of the path deform modifier, in which a tube is deformed along the yellow line to obtain the desired
shape. b) subclavian catheter placement within the phantom and c) jugular catheter placement, the catheters are coloured in light
grey.

89
The catheters were modelled using tube shaped cylinders which were later deformed to mimic the shape and position
of these devices inside the patients. To place the catheters the Path deform modifier was used, which deforms the path
of an object along a spline. The object can be moved or stretched along the path and also rotated or twisted about the
path. Figure 5.3a illustrates the effects of the Path deform modifier applied over a tube which is deformed following
a user-defined line (in yellow). Stretching the tube along the line generates the desired shape. Figures 5.3b and 5.3c
show part of the phantom model including a subclavian and a jugular catheter, respectively. To create the wall of the
catheter the Shell modifier was employed, which modifies the object thickness to a specific value. The distance
between the inner and outer layer of the object was set to 1 mm.

5.1.1.3.2 Lung nodules

Lung nodules are usually small lesions in the lungs that can be benign or malignant. Nodules can have different sizes,
shapes and densities. According to their size, they can be labelled as miliary nodules (<2 mm), pulmonary
micronodules (2-7 mm), pulmonary nodules (7-30 mm) and pulmonary masses (>30 mm). The nodules are described
as spiculated when the margins with the lung tissue appear spiky and uneven. Smooth or lobulated margins are also
possible, the latter occurs when the margins appear like a cluster of overlapping rounded nodules. Additionally,
nodules can be solid or calcified, partly solid or non-solid – this type of lesion is also called a ground glass nodule.[158]

Simulation of lung nodules with different shapes began with the creation of perfect spheres covering a range of
diameters. Irregularities on the surface were introduced using the Noise modifier, which simulates random variations
in an object’s shape. A second method illustrated in Figure 5.4 was used to create nodules with spiculated margins.
First random faces of the sphere were selected and the Extrude tool was applied to push the faces out (Figures 5.4a
and 5.4b). Then, the TurboSmooth tool was applied to smooth the extruded faces to make them smoother and achieve
the spiculated effect (Figure 4c). A range of nodules simulated within the phantom are displayed in Figures 5.5, the
first simulation method was applied for nodules shown in figures 5.5a to 5.5c, while the second for spiculated masses
was applied in 5.5d and 5.5e. To generate the nodule shapes, the work from Solomon and Samei [23] was used as
reference.

Figure 5.4: Illustration of the process to create spiculated masses. a) a range of faces are selected from a sphere object, b) the
Extrude modifier is used to push the faces out and c) the TurboSmooth modifier is employed to smooth the surface. The spiculated
mass in c) was created for visualization purposes.

90
Figure 5.5: Lesion model with different levels of Noise applied (a, b and c) and with different Spiculation applied (d and e).

5.1.1.3.3 Rib Fractures

Rib fractures occur when a strong force is directed towards the ribs and causes a break. This results in chest pain that
gets worse when the patient breathes in. This type of fracture is common in trauma patients and can cause life-
threatening complications.

Figure 5.6: Illustration representing the modelling procedure of two types of rib fractures.

The procedure to model rib fractures is illustrated in Figure 5.6. First the area where the fracture will be modelled is
selected and the number of vertices of the mesh is increased using the TurboSmooth tool. The increase of vertices is
controlled by the number of iterations set by the user. This is done to ensure greater modelling freedom and improved
geometrical precision in the lesion modelling, ultimately leading to more realistic simulated fractures. Figure 5.6a
shows the vertices of the rib mesh with blue points, as observed there are areas with an obvious higher density of blue
points. Figure 5.6 shows two types of fractures, in one the fractures are modelled by slightly deforming the shape of
the rib and in the second by detaching the bones from the rest of the ribs. In the first case the modifications are
performed using the Soft Selection tool (Figure 5.6b), previously described in Section 5.1.1.2. The result is an

91
angulation of the ribs at multiple positions (Figure 5.6c). In Figures 5.6d-5.6f, the fractured bone is separated from the
rest of the ribs, first the area to detach is selected (in red) and then rotation and/or translation are applied until the
desired distancing gap between bones is achieved. Radiographic images and illustrations of rib fractures were used as
reference in the modelling.

5.1.1.3.4 Pneumothorax and pleural effusion

Pneumothorax occurs when air leaks into the space between the lungs and chest wall. This air then pushes on the
exterior of the lung(s) causing the lung to collapse. This can occur in the whole lung or just partially.

Pleural effusion occurs when fluid accumulates between the tissues lining the lung and the chest wall (pleural space).
The presence of fluid inhibits the expansion of the lungs, which can make breathing difficult. Various types of fluids
can accumulate in the pleural space, including blood (haemothorax) and pus.

The same method was used to model pneumothorax and pleural effusions. First, the meshes in the region to be
modified were selected (see red selection in Figure 5.7). Then the Relax modifier was applied and the volume reduced.
The Relax tool smooths the mesh as the vertices move toward the centroid of the transformation, thus reducing the
volume. In Figure 5.7, the yellow overlay represents the original lung volume before modification and the blue line is
the new model simulating the pneumothorax. To obtain the lesion area, the ‘relaxed’ lung volume was subtracted from
the original lung volume and the resulting volume is then added in the phantom using a different material index.

Figure 5.7: Illustration representing the modelling of a pneumothorax, same procedure is applied for pleural effusions.

5.1.1.4 Voxelization of the phantoms

The polygonal mesh models of the different organs and pathologies were then exported separately and loaded into a
voxelization software developed by Lombardo et al. [86]. The latter software tool is based on the work of Laine [159]
and is optimized such that the mass and thickness of the organs is preserved. The algorithm uses a conservative eight-
separating voxelization method, by calculating intersections between the triangles and quads from the meshes with a
cube used as test object [159]. Figure 5.8 displays the software interface. The parameters used in the voxelization
software were the voxel resolution and the coordinates of the bounding box. The voxel resolution used for the models
was 0.5x0.5x0.5 mm3, which was a compromise that maintained fine details within the phantom anatomy and yet kept
the computational load reasonably low. Once all the organs and pathologies were voxelized, they were combined and
92
assigned an ID number using a script developed for ImageJ [160]. The process started from the thoracic wall and
proceeded with an increasing hierarchy to the bones, diaphragm, lungs, heart, up to the pulmonary veins, vessels and
the bronchi (highest priority): IDs were overwritten where there was superposition. The clinical tasks were included
in the hierarchy in the following way: nodules, pneumothorax, pleural effusion and catheters, the last task to be added
in the model had the top priority. In the case of the catheter, priority was first given to the external and then the internal
layer. The rib fractures were added together with the rest of the ribs and bones.

Figure 5.8: Graphic interface of voxelization software: the voxel resolution and bounding box size are entered as parameters.

5.1.1.5 Creation of radiographic images

Synthetic radiographic images of these phantoms were created using the simulation framework previously described
in Chapter 4. The materials and proportions of the mixtures used for the different phantom organs were taken from
ICRP Publication 89 [46] and 110 [155] and are shown in Table 5.1.

For the lung nodules, the chemical composition and density reported by Ullman et al [161] were used, while the
catheter external wall was modelled using the PVC composition [162]. To simulate the pneumothorax and the pleural
effusion, air and blood were used, respectively.

93
Table 5.1: Density of materials used in the phantom organs.

Material Density (g/cm3)

Thoracic wall (residual tissue) 0.95

Diaphragm (Skeletal muscle) 1.04

Bones (75% spongiosa, 25% cortical) 1.28

Cartilage 1.10

Lungs (mix) 0.26

Heart (blood and muscle mix) 1.05

Breast 1.02

Bronchi 1.03

Blood 1.06

Nodules 1.03

Catheter (PVC) 1.41

5.1.1.6 Task realism

Realism of the phantoms and clinical tasks was assessed using a dedicated reader study. The study comprised a set of
11 images corresponding to each of the phantom versions including the range of clinical tasks created. The images
were generated at 120 kVp, i.e. the same tube voltage as in the clinical protocol for chest PA, and at a dose level of
8 µGy at the detector. Clinical image processing was then applied using image processing software MUSICA (Agfa,
Belgium). As for the realism study, the dose level was higher than the clinical standard dose level so that noise did
not interfere with the evaluation of the tasks.

A thorax radiologist with 30 years of experience scored each image using the four criteria listed in Table 5.2. Questions
3 and 4 were position dependent and required the user to first mark the location of the clinical task in the image. The
realism criteria were scored using a five-point scale, the confidence levels established are described in Table 5.3.
Additionally, the user could leave comments on the tasks and the image in general in a notes panel. Images were
displayed using Viewdex software [111]. Contrast and brightness levels could be adjusted if desired and no time limit
was imposed. More than one lesion or device was present in the images and they were all marked with arrows to avoid
the risk of a missed task by the reader. This was considered appropriate since the main objective of the study was to
evaluate the realism of the tasks and not the ability of the reader to localize them within the image.

94
Table 5.2: Realism criteria used in the reader study.

Question Realism criteria Localization dependent

1 Realism of the lung background no

2 Realism of the mediastinum region no

3 Realism of the lesions in terms of appearance yes

4 Realism of the lesions in terms of position yes

Table 5.3: Confidence levels used by the reader to score the images according to the realism criteria.

Five-point scale Confidence level

1 Not at all realistic: critical elements that affect the realism

2 Poor: obvious elements that may affect the realism

3 Adequate: minor elements that did not affect the realism

4 Good: minimal unrealistic elements

5 Very realistic: no unrealistic elements

As an additional validation step, the images of the phantoms including the clinical tasks were uploaded to a Lunit
Insight CXR AI algorithm (Lunit, South Korea) [163]. Of the clinical tasks included in the phantom models, the
algorithm can detect nodules, pneumothorax, and pleural effusion and therefore detection of rib fractures and catheter
localization could not be assessed using the Lunit AI algorithm. The used algorithm was trained with a dataset of real
chest X-ray images. The evaluation was performed by uploading the images as DICOM files from which the AI
generated a heat map giving the location information probability that a given clinical task lies within a region of the
image. The AI report also provided an abnormality score for each lesion type. This score reflects the AI’s calculation
of the likelihood of the detected lesion presence.

5.1.2 Results

Figure 5.9 illustrates the voxelized versions of the standard BMI and overweight male phantoms without tasks, (a)
shows several slices of the standard RAF. For visualization purposes, the organs were given Hounsfield Unit (HU)
equivalent values, in (b) a frontal view of the standard RAF is shown and (c) shows the frontal view of the overweight
male phantom, together with the axial, sagittal and frontal views.

A total of 15 phantom models were created, four of them normal cases (i.e., without lesions) and eleven 11 abnormal,
with a large range of lesion and device combinations. The tasks were paired in order to get a range of different
combinations, with the purpose to reduce the number of phantoms and consequently images. This effect will be better
explained in Chapter 6. The modelled clinical tasks comprised 19 lung nodules, 4 catheters, 5 rib fractures, 4

95
pneumothorax and 2 pleural effusion. The lung nodule models were generated with different shapes (Figure 5.5) and
with diameters ranging from 0.5 to 3.5 cm. They were placed in different positions in the lungs, resulting in a range
of visibilities for the lesions.

a)

b) c)

Figure 5.9: a) Axial slices of the voxelized version of the standard RAF male phantom adapted for chest imaging applications (the
organs were assigned HU equivalent values for visualisation purposes), b) Frontal rendered view of the standard RAF male
phantom; c) Voxelized version of the overweight phantom. The voxel resolution of both models is 0.5x0.5x0.5 mm3.

The four normal (without lesion) phantoms corresponded to all the simulated body types: standard male, female,
overweight male and obese male. The first two have the same anatomy, except for the breast in the female and the
outer skin. The latter two have the same internal anatomy, which was partly segmented from a CT image dataset with
different skin representing the external body shape.

Simulated radiographic images of the 11 phantom versions including clinical tasks are displayed in Figure 5.10. For
each case, a projection is also shown with the voxels containing the pathology highlighted in green.

96
97
98
99
Figure
Figure 5.10: Simulated radiographies of the 11 abnormal phantom models. For each case, to the left a projection is also shown
with clinical tasks highlighted in green.

5.1.2.1 Comparison of organ volumes, mesh vs voxels

To study the effect of the organ voxelization, the volumes of the mesh and voxel organs were compared. This also
served as justification of the voxel resolution used. In Tables 5.4 and 5.5, the volume comparison for all the organ
models can be seen, listing the relative difference between the mesh and voxel model. The standard male and female
organs are in Table 5.4 and the overweight and obese male organs are in Table 5.5. As observed, the voxelization
using the selected voxel resolution did not cause significant changes in the organ volume compared to the mesh model,
with a maximum difference of 9% and the average absolute difference of 1.4%.

100
Table 5.4: Organ volumes and relative difference between the mesh and voxel models of the organs for the standard male and
female phantom versions.

Volume (cm3)
Organ name voxelized organ mesh organ Relative difference
Skin female 32104.0 32082.0 -0.07%
Skin male standard BMI 29385.9 29359.4 -0.09%
Diaphragm 307.2 305.7 -0.48%
Breast 1522.7 1523.0 0.02%
Bones (ribs + spine) 1473.4 1436.3 -2.58%
Lungs (mix) 4409.8 4408.6 -0.03%
Heart 596.0 595.7 -0.04%
Heart + Aorta + Vena cava 884.1 887.1 0.35%
Pulmonary veins 53.8 58.2 -8.20%
Pulmonary arteries 63.2 69.9 -8.91%
Bronchi + trachea 144.2 142.5 -1.21%

Table 5.5: Organ volumes and relative difference between the mesh and voxel models of the organs for the standard male and
female phantom versions.

Volume (cm3)
Organ name voxelized organ mesh organ Relative difference
Skin male overweight 35106.9 35080.1 -0.08%
Skin male obese 47833.9 47831.4 -0.01%
Diaphragm 406.7 404.5 -0.56%
Bones (ribs + spine) 1602.8 1610.0 0.45%
Lungs (mix) 5236.9 5236.0 -0.02%
Heart 840.9 840.5 -0.05%
Heart + Aorta + Vena cava 1112.3 1115.2 0.26%
Pulmonary veins 100.1 94.9 -5.52%
Pulmonary arteries 97.2 93.0 -4.53%
Bronchi + trachea 178.7 173.8 -2.83%
5.1.2.2 Realism study

Table 5.6 shows the reader scores. For the lung background, the reader classified 100% of the images as at least
“adequate” and 91% at least “good”. For the mediastinum region, all images were classified as at least “good”.
Regarding the lesions, 97% were classified at least “adequate” and 69% as at least “good”. Lesion position was
classified in all cases as at least “adequate” and in 94% as at least “good”.

101
The reader was asked to leave comments for each lesion that would help with further development of the models. As
a result, the catheter material composition and some of the fractures were further improved. Figure 5.10 shows the
modelled radiographic images of the phantoms used in the reading after the improvements mentioned.

Table 5.6: Percentage of cases rated by radiologist reader as at least adequate and at least good for each of the realism criteria in
the images.

Lung background Mediastinum region Lesion appearance Lesion position

% at least adequate (>=3) 100% 100% 97% 100%

% at least good (>=4) 91% 100% 69% 94%

Figure 5.11 shows the output of the AI algorithm for the image dataset used in the task realism reading study. Model 2
was excluded from the analysis as this did not contain tasks that were detectable by the algorithm. It can be seen that
the algorithm was able to detect at least one of the tasks present in the image, except for Model 10 (containing nodules
and a catheter) where no nodules were detected. The algorithm detected 10 of the 25 tasks present in the models, more
specifically, pleural effusions and pneumothorax were detected with 100% accuracy, while the detectability of lung
nodules was 21% (this value was calculated as the number of true positive localizations divided by the number of
nodules present). On the other hand, the abnormality score reported by the software was on average 29% for the
nodules, 75% for the pneumothorax and 49% for the pleural effusion. From the images, it is evident that the algorithm
detected the less subtle nodules present in the image. The algorithm also generated some false positive pathologies in
models 7, 8, 9 and 11, all of which occurred in the larger BMI phantoms.

a) Model 1: Lung nodules

102
b) Model 3: Pneumothorax and pleural effusion

c) Model 4: Lung nodules

d) Model 5: Pneumothorax

103
e) Model 6: Lung nodules

f) Model 7: Pneumothorax and lung nodules

g) Model 8: Pleural effusion

104
h) Model 9: Lung nodules

i) Model 11: Pneumothorax

Figure 5.11: Output of the AI algorithm for models 1 (a), 3-9 (b-h) and 11 (i). Left: original image with the pathology highlighted
in green, right: heat map generated by the AI algorithm. Note that the algorithm does not detect catheters or rib fractures and
therefore these tasks are not expected to appear in the output.

5.1.3 Discussion

5.1.3.1 Comparison to the Lungman phantom

As mentioned in the introduction of the chapter, the first computational phantom used in the simulations was a voxel
model of the Lungman phantom. This phantom was previously validated in Chapter 2 for dosimetric applications in
chest radiography. However, the main shortcoming of the physical phantom and thus the computational version, is the
realism of the imaged anatomy, specifically in the lung background and mediastinum area. The flexibility of this
approach is also limited, which prompted the creation of the new models with more realistic chest anatomy, as
described in this Chapter. Figure 5.12a and 5.12b show respectively 3D mesh models of the Lungman and RAF
phantom and while 5.12c and 5.12d show the simulated images of the Lungman model and the RAF model,

105
respectively. The improved level of anatomical detail and realism of the RAF models is clearly seen compared to the
Lungman phantom.

a) b)

c) d)
Figure 5.12: Polygonal mesh model of the a) Lungman phantom and b) the RAF phantom. Simulated radiographic images of c)
the Lungman and d) the standard RAF for comparison.

5.1.3.2 Library of computational models including clinical tasks

The modelling of a library containing 15 anthropomorphic chest phantoms with a range of clinical tasks has been
described.

Different segmentation techniques and polygonal mesh modelling were implemented to create the presented models.
Both, segmentation and polygonal mesh modelling present a steep learning curve, which can be considered a downside
to the approach if other users were to reproduce the method. Although some semiautomatic or reproducible steps can
be carried out, the polygonal mesh modelling requires considerable user intervention and could not be automated. On
the other hand, the creation of the clinical tasks can be considered a straight-forward process and the methods described
106
could be more easily applied by other users that have an available polygonal mesh phantom. It should also be noted
that the phantom and organ modelling was improved over time with frequent feedback from an experienced radiologist
to arrive to the final models presented in the chapter.

In order to generate radiographic images of the phantom using the simulation platform, a voxelization of the mesh
phantoms is necessary. One way to validate the choice of voxel resolution used, was to compare the volumes of the
voxelized organs to the original mesh model. It was found that the voxelized organ volumes are generally larger than
the mesh organ volumes, which is due to the conservative nature of the algorithm used [86]. In only four cases was
the voxelized organ volume smaller than that of the corresponding mesh model, but these differences were negligible,
with values below 0.5%. The biggest differences were observed for the organs corresponding to the pulmonary
structures, which might be due to the complex shape of the mesh models resulting in the presence of overlapping
polygonal faces within the same mesh. However, for the chosen voxel resolution of 0.5x0.5x0.5 mm3, these
differences were below 9%, which does not compromise the realism of the imaged lung background and does not
significantly affect simulated doses. The decrease of the voxel size to, for example, 0.3x0.3x0.3 mm3 did not bring
considerable improvement in these differences, but it increased the phantom memory burden in simulations by 460%.
For the standard male phantom, this marginally higher resolution, increased the number of voxels from 364.2 million
voxels to 1.68 billion voxels. Thus, 0.5x0.5x0.5 mm3 was considered an optimal resolution in terms of anatomical
accuracy and computational load.

5.1.3.3 Realism evaluation

The phantoms and the clinical tasks were validated in a realism study by one radiologist. The participation of one
reader can be signalled as limitation, however, given the vast experience of the reader we considered the results
sufficient to validate the models. Images of the models with the area of the pathology or device signalled were scored
by the reader. The realism criteria were related to the lung background and mediastinum region, while in the case of
the simulated tasks, they were scored in terms of appearance and position. The use of highlighted tasks prevented false
negatives localizations during the reading process. Some final minor adjustments were performed to the phantom
following the suggestion from the reader, but in general the results of the reading were considered satisfactory. In
addition to the human observer evaluation, the images were uploaded and evaluated by an AI algorithm which was
able to detect 40% of the tasks present in the images, including nodules, pneumothorax and pleural effusions. This
percentage was affected by the detection of lung nodules: it was noted that the AI was not able to detect the more
subtle nodules due to their size and position within the chest. In fact, other studies have proven that the detection of
nodules is largely affected by the anatomical noise in the images [121,164–168]. This detectability improves when
the anatomical noise is removed, like in CT. In general, nodules detected by the AI had diameters above 2.3 cm.

107
5.2 Methodology to create 3D models of Covid-19 pathologies for Virtual Clinical

5.2.1 Introduction

During the ongoing COVID-19 outbreak, thoracic imaging by means of computed tomography (CT) and/or planar
chest X-ray is being used as a key tool in early diagnosis and disease monitoring, in particular to establish severity
and potentially quantify progression of the disease. Reverse transcription polymerase chain reaction (RT-PCR) results
are considered gold standard for diagnosis of Covid-19, but as thoracic imaging has been able to positively confirm
cases after false negative RT-PCR tests [169], this is often used in clinical practice. Access to chest CT or CXR is
especially useful where there is a high influx of symptomatic patients, a shortage of tests and potentially long waiting
times for the results.

While CT is the preferred imaging modality for diagnosing Covid-19 patients, CXR is widely used for follow-up of
the disease, with CXR performed daily on patients in the Intensive Care Unit. CXR systems have many logistical
advantages over CT, among them their wide availability and short examination times. A CT examination requires
transport of infectious patients to dedicated rooms followed by extensive and time-consuming decontamination of the
system. In contrast, portable CXR devices can be transported to areas designated for Covid-19 patients within the
hospital and even to external locations including care homes. Early research has shown that the sensitivity of CXR for
Covid-19 is in the range of 70% [170], while a figure of approximately 90% has been found for CT [171].

The rationale for this study is that an attempt should be made to improve the performance of CXR when used in the
diagnosis of Covid -19. There are different technical approaches available for this, including new detectors, improved
anti-scatter rejection techniques, rib suppression software and/or dual energy acquisitions. Given the urgency of the
situation, we have developed a Virtual Clinical Trial (VCT) [19,127,172] platform that will allow dedicated
optimization studies without the difficulties associated with clinical studies on (critically ill) patients. The objective
of this work was therefore to create a series of Covid -19 models which can be used to generate simulated chest X-ray
images for future VCTs. The method for producing the Covid -19 lesion models is described in detail; nine models
were then evaluated for realism by radiologists. Alongside the radiologist rating, an Artificial Intelligence (AI) based
software tool, developed for real patient cases, was used to assess the virtual cases for the presence of Covid -19
pathology. This tool was used as an extra realism estimate in addition to the realism scores of radiologists.

5.2.2 Materials and methods

5.2.2.1 RAF Phantom

In this work the modelling also started from the polygonal mesh RAF phantom [86]. The different versions of the
phantom were also used in this study. However, different from previously described, only the external shape of the
phantom was modified to represent different body types. The internal organs of the phantom were kept unchanged. A
study by Lemanowicz et al. [173] found the level of patient obesity to have highest correlation with the chest soft
tissue thickness, thus it was considered correct to keep the internal organs unchanged for a first set of models.

108
5.2.2.2 Modelling of the pathologies from CT scans

Ethical approval was requested for a retrospective study making use of patient CT images; the ethical committee
waved any dedicated patient consent. A 3D modelling methodology was implemented to create the Covid-19 disease
within the RAF phantom. CT images of patients suspected of Covid-19 confirmed by CT scan were used as reference
for the development of the pathology models. The CT scans had been acquired with a low dose thorax protocol using
120 kVp, 1.2 mm pitch and 46 mAs (tube current modulation off). CT image voxel size ranged from 0.68 mm to 1.00
mm while slice thickness was 3 mm.

A range of cases was selected by a radiologist to ensure the different stages of the disease would be covered. Areas of
ground glass opacities (GGO) and consolidation were segmented manually using ImageJ [160] (Figure 5.13a). The
segmentation was carried out by a medical physicist but closely guided by a radiologist. The segmented pathology
was then converted to a binary stack image in which the pathology was coloured white and the background black.
Next, the marching cubes algorithm [174] was applied to extract polygonal meshes from the iso-surfaces of the three-
dimensional pathological structures (Figure 5.13b). This conversion from CT voxels to meshes was done using the
3D visualization library in ImageJ [175]. A resampling factor of 1 was chosen for the marching cubes algorithm so
that the meshes would replicate the voxel structures without a loss of resolution. The mesh volume of the pathology
was then exported as an OBJ file containing the coordinates of the vertices in 3D space; this in turn defined the shape
and size of the surface of the segmented pathology.

The OBJ lesion model was imported into 3ds max, where two steps were used to correct the meshes and generate
consistent models. First, the voxel scaling effect due to the segmentation process in anisotropic CT data was smoothed;
artefacts generated by the marching cubes algorithm were removed using the TurboSmooth and Relax modifiers. The
latter were sparingly applied to avoid loss in the model volume. Second, the shape of the pathology was modified to
fit the RAF phantom lungs, while respecting the volume of the segmented disease (Figure 5.13c). This step was carried
out using the Free Form Deformation (FFD) modifier. The deformations applied in this step required manual
intervention since the shape of the lung and its pathology changes from case to case. Non-isotropic scaling was
utilized. The original spatial distribution of the pathology over the lung was preserved and changes in the volume ratio
disease/lung (Vsegmented_pathology/ Vpatient_lung) were kept below 15%. The set of models represented the typical
distributions of the disease: predominantly in the lower lobe, multifocal, peripheral and bilateral [176]. In order to
quantitatively assess differences between the segmented lesion and the final model used in the phantom, the Hausdorff
distance (HD) was calculated for the mesh models. Meshlab software was used [177], which samples a set of points
over the mesh and finds the closest point in the reference mesh.

The modelling methodology enables the representation of different grades of severity of lung involvement by changing
the X-ray attenuation in the simulated pathology or by changing its size and distribution. As a proof of concept for the
development of different lesion severity from the reference set, the mesh of one of the segmented lesions was modified
by changing its size and shape.

109
An example of a model in which the X-ray attenuation of the pathology was changed to study the impact on lesion
detection was also simulated. Finally, the influence of BMI on lesion detection is also illustrated for one disease model
simulated into the different BMI and gender realizations of the phantom. These examples are shown in the results,
Section 5.2.3.5.

Figure 5.13: Workflow of the methodology followed to model the pathologies, (a) Patient CT slice with segmented pathology
(green), (b) 3D surface of the pathology converted to mesh format, (c) slice of the 3D mesh model of the pathology (green) inside
RAF phantom lungs, (d) Slice of voxelized RAF phantom with highlighted pathology (green).

5.2.2.3 Creation of voxelized phantoms

The polygonal mesh models were then exported separately and loaded into voxelization software as described in
Section 5.1.4. The voxel resolution used for the models was also 0.5x0.5x0.5 mm3. The process of adding all the
organs follows the same method previously described in Section 5.1.4 (see Figure 5.13d) with some modifications to
include the new modelled tasks. The GGO ID was added after the lungs, since GGO represent an area of increased
opacity in which bronchial structures and vessels are still visible. The consolidation ID was added last, as
consolidations obscure all pulmonary structures within some region [178] (see Figure 5.14).

Finally, the volume ratio of the segmented pathology to that of the patient lung (Vsegmented_pathology/ Vpatient_lung) was
compared with the volume ratio of the voxelized pathology model to that of the phantom lung (V final_pathology/
Vphantom_lung). This served as a validation of the method to ensure volume ratio of pathology/lung was kept similar,
regardless of differences between the patients and the RAF chest phantom.

110
Figure 5.14: Slice of mesh model of the lung containing GGO and consolidation regions. GGO are regions of increased opacity in
the lungs where pulmonary structures are still visible while in the consolidation the pulmonary structures are obscured.

5.2.2.4 Generating radiographic images using a simulation framework

Radiographic images were generated using the methodology described in Chapter 4. The materials and proportions of
the mixtures used for the different phantom organs were taken from ICRP Publication 89 [46] and 110 [155]. GGO
and consolidations are areas of increased levels of X-ray attenuation due to the presence of fluids in the lungs. To
obtain realistic pathology densities, the Hounsfield Unit (HU) histograms of the corresponding segmented lesions in
the CT images were measured and subsequently converted to density (g/cm 3) using the data of Schneider [88]. A
random distribution of densities was used in the pathology region of the phantom. A random number generator was
used to assign different attenuation coefficients to each voxel in the pathology region: each density value was given
the probability of the corresponding HU in the CT images. To achieve this, the probability distribution was taken from
the normalized HU histogram of the segmented lesion.

Radiographic images of all the models were generated at 120 kVp, grid in and 180 cm source to detector distance,
settings commonly used in thorax PA examinations. A detector air kerma level of approximately 8 µGy was simulated,
which is higher than in the clinical protocol. A PA exposure setting with grid in and a higher dose level were selected
to obtain high quality images, in which the pathology was clearly visualized. Finally, clinical image processing
corresponding to an adult thorax examination was applied to the images using image processing software MUSICA
(Agfa, Belgium).

5.2.2.5 Assessment of task realism

In order to assess the realism of the pathologies in the RAF phantom, the simulated radiographic images were
presented to three thorax radiologists, trained during the outbreak to diagnose Covid-19 suspected patients. A reader
study was set up in which the observers were asked to score nine different images for each of the Covid-19 models
inserted within the standard BMI male RAF. The score was carried out according to three realism criteria, question 1:
realism of the lung background, question 2: realism of the lesions in terms of appearance and question 3: realism of
the lesions in terms of position within the lungs. A five-point scale was used for all criteria: (1) not at all realistic:
critical elements that affect the realism, (2) poor: obvious elements that may affect the realism, (3) adequate: minor

111
elements that did not affect the realism (4) good: minimal unrealistic elements and (5) very realistic: no unrealistic
elements. In addition, the readers were asked to write a description of the pathology that was seen. Images were
displayed using Viewdex software [111]. Contrast and brightness levels could be adjusted if desired and no time limit
was imposed.

As an additional validation of the models, the images were uploaded to the AI software Lunit INSIGHT CXR for
Covid-19 (Lunit, South Korea) [178]. This software can detect areas of consolidation or GGO in the chest and is
intended to support the interpretation of CXR of suspected Covid-19 cases. The software analyses the images and
reports an abnormality score given by the likelihood of the presence of the detected lesion, i.e. low (0-15%), moderate
(16-50%) and high (51-100%).

5.2.3 Results

5.2.3.1 Pathology models

Nine Covid-19 disease models were created covering the typical manifestations of lung involvement distribution
characteristics of Covid-19 pneumonia. Eight of the models were created from the segmented lesions in CT images
and one extra case was created by modifying the mesh of one of the segmented models. The HU histograms obtained
for the lesions of the eight segmented CT datasets are shown in Figure 5.15. The differences in HU are related to the
stage of the disease. While GGOs lie in the range -800 to -400, higher opacities like consolidations reach 100 HU,
consistent with data reported by Lanza et al[179].

Figure 5.16 shows a comparison of different slices of the CT data (a, b and c) to the corresponding slice of the
voxelized phantom (d, e and f) for Case 4. The pathology voxels are highlighted in green in all the images.

Figure 5.15: HU distribution within the lesions for the pathologies modelled from segmentation of CT images (cases 1-8).

112
Figure 5.16: Comparison of CT slices of a patient (Case 4) (a, b and c) and the respective slices of the voxelized RAF phantom (d,
e and f). Voxels corresponding to the pathology are highlighted in green.

Different stages from the disease were modelled, from the more subtle in the initial stage to more advanced. Figure
5.17 (a-c) shows the mesh models of the RAF phantom lungs including the pathologies from cases 6, 4 and 7,
respectively. As illustrated, the level of lung involvement changes from case to case, the ratios of pathology volume
to lung volume of the voxelized versions of these models can be found in Table 5.7.

(a) (b) (c)

Figure 5.17: mesh models of the pathologies (green) modelled within the lungs of the RAF phantom. Different levels of lung
involvement can be seen, from subtle (a): case 6, to more prominent in (b): case 4, and (c): case 7.

5.2.3.2 Mesh and volume comparison

Table 5.7 shows the volume ratios of pathology/lungs for the developed models. As can be seen, the percentage of
lung involvement ranges from 2.2% to 38.2%. The pathology/volume ratios for the segmented CT images and the
relative differences compared to the voxelized models are also shown. As observed, maximum differences between
the initially segmented lesion and the corresponding final phantom version stayed below 16%. Volume ratios were
113
usually smaller when the pathology was placed in the phantom. No comparison is shown for case 9 since this model
is a modification of Case 4, thus no reference volume ratio is available.

The mean Hausdorff distance (HD) between the original mesh and the mesh adapted to the RAF phantom can be found
in Table 5.7. Mean HD values were below 0.64 cm.

Table 5.7: Volume ratios of pathology to lung for the segmented CT images and those for the corresponding phantom models.
Relative deviation between the developed models and the patient data is also shown. Mean Hausdorff distance calculation for the
mesh models of cases 1-8. No CT data is available for Case 9 since it is a modified version of Case 4.

Case number Vsegmented_pathology/ Vpatient_lung Vfinal_pathology/ Vphantom_lung Relative deviation Mean HD (cm)

1 12.2% 13.1% 7% 0.64±1.02

2 13.2% 12.7% -4% 0.42±0.53

3 39.0% 38.2% -2% 0.21±0.31

4 14.0% 12.6% -10% 0.16±0.34

5 2.6% 2.3% -12% 0.21±0.29

6 2.6% 2.2% -16% 0.34±0.66

7 34.6% 29.3% -15% 0.18±0.24

8 34.2% 36.7% 7% 0.25±0.20

9 - 9.8% -

5.2.3.3 Simulated radiographic images

Figure 5.18 (a-i) shows the set of simulated radiographic images of the RAF phantom featuring the nine pathology
models. For each case, a projection is also shown with the voxels containing the pathology highlighted in green. As
observed, the pathologies are in all cases bilateral, often located in the periphery of the lungs and involving several
lobes. The degree of spread of the disease is easily noticeable in each of the cases.

114
Case 1 Case 2 Case 3

Case 4 Case 5 Case 6

115
g) Case 7 h) Case 8 i) Case 9
Figure 5.18: Simulated images of the RAF phantom including the Covid-19 models created. From (a to h) cases 1 to 9, respectively.
Below each image an equivalent version with pathology regions highlighted in green is shown.

5.2.3.4 Assessment of task realism

Table 5.8 presents the percentage of cases marked as at least ‘adequate’ or at least ‘good’ for the individual observer
and the mean realism criteria of all three observers used in the reader study. For question 1, 2 and 3 (Q1, Q2 and Q3)
the average percentages of cases marked at least ‘adequate’ were 100%, 92% and 96%. The average percentage of
cases marked at least ‘good’ were 59%, 54% and 65% respectively for Q1, Q2 and Q3. The mean value for the three
quality criteria was at least ‘adequate’ for 96% of the cases and at least ‘good’ for 50% of the cases.

For readers 1, 2 and 3 respectively, 92%, 96% and 100% of all the scores given to the images were at least ‘adequate’:
they detected only minor unrealistic elements that did not affect the general realism of the models. Moreover, 67%,
52% and 59% of all scores were at least ‘good’ for readers 1, 2 and 3 respectively. A moderate agreement was found
between the readers with Intra-Class Correlation Coefficient (ICC) of 0.5 (95% CI = 0.10-0.84).

Radiologists 1, 2 and 3 detected respectively 78%, 89% and 89% of all the Covid-19 pathologies. Case 6 was missed
by all three readers and case 5 was missed by reader 1. Although reader 3 had marked an area of opacities in this lung,
he was uncertain about its presence. These two cases, in fact, represent the more subtle simulated pathologies. Further
investigation of Case 5 revealed that after three PCR tests this patient was negative to Covid-19 and represents a rare
case where CT and PCR arrived at different conclusions. This highlights the difficulty in the radiological practice

116
where lesions with spatial distributions typical for Covid-19 and HU distributions like Covid-19 cases can in fact be
negative.

Table 5.8: Percentage of cases rated by radiologists as at least adequate and at least good for each of the realism criteria in the
images. Q1: Realism of lung background, Q2: Realism of lesion (appearance), Q3: Realism of lesion (position).

Percentage of cases marked at least adequate Percentage of cases marked at least good

Q1 Q2 Q3 Mean Q1-3 Q1 Q2 Q3 Mean Q1-3

Radiologist 1 100% 88% 88% 88% 88% 50% 63% 63%

Radiologist 2 100% 89% 100% 100% 0% 67% 89% 56%

Radiologist 3 100% 100% 100% 100% 89% 44% 44% 33%

Average 100% 92% 96% 96% 59% 54% 65% 50%

The AI algorithm was able to identify consolidation areas in 56% of the cases. The algorithm successfully identified
pathologic areas in cases 3, 4, 7, 8 and 9. For the rest of the cases no lesions were detected. Figure 5.19 (a-e) shows
simulated images with the pathology highlighted in green and below the corresponding output from the AI software
for cases 3, 4, 7, 8 and 9 respectively. The heat map displayed from the AI software output represents the likelihood
of the presence of the lesion as detected by the software. The abnormality scores reported by the AI were 90%, 93%,
77%, 92%, and 92% respectively for cases 3, 4, 7, 8 and 9.

a) Case 3. Abnormality score 90% b) Case 4. Abnormality score 93% c) Case 7. Abnormality score 77%

117
d) Case 8. Abnormality score 92% e) Case 9. Abnormality score 92%
Figure 5.19: Simulated images for cases 3 (a), 4 (b), 7 (c), 8 (d) and 9 (e). In the upper part the image with the pathology highlighted
in green and below the corresponding results from the AI software. The heat maps in the bottom images represent the likelihood
of the detected lesion to be suggestive for Covid-19.

5.2.3.5 BMI and pathology modifications

Figure 5.20 presents the images from the RAF phantom with different body types, namely female (a), overweight

male (b) and obese male (c). The same pathology, taken from case 4, was used for all these images (Figure 5.18d). As

observed, the visualization of the pathology is affected by the increased size of the phantom and by the presence of

breasts. These images were analysed by the AI Covid-19 detection software and the corresponding output is shown

below each case. The different body types had a clear influence on the detection of the pathology, as can be seen in

the heap maps of the AI software. The abnormality scores were 88%, 78% and 56% for the female, male overweight

and male obese respectively, compared to 93% obtained for the same case in the standard BMI male.

Figure 10 shows the X-ray images of two versions of case 8: (a) represents the original pathology and (b) a

modified version where lower attenuation in the lungs was simulated to represent a more subtle case. These images

were uploaded to the AI software for analysis, the respective results are displayed below each simulated image. Heat

118
map intensity falls for the more subtle case, with the abnormality score going from 92% in the original model to 77%

in the modified subtler version.

a) Female. Abnormality score 88% b) Overweight male. Abnormality score c) Obese male. Abnormality score 56%
78%
Figure 5.20: Images of the RAF phantom (with the pathology of case 4) with different body types (a) female (BMI=29), (b) male
overweight (BMI=29), (c) male obese (BMI=40), below each image the respective output from the AI software is displayed.

5.2.4 Discussion

In the second part of this chapter, we have established a methodology to develop computational models of Covid-19
patients. For the models presented in this work, it was sufficient to segment the pathology and the lungs of each patient
instead of having to segment all the organs present in the thorax: this reduced the overall modelling time. Depending
on the complexity of the pathology and for an experienced developer, segmentation took between 2 to 6 hours while
3D modelling ranged from 3 to 6 hours. Generation of the input files and images took from 4 to 5 hours, which was
mostly computational time. The average human input needed to create one model was about 9 hours. Having a solid
simulation framework and expertise with the creation of computational models played an important role in terms of
development time. Tools to automatically segment the pathology from CT images are improving in accuracy and
availability, and this could improve the simulation process by making the creation of the models faster. However, this
work represents a proof of concept of the creation of Covid-19 patients. If many more models would be required for
a VCT or any other study, then the use of latest segmentation tools may have to be explored [180,181].

119
a) Original model Case 8. Abnormality score 92% b) Modified model Case 8. Abnormality score 73%

Figure 5.21: Example of different stages of the disease for Case 8. (a) the original models and (b) a more subtle stage with less
attenuation in the lungs. The corresponding AI output is shown below.

The changes in the ratio of pathology volume to lung volume between the patient and the final model of the phantom
remained below 16%. This was considered acceptable because the shape of the pathology was largely preserved; note
that 16% represents a deviation of less than one voxel in each direction for the model selected. The differences can be
ascribed to the surface smoothing used to eliminate artifacts from CT, the discretization error introduced by the
marching cubes algorithm and by the conservative nature of the voxelization algorithm[182]. The Hausdorff distance,
used to compare the meshes of the segmented pathology and the model adapted to the phantom, had mean values
below 0.64 cm. These differences can be considered satisfactory and were expected since they are attributed to the
mesh modification applied when fitting to the lungs of the phantom and the smoothing to eliminate the staircase effect
from the CT.

In order to demonstrate the flexibility of the modelling methodology, the lesion from Case 9 was created by modifying
the mesh of Case 4. This allowed the creation of an additional model without the need of additional segmentation, but
by modifying the shape and size of a pre-existing mesh. When performing this procedure, reported findings from
Covid-19 pneumonia and the progression of the disease over time should be verified to ensure correct modelling. Case
9 was included in the validation dataset for the reading study and was classified as at least ‘good’ by the three readers.

120
An overweight and obese version of the male phantom and a female version were developed to extend the range of
patient types simulated. An example of the effect of BMI and body shape on the pathology from Case 4 was
demonstrated in Figure 5.20 using the AI software, where it appears that increasing the BMI reduced the abnormality
score. Another illustration of the potential scope of the method and the models was the realization of different grades
of severity of lung involvement. An example was presented (Figure 5.21) in which the attenuation in the simulated
pathology of Case 8 was modified, resulting in a more subtle case. This modification was reflected in the abnormality
score reported by the AI software, which decreased from 92% to 77%. These data suggest that a greater range of BMI
values and disease severities can be modelled if required for the VCT.

One of the potential uses of our simulation platform in combination with the Covid-19 models, is to assess the
influence of X-ray acquisition parameters on the visibility of the models. For example, different tube voltages, X-ray
spectral filtering, dose levels and the presence or absence of antiscatter methods could be investigated by generating
images with these characteristics. The models could also be exploited as a means of evaluating new or improved X-
ray detector performance by applying the measured characteristics of a new detector, without or with a (new) grid.
There is also the potential to investigate antiscatter grids with different design parameters, for example using a higher
ratio [183].

The realism of the models was assessed by a reader study in which three thorax radiologists classified the images of
the phantoms including the lesions. The mean realism score (mean value of the three scoring criteria) was
approximately 3 (i.e., ‘adequate’) for 100% of the cases for reader 2 and 3, while for reader 1 this value dropped to
92%. For reader 1, the mean score below 3 was given to Case 6, which in fact corresponded to a missed lesion by this
reader. On the other hand, 67%, 52% and 59% of the cases had mean scores above or equal to 4 (i.e., ‘good’) for
readers 1, 2 and 3 respectively. The fact that the readers missed some of the pathologies is considered acceptable as
these cases had very subtle lesions and could have been missed on CXR. This is consistent with the modelling of
realistic pathologies, but also shows the need for optimization of planar X-ray imaging for the detection of Covid-19
if high sensitivity is required, for example if CXR would be used for triage. Although the percentage of cases with
mean scores 3 or above (‘adequate’) is similar for the three readers (92% to 100%), the ICC showed only a moderate
agreement between the readers. This is due to the differences in the scores of the single cases, given by the readers’
subjectivity and interpretation for this type of study. Additional validation was performed using the Lunit AI algorithm
for the detection of Covid-19 disease in chest X-rays. The algorithm could be applied to the same phantom images
and was able to identify 56% of the cases.

A major advantage of using models for VCT type studies is that the ground truth of the models is always known: the
exact position and attenuation of the pathology is known. The VCT approach can be used for several types of studies.
(1) It was shown that at least the current AI algorithm could be tested for sensitivity in our cases. (2) New technologies
could also be investigated, for example dual energy imaging, a promising technique with good potential for this type
of application as bone structures can be subtracted from the images, providing a clearer view of the lung field. The
3D models can also be used in VCTs carried out using other imaging modalities such as CT and to compare modalities.
Case 5 in particular, highlights the potential of VCTs to study sensitivity and specificity of imaging devices.

121
Current limitations of the modelling include the lack of anatomical variation in the thorax phantom except for the
different versions representing thicker patients. Additionally, the method involves several manual steps, some of them
requiring a steep learning curve for less experienced developers. An example is the adaptation of the pathology inside
the existing phantom, which can be time consuming and requires manual intervention. A possibility to speed up the
creation of the models is by using automatic segmentation software to identify the pathology in the CT
images[179,184].

A study using a similar methodology has been recently published by Abadi et al[185]. In their paper, additional
structures within the underlying lung parenchyma were simulated by enlarging the size of the pulmonary lobules to
represent crazy paving regions. This can also be implemented in the RAF model if required, however crazy paving
regions are barely visible in plain radiography, which is the current focus for the models created in the present work.
For the more common manifestations of the disease like GGO and consolidations, Adabi et al. combined fluid with
the texture in secondary pulmonary lobules to match the mean linear attenuation coefficient measured for the
segmented abnormalities. In this work we simulated the variation in X-ray attenuation within a diseased region by
using the HU histogram obtained in the segmented pathology.

5.3 General Conclusions

A library of 24 anthropomorphic chest phantoms was created, featuring different body types and a range of clinical
tasks commonly found in chest exams plus a set dedicated to Covid-19 disease. The realism of the models was assessed
by readers who classified simulated radiographic images of the phantoms. Additionally, the images were uploaded for
analysis to an AI software which served as extra validation.

The methodology used to create the phantoms made use of polygonal mesh modelling, which provided great flexibility
for modelling different lesions and devices and to modify the anatomy of the phantoms. The library of phantoms can
be extended by modelling variations or by adding new types of tasks. The results of this phantom and task development
were put into practice in the second part of the chapter, where tissue models segmented from CT datasets of patients
were integrated within the original anatomy of the RAF phantoms, to create nine realistic models of the Covid-19
disease associated pathologies.

Although these models form the basis for optimization studies of CXR they could also be used in other imaging
applications, including CT. To the authors knowledge, there are no other phantom libraries that depict this range of
clinical tasks for chest radiography.

The models created in the first part of the Chapter represent an essential step in the design of a Virtual Clinical Trial.
The models can be used as input in the previously described simulation platform. The simulated x-ray images can then
be used in a reading study to evaluate task detection under different exposure settings.

122
Chapter 6
Virtual clinical trial in chest radiography

This final chapter presents a Virtual Clinical Trial in which diagnostic performance in chest radiography is studied
with respect to different acquisition parameters. The VCT applies the tools described previously in the thesis, namely
the simulation framework, whose development and validation were described in Chapter 4, and the anthropomorphic
phantom library including a range of clinical tasks described in Chapter 5. Images of 11 phantoms including clinical
tasks were generated under different exposure conditions, characterized by their tube voltage, antiscatter method and
dose level. Clinical image processing was applied to the simulated images and they were subsequently scored via a
free-response observer study by four readers. The statistical analysis of the results was done using the jackknife-
alternative free-response receiver operating characteristic (JAFROC) method. Additionally, organ dose calculations
were performed for the different settings.

6.1 Methods

6.1.1 Anthropomorphic computational models

The study included 11 phantoms with clinical tasks as described in the first part of Chapter 5. The phantoms were
voxelized using two resolutions, one of 0.5x0.5x0.5 mm3 used in the ray tracing simulations and one of
1.0x1.0x1.0 mm3 used for the Monte Carlo (MC) simulations. More detail on this is given in the following section.
Table 6.1 contains a summary of the number of images generated for each phantom body type and the number of
images containing the simulated tasks.

6.1.2 Image simulation

Phantom images were generated using the simulation framework described in Chapter 4. The same methodology was
applied for this study but with some minor differences in the simulation procedure, related to the local scatter variation
in a thorax phantom and to the simulated dose levels.

123
6.1.2.1 Scatter images

As described in Chapter 4 the scattered radiation was simulated using Monte Carlo methods using
PENELOPE/penEasy [84] [85]. Since this step requires long computational times, some steps were taken to reduce it.
Figure 6.1 a) shows an example of the image of the scattered radiation generated by one of the thorax phantoms. It is
obvious from this figure that scatter images contain limited anatomical information, thus it was assumed the presence
of the clinical tasks would make a negligible contribution to the total scattered radiation. Assuming this allows one
scattered radiation image to be used for all phantom models with the same BMI. Four phantoms were then used in the
MC simulations to generate the scattered radiation fields, corresponding to the four body types: female, standard male,
overweight male and obese male. A resolution of 1.0x1.0x1.0 mm3 was used in the voxelization of the phantoms to
be used in Monte Carlo simulations. This is larger than the voxels chosen for the ray-tracing images but had only a
small influence on the volume of organs and was considered sufficient for simulating scattered radiation.

The tallyPixelImageDetector was used to generate the images. This allows photons to be filtered according to their
trajectory before arriving at the detector, in primary (unscattered photons), single Compton scatter, single Rayleigh
scatter, multiscattered or secondary photons. In the original implementation of Penelope, all these different images
cannot be obtained in a single run, and consequently, the penEasy source code was modified to replicate the
tallyPixelImageDetector, which allowed images to be obtained corresponding to all of these interaction types in a
single simulation. For the scatter images pixels of 5x5 mm2 were used as described in Chapter 4.

All MC simulations were run using the resources and services from the Genius Tier-2 Cluster from the Flemish
Supercomputer Centre (VSC for its name in Dutch). The jobs were submitted to the Skylake thin nodes, with 2 Xeon
Gold 6140 CPUs@2.3 GHz (Skylake), 18 cores each and 192 GB RAM.

Given that penEasy does not implement parallel computing, the jobs were replicated with different random seeds and
submitted using the following script:

#PBS -l nodes=1:ppn=1

#PBS -l walltime=72:00:00

#PBS -l pmem=1gb

#PBS -A default_project

#PBS -t 0-11

cd $VSC_DATA/MALE_PHANTOM/M_80kVp

cd ${PBS_ARRAYID}_80/

penEasy.x <penEasy_grid.in> penEasy.out

The first five lines specify how many nodes and cores will be used for a single job, the run time, the RAM memory
occupied by the job, the project ID and lastly an array ID used to identify the folders of the jobs to be launched in
parallel. The remaining lines locate the directories with the input files and finally the program penEasy.x is executed.

124
In order to simulate the effect of the moving grid, parallel runs for each tube voltage and phantom were run in which
five shift sequences of 21.5 µm (lead strip thickness) were applied to the grid volume. The five shift sequences were
applied to cover the interspace thickness. The simulations were stopped when total statistical uncertainty was below
3% in the total scatter images. For the grid-in configurations, 12 parallel jobs were submitted, the resulting images
were then combined automatically with a script developed in Python, according to the type of interaction filter used.
The total uncertainty in the combined images was calculated following the method described by Badal and Sempau
[186]. The total simulation time to obtain the images for a single configuration was approximately 72 hours, depending
on the phantom used. For the simulations without grid, 2 to 3 parallel jobs were sufficient to achieve the same
uncertainty values in the same amount of time.

Table 6.1: Image dataset characteristics.

Characteristics Number of images Percentage


Simulated images 296
Number of patients 11
Female 81 24%
Sex
Male 215 76%
standard 165 56%

BMI overweight 83 28%


Obese 48 16%

nodules 161 54%


catheter 99 33%
Clinical task type fractures 162 55%
pneumothorax 106 36%
pleural effusion 56 18%
Image resolution 0.139*0.139 mm2

6.1.2.2 Primary images

As described in Chapter 4, the primary projections were obtained using a ray tracing algorithm to calculate the
attenuation of radiation through the object and from flood images (no object) calculated in MC for calibration
purposes. A new step was applied in cases where the antiscatter grid was simulated. For these cases, the primary image
should take into account the primary transmission of the grid (Tp), which indicates the number of primary photons
that are stopped by the grid before reaching the detector. For the homogeneous PMMA phantom, a single Tp value
was applied to all the images. In the chest, the situation is different as the scatter conditions can vary considerably
across the image and this has consequences for the grid Tp and total transmission (Tt) values. Figure 6.1 b) and c)
shows a simulated image of the RAF chest phantom in which the pixel intensity corresponds to Tp and Tt values
respectively, the values are indicated by the colour bar in the image. These types of images were generated for each
of the exposure settings and applied to the primary images (formed by the ray tracing scaled by the MC flood image).

125
They were generated from the same MC run used for the scatter images using the modifications made to the penEasy
software. One of the PixelImageDetector tallies was set to filter the unscattered or primary radiation. In this case, the
pixel size was set to 1.0 x 1.0 mm2 in accordance with the phantom resolution since the primary images depict fine
details. The Tp images were calculated by dividing the primary image with grid in and the primary image with grid
out, the equivalent was done for the Tt using the total images (scatter + primary) for grid in and grid out. The Tp
resulting images were then interpolated to the dimensions of the real image (2560 x 3072 pixels, with pixel size of
139 µm) and filter to remove any noise remaining. It is worth noticing that these primary images are only used to
account for the local transmission of the grid, after interpolation and filtering. Therefore, the simulation time used for
the scatter images is sufficient to generate these images.

a)

b) c)
Figure 6.1: a) Scatter image of the standard male phantom obtained from Monte Carlo simulations, b) and c) Standard male
phantom image, where pixel intensity represents the grid primary (b) and total (c) transmission. These images were obtained by
dividing MC simulated images with grid in by images with grid out for the primary (b) and total radiation, i.e., primary+scatter (c).

The ray tracing algorithm was run in parallel in a Dell Workstation Core i9 9960x@4.4 GHz with 16 cores and 128 Gb
RAM. For each phantom model including the tasks, the ray tracing images were generated for monoenergetic beams,
ranging from 0 to 140 keV with 5 keV bins. This generated a stack of 2D images which were then scaled by the flood
images generated in MC using the same energy bins to form a primary image at any given tube voltage within the
range, following the procedure described in Chapter 4, Section 4.2.1. Images of the 11 phantoms including clinical

126
tasks were generated, in this case the phantom resolution was set to 0.5x0.5x0.5 mm3. Hence, the primary images
contain anatomical information and therefore depiction/transfer of fine details in the phantom. The ray tracing images
were generated with a pixel resolution of 0.5x0.5 mm2 and then interpolated to the real image dimensions with pixel
size of 0.139x0.139 mm2 before adding the real detector characteristics.

6.1.2.3 Adding sharpness and noise

Realistic imaging characteristics measured from a real CsI flat panel digital detector (FPD) were applied to the images.
The main steps in this part of the simulation are:

1. Scaling of the simulated images to the desired target detector air kerma (DAK)
2. Blurring of the image using the measured detector-specific Modulation Transfer Function (MTF)
3. Creation of a noise image relevant to the real noise level using the measured Normalized Noise Power
Spectrum (NNPS)

The target DAK values were set according to the automatic exposure control (AEC) programming, which aims for a
DAK of ~2.5 µGy at the lung region (see also Chapter 4). This was verified on experimental Posterior Anterior (PA)
projections of real patients and the Lungman phantom. Based on this reference value, three other target DAK levels
were used in the simulation, by dividing or multiplying the reference AEC value by 2. The dose values used were the
following: 0.62, 1.25, 2.5 and 5.0 µGy. To scale the hybrid images to the target DAK levels, the average PV was
calculated from two regions equivalent to the AEC sensing region place in the lung field of the phantom. The
correction factor calculated from this was applied to the remainder of the image. This step was different from that
used for the PMMA phantom where a region at the centre of the image was used to scale the image.

The detector characterization was performed for each of the beam qualities used following the procedure described in
Chapter 4, Section 4.2.2 for the Carestream DRX detector. The detector response function, MTF and NNPS needed
for this were measured from 60 to 140 kVp in steps of 20 kVp with a 9 cm thick block of PMMA at the X-ray tube.

6.1.3 Organ dose calculations

Organ doses within the phantom were calculated by MC simulation. The tallyEnergyDeposition was used, which
scores the energy deposited in the different materials used in the simulation per number of histories (eV/hist). To
calculate the phantom organ doses, the energy deposited in the organs was converted from units of eV to units of Joule
and divided by the organ mass in kg. Following the recommendations from ICRP publication 116 [187] and ICRU
report 95 [188] the organ dose conversion coefficient (DCCs) were calculated by normalizing by the photon fluence.
The photon fluence with units of counts/cm 2 defined as dN/dA, where dN is the number of photons that enter an
imaginary sphere of cross-sectional area dA. Moreover, this also allowed the dependency on number of histories
simulated to be removed. The photon fluence was calculated using the tallyParticleCurrentSpectrum for each tube
voltage studied. This tally scores the number of particles, per unit energy interval and per history entering a detector
material. In the present case, the detector was a hypothetical material with a surface of 20x20 cm2 located at 100 cm
from the source. To avoid repeated counting of particles re-entering the detection material, the latter was defined as a

127
perfect absorber, thus all particles entering were absorbed. The organ DCCs were reported as a function of tube voltage
for each of the phantom types, i.e., female, standard male, overweight male and obese male.

A specific example for the calculation of organ doses for an AEC controlled PA exposure was also performed. First,
the Boone spectrum model was calibrated to reproduce the same air kerma [µGy/mAs] as the Carestream system for
the tube voltages studied. The fluence calculated from the calibrated spectrum [photons/cm 2] was then multiplied by
the tube load (mAs) values specific to each kVp. These mAs values were taken from AEC controlled PA exposures
made on the Carestream system using the Lungman phantom, for grid in and grid out acquisitions. The organ doses
reported correspond to target DAK level of 2.5 µGy.

6.1.4 Image dataset

The virtual phantom was placed in a Posterior Anterior position with respect to the detector, equivalent to a thorax PA
exam. Three acquisition parameters were varied in the images: antiscatter grid use, tube voltage and dose. For the
grid, two options were evaluated: antiscatter grid in place (Gin) and antiscatter grid removed (Gout). Five tube voltages
values were used ranging from 60 to 140 in steps of 20 kVp and four dose levels as previously mentioned.

For the grid in and grid out techniques, the source-to-image receptor distance (SID) was 150 cm, corresponding with
the focus distance of the antiscatter grid. There was a gap of 5 cm between the exit of the phantom and the detector
input plane. The antiscatter grid was placed at 1 cm from the detector input plane. The simulated grid had a ratio of
15:1, cotton fibre interspace, 80 lines per cm and top and bottom carbon fibre covers.

The scatter rejection method was selected depending on tube voltage used, following good radiographic practice [4].
For example, at 60 kVp the grid in technique was not assessed because this setting is unlikely in clinical practice due
to long exposure times and associated patient motion blurring in the image. Thus, grid in technique was only used
from 80 kVp to 140 kVp. On the other hand, grid out technique was only set for tube voltages from 60 to 100 kVp
and not at 120 and 140 kVp: at such high energies, the scatter radiation is more likely to arrive at the detector and this
would lead to large amount of scatter in the image, while at lower energies scatter radiation is more likely to be
absorbed in the patient. Table 6.2 presents a summary of the exposure setting combinations used in the data set. The
dose levels are not specified because for each tube voltage and scatter condition, all dose levels were tested.

Table 6.2: Number of images generated for each phantom type for the combinations of tube voltage and scatter rejection method.

60 kVp 80 kVp 100 kVp 120 kVp 140 kVp


Phantom type
Gout Gin Gout Gin Gout Gin Gin

Female 9 12 12 12 12 12 12

Male standard 12 12 12 12 12 12 12

Male overweight 12 12 12 11 12 12 12

Male obese - 8 8 8 8 8 8

128
6.1.5 Observer study using FROC

A total of 296 images were assessed by four radiology residents, one 5 th year resident, one 4th year resident and two
in their 3rd year of residence. These radiologists were separate from the radiologists that had contributed to the phantom
development. The characteristics of the image dataset can be seen in Table 6.1. The reading was divided into five
sessions, with randomly selected images for a total of 59 or 60 images in each session. The order in which the sessions
were read was different for all the readers. All reader scored all the images. Additionally, a short training session with
10 images was done at the beginning of the study, for the readers to get accustomed with the scoring system. The
reading of the images was performed under normal clinical viewing conditions in a reading room for thorax
radiography with an ambient light between 6 to 12 lux in a 12 MP Barco Coronis Uniti 12MP diagnostic display.

The readers were told that images could contain zero, one or several clinical tasks and were informed about the type
of clinical tasks that could be expected. All images were interpreted using the free response receiver operator
characteristics (FROC) paradigm. In this type of method, the observer does not have limitations regarding the number
of suspicious regions that can be marked up in the images (each called a ‘mark’). Additionally, the reader gives a
confidence level that a lesion or clinical finding is present in the marked location (called a ‘rating’).

Viewdex software [111] was used to display the images (see Figure 6.2). The observer could change the window/level
setting and change image zoom, no time limitations were imposed. When a task was localized in the image, the position
was recorded and marked with a cross. Then, two questions were activated in the task panel of the software, the first
was to select the type of finding suspected from a predefined list (i.e., lung nodules, catheters, rib fractures,
pneumothorax, and pleural effusion). The second question was to select the confidence level for the selected finding
using a four-point scale: 1= slightly confident that the lesion is present, 2 = somewhat confident that the lesion is
present, 3 = fairly confident that the lesion is present, 4 = completely confident that the lesion is present.

The third question was if the noise level was acceptable for diagnostic applications. The readers answered with ‘yes’
or ‘no’. This question was not localization dependent, i.e., had to be answered once for each image, with or without
marks.

There were some requirements for the task localization: for lung nodules, the readers were asked to click at the centre;
for catheters, the readers had to click at the catheter tip. For the case of multiple rib fractures, one localization was
required for each fracture region, while for the pneumothorax and pleural effusion the readers were asked to click
anywhere within the region. The reader sessions were carried out over the course of 7 to 10 days. The readers spent
between 8.1 to 11.7 hours in the reading of all images and a single session lasted 2 hours in average, without counting
pauses.

129
Figure 6.2: Example of the VIEWDEX scoring software used during the reader study with the Viewdex software.

6.1.6 Statistical analysis

The marks made by the readers were compared with the localization of the simulated tasks and were classified as false
positive (FP), true positive (TP), false negative (FN) and true negative (TN) results. To speed up the process, a script
was developed to be used in Fiji (ImageJ). The script used the localization in pixels of each mark and created a mark
in the image. All marks were then displayed over a masked image with the clinical task region highlighted and were
manually classified as TP or FP. The marks and ratings were formatted and used as input in the JAFROC software
(Version 4.2.1). The software performs the analysis of the data following the jackknife-alternative free-response
receiver operating characteristic method (JAFROC) [189,190].

Two types of analyses were performed. In the first analysis, all 296 images were included as abnormal images,
comprising all types of clinical tasks and all types of non-lesion localizations (false positive). The figure of merit
(FOM) used for this analysis was JAFROC1. This represents the empirical probability that lesion rating (true positive)
exceeds highest rated non-lesion rating (false positives); comparisons are made between lesion ratings and highest
rated non-lesion rating on all images. This figure of merit is recommended only if there are a few or no normal cases
(i.e., a case with no clinical task).

In the second analysis, each clinical task was studied individually. The images were split into normal and abnormal
depending on the task. For example, images containing nodules were classified as abnormal and the rest as normal.
The ratings were thus corresponding only to the specific task, including the false positive cases. In this analysis the
FOM was JAFROC2, representing the empirical probability that lesion rating exceeds the highest rated non-lesion
rating on normal images (i.e., images without the specific task); comparisons are made between lesion ratings and the
highest rated non-lesion rating on normal images only. For each clinical task analysed, the statistical analysis was
performed in two parts in which the modalities being compared were, 1: tube voltage and grid use and 2: dose level
at the detector.

130
The software performs significance testing using the ANOVA technique by Dorfman-Berbaum-Metz [191,192]. The
output of the software provides different analyses in which the readers or the cases are treated as fixed or random. In
this study we used random readers and fixed cases (referring to the images used), which means that the results apply
to the population of readers and to the cases used in this study.

Observer averaged figures of merit, 95% confidence intervals (CI) and the inter-modality differences (i.e., 1- tube
voltage and grid use and 2- dose level) were generated in the analysis. The relevant statistics are the F-statistics and
associated p-values.

6.2 Results

6.2.1 Simulated images

Figure 6.3 shows the output images of the different steps of the image simulation process for the standard male
phantom, 120 kVp and grid in. Image a) is the primary image generated from the ray tracing attenuation map after
scaling with the MC flood image. Images b) to d) represent scatter images obtained from MC for photons that
underwent a single Rayleigh interaction, a single Compton interaction and multiple scatter interactions, respectively.
From the calibration bar present in the images (with units of eV/cm2/history), we can see the contribution of scatter to
the simulated images. Compton scattering is the main scattering process at the simulated spectrum energy, (see PV
calibration bar at the bottom left of the images). The Compton image clearly contains less anatomical information due
to the larger angular deflection of the photons, on the other hand, Rayleigh scatter leads to smaller angular deflections
preserving more detail within the image, although there is still a clear loss of contrast. Images a) to d) were combined
to form the hybrid noise free image. Subsequently, image e) represents the hybrid image normalized to the target DAK
value, further blurred by the detector MTF, and f) is the total noise image formed from the detector NNPS. Images e)
and f) were added to form the final simulated image, and later converted from DAK to PV using the response function
of the detector.

Examples of the final simulated images converted to PV are displayed in Figure 6.4. Figure 6.4 a) and b) correspond
to 80 and 140 kVp respectively, with the same DAK and grid. As can be seen, the bones in the image at 80 kVp are
more noticeable than at 140 kV, as expected due to higher attenuation for lower beam energies. Figures 6.4 c) and d)
represent an enlarged section of the phantom images, generated at the same tube voltage but different dose levels,
representing a DAK of 0.62 µGy and 5.0 µGy respectively. Figures 6.4 e) and f) corresponds to the same tube voltage
and dose level but with grid in and grid out. It can be noticed a reduction of contrast in the image with no grid.

131
Figure 6.3: Images obtained from the different steps of the simulation procedure. The images correspond to the standard male
phantom and 120 kVp, a) primary image b) single Rayleigh interaction scatter, c) single Compton scatter, d) Multiple scatter image,
e) Hybrid image blurred by the detector MTF and f) enlarged section of total noise image formed from the detector NNPS at
2.5 µGy.

132
Figure 6.4: Final simulated images converted to PV at different exposure parameters for comparison, a) and b) correspond
respectively to 80 and 140 kVp with same dose and grid in. c) and d) correspond to an enlarged section of images obtained at the
same tube voltage, grid in and dose levels of 0.62 and 5.0 µGy respectively. e) and f) correspond to images at 100 kVp, 5 µGy with
grid in and grid out respectively.

133
6.2.2 Organ doses

Figure 6.5 displays the DCCs for the organs in the four phantoms representing the different body types as a function
of tube voltage. As can be seen, the DCCs generally increase with the increase of tube voltage in all the cases. It should
be noted that the tally scores the energy deposited in each material with a given label. For example, the cartilage
includes the cartilage in between the vertebral discs and also the costal cartilage, that is the cartilage that connects the
sternum to the ribs. The same applies to the bronchi and tracheal wall which are reported together, and the bones,
which include the ribs and the spine. The thoracic wall refers to all the tissue surrounding the phantom. Organs closer
to the beam have higher DCCs, with maximum values reached by the bones and lower values for the breast in the case
of the female phantom, for the PA projection. For the male phantoms, the lower DCCs were obtained for the cartilage
which includes the costal cartilage as previously mentioned, which is farther from the beam in the PA projection.

The DCCs for the female and standard BMI male phantoms are very similar, which was expected since the phantoms
have the same internal organs, except for the breast and thoracic wall. This similarity is also seen for the overweight
and obese male phantoms, which have the same internal organs. In can also be seen that the DCCs for the thoracic
wall increase for the obese phantom compared to the others as expected, due to the higher thickness of the phantom.
The values of the graphs are listed in the Table 6.3.

Figure 6.5: Dose conversion coefficients: organs absorbed dose per photon fluence (pGy∙cm2) for all phantom types as a function
of tube voltage for PA examination.

134
An example of the use of these DCCs is shown in Figures 6.6 and 6.7. In this example the DCCs were used to calculate
organ doses for PA examinations with mAs values delivered by the AEC from the Carestream system for a target
DAK of 2.5 µGy. The mAs values delivered by the AEC for each tube voltage were used to take into account grid in
and grid out in the calculation i.e., the mAs was higher for grid in.

The organ doses obtained for the four phantoms for grid in examinations are shown in Figure 6.6, for tube voltages
between 80 kVp and 140 kVp. As can be seen, organ doses increase as tube voltage decreases due to lower
transmission and consequently increased mAs to reach the target detector DAK when the tube voltage is lowered. This
trend is seen up to 120 kVp. At 140 kVp a slight increase in the organ doses is observed. This is explained because
once reached 1 mAs, the tube load reduction when tube voltage is increased is very limited. For example, at 120 kVp
the system uses 0.9 mAs and at 140 kVp it uses 0.8 mAs. These are values delivered by the (real) Carestream system
and they constitute a limit arising from the machine function. Since the mAs does not decrease proportionally to the
increase in the photon fluence we obtain a slight increase of the organ doses at 140 kVp. This behaviour is seen for
the standard BMI and overweight phantoms. For the obese phantom this behaviour is not seen because the mAs values
at 120 kVp and 140 kVp are still above 1 mAs, thus we see that the dose continues to decrease at 140 kVp. As
expected, the organ doses follow the same behaviour as the DCCs, with the bones having the higher dose and organs
farther from the beam like breast and cartilage having the lower doses. Additionally, it can be seen that the doses for
the thicker phantoms increase with respect to the thinner phantoms.

For grid out examinations (Figure 6.7), the organ doses were calculated for 60, 80 and 100 kVp only, in line with the
exposure settings used in the current study. As expected, for a constant DAK, the doses to the organs are smaller for
grid out and they decrease with the increase of kVp.

135
Table 6.3: Organ DCCs (pGy∙cm2) for all phantom types as a function of tube voltage for PA examination.

Dose conversion coefficients [pGy∙cm2] FEMALE


kV Thoracic wall Diaphragm Bones Cartilage Lungs Heart Pulmonary vessels Bronchi + trachea Breast
60 1.03E-01 5.79E-02 3.88E-01 3.26E-02 9.49E-02 4.31E-02 9.38E-02 1.07E-01 5.90E-03
80 1.01E-01 6.78E-02 3.94E-01 4.02E-02 1.03E-01 5.43E-02 1.03E-01 1.19E-01 8.09E-03
100 1.05E-01 7.68E-02 4.02E-01 4.64E-02 1.11E-01 6.35E-02 1.13E-01 1.31E-01 1.00E-02
120 1.09E-01 8.38E-02 4.04E-01 5.10E-02 1.18E-01 7.03E-02 1.20E-01 1.40E-01 1.16E-02
140 1.15E-01 9.00E-02 4.07E-01 5.50E-02 1.25E-01 7.60E-02 1.27E-01 1.49E-01 1.30E-02
Dose conversion coefficients [pGy∙cm ] MALE STANDARD
2

kV Thoracic wall Diaphragm Bones Cartilage Lungs Heart Pulmonary vessels Bronchi + trachea n/a
60 1.07E-01 3.18E-02 3.45E-01 2.79E-02 8.27E-02 3.75E-02 8.25E-02 9.40E-02
80 1.04E-01 3.75E-02 3.53E-01 3.44E-02 9.04E-02 4.77E-02 9.16E-02 1.06E-01
100 1.08E-01 4.26E-02 3.61E-01 3.98E-02 9.84E-02 5.60E-02 1.00E-01 1.17E-01
120 1.13E-01 4.66E-02 3.64E-01 4.39E-02 1.05E-01 6.21E-02 1.07E-01 1.25E-01
140 1.18E-01 4.98E-02 3.65E-01 4.71E-02 1.10E-01 6.69E-02 1.13E-01 1.33E-01
Dose conversion coefficients [pGy∙cm2] MALE OVERWEIGHT
kV Thoracic wall Diaphragm Bones Cartilage Lungs Heart Pulmonary vessels Bronchi + trachea n/a
60 1.27E-01 3.93E-02 3.80E-01 2.37E-02 9.22E-02 4.30E-02 1.26E-01 9.62E-02
80 1.16E-01 3.61E-02 3.74E-01 2.55E-02 1.02E-01 5.46E-02 1.40E-01 1.07E-01
100 1.21E-01 4.15E-02 3.85E-01 2.96E-02 1.11E-01 6.49E-02 1.54E-01 1.19E-01
120 1.26E-01 4.57E-02 3.90E-01 3.27E-02 1.19E-01 7.27E-02 1.65E-01 1.27E-01
140 1.31E-01 4.91E-02 3.92E-01 3.52E-02 1.25E-01 7.87E-02 1.74E-01 1.35E-01

Dose conversion coefficients [pGy∙cm2] MALE OBESE


kV Thoracic wall Diaphragm Bones Cartilage Lungs Heart Pulmonary vessels Bronchi + trachea n/a
60 1.76E-01 2.19E-02 2.87E-01 1.63E-02 7.27E-02 3.40E-02 9.98E-02 7.70E-02
80 1.72E-01 2.80E-02 3.15E-01 2.15E-02 8.53E-02 4.66E-02 1.18E-01 9.11E-02
100 1.80E-01 3.33E-02 3.34E-01 2.57E-02 9.62E-02 5.68E-02 1.33E-01 1.03E-01
120 1.88E-01 3.73E-02 3.45E-01 2.88E-02 1.05E-01 6.45E-02 1.45E-01 1.13E-01
140 1.96E-01 4.05E-02 3.51E-01 3.12E-02 1.11E-01 7.04E-02 1.55E-01 1.20E-01

136
Figure 6.6: Organ doses calculated using the DCCs for the male (standard BMI), female, overweight and obese male phantoms as
a function of tube voltage for grid in examinations.

Figure 6.7: Organ doses calculated using the DCCs for the male (standard BMI), female and overweight male phantoms for grid
out examinations for tube voltages of 60, 80 and 100 kVp.

137
6.2.3 Reading study

Table 6.4 shows the average TPF for each of the tasks separately. The results are shown for each reader and averaged
over all modalities for the specific task. For the nodules, the TPF ranges from 51% to 66% with an average over all
readers of 57 %. The rib fracture TPF goes from 42% to 82% with an average of 65%. The pleural effusion TPF goes
from 57% to 98% with an average of 83%. The TPF for the pneumothorax was the highest with ~100% fraction seen.
This is similar for the catheter localization with 100% visualization as it was expected, however the catheter tip was
not so accurately detected as will be seen in following results.

Table 6.4: Average true positive mark fractions for all readers and their average for each of the clinical tasks studied.

Reader 1 Reader 2 Reader 3 Reader 4 Average

Nodules 66% 58% 54% 51% 57%

Fractures 68% 82% 67% 42% 65%

Pneumothorax 99% 100% 99% 100% 100%

Pleural effusion 57% 98% 95% 80% 83%

Catheter 100% 100% 100% 100% 100%

Figure 6.8 shows the TPF as a function of dose for each reader (columns) and the readers average (line) for the different
analysis. No graphs are presented for the pneumothorax or catheter since the TPF was very close or 100% in most of
the cases. As expected from the values in Table 6.4 there are visible difference in between the readers TPF, we can
see that reader 4 has generally the lower TPF except for the pleural effusion. Reader 1 on the other hand had the higher
TPF for the detection of nodules and lowest in the detection of pleural effusion. Reader 2 and reader 3 showed similar
TPF for the nodules. Reader 2 had the highest TPF for the fractures for all dose levels. In general, for the detection of
all the tasks, readers 1-3 had similar performance, while reader 4 had the lowest performance of all the readers. The
readers performance was investigated further by analysing the false positive fraction (FPF) calculated as the number
of false positive marks divided by the number of images.

Figure 6.9 shows the FPF as a function of dose for each reader for the different clinical tasks, except for the catheter
since as expected there were not FP marks. As observed all readers marked FP nodules in the images. Reader 2 on the
other hand had the higher FPF for the fractures, interestingly this reader had also the higher TPF in the fractures for
all readers. For the pneumothorax, only reader 4 had false positive marks in the images and together with reader 1 had
the highest amount of FP marks in the pleural effusion. As observed the highest FPF values were obtained for the
pleural effusion tasks.

138
Figure 6.8: True positive fraction (lesions correctly marked/total lesions) as a function of dose for all clinical tasks, nodules, pleural
effusion and fractures. The results are shown per reader (column) and for the reader’s average (line).

Figure 6.9: False positive fraction (false positive marks/total number of images) as a function of dose for the nodules, fractures,
pneumothorax and pleural effusion lesions. The results are shown per reader.

139
6.2.3.1 JAFROC analysis – dose modalities

Table 6.5 shows the figure of merit for the JAFROC analysis (AFROC area under the curve) for all clinical tasks
considered together and for nodules, rib fractures, pneumothorax and pleural effusion separately. The modalities
considered in this part of the study were the four dose levels used in the simulations. The overall significances of the
different analyses (i.e., the different clinical tasks analysed together or individually) are reported with the degree of
freedom, F-statistics and with the p-value. A p-value inferior to 0.05 means that at least one pair of the modalities
compared (in this case different dose levels) is significantly different.

The area under the alternative free-response receiver operating characteristic (AFROC) curves have similar values.
For all tasks together, the FOM ranged from 0.736 to 0.774. Looking at the tasks individually, we found AUC for
nodules ranged from 0.763 to 0.789, for fractures from 0.774 to 0.841, for pleural effusion from 0.829 to 0.858 and
for pneumothorax from 0.955 to 0.997, which was expected due to the high TPF obtained. The lowest FOM was
obtained at 0.62 µGy in all analyses except for the pleural effusion tasks. For the analysis of the fractures and
pneumothorax considered separately, p-values of 0.0005 and 0.0008 were obtained with F=16.78 and F=14.64
respectively, meaning that there were significant differences found in at least one pair of modalities. This can be
further examined in the results shown in Table 6.5.

Table 6.5: Figure of merit for the JAFROC analysis (AFROC area) for the dose levels shown for all clinical tasks considered
together and for the nodules, rib fractures, pneumothorax, and pleural effusion separately. The numerator and the denominator of
the degrees of freedom, the F-statistics and the p-value are also presented for each analysis. A p-value smaller than 0.05 means that
at least one pair of modalities compared in the test are significantly different. The * indicates significant difference.

FoM (AFROC area) - Dose Levels

Dose Levels All tasks Nodules Fractures Pneumothorax Pleural Effusion

0.62µGy 0.736 0.763 0.774 0.955 0.854

1.25µGy 0.774 0.789 0.830 0.997 0.858

2.5µGy 0.765 0.779 0.841 0.992 0.829

5.0µGy 0.768 0.778 0.835 0.985 0.832

Degree of freedom
(numerator, denominator) (3, 9) 2.93 (3, 12) 2.41 (3, 9) 16.78 (3, 9) 14.64 (3, 9) 0.36
F-statistics

p-value 0.092 0.118 0.0005* 0.0008* 0.8

Table 6.6 shows the comparison of the FOMs for the different dose levels for all tasks analysed together and nodules,
rib fractures, pneumothorax and pleural effusion separately. A p-value smaller than 0.05 and a 95% confidence interval
(CI) that does not include zero mean a significant difference between the two modalities being compared. Modalities
considered significantly different are indicated with an asterisk symbol and with italic font.

140
Table 6.6: Comparison of different dose levels investigated for all clinical tasks considered together and for the nodules, rib fractures, pneumothorax, and pleural effusion separately.
A p-value smaller than 0.05 and a CI that does not include zero mean a significant difference between the two modalities being compared. Modalities considered not equal are
signalled with * and with italic font.

Combined Dose Levels

All tasks Nodules Fractures Pneumothorax Pleural Effusion


Dose Levels
compared p-
p-value 95% CI p-value 95% CI p-value 95% CI p-value 95% CI 95% CI
value

0.62µGy-1.25µGy 0.0235 (-0.0686,-0.0063)* 0.0341 (-0.0498,-0.0024)* 0.0006 (-0.0795,-0.031)* 0.0002 (-0.0580,-0.0264)* 0.9124 (-0.0829,0.0750)

0.62µGy-2.5µGy 0.0672 (-0.0598,0.0025) 0.1794 (-0.0389,0.0084) 0.0001 (-0.0906,-0.0424)* 0.0005 (-0.0528,-0.0212)* 0.4867 (-0.0536,0.1043)

0.62µGy-5.0µGy 0.0484 (-0.0625,-0.0002)* 0.2 (-0.0381,0.0092) 0.0003 (-0.0850,-0.036)* 0.0021 (-0.0455,-0.0140)* 0.5504 (-0.0573,0.1006)

1.25µGy-2.5µGy 0.536 (-0.0222,0.0400) 0.3252 (-0.0127,0.0345) 0.3245 (-0.0351,0.0129) 0.4746 (-0.0105,0.021) 0.4235 (-0.0497,0.1082)

1.25µGy-5.0µGy 0.6692 (-0.0250,0.0372) 0.2944 (-0.0120,0.0353) 0.6181 (-0.0295,0.0185) 0.1083 (-0.0033,0.0282) 0.4819 (-0.0533,0.1046)

2.5µGy-5.0µGy 0.8445 (-0.0339,0.0283) 0.9436 (-0.0229,0.0244) 0.6117 (-0.0184,0.0296) 0.327 (-0.0085,0.0230) 0.9188 (-0.0826,0.0753)

141
As observed, there are some modality pairs that showed significant differences for all analyses except for the pleural
effusion analysis. From these analyses, the modality performing significantly lower was 0.62 µGy. For the clinical
tasks considered together, 0.62 µGy showed significant differences with 1.25 and 5.0 µGy. For the nodules,
differences were found between 0.62 µGy and 1.25 µGy. For the rib fractures and pneumothorax analyses, significant
differences were found between 0.62 µGy and the rest of dose levels. Thus, there is a decrease in detectability when
using dose levels of 0.62 µGy. While above 1.25 µGy the information content does not improve further. We can also
see that the overall p-value for all clinical tasks and for the nodules was above 0.05 but there were still some modality
pairs with significant differences, this is due to sampling effects of the methods and is considered normal [193].

6.2.3.2 JAFROC analysis – tube voltage/grid use modalities

Table 6.7 shows the figure of merit for the JAFROC analysis (AFROC area under the curve) for all clinical tasks
considered together, for the nodules and rib fractures separately and pneumothorax and pleural effusion together. For
these analyses, the modalities compared were a combination of tube voltages and grid use. The seven modalities
compared were 80 kVp (grid in and grid out), 100 kVp (grid in and grid out), 120 kVp (grid in) and 140 kVp (grid in).
The overall significance of each of the four analyses (i.e., all clinical tasks, nodules, rib fractures and pneumothorax
together with pleural effusion) are also shown with the degree of freedom, F-statistics and p-value.

Since the pneumothorax and pleural effusion are the less prevalent clinical tasks, they were analysed together. This
was not the case for the dose levels since only four modalities were compared, while in this situation we are comparing
7 modalities, thus a larger number of abnormal images were necessary.

The analyses for the nodules (p= 0.0019), fractures (p= 0.0022), and pneumothorax + pleural effusion (p= 0.0008)
yielded p-values below 0.05, which means at least one pair of modalities compared is significantly different.

The FOM values presented in the Table 6.7 plus their 95% CI are displayed in Figure 6.10. Some trends can be
identified from the plotted data. For the nodule analysis, there is a general trend towards higher FOM at higher tube
voltage, while the opposite is observed for the rib fracture analysis with decreasing FOM at higher tube voltage. For
the pneumothorax and pleural effusion, higher FOM values were obtained for grid out techniques. For all clinical tasks
there is no marked trend, but it can be seen that FOM is generally lower at 60 kVp.

Table 6.8 displays the comparison of the modalities formed by tube voltage and grid use combinations. The p-value
and 95% CI are listed for all tasks analysed together, nodules and rib fractures separately and pneumothorax and
pleural effusion together. A p-value smaller than 0.05 and a 95% confidence interval (CI) that does not include zero
mean a significant difference between the two modalities being compared. Modalities considered not equal are
indicated with an asterisk symbol and with italic font.

For the analysis of all the tasks, the overall p-value was above 0.05 but we can see that three modality pairs showed
significant difference, they correspond to 60 kVp and grid out, which performed significantly poorer than 80 kV (grid
in and grid out) and 100 kVp grid out. For the nodules, 60 kVp and grid out had significantly lower FOM than all the
rest of the modalities. For the rib fractures analysis, 80 kVp with grid in or grid out showed significantly better AUC
than modalities like 60 kVp-grid out and 140 kVp-grid in. Is it surprising that the 60 kVp is seen to have poorer

142
performance for the fractures when a higher value might be expected from the trends seen for the other combinations.
It is possible that this is a result of the few number of images present in the study given that this setting was not used
for the obese phantom, causing a statistical unbalance in the analysis. In clinical practice such a low kVp would have
led to unacceptably long exposure times in an obese patient therefore it was not included in the image dataset. For the
pneumothorax and pleural effusion analysis, the 60 kVp modality performed significantly poorer than the other
combinations, followed by 100 kVp and grid in that performed significantly lower than 80 and 100 kVp grid out.

Table 6.7: Figure of merit for the JAFROC analysis (AFROC area) for the tube voltage and grid use combination. The analyses
included are all clinical tasks considered together and nodules, rib fractures, pneumothorax, and pleural effusion separately. The
numerator and the denominator of the degrees of freedom, the F-statistics and the p-value are also presented for each analysis. A
p-value smaller than 0.05 means that at least one pair of modalities compared in the test are significantly different. The * indicates
significant difference.

FoM (AFROC area) - kV and Grid In / Grid Out

Pneumothorax
kV and grid use compared All taks Nodules Fractures
+ Pleural Eff

60kV_go 0.714 0.701 0.757 0.842

80kV_gi 0.765 0.771 0.853 0.908

80kV_go 0.773 0.764 0.858 0.916

100kV_gi 0.741 0.772 0.803 0.882

100kV_go 0.771 0.792 0.825 0.930

120kV_gi 0.756 0.789 0.804 0.910

140kV_gi 0.748 0.796 0.779 0.918

Degree of freedom
(numerator, denominator) (6, 18) 1.69 (6, 18) 5.67 (6, 18) 5.49 (6, 18) 6.60

F-statistics

p-value 0.1797 0.0019* 0.0022* 0.0008*

143
Figure 6.10: FOM for the JAFROC analysis (AFROC AUC and 95% CI) for all clinical tasks considered together, nodules and rib
fractures separately and pneumothorax with pleural effusion. Modalities compared were a combination of tube voltage and grid
use. The FOM values of the graphs are listed in Table 6.7.

144
Table 6.8: Comparison of different combinations of tube voltage and grid use investigated. The analyses included are all clinical tasks considered together and nodules, rib fractures,
pneumothorax, and pleural effusion separately. A p-value smaller than 0.05 and a CI that does not include zero mean a significant difference between the two modalities being
compared. Modalities considered not equal are signalled with * and with italic font.

Combined kV and Grid In / Grid Out

All lesions Nodules Fractures Pneumothorax + Pleural Eff


kV and grid use compared
p-value 95% CI p-value 95% CI p-value 95% CI p-value 95% CI

100kV_gi - 100kV_go 0.2022 (-0.0769,0.0174) 0.3053 (-0.0607,0.02013) 0.5002 (-0.0876,0.0431) 0.008 (-0.08323,-0.0144)*

100kV_gi - 120kV_gi 0.5267 (-0.0617,0.0327) 0.3802 (-0.0577,0.0231) 0.9646 (-0.0668,0.0639) 0.1045 (-0.06239,0.0064)

100kV_gi - 140kV_gi 0.7799 (-0.0535,0.0408) 0.2269 (-0.0645,0.0163) 0.4707 (-0.0415,0.0892) 0.04 (-0.07064,-0.0018)*

100kV_gi - 60kV_go 0.243 (-0.0200,0.0743) 0.0017 (0.0302,0.1111)* 0.1737 (-0.0202,0.1105) 0.0282 (0.00466,0.0734)*

100kV_gi - 80kV_gi 0.3032 (-0.0710,0.0233) 0.9894 (-0.0401,0.0407) 0.1277 (-0.1159,0.0148) 0.1233 (-0.0608,0.0079)

100kV_gi - 80kV_go 0.1705 (-0.0792,0.0151) 0.7093 (-0.0331,0.0477) 0.0942 (-0.1210,0.0097) 0.0469 (-0.0693,-0.0005)*

100kV_go - 120kV_gi 0.5064 (-0.0319,0.0624) 0.8781 (-0.0374,0.0434) 0.5287 (-0.0445,0.0861) 0.2194 (-0.0135,0.0552)

100kV_go - 140kV_gi 0.3122 (-0.0238,0.0705) 0.8467 (-0.0442,0.0366) 0.1647 (-0.0192,0.1114) 0.452 (-0.0218,0.0469)

100kV_go - 60kV_go 0.0209 (0.0096,0.1040)* 0.0002 (0.0505,0.1314)* 0.0436 (0.0019,0.1327)* 0 (0.0534,0.1222)*

100kV_go - 80kV_gi 0.795 (-0.0412,0.0531) 0.2994 (-0.0198,0.0610) 0.3915 (-0.0937,0.0370) 0.189 (-0.0120,0.0567)

100kV_go - 80kV_go 0.9183 (-0.0495,0.0448) 0.1688 (-0.0128,0.0680) 0.3127 (-0.0987,0.0319) 0.4074 (-0.0205,0.0482)

120kV_gi - 140kV_gi 0.7217 (-0.0390,0.0553) 0.7291 (-0.0472,0.0336) 0.4439 (-0.0401,0.0906) 0.6206 (-0.0426,0.0261)

145
120kV_gi - 60kV_go 0.0804 (-0.0055,0.08882) 0.0002 (0.0475,0.1284)* 0.1603 (-0.0188,0.1119) 0.0007 (0.0326,0.1014)*

120kV_gi - 80kV_gi 0.6836 (-0.0565,0.0378) 0.3733 (-0.0228,0.0580) 0.139 (-0.1145,0.0162) 0.9271 (-0.0328,0.0359)

120kV_gi - 80kV_go 0.4444 (-0.0647,0.0296) 0.2174 (-0.0158,0.0650) 0.103 (-0.1195,0.0111) 0.6765 (-0.0413,0.0274)

140kV_gi - 60kV_go 0.1533 (-0.0137,0.0806) 0.0001 (0.0543,0.13524)* 0.5191 (-0.0441,0.0866) 0.0002 (0.0409,0.1097)*

140kV_gi - 80kV_gi 0.4477 (-0.0646,0.0297) 0.2221 (-0.0161,0.0648) 0.0262 (-0.1398,-0.0090)* 0.5583 (-0.0246,0.0441)

140kV_gi - 80kV_go 0.2677 (-0.0729,0.0215) 0.1205 (-0.0090,0.0718) 0.0178 (-0.1448,-0.0141)* 0.9375 (-0.0331,0.0357)

60kV_go - 80kV_gi 0.036 (-0.0981,-0.0037)* 0.0018 (-0.1108,-0.03)* 0.0046 (-0.1610,-0.0303)* 0.0008 (-0.0999,-0.0311)*

60kV_go - 80kV_go 0.0168 (-0.1063,-0.0119)* 0.004 (-0.1038,-0.0229)* 0.0029 (-0.1661,-0.0353)* 0.0003 (-0.1084,-0.0396)*

80kV_gi - 80kV_go 0.7174 (-0.0554,0.0389) 0.7192 (-0.0334,0.0474) 0.8776 (-0.0704,0.0603) 0.6115 (-0.0428,0.0259)

146
6.2.3.3 Catheter localization vs dose

Figure 6.11 shows the influence of dose on localization accuracy of the catheter tip. The figure shows the distance
in millimetres from the readers’ marks to the actual location of the catheter tip as a function of dose. The distance
shown is along the vertical direction of the image, thus in the length of the catheter. A red solid symbol shows the
average distances at each dose level, which were distances of 6.3, 2.0, 2.8 and 1.8 mm for dose levels of 0.62,
1.25, 2.5 and 5.0 µGy respectively.

The largest deviations from the true catheter location were seen at 0.62 µGy with a maximum distance of 20.1 mm
and mean distance of 6.3 mm from the catheter tip. Given that catheter placement did not lead to any false positive
marks in the images, the results of this task were not analysed using the JAFROC method. Despite the dose level
affecting the localization of the catheter tip, there were almost no differences in the average rating given to the
images acquired at the different dose levels. For 0.62 µGy the average rating was 3.97 while for the rest of the
dose levels all marks were given the maximum rating of 4.0.

Figure 6.11: Catheter tip localization as a function of dose level simulated in the image. The y-axis shows the distance in mm
from the reader’s mark in the image to the actual localization of the catheter tip. The values are shown for each reader
individually and for their average.

6.2.3.4 Noise level acceptability vs dose

In the Figure 6.12 we can see the percentage of images scored as acceptable for diagnostic applications as a
function of the air kerma at the detector simulated in the images. The results are shown for all readers separately
and for the average of all readers (red curve). Readers 1, 3 and 4 classified less than 30% of the images simulated
at 0.62 µGy as suitable for diagnostic purposes. As expected, this percentage increases with dose and reaches a
plateau from 2.5 µGy onwards. On average, 27%, 66%, 79% and 82% images were considered as suitable for
respectively 0.62, 1.25, 2.5 and 5.0 µGy. It can also be seen that the curve for Reader 2 has a different shape
compared to the others, and in this case the dose level did not strongly affect the percentage of images marked as
suitable for diagnosis. This can be explained by the subjective nature of this question, which depends on the
reader’s tolerance or intolerance to noise in the image. Note that these are the type of questions typically used in
VGA analysis.

147
Figure 6.12: Percentage of images marked as acceptable for diagnostic applications as a function of dose level simulated in
the image. The values are shown for each reader individually and for their average.

6.3 Discussion

A set of 11 computational models including a range of clinical tasks were used as input in a simulation framework
for chest radiography to create synthetic images at different exposure settings. For the creation of the scattered
radiation distributions in the images, Monte Carlo simulations were employed. The scatter image was calculated
only once for the phantoms with the same body type, it was assumed the presence of the clinical tasks to have
negligible effect in the scatter contribution. For the calculation of the scatter image, the phantom voxel resolution
was reduced to 1.0x1.0x1.0 mm3 and the detector pixel size to 5x5 mm2 [37]. This phantom resolution was also
used for the organ dose calculation. The reduction of phantom resolution did not cause significant changes in the
organ volumes. The average difference in the organ volumes between 1.0x1.0x1.0 mm3 and 0.5x0.5x0.5 mm3
voxel resolution was around 5%, these differences can be explained by the conservative nature of the algorithm,
which tries to preserve the thickness in small structures.

Monte Carlo simulations were used to study the grid primary and total transmission (Tp and Tt) across the image
and local scatter conditions in the thorax region were seen to influence these grid parameters. Thus, images with
Tp as pixel values were created for each tube voltage and phantom type and applied to the primary images to
account for the grid efficiency. This is more useful than applying a single grid Tp to the entire image, which is
the standard method when studying grid efficiency [147,194].

Organ dose conversion coefficients were calculated for the four phantom types as a function of tube voltage. As
expected, the DCCs decreased with the energy in the spectrum and were higher for organs closer to the beam. The
DCCs for the male of standard BMI and the female phantom are very similar, which is expected given that they
have the same internal anatomy, and the skin only changes for the presence of the breast. Comparing the
overweight and obese male phantoms, bigger differences are observed, which can be explained by the thoracic
wall being much thicker for the obese phantom. This led to higher DCCs for the thoracic wall and lower in the
rest of the organs from the obese phantom compared to the overweight phantom. Compared to the DCCs from
ICRP publication 116 [4], we found the same order of magnitude for the standard BMI phantom. However, any
direct comparison of values must be treated with caution. Different irradiation conditions were used i.e., parallel

148
monoenergetic beam in ICRP vs a cone beam and spectrum in the current study. Furthermore, the organs in RAF
have slightly different shapes.

The expected organ doses for PA examinations in the Carestream system, simulated in our study, were estimated.
For this, we used the air kerma and mAs output from the Carestream system for AEC controlled exposures, i.e.,
real clinical exposure conditions. For a constant signal at the detector of 2.5 µGy, organ doses were found to
decrease with the increase of tube voltage up to 120 kVp for all phantoms except the obese phantom. As expected,
the grid out technique showed considerable reductions in the organ doses compared to grid in, with average
reductions of 75%.

A total of 296 images were scored by four radiology residents. Several types of JAFROC analyses were performed
in order to compare the effect of the modalities on the detection of the different tasks. The analysis was performed
for all clinical tasks together, and for each clinical task separately, except for the catheter. For the analysis of the
individual tasks, the images were classified in normal or abnormal depending on the task present. This is
considered to be valid for this type of study because all the images contained more than one clinical task, and in
most of the cases there were different types of clinical tasks.

The results for the different analyses showed no significant difference between three of the dose levels used (i.e.,
1.25, 2.5 and 5.0 µGy). On the other hand, 0.62 µGy showed significant difference with at least one other
modality, except in the analysis of the pleural effusion. This means that a reduction of dose by 50% can be
achieved without affecting the detectability of the clinical tasks. Previous studies in pulmonary nodule detection
have found similar results in chest radiography, consistent with anatomical noise being the dominant factor
affecting the detection of nodules [12,13]. Studies with chest tomosynthesis have already shown improved
detectability of nodules compared to plain radiography [14,15].

From the results shown in Figure 6.12, there is a clear influence of the noise level in the readers perception when
classifying the images as acceptable or not for diagnostic applications. It was observed that for three of the four
readers, the noise level had a large influence on the overall quality perception of the image and on average 27%,
66%, 79% and 82% images were classified as suitable for diagnosis for 0.62, 1.25, 2.5 and 5.0 µGy respectively.
It can be seen that the scoring by reader 2 is somewhat different from the other readers: the simulated dose level
did not drastically affect the percentage of images marked as suitable for diagnosis by this reader. This may be
related with the subjectivity that is involved with this type of scoring. Kroft et at. also found differences in reader
opinion when asked about the noise in patient images in which different dose levels were simulated [16].

This was examined more closely to see whether this was seen in the AFROC FOM values. Figure 6.13 shows the
FOM values for reader 1 and reader 2 for the analyses of the different clinical tasks as a function of dose level.
As observed, for reader 1, 0.62 µGy has generally the lowest FOM, while for reader 2 only in the rib fracture
analysis the FOM at 0.62 µGy is the lowest. This highlights the complexity between the perceived dose and overall
quality of the image and decision making in chest radiography.

149
Figure 6.13: FOM values (AFROC AUC) for reader 1 and reader 2 for the dose modalities compared and for the different
analyses of the clinical tasks.

Turning to the catheter tip data, it was observed that tip localization accuracy only suffered at the lowest dose
level simulated, with markings up to 20 mm from the catheter tip. In fact, localization was lower at 1.25 µGy and
then increased slightly at 2.5 µGy, which is the detector air kerma level generally used in the clinics. From this
study we could conclude that the dose could be reduced by 50% without affecting the catheter localization in the
mediastinum region. This can be applied if the main reason to perform the X-ray examination is to check for the
localization of this type of devices, for example in intensive care patients. Similar results for catheter detection in
the mediastinum were found by Eisenhuber et al. in a study using CR detectors [17].

In general, the results suggest that a reduction of dose from 2.5 µGy to 1.25 µGy is possible without decreasing
the diagnostic performance. A study from Strotzer et all [18] showed a reduction of dose from 2.5 µGy to 1.8 µGy
was possible without decreasing the overall image quality. In their study, images were scored according to
subjective image quality criteria linked to the visualization of anatomical structures. On the other hand, Metz et
al [19] found that only for follow up of lesions in the lung region the dose could be reduced to 1.25 µGy, but not
for the mediastinum region.

In the second JAFROC analysis, modalities compared were combinations of tube voltage and grid use for all
clinical tasks, for nodules and rib fractures separately and pneumothorax with pleural effusion. Settings like
100 kVp and grid out and 80 kVp with grid in and grid out had FOM values among the highest for all clinical
tasks and nodules analysis. Compared with 120 kVp and grid in (i.e., the setting commonly used in clinical
protocols) there were no significant differences. These results have some relevance for portable chest radiographs,
especially the data for grid out techniques since these types of examinations are usually acquired without the
antiscatter grid.

It was also observed for the analysis of the pneumothorax with the pleural effusion, that a grid out technique at
80 kVp and 100 kVp was beneficial in the detection of the tasks compared to grid in modalities.

For the rib fracture analysis, modalities with lower tube voltage had higher FOMs. Significant differences were
found between 80 kVp grid in and grid out when compared to 100 and 140 kVp with grid in. This was expected
due to the higher contrast of the bones at lower energies.

150
For the nodules, a general trend towards higher FOM at higher tube voltage was observed. This is generally
explained as ‘higher kV reduces the rib contrast, thus reducing the anatomical noise that can affect the detection
of lung nodules’ (personal communication with staff members in the hospital). Metz et al [19] found a similar
trend in the detection of lesions over the pulmonary areas when increasing tube voltages. However, except for the
60 kVp and grid out setting, the comparison between the rest of the modalities did not show statistically significant
differences.

In Chapter 1, a standard experimental approach was applied for technique optimization in a test object consisting
of a PMMA block with inserts of different materials. The results of that study showed highest FOM for lower
tube voltages, which is different to the results we have found in the present work. This suggests there are difference
in studies with physical test objects and image quality metrics like SDNR and detectability studies using
anthropomorphic phantoms where images are scored by radiologists. There is also other type of studies, where
patients or phantoms images are scored following subjective image quality criteria of anatomical features in the
image [18,20–22]. However, these studies miss a clear link to the clinical task to be answered, which is not ideal
when studying the influence of different exposure parameters. It is also important to consider that in free response
studies as presented here, the results are also influenced by the false positive lesions marked on the image.

Metz et al. published a study in which a physical anthropomorphic phantom and receiver operating characteristics
[19] were used to study the influence of tube voltage and dose in diagnostic performance. An important difference
with our study is that in their study physical phantoms have been used with limited anatomical noise, especially
in the lung region. In addition, only one body size was used, which is often the case when physical phantoms are
used. This highlights the main advantage of the computational phantoms used in this study, a result of the
flexibility provided by the mesh modelling.

There are a number of limitations associated with the work presented here. First, the sessions were scored within
a short period of time, ideally several days or a week between each session would have decreased the learning
effect of the readers. This was due to practical limitations at our institution and could not be avoided.

The fact that all images were abnormal i.e., contained pathology of some sort, could also be noted as a limitation.
This fact was not told upfront to the observers. For time constraints, the number of images was kept as low as
possible and therefore images of the phantom at different acquisition settings without clinical tasks were not
shown to the observers. Even though the readers were blinded to the fact that all images were abnormal, it is likely
they noticed that every image contained some kind of clinical task after reading the first sessions. However,
because all the images contained more than one clinical task, the data could be analysed by separating the images
in normal and abnormal in accordance to a specific task. Readers were also blinded to the fact that there was more
than one task in all images.

A further factor influencing the results of the study is that the readers were not experienced in reading studies with
phantoms. This was perhaps one of the reasons that they did not make full use of the confidence scale, leading to
FOMs values very similar in some of the cases. Differences in the reading scoring were also observed, as displayed
for the TPF and FPF graphs (Figure 6.8 and Figure 6.9). Overall, readers 1-3 performed similarly while reader 4
had lower scores. Another limiting factor could be that the tasks may have been too obvious, although if we look
at the TPF in Figure 6.8, we can see that especially for nodules and rib fractures this was not the case.

151
A surprising result found was that, in general, there was no significant difference between grid in and grid out
technique in this study. Note that there is a small gap of 5 cm between patient exit and the detector for the geometry
used in this study. In projection radiography, the effect of scatter is a loss of contrast and one would consequently
expect a reduction in detectability, however this was not found in this study. A number of factors may have
resulted in this. For example, the lack of experience from the readers leading them to not make full use of the
confidence scale, or a learning effect due to small variabilities in the anatomy. Another important factor to
consider is the image processing, which has the potential to restore image contrast for grid out techniques. A study
by Neitzel et al. [23] found no significant differences in lesion detection when working with grid in or grid out,
they suggested that proper image processing can compensate for the effect of higher scatter in the image. However,
in their study a small air gap of 15 cm was used for the grid out technique. Another study by Moore et al. [24]
used a range of image quality criteria to score images with different tube voltages and scatter rejection methods
in a relative Visual Grading Analysis study. They found that tube voltages below 100 kVp and grid out provided
superior quality for average size patients.

In our case, further investigation is needed to establish whether imaging without the grid and not a true air gap
(i.e., just removing the grid) results in clinically acceptable images. Also relevant is the comparison of different
image processing techniques to evaluate how effective they are in enhancing the local contrast in the image when
the antiscatter grid is removed.

The fact that the analysis used was random readers and fixed cases, means that the results presented apply to the
reader population, but are only applicable to the image dataset presented in this study. It was decided to use this
type of analysis given that the images were from phantoms and not real patients. Although clinical testing will
always be required for validation, the results obtained with these anthropomorphic phantoms including clinical
tasks can be used as base for further analysis, constituting an important first step towards optimization studies.

To our knowledge, this is the first VCT in the field of chest radiography to include a large range of clinical tasks.
With the introduction of new technologies like dual energy techniques [195] or dual energy detectors [45] one
could see how this type of framework could contribute to the improvement and evaluation of this imaging
technique. The work presented sets the groundwork to explore optimal acquisition techniques or new detectors
for a wider range of applications. For example, new detectors with small pixel size [196] and different antiscatter
grids [183].

6.4 Practical conclusions

• A VCT study was used to study the effect of different exposure parameters in PA thorax examinations
using anthropomorphic phantoms including five clinical tasks: nodules, catheters, rib fractures,
pneumothorax and pleural effusion.
• In general, the results suggest that a reduction of dose from 2.5 µGy to 1.25 µGy is possible without
decreasing the diagnostic performance.
• For grid out techniques using 100 kVp and 80 kVp values did not show significant differences in
diagnostic performance. Organ doses were on average 10% lower at 100 kVp compared to 80 kVp. More
investigation on this is necessary, in which AP projections are simulated and DCCs for this type of
irradiation are calculated to apply the results to bedside imaging.

152
• For a constant target DAK and grid in technique, 120 kVp gave the lower organ doses with no significant
decrease in diagnostic performance. In terms of tube voltage, this justifies the choice made in practice.
• Further analysis of the effect of the antiscatter grid is necessary. A comparison of different image
processing techniques in the enhancement of contrast for grid out technique is relevant.

153
Conclusions and future work

Nowadays, chest radiography (CXR) remains the core imaging examination for the chest despite the availability
of imaging techniques like chest tomosynthesis and Computed Tomography. CXR has many advantages, such as
relatively low cost, widespread availability, and reduced dose to the patient. Chest radiography images are used
in the management of numerous clinical problems, from pulmonary diseases, bone fractures to the visualization
of catheters amongst many other tasks. Acquisition protocols used nowadays have at their root those used in
screen film systems, and in many cases have not been optimized for the latest generation of digital flat panel
detectors (FPDs). In digital radiography, the optimization process should be referred to the clinical task to be
performed, which was fundamental premise behind our project. As with all uses of radiation for medical purposes,
optimization should find a balance between the image quality necessary to perform certain imaging task and the
dose delivered to the patient [197].

Numerous research articles have demonstrated that the projected patient anatomy is the main factor limiting the
detection of lesions in a chest radiography [121,164–168], thus the term ‘anatomical noise’ derived from their
work. Therefore, after the first technical studies have been performed, it is important to ultimately have realistic
anatomical information in images generated for diagnostic performance evaluation to go the next step. Computer
simulations have been used in the field of chest radiography [17,33,34,42,152,198], with noteworthy results
obtained. However, in occasions there is lack of realistic anatomical background in the simulated images.

This thesis has been devoted to the design and implementation of a simulation platform for CXR that can be used
to evaluate image quality and dose. The platform, together with a set of realistic computational models of patient
anatomy, were applied in a Virtual Clinical Trial (VCT) to study the diagnostic performance of different readers
as a function of different exposure parameters, for a range of common clinical CXR tasks. This chapter
summarizes and discusses the main findings and conclusions of the presented work.

A wide variety of methods exist to perform optimization of imaging technique. From measurements of Signal
difference to noise ratio (SDNR), threshold contrast-detail detectability, and Visual Grading Analysis (VGA)
studies that evaluate the overall perceived impression of image quality, graded or rated via the visualization of
normal anatomical structures by the imaging system. The different methods were put into practice in the thesis,
with the conclusion that a more detailed platform, with a clear link to clinical tasks performed is needed.

154
From Chapter 1 and Chapter 3, we observed that the clinical protocols for thorax Posterior Anterior (PA)
examinations use a high tube voltage technique and antiscatter grid, and this not only within UZ Leuven but also
at several other clinical sites in Flanders. These protocols were developed from initial experience with film screen
systems and then adapted over time as different generations of digital image receptors were adopted and integrated
within the radiology department, first Computed Radiography (CR) imaging and now CsI-based FPDs.

In Chapter 1, experimental measurements were performed for technique optimization using a figure of merit
(FOM) defined to account for detectability, in terms of SDNR, and a measure of patient exposure (lung dose).
The results from this study showed that, for systems set up to work at constant detector air kerma, a higher FOM
was obtained at lower tube voltage without antiscatter grid. This ‘optimal working point’ is not used in practice
(see the survey in Chapter 3). Physics test object measurements like this one are straightforward and have good
reproducibility. They provide valuable insight on the influence of the radiation quality on SDNR and can serve as
a base for more comprehensive studies. We have interpreted this finding as follows: test objects made from details
in a homogeneous background may not be sufficiently representative for the actual thorax anatomy and might not
reflect or predict the results obtained in working practice, with ‘anatomical noise’ and a large dynamic range in
the thorax.

In search for a more realistic test object, the thorax anthropomorphic phantom Lungman was at first validated for
dosimetric applications in chest radiography. When imaged using clinical settings for a thorax PA exam, the
phantom showed Exposure Index and Kerma Area Product values in line with those found in patients at the same
system. An important outcome of this study was the PMMA equivalence of the lung and mediastinum regions of
the phantom, which was used in other studies throughout the thesis. Additionally, a voxel model of the Lungman
phantom was created and used to compare the tissue equivalent materials of the phantom against ICRU defined
material composition.

The Lungman phantom was then used in Chapter 3 to evaluate image quality in a survey of chest radiography
including 22 digital X-ray system. The images of three versions of the Lungman, in which chest plates were added
to simulate different thicknesses, were read by three radiologists in a VGA study. The image scoring was based
on the CEC image quality criteria that focuses on the visualization of different anatomical features in the chest
[110]. Additionally, a contrast detail test object (Leeds TO20 [199]) was also used to make a technical evaluation
of the image quality of the system. It was found that, across the working dose range used for the systems in the
study, poor correlation was found between technical (contrast-detail) and clinical (VGA scoring) image quality.
Furthermore, it was found that systems with higher dose did not show improved detectability, characterized by
number of visible discs in the TO20, expected for quantum noise dominated systems. The study showed the need
to optimize a number of systems in the chest X-ray departments included in the survey. In summary, no correlation
was found between the results obtained from the VGA study and the contrast detail test object [200].

One of the main contributions of the thesis has been the development of a methodology to simulate synthetic
radiographic images, which was described and validated in Chapter 4. The simulation platform used a ray tracing
technique to generate noise free primary images and PENELOPE/penEasy Monte Carlo simulations [84] [85] for
the generation of the scatter images. The simulation included the modelling of the antiscatter device. Noise and
sharpness characteristics were then added to the resulting hybrid image. These characteristics were measured from
a real CsI digital detector, which was fully characterized in terms of response function (relationship PV and

155
detector air kerma), Modulation Transfer Function (MTF) characterizing the sharpness and the Normalized Noise
Power Spectrum (NNPS) characterizing the noise magnitude and texture. In comparison to other simulation
platforms that use the total NNPS to add the noise characteristics to the image [37,38], this work split the NNPS
into electronic, quantum and structure noise components, a method previously used for mammography
imaging [131]. For low dose simulations, the correct modelling of the electronic noise is crucial, given that this
noise source dominates when the X-ray signal at the detector is low. This can also happen in specific regions of
the image, where high X-ray attenuation means that the fraction of electronic noise to quantum noise can be
relatively high. This step, and the modelling of the antiscatter grid, sets this simulation framework apart from
others developed for chest radiography.

The validation of the simulation platform was carried out by implementing a model of a Carestream X-ray system
available in our hospital used for thorax imaging. Validation was first performed by comparing SDNR values
measured in real and simulated images of a test object consisting of PMMA and a small aluminium detail. Large
area contrast and noise were verified as a function of PMMA thickness, tube voltage and different dose levels, for
grid in and grid out cases. Average differences between SDNR measured in real and simulated images were 6%
and 5% for grid in and grid out respectively. Additional validation of the noise and sharpness modification process
showed maximum differences between the simulated and measured MTF of 12% up to the Nyquist frequency.
For the NNPS in real and simulated images, average deviations found for all tube voltages remained below 5%.
Moreover, as a means of validating the grid simulation, primary transmission (Tp) and total transmission (Tt)
were measured experimentally in the physical antiscatter grid and compared to those for the simulated grid. It was
found that relative differences for Tp and Tt were below 6% and 13% respectively. The results obtained were
considered satisfactory; the framework was considered validated.

In addition to the step in which measured MTF and NPS are added to the image, the modelling of the CsI detector
was simplified by making it a perfect absorbent material. Future work could include the modelling of X-ray photon
interactions and the subsequent light photon cascades generated within the CsI scintillation layer [150,201].
Nevertheless, the method used in the simulation platform developed here allows the rapid simulation of an X-ray
detector that already exists and has been characterized using standard methods. For example, the benefit of
improved quantum detection promised by detectors using perovskite as a detection medium [202] in terms of
improved task performance could be fairly quickly investigated with this platform. It may be possible to use
cascaded systems analysis of X-ray detectors [143,203,204] as a means of obtaining the MTF and NNPS data
required for the simulation platform. However, some points of caution must be noted. The Lubberts effect, which
is the influence of the depth of interaction on the transfer of signal and noise through the CsI phosphor, has not
been treated analytically and it is not currently included in the cascaded models [205]. The imaging chain
simulation could also be used to produce images for detectors at the prototype stage, providing insight into the
development or prospective performance of new imaging technologies. It is important to highlight that the
proposed method depends on the availability of raw data: the system should allow the extraction of For Processing
images with linearized PV, necessary for the measurements of MTF and NPS.

The proposed simulation framework was designed for Virtual Clinical Trials (VCTs) in chest radiography. The
proposed framework should allow the exploration of a wide range of system configurations and rule out
suboptimal scenarios. This should also ease the optimization of the newest detectors being brought on the market

156
these days. For example, the potential of hybrid photon counting detectors [206] or perovskite detectors [207,208]
could be examined in a clinical setting.

Another important tool created during the PhD project was a library of 24 realistic anthropomorphic chest
phantoms described in Chapter 5. Different segmentation techniques and polygonal mesh modelling were
implemented to create the presented models. The models created were based on an existing polygonal mesh
phantom, the Realistic Anthropomorphic Flexible (RAF) phantom [86]. This phantom depicts a detailed model
of human anatomy and is therefore ideally suited for imaging applications. The polygonal mesh format gives great
flexibility to the phantom and allows further modification of the anatomy, a valuable characteristic when
designing models for this work. For VCT studies in chest radiography, a set of features was further developed for
the phantom. Modifications included a more realistic lung background and modification to the external shape of
the phantom. The latter allowed the representation of male and female patients as well as overweight and obese
male models. To simulate variability in the anatomy, CT image segmentation was used to create new organ models
which were included in the obese and overweight versions. For task-based optimization, a set of lesions and/or
devices commonly found in chest radiography were modelled within the phantoms. The clinical task range
included pulmonary nodules, catheters, rib fractures, pneumothorax and pleural effusion. The methodology used
to create the phantoms provided great flexibility for modelling different lesions and devices and to modify the
anatomy of the phantoms, thus it allows to extend the library including new types of tasks and phantom models.

This knowledge was put into practice in the second part of Chapter 5, when a similar methodology based on
patient CT image segmentation and mesh modelling was implemented to create 3D computational models of
pathologies associated with Covid-19. Nine of the phantom models in the library display disease characteristics
related to Covid-19. A VCT including the Covid-19 models was out of the scope of this thesis and remains as a
future application of the developed models.

The anatomy of the models and of the simulated clinical tasks were first validated by experienced radiologists
Additionally, the images were uploaded for analysis to an AI software which served as extra validation. Related
to the phantoms, work remains for the modelling of different anatomies to represent a wider range of patients and
a wider range of lesions and devices to answer different clinical questions. As future development for the phantom
we could include the further improvement of the organs, for example the modelling of the bone marrow structure
inside the bones, which could be an interesting feature for dosimetry studies. Another option could be further
automating the method used to create the pulmonary structures, so that different lung backgrounds could be
simulated in the phantoms. Also, more subtle lung pathologies or pathologies at a more microscopic scale could
be added to the phantom.

Although developed for chest radiography applications, these phantoms are not limited to this modality and can
also be employed in other imaging techniques, for example in chest tomosynthesis or CT, or for dosimetry studies.
To our knowledge, there are no other phantom libraries that depict this range of clinical tasks for chest
radiography.

Finally, in Chapter 6 of the thesis, the tools developed were applied in a VCT in which diagnostic performance
for a specific collection of tasks in chest radiography was studied with respect to different acquisition parameters.
A set of 11 computational models including the range of clinical tasks simulated in Chapter 5, were used as input

157
in the simulation framework to create synthetic radiographic images. The different exposure settings covered tube
voltages of 60 to 140 kVp in steps of 20 kVp, grid in and grid out and four air kerma levels at the detector of 0.62,
1.25, 2.5 and 5.0 µGy. All images created were abnormal, i.e., there was always a clinical task present. The readers
scored the images following the free response receiver operator characteristics (FROC) paradigm. In this type of
method, the observers can mark as many lesions as visible. These marks are then rated with a confidence level.
The readers were blinded to the fact that all images were abnormal. The analysis of the reading study was done
for all clinical tasks considered together and for nodules, rib fractures, pneumothorax and pleural effusion
separately.

The Monte Carlo simulations were used for organ dose calculations for the different settings. Organ dose
conversion coefficients (DCC) were calculated for each of the phantom types imaged (male, female, overweight
and obese male). Applying the DCC for typical clinical settings of the simulated system, it was found that at a
fixed dose at the detector, the organ doses decreased for higher tube voltages and were lower for grid out
acquisitions. While this tendency was on itself not a surprise, these values represent up to date systems and were
also calculated for normal, overweight and obese models.

Regarding the results of the reading study, the comparison of the different detector air kerma levels showed no
significant difference between 1.25, 2.5 and 5.0 µGy for any of the tasks. This means that a reduction of dose by
50% can be achieved from the standard working detector air kerma level of 2.5 µGy to 1.25 µGy, without affecting
the detectability of the clinical tasks. This was in line with what other studies have found, suggesting the detection
of lesions is mostly affected by the anatomical noise in the image. Differently, reader classification of the images
as acceptable for diagnostic applications was largely influenced by the noise level in the image, expect for reader
2 that showed more tolerance to the presence of noise in the image. The accuracy of catheter tip localization was
only affected at 0.62 µGy where marks deviated by up to 20 mm from the tip; localization was accurate for the
remaining dose levels. Thus, a dose reduction of 50% is possible without affecting the catheter localization in the
mediastinum region, something to be considered if the clinical request is the visualization of this type of devices.

For the analysis of the tube voltage and grid use, we found that in general there was no significant difference
between grid in and grid out technique. This was unexpected, since accordance to the results obtained in Chapter 1,
the detectability at a fixed dose at the detector decreases for grid out compared to grid in. It is possible that some
aspects of the study design may have caused this. An alternative reason might be that the readers lack some
experience in this type of study. An important aspect to consider is the image processing, which has the potential
of restoring the contrast on the image for grid out techniques. A study by Neitzel et al. [41] found no significant
differences in lesion detection when working with grid in or grid out, they suggested that proper image processing
can compensate for the effect of higher scatter on image contrast. However, in their study, an air gap of 15 cm
was used for the grid out technique. Another study by Moore et al. [42] used image quality criteria to score images
with different tube voltages and scatter rejection methods in a relative Visual Grading Analysis study. They found
that tube voltages below 100 kVp and grid out provided superior quality for average size patients. Recently, scatter
reduction based on Artificial Intelligence techniques is being introduced [209]. Our study may support
investigating such approaches.

This and the proven fact that anatomical noise is the limiting factor in the detection of lesions, made us discuss
whether the increase in dose due to the use of the antiscatter grid is fully justified in digital imaging. Is proper

158
image processing able to compensate for contrast reduction in the image? Of course, this may depend on to the
image processing technique that is applied and remains to be investigated.

Some practical conclusions were derived from the results.

• For grid in technique, 120 kVp showed the lower organ doses with no significant decrease in diagnostic
performance compared to lower tube voltages, justifying the choice made in practice regarding tube
voltage.
• For grid out techniques, no significant difference in diagnostic performance was found between 80 and
100 kVp, however organ doses were in average 10% lower at 100 kVp compared to 80 kVp. More
investigation on this is necessary together with the calculation of DCC for anterior posterior irradiation.
• A reduction of dose from 2.5 µGy to 1.25 µGy is potentially possible without decreasing the diagnostic
performance.
• Further analysis of the effect of the antiscatter grid is necessary. A comparison of different image
processing techniques in terms of how they can restore the contrast for grid out techniques is also
relevant.

Aspects beyond the localization and classification of clinical tasks were out of the scope of this work. The effect
of the reader experience and background, for example residents, radiographer, radiologists or AI algorithms, in
lesion localization remains to be studied.

Although clinical testing will always be required for validation, the results obtained in this study using
anthropomorphic phantoms including clinical tasks can be used as a starting point for further analysis, constituting
an important first step towards optimization studies. The use of phantoms over real patients has some clear
advantages. First, the person designing the study has full control over the number of clinical tasks and their
localization, thus having knowledge of the ground truth, which is not the case for images of real patients.
Additionally, images of the same phantom can be generated using extensive exposure conditions which is not
possible with real patients. This gives more power to the study design, since it is more likely that the differences
in lesion detection are due to the exposure settings and not to other factors like anatomical differences or variations
in positioning of the patients. The result is a better indication of the parameters that are more likely to affect
diagnostic performance. For example, in a clinical study with real patients (that cannot be imaged repeatedly), the
different exposure settings would have been compared for different patients. Ideally these patients would have the
same type of pathology with similar localization and if possible similar body habitus, making case selection
extremely challenging and time consuming. To this we can add the ethical barriers to be faced. Achieving these
requirements can be very difficult or even impossible and underlines the potential of the VCT method.

VCTs are playing an increasing role in the performance evaluation and optimization of imaging technologies in
the context of medical imaging [172]. Recently, the FDA has made some efforts to standardize and approve VCTs
[210], however there is still much to be done. With the rapid growth of computer technology, simulations will
become faster and more widely available, not only in academic settings but also in the clinic. They will also be
able to handle more complex geometries. Allied to this, phantom development will lead to improvements in the
sophistication and the realism of both the anatomy and pathology.

159
The simulation platform presented in this thesis has been successfully used to generate synthetic images of
anthroponomic phantoms including a range of clinical tasks on a VCT for chest radiography. To our knowledge,
this is the first VCT in the field of chest radiography to include this range of clinical tasks. While from these
results, the first straightforward research project is to explore the grid in more detail, there is a definitive future
towards the simulation of tasks in newer or competing modalities, such as chest tomosynthesis, dual energy
subtraction techniques, (low dose) chest CT, phase contrast imaging, and multimodality imaging. In addition, the
phantom could be made more realistic by including cardiac, respiratory or involuntary motions and extrapolating
to younger age groups. Finally, the phantom is realistic enough to support the development and or testing of AI
algorithms. The use of synthetic images in training datasets can be especially useful for rare pathologies, where
data availability is limited. Some works have used synthetic radiographs and mammograms in training sets with
promising results [211,212], however there are limited studies at the moment, especially with images including
abnormalities or pathologies. A study by Salehinejad et al. [211] used a mix of synthetized and real images to
train a neural network to detect chest pathologies. They found higher accuracy when using a mixed dataset of real
and synthetic images compared to only real images. The improvement was attributed to the augmentation of
images depicting less common pathologies, which provided a better balance in the dataset. Moreover, when using
synthetic images, the ground truth is always known, thus images can be accurately labeled and segmented. In
general the realistic lesions developed in the thesis work are very promising in this regard, as pointed out in [213].
However, challenges remain when using synthetic images to train AI algorithms, a larger set of patient models
will be needed to create a more diverse dataset of highly realistic, high-resolution images required for this type of
applications.

160
References

[1] ICRU. ICRU Report 70: Image Quality in Chest Radiography. 2003.
[2] Zhang Y, Li X, Segars WP, Samei E. Comparison of patient specific dose metrics between chest
radiography, tomosynthesis, and CT for adult patients of wide ranging body habitus. Med Phys
2014;41:023901. https://doi.org/10.1118/1.4859315.
[3] Heitzman ER. Thoracic Radiology : The Past 50 Years. Radiology 2000;214.
[4] Commission of the European Communities. European guidelines on quality criteria for diagnostic
radiographic images. Report EUR 16260. Geneva: EC: 1996.
[5] Gibson DJ, Davidson RA. Exposure Creep in Computed Radiography. A Longitudinal Study. Acad
Radiol 2012;19:458–62. https://doi.org/10.1016/j.acra.2011.12.003.
[6] Seibert JA, Morin RL. The standardized exposure index for digital radiography: An opportunity for
optimization of radiation dose to the pediatric population. Pediatr Radiol 2011;41:573–81.
https://doi.org/10.1007/s00247-010-1954-6.
[7] Campbell R. The WHO manual of diagnostic imaging: radiographic anatomy and interpretation of the
musculoskeletal system. Clin Radiol 2004;59:113. https://doi.org/10.1016/S0009-9260(03)00335-0.
[8] Coblents CL, Matzinger F, Samson LM, Scherer J, Stolberg HO, Weisbrod G. Standards for Chest
Radiography. 2000.
[9] Busch HP, Faulkner K. Image quality and dose management in digital radiography: a new paradigm for
optimisation. Radiat Prot Dosimetry 2005;117:143–7. https://doi.org/10.1093/rpd/nci728.
[10] Tingberg A, Sjöström D. Optimisation of image plate radiography with respect to tube voltage. Radiat
Prot Dosimetry 2005;114:286–93. https://doi.org/10.1093/rpd/nch536.
[11] Uffmann M, Neitzel U, Prokop M, Kabalan N, Weber M, Herold CJ, et al. Flat-panel-detector chest
radiography: effect of tube voltage on image quality. Radiology 2005;235:642–50.
https://doi.org/10.1148/radiol.2352031730.
[12] Bacher K, Smeets P, Vereecken L, De Hauwere A, Duyck P, De Man R, et al. Image quality and radiation
dose on digital chest imaging: Comparison of amorphous silicon and amorphous selenium flat-panel
systems. Am J Roentgenol 2006;187:630–7. https://doi.org/10.2214/AJR.05.0400.
[13] Strotzer M, Völk M, Fründ R, Hamer O, Zorger N, Feuerbach S. Routine chest radiography using a flat-
panel detector: Image quality at standard detector dose and 33% dose reduction. Am J Roentgenol
2002;178:169–71. https://doi.org/10.2214/ajr.178.1.1780169.
[14] Metz S, Roggel R, Engelke C, Woertler K, Renger B, Rummeny EJ, et al. Chest Radiography with a
Digital Flat-Panel Detector : Experimental Receiver 2005.
[15] International Electrotechnical Commission. Medical electrical equipment – Characteristics of digital X-
ray imaging devices – Part 1: Determination of the detective quantum efficiency. IEC 62220-1 2003;2003.
[16] Samei E. Performance of digital radiographic detectors: quantification and assessment methods. Adv
Digit Radiogr RSNA Categ Course Diagnostic Radiol Phys 2003 2003;27710:37–47.

161
[17] Launders JH, Cowen AR, Bury RF, Hawkridge P. Towards image quality, beam energy and effective
dose optimisation in digital thoracic radiography. Eur Radiol 2001;11:870–5.
https://doi.org/10.1007/s003300000525.
[18] Dobbins JT, Samei E, Chotas HG, Warp RJ, Baydush AH, Floyd CE, et al. Chest radiography:
optimization of X-ray spectrum for cesium iodide-amorphous silicon flat-panel detector. Radiology
2002;226:221–30. https://doi.org/10.1148/radiol.2261012023.
[19] Abadi E, Segars WP, Tsui BMW, Kinahan PE, Frangi AF, Maidment A, et al. Virtual clinical trials in
medical imaging : a review. J Med Imaging 2020;7:042805-1–40.
[20] Segars WP, Sturgeon G, Mendonca S, Grimes J, Tsui BMW. 4D XCAT phantom for multimodality
imaging research. Med Phys 2010;37:4902–15. https://doi.org/10.1118/1.3480985.
[21] Abadi E, Segars WP, Sturgeon GM, Roos JE, Ravin CE, Samei E. Modeling Lung Architecture in the
XCAT Series of Phantoms: Physiologically Based Airways, Arteries and Veins. IEEE Trans Med Imaging
2018;37:693–702. https://doi.org/10.1109/TMI.2017.2769640.
[22] Lombardo PA, Vanhavere F, Lebacq AL, Struelens L, Bogaerts R. Development and Validation of the
Realistic Anthropomorphic Flexible (RAF) Phantom. Health Phys 2018;114:489–99.
https://doi.org/10.1097/HP.0000000000000805.
[23] Solomon J, Samei E. A generic framework to simulate realistic lung, liver and renal pathologies in CT
imaging. Phys Med Biol 2014;59:6637–57. https://doi.org/10.1088/0031-9155/59/21/6637.
[24] Rodríguez Pérez S, Coolen J, Marshall NW, Cockmartin L, Biebaû C, Desmet J, et al. Methodology to
create 3D models of COVID-19 pathologies for virtual clinical trials. J Med Imaging n.d.;8:1–17.
https://doi.org/10.1117/1.JMI.8.S1.013501.
[25] Nam JG, Park S, Hwang EJ, Lee JH, Jin KN, Lim KY, et al. Development and validation of deep learning-
based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology
2019;290:218–28. https://doi.org/10.1148/radiol.2018180237.
[26] Qin C, Yao D, Shi Y, Song Z. Computer-aided detection in chest radiography based on artificial
intelligence: A survey. Biomed Eng Online 2018;17:1–23. https://doi.org/10.1186/s12938-018-0544-y.
[27] Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. ChestX-ray: Hospital-Scale Chest X-ray
Database and Benchmarks on Weakly Supervised Classification and Localization of Common Thorax
Diseases. Adv Comput Vis Pattern Recognit 2019:369–92. https://doi.org/10.1007/978-3-030-13969-
8_18.
[28] Sahbaee P, Segars WP, Samei E. Patient-based estimation of organ dose for a population of 58 adult
patients across 13 protocol categories. Med Phys 2014;41:72104. https://doi.org/10.1118/1.4883778.
[29] Gong C, Han C, Gan G, Deng Z, Zhou Y, Yi J, et al. Low-dose dynamic myocardial perfusion CT image
reconstruction using pre-contrast normal-dose CT scan induced structure tensor total variation
regularization. Phys Med Biol 2017;62:2612–35. https://doi.org/10.1088/1361-6560/aa5d40.
[30] Zhang Y, Ma J, Iyengar P, Zhong Y, Wang J. A new CT reconstruction technique using adaptive
deformation recovery and intensity correction (ADRIC). Med Phys 2017;44:2223–41.
https://doi.org/10.1002/mp.12259.
[31] Abadi E, Segars WP, Harrawood B, Kapadia A, Samei E. Virtual clinical trial in action: textured XCAT
phantoms and scanner-specific CT simulator to characterize noise across CT reconstruction algorithms.
In: Lo JY, Schmidt TG, Chen G-H, editors. Med. Imaging 2018 Phys. Med. Imaging, vol. 10573, SPIE;
2018, p. 304–9. https://doi.org/10.1117/12.2294599.
[32] Elangovan P, Rashidnasab A, Mackenzie A, Dance DR, Young KC, Bosmans H, et al. Performance
comparison of breast imaging modalities using a 4AFC human observer study. In: Hoeschen C, Kontos
D, editors. Med. Imaging 2015 Phys. Med. Imaging, vol. 9412, SPIE; 2015, p. 450–6.
https://doi.org/10.1117/12.2081878.
[33] Sandborg M, McVey G, Dance DR, Alm Carlsson G. Schemes for the optimization of chest radiography
using a computer model of the patient and x-ray imaging system. Med Phys 2001;28:2007–19.
https://doi.org/10.1118/1.1405840.
[34] Ullman G, Sandborg M, Dance DR, Hunt RA, Alm Carlsson G. Towards optimization in digital chest
radiography using Monte Carlo modelling. Phys Med Biol 2006;51:2729–43.

162
https://doi.org/10.1088/0031-9155/51/11/003.
[35] Dance DR, Day GJ. The computation of scatter in mammography by Monte Carlo methods. Phys Med
Biol 1984;29:237–47. https://doi.org/10.1088/0031-9155/29/3/003.
[36] Moore CS, Liney GP, Beavis AW, Saunderson JR. A method to produce and validate a digitally
reconstructed radiograph-based computer simulation for optimisation of chest radiographs acquired with
a computed radiography imaging system. Br J Radiol 2011;84:890–902.
https://doi.org/10.1259/bjr/30125639.
[37] Smans K, Vandenbroucke D, Pauwels H, Struelens L, Vanhavere F, Bosmans H. Validation of an image
simulation technique for two computed radiography systems: An application to neonatal imaging. Med
Phys 2010;37:2092. https://doi.org/10.1118/1.3377772.
[38] Zhang G, Pauwels R, Marshall N, Shaheen E, Nuyts J, Jacobs R, et al. Development and validation of a
hybrid simulation technique for cone beam CT: Application to an oral imaging system. Phys Med Biol
2011;56:5823–43. https://doi.org/10.1088/0031-9155/56/18/004.
[39] Salvagnini E, Bosmans H, Van Ongeval C, Van Steen A, Michielsen K, Cockmartin L, et al. Impact of
compressed breast thickness and dose on lesion detectability in digital mammography: FROC study with
simulated lesions in real mammograms. Med Phys 2016;43:5104–16. https://doi.org/10.1118/1.4960630.
[40] Abadi E, Harrawood B, Sharma S, Kapadia A, Segars WP, Samei E. DukeSim: A realistic, rapid, and
scanner-specific simulation framework in computed tomography. IEEE Trans Med Imaging 2019;38:1–
18. https://doi.org/10.1109/TMI.2018.2886530.DukeSim.
[41] Neitzel U, Pralow T, Schaefer-prokopb C, Prokopb M. Influence of scatter reduction on lesion signal-to-
noise ratio and lesion detection in digital chest radiography 1998;3336:337–47.
https://doi.org/10.1117/12.317033.
[42] Moore CS, Avery G, Balcam S, Needler L, Swift A. Use of a digitally reconstructed radiograph-based
computer simulation for the optimisation of chest radiographic techniques for computed radiography
imaging systems 2012:1–10. https://doi.org/10.1259/bjr/47377285.
[43] Bernhardt TM, Rapp-Bernhardt U, Lenzen H, Roehl FW, Diederich S, Papke K, et al. Low-voltage digital
selenium radiography: detection of simulated interstitial lung disease, nodules, and catheters--a phantom
study. Radiology 2004;232:693–700. https://doi.org/10.1148/radiol.2323030187.
[44] International Commission on Radiation Units and Measurements. ICRU Rep. No. 54. Medical imaging -
the assessment of image quality. Bethesda, MD: 1996.
[45] Maurino SL, Badano A, Cunningham IA, Karim KS. Theoretical and Monte Carlo optimization of a
stacked three- layer flat-panel x-ray imager for applications in multi-spectral medical imaging. Proc SPIE
Int Soc Opt Eng., vol. 9783, Bellingham: SPIE, Bellingham, WA; 2016, p. 97833Z.
https://doi.org/10.1117/12.2217085.
[46] ICRP. Basic Anatomical and Physiological Data for Use in Radiological Protection: Reference Values.
ICRP Publication 89. Ann ICRP 2002;32.
[47] Shepard SJ, Wang J, Flynn M, Gingold E, Goldman L, Krugh K, et al. An exposure indicator for digital
radiography: AAPM Task Group 116 (executive summary). Med Phys 2009;36:2898–914.
https://doi.org/10.1118/1.3266686.
[48] IEC. Medical electrical equipment – Exposure index of digital X-ray imaging systems – Part 1: Definitions
and requirements for general radiography. IEC 62494-1 2008.
[49] Federal Agency for Nuclear Control (FANC). Diagnostic reference levels in radiology 2011.
http://fanc.fgov.be/nl/professionelen/medische-professionelen/radiologische-
toepassingen/diagnostische-referentieniveaus-de.
[50] Marshall NW. An examination of automatic exposure control regimes for two digital radiography
systems. Phys Med Biol 2009;54:4645–70.
[51] Doyle P, Martin CJ, Gentle D. Application of contrast-to-noise ratio in optimizing beam quality for digital
chest radiography: comparison of experimental measurements and theoretical simulations. Phys Med Biol
2006;51:2953–70. https://doi.org/10.1088/0031-9155/51/11/018.
[52] Rodríguez Pérez S, Marshall NW, Struelens L, Bosmans H. Characterization and validation of the thorax

163
phantom Lungman for dose assessment in chest radiography optimization studies. J Med Imaging
2018;5:1. https://doi.org/10.1117/1.jmi.5.1.013504.
[53] Mackenzie A, Doshi S, Doyle P, Hill A, Honey I, Marshall N, et al. Measurement of the performance
characteristics of diagnostic x-ray systems: digital imaging systems. IPEM Report 32 (Part VII). York:
2010.
[54] Perry, N., Broeders, M., de Wolf, C., Törnberg, S., Holland, R., & von Karsa L. European guidelines for
quality assurance in breast cancer screening and diagnosis. Fourth edition - Summary document. Ann
Oncol 2008.
[55] Tapiovaara M. PCXMC : a PC-based Monte Carlo program for calculating patient doses in medical X-
ray examinations. Finnish Centre for Radiation and Nuclear Safety; 1997.
[56] Tingberg A, Sjostrom D. Search for optimal tube voltage for image plate radiography. In: Chakraborty
DP, Krupinski EA, editors., 2003, p. 187. https://doi.org/10.1117/12.479982.
[57] Launders JH, Cowen AR. A comparison of the threshold detail detectability of a screen-film combination
and computed radiology under conditions relevant to high-kVp chest radiography. Phys Med Biol
1995;40:1393–8. https://doi.org/10.1088/0031-9155/40/8/008.
[58] Chotas HG, Floyd CE, Dobbins JT, Ravin CE. Digital chest radiography with photostimulable storage
phosphors: signal-to-noise ratio as a function of kilovoltage with matched exposure risk. Radiology
1993;186:395–8. https://doi.org/10.1148/radiology.186.2.8421741.
[59] Doyle P, Martin CJ, Gentle D. Dose-image quality optimisation in digital chest radiography. Radiat Prot
Dosimetry 2005;114:269–72. https://doi.org/10.1093/rpd/nch546.
[60] Ullman G, Sandbord M, Dance DR, Hunt R, Carlsson GA. Distributions of scatter-to-primary and signal-
to-noise ratios per pixel in digital chest imaging. Radiat Prot Dosimetry 2005;114:355–8.
https://doi.org/10.1093/rpd/nch530.
[61] Oda N, Nakata H, Murakami S, Terada K, Nakamura K, Yoshida a. Optimal beam quality for chest
computed radiography. Invest Radiol 1996;31:126–31. https://doi.org/10.1097/00004424-199603000-
00002.
[62] Foos DH, Sehnert WJ, Reiner B, Siegel EL, Segal A, Waldman DL. Digital radiography reject analysis:
data collection methodology, results, and recommendations from an in-depth investigation at two
hospitals. J Digit Imaging 2009;22:89–98. https://doi.org/10.1007/s10278-008-9112-5.
[63] George Xu X. Computational Phantoms for Organ Dose Calculations in Radiation Protection and
Imaging. In: DeWerd LA, Kissick M, editors. Phantoms Med. Heal. Phys. Devices Res. Dev., New York,
NY: Springer New York; 2014, p. 225–62. https://doi.org/10.1007/978-1-4614-8304-5_12.
[64] White DR, Buckland-Wright JC, Griffith R V, Rothenberg LN, Showwalter CK, Williams G, et al. Report
48. Phantoms and computational models in therapy, diagnosis and protection. J Int Comm Radiat Units
Meas 1992;os25.
[65] Hintenlang D, Moloney W, Winslow J. Physical Phantoms for Experimental Radiation Dosimetry. In: Xu
XG, Eckerman KF, editors. Handb. Anat. Model. Radiat. Dosim., New York, Tennessee: Taylor &
Francis; 2009, p. 389–409. https://doi.org/doi:10.1201/EBK1420059793-c16.
[66] Båth M, Håkansson M, Tingberg A, Månsson LG. Method of simulating dose reduction for digital
radiographic systems. Radiat Prot Dosimetry 2005;114:253–9. https://doi.org/10.1093/rpd/nch540.
[67] Moore CS, Wood TJ, Beavis AW, Saunderson JR. Correlation of the clinical and physical image quality
in chest radiography for average adults with a computed radiography imaging system. Br J Radiol
2013;86. https://doi.org/10.1259/bjr.20130077.
[68] Toroi P, Young KC, Marchal G. Experimental investigation on the choice of the tungsten / rhodium anode
/ filter combination for an amorphous selenium-based digital mammography system. Eur Radiol
2007;17:2368–75. https://doi.org/10.1007/s00330-006-0574-x.
[69] Chotas HG, Floyd CF, Dobbins JT, Ravlin CE. Digital chest radiography with photostimuable storage
phosphors: signal-to-noise ratio as a function of kilovoltage with matched exposure risk. Radiology
1993;186:395–8.
[70] Vassileva J. A phantom for dose-image quality optimization in chest radiography. Br J Radiol

164
2002;75:837–42. https://doi.org/10.1259/bjr.75.898.750837.
[71] Pina DR, Duarte SB, Netto TG, Trad CS, Brochi MAC, de Oliveira SC. Optimization of standard patient
radiographic images for chest, skull and pelvis exams in conventional x-ray equipment. Phys Med Biol
2004;49:N215.
[72] Rapp-Bernhardt U, Bernhardt TM, Lenzen H, Esseling R, Roehl FW, Schiborr M, et al. Experimental
Evaluation of a Portable Indirect Flat-Panel Detector for the Pediatric Chest: Comparison with Storage
Phosphor Radiography at Different Exposures by Using a Chest Phantom. Radiology 2005;237:485–91.
https://doi.org/10.1148/radiol.2372040672.
[73] Ullman G, Dance DR, Sandborg M, Carlsson GA, Svalkvist A, Båth M. A Monte Carlo-based model for
simulation of digital chest tomosynthesis. Radiat Prot Dosimetry 2010;139:159–63.
https://doi.org/10.1093/rpd/nc0000.
[74] Sandborg M, Tingberg A, Ullman G, Dance DR, Alm Carlsson G. Comparison of clinical and physical
measures of image quality in chest and pelvis computed radiography at different tube voltages. Med Phys
2006;33:4169–75. https://doi.org/10.1118/1.2362871.
[75] Ma WK, Hogg P, Tootell A, Manning D, Thomas N, Kane T, et al. Anthropomorphic chest phantom
imaging – The potential for dose creep in computed radiography. Radiography 2013;19:207–11.
https://doi.org/10.1016/j.radi.2013.04.002.
[76] Williams DB, Siewerdsen JH, Tward DJ, Paul NS, Dhanantwari AC, Shkumat NA, et al. Optimal kVp
selection for dual-energy imaging of the chest : Evaluation by task-specific observer preference tests. Med
Phys 2007;34:3916–25. https://doi.org/10.1118/1.2776239.
[77] Solomon J, Samei E. Quantum noise properties of CT images with anatomical textured backgrounds
across reconstruction algorithms: FBP and SAFIRE. Med Phys 2015;41:091908.
https://doi.org/10.1118/1.4893497.
[78] Cockmartin L, Marshall NW, Zhang G, Lemmens K, Shaheen E, Van Ongeval C, et al. Design and
application of a structured phantom for detection performance comparison between breast tomosynthesis
and digital mammography. Phys Med Biol 2017;62:758–80. https://doi.org/10.1088/1361-6560/aa5407.
[79] Conway BJ, Butler PF, Duff JE, Fewell TR, Gross RE, Jennings RJ, et al. Beam quality independent
attenuation phantom for estimating patient exposure from x-ray automatic exposure controlled chest
examinations. Med Phys 1984;11:827–32.
[80] Renger B, Brieskorn C, Toth V, Mentrup D, Jockel S, Lohofer F, et al. Evaluation of dose reduction
potentials of a novel scatter correction software for bedside chest x-ray imaging. Radiat Prot Dosimetry
2016;169:60–7. https://doi.org/10.1093/rpd/ncw031.
[81] IEC. Medical diagnostic X-ray equipment - Radiation conditions for use in the determination of
characteristics. IEC 61267 2005.
[82] Fedorov A, Beichel J, Kalpathy-Cramer J, Finet J, Fillion-Robin J-C, Pujol S, et al. 3D Slicer as an Image
Computing Platform for the Quantitative Imaging Network. Magn Reson Imaging 2012;30:1323–41.
[83] Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, et al. Fiji: an open-source
platform for biological-image analysis. Nat Meth 2012;9:676–82.
[84] Salvat F, Fernández-Varea JM, Acosta E, Sempau J. PENELOPE - A code system for Monte Carlo
simulation of electron and photon transport. Work Proc 2006:384. https://doi.org/10.1.1.78.4492.
[85] Sempau J, Badal A, Brualla L. A PENELOPE-based system for the automated Monte Carlo simulation of
clinacs and voxelized geometries—application to far-from-axis fields. Med Phys 2011;38:5887.
https://doi.org/10.1118/1.3643029.
[86] Lombardo PA, Vanhavere F, Lebacq AL, Struelens L, Bogaerts R. Development and validation of the
Realistic Flexible Anthropomorphic ( RAF ) polygonal mesh phantom 2015:1–13.
[87] Boone JM, Seibert JA. An accurate method for computer-generating tungsten anode x-ray spectra from
30 to 140 kV. Med Phys 1997;24:1661–70. https://doi.org/10.1118/1.597953.
[88] Schneider U, Pedroni E, Lomax A. The calibration of CT Hounsfield units for radiotherapy treatment
planning. Phys Med Biol 1996;41:111–24. https://doi.org/10.1088/0031-9155/41/1/009.
[89] Dobbins JT, Rice JJ, Beam CA, Ravin CE. Threshold perception performance with computed and screen-

165
film radiography: implications for chest radiography. Radiology 1992;183:179–87.
[90] Mothiram U, Hons DR, Ed MS, Brennan PC, Lewis SJ, Moran B, et al. Digital radiography exposure
indices : A review 2014. https://doi.org/10.1002/jmrs.49.
[91] Takaki T, Takeda K, Murakami S, Ogawa H. Evaluation of the effects of subject thickness on the exposure
index in digital radiography. Radiol Phys Technol 2016;9:116–20. https://doi.org/10.1007/s12194-015-
0341-2.
[92] Williams MB, Raghunathan P, More MJ, Seibert JA, Kwan A, Lo JY, et al. Optimization of exposure
parameters in full field digital mammography. Med Phys 2008;35:2414–23.
[93] Cowen AR, Workman A, Haywood JM, Clarke OF. A set of X-ray test objects for image quality control
in digital subtraction fluorography. II: Application and interpretation of results. Br J Radiol
1987;60:1011–8.
[94] Aufrichtig R. Comparison of low contrast detectability between a digital amorphous silicon and a screen-
film based imaging system for thoracic radiography. Med Phys 1999;26:1349–58.
[95] Thijssen MAO. Bepaling en bewaking van de beeldkwaliteit in de radiodiagnostiek. Radboud
Unviversity, 1993.
[96] Van Peteghem N, Bosmans H, Marshall NW. NPWE model observer as a validated alternative for contrast
detail analysis of digital detectors in general radiography. Phys Med Biol 2016;61:N575–91.
https://doi.org/10.1088/0031-9155/61/21/N575.
[97] European Commission. The European protocol for the quality control of the physical and technical aspects
of mammography screening: part B. Digital mammography European Guidelines for Breast Cancer
Screening 4th edn 2006.
[98] Månsson LG. Methods for the evaluation of image quality: a review. Radiat Prot Dosim 2000;90:89–99.
[99] Båth M, Månsson LG. Visual grading characteristics (VGC) analysis: A non-parametric rank-invariant
statistical method for image quality evaluation. Br J Radiol 2007;80:169–76.
https://doi.org/10.1259/bjr/35012658.
[100] IPEM. Report 91: Recommened Standards for the Routine Performance Testing of Diagnostic X-Ray
Systems. York, UK: 2005.
[101] De Crop A, Bacher K, Van Hoof T, Smeets P V., Smet BS, Vergauwen M, et al. Correlation of Contrast-
Detail Analysis and Clinical Image Quality Assessment in Chest Radiography with a Human Cadaver
Study. Radiology 2012;262:298–304. https://doi.org/10.1148/radiol.11110447.
[102] Weir A, Salo EN, Janeczko AJ, Douglas J, Weir NW. Evaluation of CDRAD and TO20 test objects and
associated software in digital radiography. Biomed Phys Eng Express 2019;5.
https://doi.org/10.1088/2057-1976/ab285b.
[103] Rodríguez Pérez S, Marshall NW, Struelens L, Bosmans H. Characterization and validation of the thorax
phantom Lungman for dose assessment in chest radiography optimization studies. J Med Imaging
2018;5:1. https://doi.org/10.1117/1.JMI.5.1.013504.
[104] Neitzel U, Maack I, Gunther-Kohfahl S. Image quality of digital chest radiography system based on a
selenium detector. Medcal Phys 1993;21:509–16.
[105] Lee D, Choi S, Lee H, Kim D, Choi S, Kim H-J. Quantitative evaluation of anatomical noise in chest
digital tomosynthesis, digital radiography, and computed tomography. J Instrum 2017;12:T04006–
T04006. https://doi.org/10.1088/1748-0221/12/04/T04006.
[106] Ma WK, Hogg P, Tootell A, Manning D, Thomas N, Kane T, et al. Anthropomorphic chest phantom
imaging - The potential for dose creep incomputed radiography. Radiography 2013;19:207–11.
https://doi.org/10.1016/j.radi.2013.04.002.
[107] Robins M, Solomon J, Sahbaee P, Sedlmair M, Roy Choudhury K, Pezeshk A, et al. Techniques for virtual
lung nodule insertion: Volumetric and morphometric comparison of projection-based and image-based
methods for quantitative CT. Phys Med Biol 2017;62:7280–99. https://doi.org/10.1088/1361-
6560/aa83f8.
[108] Kyoto Kagaku Co. LTD. Multipurpose Chest Phantom N1 ‘LUNGMAN’ product catalog. Kyoto, Japan
n.d. https://www.kyotokagaku.com/products/detail03/pdf/ph-1_catalog.pdf.

166
[109] Samei E, Badano A, Chakraborty D, Compton K, Cornelius C, Corrigan K, et al. Assessment of Display
Performance for Medical Imaging Systems, Report of the American Association of Physicists in Medicine
(AAPM) Task Group 18. 2005.
[110] Commission E. European guidelines on quality criteria for diagnostic radiographic images. 1996.
[111] Hakansson M, Svensson S, Zachrisson S, Svalkvist A, Bath M, Mansson L. VIEWDEX: an efficient and
easy-to-use software for observer performance studies. Radiat Prot Dosimetry 2010;139:42–51.
https://doi.org/10.1093/rpd/ncq057.
[112] Koo TK, Li MY. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for
Reliability Research. J Chiropr Med 2016;15:155–63. https://doi.org/10.1016/j.jcm.2016.02.012.
[113] Hinkle D, Wiersma W, Jurs S. Applied statistics for the behavioral sciences. Houghton Mifflin College
Division; 2003.
[114] Al-murshedi S, Hogg P, England A. Relationship between body habitus and image quality and radiation
dose in chest X-ray examinations : A phantom study. Phys Medica 2019;57:65–71.
https://doi.org/10.1016/j.ejmp.2018.12.009.
[115] Hart D, Hillier MC, Shrimpton PC. HPA-CRCE-034 - Doses to Patients from Radiographic and
Fluoroscopic X-ray Imaging Procedures in the UK – 2010 Review. 2012.
[116] Marshall NW. A comparison between objective and subjective image quality measurements for a full field
digital mammography system. Phys Med Biol 2006;51:2441–63. https://doi.org/10.1088/0031-
9155/51/10/006.
[117] Fetterly K a, Schueler B a. Experimental evaluation of fiber-interspaced antiscatter grids for large patient
imaging with digital x-ray systems. Phys Med Biol 2007;52:4863–80. https://doi.org/10.1088/0031-
9155/52/16/010.
[118] Scott AW, Yester M V, Barnes GT. High-ratio grid considerations in mobile chest radiography
2012;39:3142–53.
[119] Warren LM, Mackenzie A, Cooke J, Given-wilson RM, Wallis MG, Chakraborty DP, et al. Effect of
image quality on calcification detection in digital mammography 2012;39:3202–13.
[120] Zanca F, Bosmans H, Jacobs J, Michielsen K, Sisini F, Nens J, et al. Contrast-detail comparison between
unprocessed and processed CDMAM images. Proc SPIE Med Imaging 2009 Phys Med Imaging
2009;7258:1514–23. https://doi.org/10.1117/12.811732.
[121] Sund P, Båth M, Kheddache S, Månsson LG. Comparison of visual grading analysis and determination
of detective quantum efficiency for evaluating system performance in digital chest radiography. Eur
Radiol 2004;14:48–58. https://doi.org/10.1007/s00330-003-1971-z.
[122] Zanca F, Jacobs J, Van Ongeval C, Claus F, Celis V, Geniets C, et al. Evaluation of clinical image
processing algorithms used in digital mammography. Med Phys 2009;36:765.
https://doi.org/10.1118/1.3077121.
[123] Smet MH, Breysem L, Mussen E, Bosmans H, Marshall NW, Cockmartin L. Visual grading analysis of
digital neonatal chest phantom X-ray images: Impact of detector type, dose and image processing on
image quality. Eur Radiol 2018. https://doi.org/10.1007/s00330-017-5301-2.
[124] Samei E. Medical Physics 3.0: Ensuring Quality and Safety in Medical Imaging. Health Phys
2019;116:247–55. https://doi.org/10.1097/HP.0000000000001022.
[125] Carton A, Bosmans H, Ongeval C Van, Souverijns G, Rogge F, Steen A Van, et al. Development and
validation of a simulation procedure to study the visibility of micro calcifications in digital mammograms.
Simulation 2003:2234–40. https://doi.org/10.1118/1.1591193.
[126] Shaheen E, De Keyzer F, Bosmans H, Dance DR, Young KC, Ongeval C Van. The simulation of 3D mass
models in 2D digital mammography and breast tomosynthesis. Med Phys 2014;41.
https://doi.org/10.1118/1.4890590.
[127] Elangovan P, Warren LM, Mackenzie A, Rashidnasab A, Diaz O, Dance DR, et al. Development and
validation of a modelling framework for simulating 2D-mammography and breast tomosynthesis images.
Phys Med Biol 2014;59:4275–93. https://doi.org/10.1088/0031-9155/59/15/4275.
[128] Hadjipanteli A, Elangovan P, Mackenzie A, Looney PT, Wells K, Dance DR, et al. The effect of system

167
geometry and dose on the threshold detectable calcification diameter in 2D-mammography and digital
breast tomosynthesis. Phys Med Biol 2017;62:858–77. https://doi.org/10.1088/1361-6560/aa4f6e.
[129] Siddon RL. Fast calculation of the exact radiological path for a three-dimensional CT array. Med Phys
1985;12.
[130] Chan HP, Doi K. The validity of Monte Carlo simulation in studies of scattered radiation in diagnostic
radiology. Phys Med Biol 1983;28:109–29.
[131] Mackenzie A, Dance DR, Workman A, Yip M, Wells K, Young KC. Conversion of mammographic
images to appear with the noise and sharpness characteristics of a different detector and x-ray system.
Med Phys 2012;39:2721. https://doi.org/10.1118/1.4704525.
[132] IEC. IEC 62220-1 Medical electrical equipment - Characteristics of digital X-ray imaging devices - Part
1-1: Determination of the detective quantum efficiency - Detectors used in radiographic imaging. 2015.
[133] Samei E, Flynn MJ, Reimann D a. A method for measuring the presampled MTF of digital radiographic
systems using an edge test device. Med Phys 1998;25:102–13. https://doi.org/10.1118/1.598165.
[134] Fujita H. A simple method for determining the modulation transfer function in digital radiography. IEEE
Trans Med Imaging 1992;11:34–9. https://doi.org/10.1109/42.126908.
[135] Maidment ADA, Albert M. Conditioning data for calculation of the modulation transfer function. Med
Phys 2003;30:248–53. https://doi.org/https://doi.org/10.1118/1.1534111.
[136] Williams MB, Mangiafico P a, Simoni PU. Noise power spectra of images from digital mammography
detectors. Med Phys 1999;26:1279–93. https://doi.org/10.1118/1.598623.
[137] Dobbins JT, Ergun DL, Rutz L, Hinshaw D a, Blume H, Clark DC. DQE(f) of four generations of
computed radiography acquisition devices. Med Phys 1995;22:1581–93.
https://doi.org/10.1118/1.597627.
[138] Monnin P, Bosmans H, Verdun FR, Marshall NW. Comparison of the polynomial model against explicit
measurements of noise components for different mammography systems. Phys Med Biol 2014;59:5741–
61. https://doi.org/10.1088/0031-9155/59/19/5741.
[139] Konstantinidis AC, Olivo A, Speller RD. Technical note: further development of a resolution modification
routine for the simulation of the modulation transfer function of digital x-ray detectors. Med Phys
2011;38:5916–20. https://doi.org/10.1118/1.3644845.
[140] Chotas HG, Floyd E, Allan G, Ravin E. Quality control phantom for digital chest radiography. Radiology
1997;202.
[141] Granfors PR, Aufrichtig R, Possin GE, Giambattista BW, Huang ZS, Liu J, et al. Performance of a 41 x
41 cm2 amorphous silicon flat panel x-ray detector designed for angiographic and R&F imaging
applications. Med Phys 2003;30:2715–26. https://doi.org/10.1118/1.1609151.
[142] Marshall NW, Smet M, Hofmans M, Pauwels H, De Clercq T, Bosmans H. Physics in Medicine &
Biology Technical characterization of five x-ray detectors for paediatric radiography applications
Technical characterization of five x-ray detectors for paediatric radiography applications. Phys Med Biol
2017;62.
[143] Siewerdsen JH, Antonuk LE, el-Mohri Y, Yorkston J, Huang W, Boudry JM, et al. Empirical and
theoretical investigation of the noise performance of indirect detection, active matrix flat-panel imagers
(AMFPIs) for diagnostic radiology. Med Phys 1997;24:71–89. https://doi.org/10.1118/1.597919.
[144] Nishikawa RM, Yaffe MJ, B. HR. Effect of finite phosphor thickness on detective quantum efficiency.
Med Phys 1989;16.
[145] Rossmann K. Point Spread-Function, Line Spread Function, and Modulation Transfer Function. Radiol
Clin North Am 1969;93:257–72. https://doi.org/10.1148/93.2.257.
[146] Chan HP, Doi K. Investigation of the performance of antiscatter grids: Monte Carlo simulation studies.
Phys Med Biol 1982;27:785–803.
[147] Mizuta M, Sanada S, Akazawa H, Kasai T, Abe S, Ikeno Y, et al. Comparison of anti-scatter grids for
digital imaging with use of a direct-conversion flat-panel detector. Radiol Phys Technol 2012;5:46–52.
https://doi.org/10.1007/s12194-011-0134-1.

168
[148] Samei E, Dobbins JT, Lo JY, Tornai MP. A framework for optimising the radiographic technique in
digital X-ray imaging. Radiat Prot Dosimetry 2005;114:220–9. https://doi.org/10.1093/rpd/nch562.
[149] Shaheen E, Van Ongeval C, Cockmartin L, Zanca F, Marshall N, Jacobs J, et al. Realistic simulation of
microcalcifications in breast tomosynthesis. vol. 6136 LNCS. 2010. https://doi.org/10.1007/978-3-642-
13666-5_32.
[150] Badano A, Sempau J. MANTIS: combined x-ray, electron and optical Monte Carlo simulations of indirect
radiation imaging systems. Phys Med Biol 2006;51:1545–61. https://doi.org/10.1088/0031-
9155/51/6/013.
[151] Star-Lack J, Sun M, Meyer A, Morf D, Constantin D, Fahrig R, et al. Rapid Monte Carlo simulation of
detector DQE(f). Med Phys 2014;41. https://doi.org/10.1118/1.4865761.
[152] Moore CS, Wood TJ, Saunderson JR, Beavis AW. A method to incorporate the effect of beam quality on
image noise in a digitally reconstructed radiograph (DRR) based computer simulation for optimisation of
digital radiography. Phys Med Biol 2017;62:7379–93. https://doi.org/10.1088/1361-6560/aa81fb.
[153] Mackenzie A, Dance DR, Diaz O, Young KC. Image simulation and a model of noise power spectra
across a range of mammographic beam qualities Image simulation and a model of noise power spectra
across a range of mammographic beam qualities. Med Phys 2014;41. https://doi.org/10.1118/1.4900819.
[154] Xu XG. An exponential growth of computational phantom research in radiation protection, imaging, and
radiotherapy: a review of the fifty-year history. Phys Med Biol 2014;59:R233–302.
https://doi.org/10.1088/0031-9155/59/18/R233.
[155] ICRP. ICRP Publication 110. Adult Reference Computational Phantoms. Ann ICRP 2009;39.
[156] Reeves AP, Biancardi AM, Yankelevitz D, Fotin S, Keller BM, Jirapatnakul A, et al. A public image
database to support research in computer aided diagnosis. Proc 31st Annu Int Conf IEEE Eng Med Biol
Soc Eng Futur Biomed EMBC 2009 2009:3715–8. https://doi.org/10.1109/IEMBS.2009.5334807.
[157] Rodríguez Pérez S, Marshall NW, Struelens L, Bosmans H. Characterization and validation of the thorax
phantom Lungman for dose assessment in chest radiography optimization studies. J Med Imaging
2018;5:1. https://doi.org/10.1117/1.JMI.5.1.013504.
[158] Loverdos K, Fotiadis A, Kontogianni C, Iliopoulou M, Gaga M. Lung nodules: A comprehensive review
on current approach and management. Ann Thorac Med 2019;14:226–38.
https://doi.org/10.4103/atm.ATM_110_19.
[159] Laine S. A topological approach to voxelization. Comput Graph Forum 2013;32:77–86.
https://doi.org/10.1111/cgf.12153.
[160] Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ : 25 years of Image Analysis. Nat
Methods 2012;9:671–5.
[161] Ullman G, Sandborg M, Hunt R, Dance DR, Carlsson GA. Implementation of pathologies in the Monte
Carlo model in chest and breast imaging 2003.
[162] IAEA. XMuDat: Photon attenuation data on PC. 1998.
[163] Lunit INSIGHT CXR n.d. https://www.lunit.io/en/products/insight-cxr.
[164] Samei E, Flynn MJ, Eyler WR. Detection of subtle lung nodules: relative influence of quantum and
anatomic noise on chest radiographs. Radiology 1999;213:727–34.
https://doi.org/10.1148/radiology.213.3.r99dc19727.
[165] Keelan BW, Topfer K, Yorkston J, Sehnert WJ, Ellinwood JS. Relative impact of detector noise and
anatomical structure on lung nodule detection. Med Imaging 2004 Image Perception, Obs Performance,
Technol Assess 2004;5372:230. https://doi.org/10.1117/12.536733.
[166] Båth M, Håkansson M, Börjesson S, Kheddache S, Grahn A, Ruschin M, et al. Nodule detection in digital
chest radiography: introduction to the RADIUS chest trial. Radiat Prot Dosimetry 2005;114:85–91.
https://doi.org/10.1093/rpd/nch575.
[167] Håkansson M, Båth M, Börjesson S, Kheddache S, Grahn A, Ruschin M, et al. Nodule detection in digital
chest radiography: Summary of the radius chest trial. Radiat Prot Dosimetry 2005;114:114–20.
https://doi.org/10.1093/rpd/nch574.

169
[168] Håkansson M, Båth M, Börjesson S, Kheddache S, Flinck A, Ullman G, et al. Nodule detection in digital
chest radiography: effect of nodule location. Radiat Prot Dosimetry 2005;114:92–6.
https://doi.org/10.1093/rpd/nch524.
[169] Yang W, Sirajuddin A, Zhang X, Liu G, Teng Z, Zhao S, et al. The role of imaging in 2019 novel
coronavirus pneumonia (COVID-19). Eur Radiol 2020. https://doi.org/10.1007/s00330-020-06827-4.
[170] Wong HYF, Lam HYS, Fong AHT, Leung ST, Chin TWY, Lo CSY, et al. Frequency and Distribution of
Chest Radiographic Findings in COVID-19 Positive Patients. Radiology 2019:201160.
https://doi.org/10.1148/radiol.2020201160.
[171] Caruso D, Zerunian M, Polici M, Pucciarelli F, Polidori T, Rucci C, et al. Chest CT Features of COVID-
19 in Rome, Italy. Radiology 2020:201237. https://doi.org/10.1148/radiol.2020201237.
[172] Badano A, Graff CG, Badal A, Sharma D, Zeng R, Samuelson FW, et al. Evaluation of Digital Breast
Tomosynthesis as Replacement of Full-Field Digital Mammography Using an In Silico Imaging Trial.
JAMA Netw Open 2018;1:1–12.
[173] Lemanowicz A, Leszczyński W, Rusak G, Białecki M, Ratajczak P. Chest adipose tissue distribution in
patients with morbid obesity. Polish J Radiol 2018;83:e68–75. https://doi.org/10.5114/pjr.2018.73406.
[174] Lorensen WE, Cline HE. Marching Cubes: A High Resolution 3D Surface Construction Algorithm.
SIGGRAPH Comput Graph 1987;21:163–169. https://doi.org/10.1145/37402.37422.
[175] Schmid B., Schindelin J., Cardona A., Longair M., Heisenberg M. Open Access SOFTWARE A high-
level 3D visualization API for Java and ImageJ. BMC Bioinformatics 2010;11:274.
[176] Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID-19):
A pictorial review. Clin Imaging 2020;64:35–42. https://doi.org/10.1016/j.clinimag.2020.04.001.
[177] Cignoni P, Rocchini C, Scopigno R. Metro: Measuring Error on Simplified Surfaces. Comput Graph
Forum 1998;17:167–74. https://doi.org/10.1111/1467-8659.00236.
[178] Lunit. Lunit Insight CXR for Covid-19 2020. https://insight.lunit.io/covid19 (accessed May 20, 2020).
[179] Lanza E, Muglia R, Bolengo I, Santonocito OG, Lisi C, Angelotti G, et al. Quantitative Chest CT analysis
in COVID-19 to predict the need for oxygenation support and intubation 2020.
https://doi.org/10.21203/rs.3.rs-30481/v1.
[180] Saood A, Hatem I. COVID-19 lung CT image segmentation using deep learning methods: U-Net versus
SegNet. BMC Med Imaging 2021;21:19. https://doi.org/10.1186/s12880-020-00529-5.
[181] Abd Elaziz M, A. A. Al-qaness M, Abo Zaid EO, Lu S, Ali Ibrahim R, A. Ewees A. Automatic clustering
method to segment COVID-19 CT images. PLoS One 2021;16:e0244416.
[182] Lombardo PA. Development of the Realistic Anthropomorphic Flexible phantom for applications in
dosimetry. KU Leuven, 2018.
[183] Fetterly K a, Schueler B a. Physical evaluation of prototype high-performance anti-scatter grids: potential
for improved digital radiographic image quality. Phys Med Biol 2009;54:N37-42.
https://doi.org/10.1088/0031-9155/54/2/N02.
[184] Wang G, Liu X, Li C, Xu Z, Ruan J, Zhu H, et al. A Noise-robust Framework for Automatic Segmentation
of COVID-19 Pneumonia Lesions from CT Images. IEEE Trans Med Imaging 2020;39:1–1.
https://doi.org/10.1109/tmi.2020.3000314.
[185] Abadi E, Paul Segars W, Chalian H, Samei E. Virtual Imaging Trials for Coronavirus Disease (COVID-
19). Am J Roentgenol 2020:1–7. https://doi.org/10.2214/ajr.20.23429.
[186] Badal A, Sempau J. A package of Linux scripts for the parallelization of Monte Carlo simulations. Comput
Phys Commun 2006;175:440–50. https://doi.org/10.1016/j.cpc.2006.05.009.
[187] Petoussi-Henss N, Bolch WE, Eckerman KF, Endo A, Hertel N, Hunt J, et al. ICRP Publication 116 - The
first ICRP/ICRU application of the male and female adult reference computational phantoms. Phys Med
Biol 2014;59:5209–24. https://doi.org/10.1088/0031-9155/59/18/5209.
[188] ICRU, ICRP. Operational Quantities for External Radiation Exposure ICRU Report 95. J Int Comm
Radiat Units Meas 2020. https://doi.org/10.1093/jicru/os29.2.Report57.
[189] Chakraborty DP, Yoon H-J. JAFROC analysis revisited: figure-of-merit considerations for human

170
observer studies. Med Imaging 2009 Image Perception, Obs Performance, Technol Assess
2009;7263:72630T. https://doi.org/10.1117/12.810859.
[190] Chakraborty DP, Berbaum KS. Observer studies involving detection and localization: Modeling, analysis,
and validation. Med Phys 2004;31:2313–30. https://doi.org/10.1118/1.1769352.
[191] Dorfman DD, Berbaum KS, Metz CE. Receiver operating characteristic rating analysis. Generalization to
the population of readers and patients with the jackknife method. Invest Radiol 1992;27:723–31.
[192] Hillis SL, Berbaum KS, Metz CE. Recent developments in the Dorfman-Berbaum-Metz procedure for
multireader ROC study analysis. Acad Radiol 2008;15:647–61.
https://doi.org/10.1016/j.acra.2007.12.015.
[193] Chakraborty D. Manual JAFROC Analysis Software. 2007.
[194] Aichinger H, Dierker J, Joite-Barfus S, Sabel M. Radiation Exposure and Image Quality in X-Ray
Diagnostic Radiology. Berlin: Springer Berlin Heidelberg; 2004.
[195] Richard S, Siewerdsen JH, Jaffray D a., Moseley DJ, Bakhtiar B. Generalized DQE analysis of
radiographic and dual-energy imaging using flat-panel detectors. Med Phys 2005;32:1397.
https://doi.org/10.1118/1.1901203.
[196] Zhao C, Vassiljev N, Konstantinidis AC, Speller RD, Kanicki J. Three-dimensional cascaded system
analysis of a 50 μ m pixel pitch wafer-scale CMOS active pixel sensor x-ray detector for digital breast
tomosynthesis. Phys Med Biol 2017;62:1994–2017. https://doi.org/10.1088/1361-6560/aa586c.
[197] Samei E, Järvinen H, Kortesniemi M, Simantirakis G, Goh C, Wallace A, et al. Medical imaging dose
optimisation from ground up: Expert opinion of an international summit. J Radiol Prot 2018;38:967–89.
https://doi.org/10.1088/1361-6498/aac575.
[198] Winslow M, Xu XG, Yazici B. Development of a simulator for radiographic image optimization. Comput
Methods Programs Biomed 2005;78:179–90. https://doi.org/10.1016/j.cmpb.2005.02.004.
[199] Cowen a R, Haywood JM, Workman A, Clarke OF. A set of X-ray test objects for image quality control
in digital subtraction fluorography. I: Design considerations. Br J Radiol 1987;60:1001–9.
[200] Rodríguez Pérez S, Marshall NW, Binst J, Coolen J, Struelens L, Bosmans H. Survey of chest radiography
systems: Any link between contrast detail measurements and visual grading analysis? Phys Medica
2020;76:62–71. https://doi.org/10.1016/j.ejmp.2020.06.014.
[201] Star-Lack J, Sun M, Meyer A, Morf D, Constantin D, Fahrig R, et al. Rapid Monte Carlo simulation of
detector DQE(f). Med Phys 2014;41:031916. https://doi.org/10.1118/1.4865761.
[202] Shrestha S, Fischer R, Matt GJ, Feldner P, Michel T, Osvet A, et al. High-performance direct conversion
X-ray detectors based on sintered hybrid lead triiodide perovskite wafers. Nat Photonics 2017;11:436–
40. https://doi.org/10.1038/nphoton.2017.94.
[203] Rabbani M, Shaw R, Van Metter R. Detective quantum efficiency of imaging systems with amplifying
and scattering mechanisms. J Opt Soc Am A 1987;4:895–901.
[204] Tanguay J, Yun S, Kim HK, Cunningham I a. Detective quantum efficiency of photon-counting x-ray
detectors. Med Phys 2015;42:491–509. https://doi.org/10.1118/1.4903503.
[205] Badano A, Gagne RM, Gallas BD, Jennings RJ, Boswell JS, Myers KJ. Lubberts effect in columnar
phosphors. Med Phys 2004;31:3122–31. https://doi.org/10.1118/1.1796151.
[206] Förster A, Brandstetter S, Schulze-Briese C. Transforming X-ray detection with hybrid photon counting
detectors. Philos Trans R Soc A Math Phys Eng Sci 2019;377. https://doi.org/10.1098/rsta.018.041.
[207] Datta A, Zhong Z, Motakef S. A new generation of direct X-ray detectors for medical and synchrotron
imaging applications. Sci Rep 2020;10:1–10. https://doi.org/10.1038/s41598-020-76647-5.
[208] Rowlands JA. Medical imaging: Material change for X-ray detectors. Nature 2017;550:47–8.
https://doi.org/10.1038/550047a.
[209] Lee H, Lee J. A deep learning-based scatter correction of simulated X-ray images. Electron 2019;8.
https://doi.org/10.3390/electronics8090944.
[210] Sharma D, Graff CG, Badal A, Zeng R, Sawant P, Sengupta A, et al. In silico imaging tools from the
VICTRE clinical trial. Med Phys 2019;46:3924–8. https://doi.org/10.1002/mp.13674.

171
[211] Salehinejad H, Colak E, Dowdell T, Barfett J, Valaee S. Synthesizing Chest X-Ray Pathology for Training
Deep Convolutional Neural Networks. IEEE Trans Med Imaging 2019;38:1197–206.
https://doi.org/10.1109/TMI.2018.2881415.
[212] Korkinof D, Rijken T, O’Neill M, Yearsley J, Harvey H, Glocker B. High-Resolution Mammogram
Synthesis using Progressive Generative Adversarial Networks 2018.
[213] Sorin V, Barash Y, Konen E, Klang E. Creating Artificial Images for Radiology Applications Using
Generative Adversarial Networks (GANs) - A Systematic Review. Acad Radiol 2020;27:1175–85.
https://doi.org/10.1016/j.acra.2019.12.024.

172
Summary

Chest radiography (CXR) remains the core imaging examination for the chest despite the availability of
diagnostically superior imaging techniques like chest tomosynthesis and Computed Tomography (CT). CXR has
many advantages, such as relatively low cost, fast acquisition times, widespread availability, and reduced dose to
the patient. Chest radiography images are used in the management of numerous clinical tasks, like pulmonary
diseases, bone fractures, visualization of catheters amongst many others. Acquisition protocols used nowadays
have at their root those used in screen film systems, and in many cases have not been optimized for the latest
generation of digital flat panel detectors. In digital radiography, the optimization process should be referred to the
clinical task to be performed, which is the fundamental premise behind our project. As with all uses of radiation
for medical purposes, optimization should find a balance between the image quality necessary to perform certain
imaging task and the dose delivered to the patient.

The main objective of the PhD thesis is the creation of a simulation framework that can be used in Virtual Clinical
Trials (VCTs) in chest radiography. To allow a task-based optimization, the imaging chain should also comprise
realistic anthropomorphic models including a wide range of tasks. Finally, the framework should allow the study
of different elements of the imaging chain that influence clinical task performance and organ dose.

The simulation platform was developed to match all these requirements by making use of ray tracing technique
to generate noise free primary images and PENELOPE/penEasy Monte Carlo simulations for the generation of
the scatter images. Additionally, the antiscatter grid is also included in the simulations. Real detector sharpness
and noise characteristics are added to the simulated images via the Modulation Transfer Function (MTF) and the
Normalized Noise Power Spectrum (NNPS) respectively. The established methodology was first validated by
comparing signal difference to noise ratio (SDNR) from real and simulated images. The test object used consisted
of a Poly (methyl methacrylate) (PMMA) block and a small aluminium detail. Large area contrast and noise were
verified as a function of PMMA thickness, tube voltage and different dose levels, for antiscatter grid in and out
cases. Additional validation of the noise and sharpness modification process was performed. Lastly, primary
transmission (Tp) and total transmission (Tt) were measured experimentally in the physical antiscatter grid and
compared to those for the simulated grid. Average differences between SDNR measured in real and simulated
images were 6% and 5% for grid in and grid out respectively. Maximum differences of 12% were found between
the simulated and measured MTF up to the Nyquist frequency. Average deviations below 5% were found between

173
the NNPS measured in real and simulated images. Relative differences for Tp and Tt were below 6% and 13%
respectively.

Several image segmentation techniques and polygonal mesh modelling were used in the development of the
computational phantoms. The models created were based on an existing polygonal mesh phantom, the Realistic
Anthropomorphic Flexible (RAF) phantom. The polygonal mesh format of the RAF allowed the modelling of a
more realistic lung background and different body types. To be used in task-based optimization, a set of lesions
and/or devices commonly found in chest radiography were modelled within the phantoms. A library of 24 realistic
anthropomorphic chest phantoms was created to model different type of patients, like male, female and different
Body Mass Indexes. The clinical tasks modelled were pulmonary nodules, catheters, rib fractures, pneumothorax
and pleural effusion. A set of models depicting Covid-19 disease was also included. The realism of the models
anatomy and of the simulated clinical tasks was first validated by experienced radiologists. Then, the images were
uploaded for analysis to an AI software which served as extra validation.

The validated simulation framework and computational phantoms were used in a Virtual Clinical Trial to
investigate the influence of different acquisition parameters in diagnostic performance. Posterior Anterior
projections of 11 phantoms including clinical tasks were generated for a range of exposure conditions (i.e., dose
level, tube voltage and antiscatter grid use). For the grid, two options were evaluated: antiscatter grid in place and
removed. Five tube voltages values were used ranging from 60 to 140 in steps of 20 kVp and four dose levels:
0.62, 1.25, 2.5 and 5.0 µGy. Clinical image processing was applied to the simulated images, which were
subsequently scored via a free-response observer study by four radiologists. The statistical analysis of the results
was done using the jackknife-alternative free-response receiver operating characteristic (JAFROC) method.
Additionally, organ dose calculations were performed for the different settings. The results from this study lead
to several practical conclusions:

• A reduction of dose by 50% can be achieved without decreasing the diagnostic performance for the
clinical tasks studied, i.e., from a working level of 2.5 µGy to 1.25 µGy.

• For grid out techniques, using 100 kVp and 80 kVp values did not show significant differences in
diagnostic performance. Organ doses were on average 10% lower at 100 kVp compared to 80 kVp. More
investigation on this is necessary, in which Anterior Posterior projections are simulated and organ doses are
calculated to apply the results to bedside imaging.

• For a constant target DAK and grid in technique, 120 kVp gave the lower organ doses with no significant
decrease in diagnostic performance. In terms of tube voltage, this justifies the choice made in practice.

• In general, no significant difference was found in images with and without the antiscatter device. Further
analysis of the effect of the antiscatter grid is necessary. A comparison of different image processing techniques
in the enhancement of contrast for grid out technique is relevant.

The simulation platform developed was successfully applied to generate synthetic images of anthropomorphic
phantoms including a range of clinical tasks on a VCT for chest radiography. Although developed for CXR, the
simulation platform and the computational phantoms can be used in a wider range of applications. To our
knowledge, this is the first VCT in the field of chest radiography to include this range of clinical tasks.

174
Acknowledgments

“Every mountain top is within reach if you just keep climbing”

The realization of this work would not have been possible without the support and contributions (direct or indirect)
from many people, who I would like to acknowledge today.

First, I would like to express my gratitude to my supervisors Dr. Lara Struelens, Prof. Dr. Hilde Bosmans and
Prof. Dr. Nicholas Marshall.

To Nick, thank you for your constant support and dedication to this work. For always providing your help and
ideas, especially in the detector modelling. Thank you for always agreeing to have a ‘short’ talk to discuss new
and/or unusual results. For being an endless source of knowledge, always with the right reference in mind.

To Hilde, thank you for all the guidance, for proposing new ideas in every meeting and for steering the work in
the right direction always with an eye on the big picture. Thank you for encouraging me to take on challenging
projects and to always smile. Thank you for making me believe in myself and the work I was doing in the moments
of greater doubt.

To Lara, thank you for all the help and support throughout all these years, since the first time I came to SCK for
a stage as a master student. You were the first person that supported me to apply to this PhD. Thank you for
always finding time to review my texts and presentations and for always asking the right questions.

I would like to express my gratitude to Prof. Dr. Tania Roskams for chairing the defence and to the members of
the jury: Prof. Dr. Josep Sempau, Prof. Dr. Klaus Bacher, Prof. Dr. Mathias Prokop, Prof. Dr. Walter de Wever
and Prof. Dr. Tom Depuydt, for the time dedicated to reading the manuscript and for their valuable comments and
suggestions which help improve the level of this work.

Moreover, I would like to acknowledge the financial support of the SCK·CEN for making this PhD project
possible.

I wish to thank all the people whose assistance was critical in the successful completion of this project. To prof.
Johan Coolen for his invaluable help and advice regarding the modelling of the phantoms and the clinical tasks,
thank you for always finding time in your busy schedule to review my images. To Dirk Vandenbroucke from
Agfa, for his prompt help applying clinical processing to the images. To all the readers that participated in the

175
different observer studies, Dr Charlotte Biebaû, Dr Jeroen Desmet, Dr. Rolf Symons, Dr. Kathleen Dhont, Dr
Bernard Sneyers, Dr. Laurence Verhaeghe and Dr. Kristof Coursier. To Frederik De Keyzer for his help with the
statistical analysis of the reading study. To Lesley Cockmartin for sharing all her experience with observer studies
and all the help with the ethical approval process and finding images for my models. I also had the opportunity to
guide two wonderful students: Philippe Moussalli and Dayana Castillo, a thank you goes to them for their
contribution to this work.

Thank you to all my colleagues from the Medical Physics & Quality Assessment group. Even if I was not much
time in Leuven, you always made me feel welcome and part of the group, thank you to Janne, Lesley, Mitko,
Liesbeth, An, Michiel, Gati, Stoyko, Joke, Annelies, Kim. A special thanks to Janne and Liesbeth always willing
to help me when I was not present in Leuven, you spared me many travels from Mol. To Joke for going all around
Flanders with me to perform measurements. Thank you to Linda and Ingrid for helping me with all practical
arrangements of the thesis. To Mitko and Lesley for the help with the public and online defense.

To all my colleagues from SCK-CEN, with who I expect to continue working for many more years. Thank you
to Filip, Clarita, Dayana, Luana, Giulia, Ingrid, Diane, Jérémie, Mahmoud and Cristian. To the zie Giulia and
Dayana, thank you for being there and for all our pizza nights with Caso Cerrado. To Luana and Edilaine, it was
great to have you as friends when moving to Mol, thank you for all the support and all nights of bowling and
sushi, here I also include Luis and Ioannis. To Clarita, it was so nice to share office for many years and being able
to speak Spanish away from home, thank you for all your encouragement and all the advice regarding living in
Belgium. To Ingrid and Diane thank you for always taking care of us.

To all my Cuban friends here in Belgium, thank you for all the moments shared in which I could feel like being
back at home. To Wally, Marlo, Wilfredo and Myriam, you took me in when I first arrived in Belgium and made
me feel like family, for that I am forever grateful. To Ana and Piet, Ibrahin and Amadys, Yamiel and Ivelisse,
Gustavo and my dear poison Diana.

Por supuesto, mi mayor agradecimiento va a mi mamá y a mi papá, sin ellos no habría alcanzado mis metas,
gracias por su amor y por su apoyo incondicional incluso desde lejos, este también es su logro. A toda mi familia
dispersa por el mundo gracias por el cariño y el apoyo. A Meiby y Cynthia por todos estos años de amistad.
Quisiera hacer una dedicatoria especial a mi abuela y a mi tata.

Grazie anche a Silvana, Michele, Andrea e Francesco per avermi accolto nella loro famiglia.

To Pasquale, thank you for allowing me to use the RAF and for introducing me to the world of the phantoms.
More importantly, thank you being there for me every step of the way. Thank you for all your love, understanding
and patience. I know it has not always been easy, but thanks to you and to our Aurora I found the motivation and
strength to keep going.

Mis últimas palabras van a Aurora, quien llego a revolucionarme la vida, tú eres mi mayor fuente de inspiración.

Sunay

176
Curriculum Vitae

Sunay Rodríguez Pérez is a PhD student at the Katholieke Universiteit Leuven and the Belgian Nuclear Research
Centre (SCK-CEN), Belgium. She received her BSc in Nuclear Physics degree in 2012 and her MSc in Nuclear
Physics degree in 2014, both from the Higher Institute of Technologies and Applied Sciences (InSTEC), Havana,
Cuba. Her PhD focuses on the creation of a simulation framework for task-based optimization studies in chest
radiography.

Publications

[1] Rodríguez Pérez, S., Coolen, J., Marshall, N. W., Cockmartin, L., Biebaû, C., Desmet, J., Bosmans,
H. (2021). Methodology to create 3D models of COVID-19 pathologies for virtual clinical trials. J Med
Imaging, 8(Suppl 1), 013501. doi:10.1117/1.JMI.8.S1.013501

[2] Rodríguez Pérez S., Marshall N. W., Binst J., Coolen J., Struelens, L. and Bosmans H. “Survey of
chest radiography systems: Any link between contrast detail measurements and visual grading
analysis?”, 2020, Physica Medica, Vol 76.

[3] Rodríguez Pérez, S., Moussalli, P., Bosmans, H., Struelens, L., & Marshall, N. System detective
quantum efficiency (DQESYS) as an index of performance for chest radiography system (bucky and
bedside) at four patient equivalent thicknesses. Proc. SPIE 10948, Medical Imaging 2019: Physics of
Medical Imaging, 1094842 (1 March 2019); doi: 10.1117/12.2512167

[4] Rodríguez Pérez, S., Marshall N. W., Struelens, L. and Bosmans H. “Characterization and validation
of the thorax phantom Lungman for dose assessment in chest optimization studies”, Feb 2018, Journal
of Medical Imaging, Vol 5, Num 1.

[5] Vignero, J., Rodríguez Pérez, S., W. Marshall, N. & Bosmans, H. “Minimizing the scatter
contribution and spatial spread due to the absorption grating G2 in grating-based phase-contrast
imaging”, Mar 2018, Proceedings SPIE Volume 10573, Medical Imaging 2018: Physics of Medical
Imaging; 105734H.

177
[6] Saldarriaga Vargas, C., Rodríguez Pérez, S., Baete, K., Pommé, S., Paepen, J., Van Ammel,
R., Struelens, L. “Intercomparison of 99mTc, 18F and 111In activity measurements with radionuclide
calibrators in Belgian hospitals”, Jan 2018, Physica Medica, Vol 45

[7] Rodríguez Pérez, S., Marshall N. W., Struelens, L. and Bosmans H. “Validation study of the thorax
phantom Lungman for optimization purposes” Proceedings SPIE 10132, Medical Imaging 2017:
Physics of Medical Imaging, 1013258.

[8] Monnin P., Verdun F. R., Bosmans H., Rodríguez Pérez, S. and Marshall N. W. “A comprehensive
model for x-ray projection imaging system efficiency and image quality characterization in the
presence of scattered radiation” 2017 Physics in Medicine & Biology, Volume 62, Number 14.

Oral and poster Presentations

[1] Rodríguez Pérez, S., Castillo Seoane D., Struelens L., Bosmans H., Coolen J., Marshall N.
W. Creation of a set of computational phantoms including clinical task for optimization studies in chest
radiography. Workshop on Computational Phantoms, July 2019. Poster Presentation.
[2] Rodríguez Pérez S., Marshall N. W., Binst J., Coolen J., Struelens, L. and Bosmans H. Snapshot of
chest radiography in Flanders: Any link between physical and clinical image quality? BHPA 2019
February 2019, Aalst, Belgium. Oral Presentation.
[3] Rodríguez Pérez, S., Struelens L., Castillo Seoane D., Coolen J., Bosmans H., Marshall N. W.
Creation of a set of computational phantoms for clinical task-based optimization studies in chest
radiography. BHPA 2019 February 2019, Aalst, Belgium. Oral Presentation.
[4] Rodríguez Pérez, S., Moussalli, P., Bosmans, H., Struelens, L., & Marshall, N. System detective
quantum efficiency (DQESYS) as an index of performance for chest radiography system (bucky and
bedside) at four patient equivalent thicknesses. SPIE Medical Imaging, San Diego, California, USA,
February 2019. Poster Presentation.
[5] Smet M., Breysem L., Breysem M., Bosmans, H., Marshall N. and Rodríguez Pérez, S. Impact of
detector pixel size on clinical image quality of digital neonatal chest X-ray images. European Congress
of Radiology (ECR 2019), Vienna, Austria, February 2019. Oral presentation.
[6] Rodríguez Pérez, S., Bosmans H., Struelens L., Coolen J., Marshall N. W. Combined use of DAP and
EI for improving outlier selection in dose monitoring for projection radiology. European Congress of
Radiology (ECR 2018), Vienna, Austria, February 2018. Poster presentation.
[7] Vignero, J., Rodríguez Pérez, S., W. Marshall, N. & Bosmans, H. Minimizing the scatter contribution
and spatial spread due to the absorption grating G2 in grating-based phase-contrast imaging. SPIE
Medical Imaging, Houston, Texas USA, 10-15 February 2018. Poster presentation.
[8] Vignero, J., Rodríguez Pérez, S., W. Marshall, N. & Bosmans, H. “Contribution of coherent and
incoherent scatter in grating-based phase-contrast imaging,”. International Conference on Monte Carlo
Techniques (MCMA), Napels, Italy, 15 -18 October 2017. Oral presentation.

178
[9] Rodríguez Pérez, S., Marshall N. W., Struelens, L. and Bosmans H. A simulation platform for virtual
clinical trials in chest X-ray imaging. Latin-American Symposium on Nuclear Physics and
Applications (LASNPA-WONP-NURT) La Habana, Cuba, October 2017. Oral presentation.
[10] Rodríguez Pérez, S., Marshall N. W., Struelens, L. and Bosmans H. Validation study of the thorax
phantom Lungman for optimization purposes. SPIE Medical Imaging February 2017, Orlando, USA.
Poster Presentation.
[11] Rodríguez Pérez, S., Marshall N. W., Struelens, L. and Bosmans H. Implementation and
validation of a methodology for image simulation to be used in optimization of chest radiography.
BHPA 2017 February 2017, Ghent, Belgium. Oral Presentation.
[12] Rodríguez Pérez, S., Marshall N. W., Struelens, L. and Bosmans H. Implementation and validation
of a simulation framework for optimization studies in chest radiography. Training School: Advanced 3D
Imaging for biology samples, Maxima Project, August 2016, Leuven, Belgium. Oral Presentation.

179

You might also like