Professional Documents
Culture Documents
Cheng.Ni
Radiotherapy Business Unit
Shanghai United Imaging
Healthcare Co Ltd
Shanghai, China
cheng_ni_sjtu@163.com
Abstract—Accurate dose prediction has been proved to be able volumetric modulated arc therapy (VMAT) are the most
to improve radiotherapy planning efficiency. Recently, deep commonly used techniques. Compared to IMRT, VMAT can
neural networks have been used in this area and made some achieve better dose distribution, namely higher target dose
progress. However, existing deep-learning-based methods could coverage and lower organs-at-risk (OAR) dose [3, 4]. However,
not predict dose distribution accurately for tumors at various VMAT planning is time-consuming and often requires
locations, i.e. lung cancer. This article proposes a new deep neural experienced dosimetrists to tune the dose-volume constraints
network CAD-UNet that combines 3D U-net, dense connection, manually by trial-and-error.
and SE-net architecture. Spatial distance information is used as a
special input channel in addition to contour information. Dice To overcome this problem, there are several methods
similarity coefficients of planning target volume (PTV) region was proposed for the automatic treatment planning process,
added to the mean squared error (MSE) loss function. A cohort of including Erasmus-iCycle [5], Auto-Planning Engine [6], and
192 VMAT plans for lung cancer patients was selected for this knowledge-based planning (KBP) strategy [7]. And all of these
study. The trained CAD-UNet and HD-UNet were tested on the methods need to predict accurate dose-volume histograms
test cases. The dose parameters derived form predicted dose (DVHs) or voxel-wise dose distribution that serves as
distribution were used to generate new plans in the treatment optimization constraints or optimization goals for IMRT or
planning system (TPS). The results showed that CAD-UNet can VMAT planning. And for the reason that voxel-wise dose
successfully predict dose distribution of lung cancer cases in
prediction provides more spatial information and can be used to
VMAT, outperforming HD-UNet in PTV region homogeneity.
calculate DVH easily, voxel-wise dose prediction seems to have
Regenerated plans based on predicted dose showed improvements
in DVHs of organs-at-risk (OAR). Those improvements showed
more potential in automatic planning in radiotherapy. There are
that CAD-UNet has the potential to guide dosimetrist in the numerous studies on voxel-wise dose prediction [8-11]. These
radiotherapy planning stage. studies all used neural network and images or 3D volumetric as
input and predicted three-dimensional dose distributions.
Keywords—CAD-UNet; deep neural network; radiotherapy; However, most studies focus on cases in which tumor sites are
dose prediction; lung cancer; relatively similar among patients. Barragán-Montero [11] tried
to add beam angle information as input to predict lung cancer
I. INTRODUCTION dose distribution of IMRT plan and shown relatively high
Lung cancer is the most commonly diagnosed cancer and the precision, but for VMAT plans, the beam angle information is
most common cause of cancer death [1]. Radiation therapy is an not as important for IMRT plans. This study hoped to be able to
important treatment modality for lung cancer and is often used predict VMAT dose distribution for tumor at various locations
as adjuvant therapy. About half of lung cancer patients are without user-specified beam angle.
advised to receive radiotherapy. It could improve local control, In this study, we proposed a new model named Channel
survival, and patient life quality. The treatment outcome of Attention Densely-connected U-Net (CAD-UNet)ˈ combining
radiotherapy for early non-small Cell Lung Cancer (NSCLC) is 3D U-Net, SE-net modular, and dense connection into a novel
comparable to surgical treatment but without operative wound deep neural network. In addition to contour information of
[2]. Intensity-modulated radiation therapy (IMRT) and
Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 01:42:54 UTC from IEEE Xplore. Restrictions apply.
regions of interest (ROI), spatial distance information is added.
Dice similarity coefficients of PTV was combined with mean
square error (MSE) global loss function. Then the predicted
results were compared with the HD-UNet [10] results. Finally,
we compared regenerated plans using predicted DVHs with the
original manual plans.
II. METHODS
A. Deep neural network
The architecture of the new deep neural network used was Fig. 2. SE-dense-block
proposed by combining 3D U-Net and concatenate operator for
feature incorporation. The new deep neural network architecture
for dose prediction is illustrated in Fig1.
To utilize channels of each layer more efficiently, SE-dense-
block was proposed, as showed in Fig2. The SE-dense-block
incorporates SE-block [16] and dense connection. This block
uses Squeeze and Excitation module for adding channel
attention, followed by a standard 3D convolution, then previous
feature sets are concatenated with new feature sets. In the whole
architecture showed in Fig1, SE-dense-block is alternatively
used with a standard convolutional layer. The number of new
filters was respectively, 32, 64, 128, 256 and 512, with the
feature map size reduced by half after max-pooling layer.
Besides the basic layers of U-net, GroupNorm layer [14] was
added after each convolutional. Rectified Linear Unit (ReLU)
[13] was used as the activation function after the GroupNorm Fig. 3. Spatial distance channel
layer except the last convolutional layer. The proposed model
was implemented in Keras [12] that uses Tensorflow as backend.
Adam optimization algorithm was used to minimize the loss The input network data include 7 channels, and the output
function value between the predicted dose and the original plan data were the 3D predicted dose distribution. The first 6
dose. He_normal [15] initialization method was used to initialize channels were the 3D mask cubes for 6 ROIs, including PTV,
the network parameters. Training batch size, learning rate, and heart, lungs, spinal cord, lungs minus the target, and body.
epochs were set to 2, 10-6, and 100 respectively. Especially, the value of each voxel in PTV mask was set to
prescription dose value while the voxel value of other masks was
set to 1. The last channel represents spatial distance information
to strengthen the PTV location information. This channel
represents the shortest Euclidean distance from each voxel
inside the body to the center point of PTV. The values of voxels
belong to PTV were set to 0. This channel was then normalized,
as shown in Fig3.
To better identify the PTV location, because PTVs of
different cases were scattered all over the lungs and the volume
of PTVs were relatively small compared to the lungs, the Dice
similar coefficient of PTV region, as showed in (1), was
introduced as part of the loss function in the training part.
Fig. 1. The deep neural network architecture. The Black number above blocks
ଶ௦ೝ תெ௦ುೇ
represent the number of feature maps at the corresponding layer. ܸܲܶௗ ൌ (1)
௦ೝ ାெ௦ುೇ
ଵ
ଶ
݈ ݏݏൌ ߑୀଵ ൫ܦௗ െ ܦ ൯ ሾͳ െ ሺܸܲܶௗ ሻሿ (2)
1305
Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 01:42:54 UTC from IEEE Xplore. Restrictions apply.
The Dpred represented the predicted dose cube and Doriginal III. RESULTS
represents the original dose cube. And i is the index of the After 100 epochs, the global loss value between predicted
voxel and n is the total number of voxels dose and original dose was decreased from 1.4156 to 0.1486 for
B. Plan database and Evaluation our CAD-UNet, compared the value from 1.5214 to 0.1889 for
HD-UNet. For the 20 test cases, MSE and the average Dice
Totally, 192 lung cancer cases were used in this study 152 similarity coefficients were 0.1647f0.0982, 0.5864f0.2584,
cases for training, 20 cases for validation, and 20 cases for
testing. In this database, the number of patients with tumor in 93.6543%f4.6547% and 92.5844f6.1549% separately for
the left or the right lung were 90 and 102 respectively. All two models, taking original plan dose as the baseline.
radiotherapy plans were made using Treatment planning system Fig4 shows the dose distributions from original plans, CAD-
(TPS) from Shanghai United Imaging Healthcare with initial UNet predictions and HD-UNet predictions for two testing cases
optimize constraints as TABLE Ⅰ followed by several manually with different tumor locations, the unit of the color bar was Gy.
adjustments. All plans satisfied clinical requirements with 95% Test case 1 and Test case 2 have different PTV locations in left
PTV region covered by prediction dose, 50Gy. All structure lung and right lung separately. As observed from Fig4, two
contours and VMAT plan dose distribution were resized to 256 models can both predict dose distribution for various PTV
h256 at each CT slice and each voxel resolution was set to location cases.
2.5mmh2.5mm. The CT slice thickness is 3mm. Patch input
In addition, the CI and HI of PTV for two model results and
was used in this model and patch size was set to 256h256h16 original plan were calculated to evaluate the PTV predicted
for each input channel, for the large computational cost in the results. As observed in Fig5, the test cases results of our model,
3D model. Also, patch input could increase training set and play showed better dose homogeneity and conformity index in the
the role of data augmentation. Hence no other data augmentation PTV region than HD-UNet results. The mean value f standard
was used for further pre-processing.
deviation of HI and CI for CAD-UNet results, HD-UNet results
ଶିଽ଼ and original dose distributions were 0.0993f 0.0369, 0.1969f
ܫܪൌ (3) 0.0132, 0.0450 f 0.0064, 0.9353 f 0.0620, 0.6194 f 0.0774,
ହ
0.9650f0.0124 respectively. The P-values between two model
ூ
ܫܥൌ (4) result for CI and HI were 0.00003182, and 0.00000297
்
respectively.
To evaluate the accuracy of predicted dose, the dose
distribution, dose-volume histograms (DVHs), homogeneity
index (HI) [17] of PTV, and conformity index (CI) [18] among
CAD-UNet predicted dose, HD-UNet [10] predicted dose and
original plan dose were compared. The definitions of HI and CI
were shown as (3) and (4). And dose-volume constrains
concluded from predicted DVHs, combining initial optimization
constrains, were used to re-plan test cases.
Where Dn means dose received by more than n percentage
of the volume. PIV means prescription isodose volume and TV
means target volume.
1306
Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 01:42:54 UTC from IEEE Xplore. Restrictions apply.
(a)
Fig. 7. Comparison of DVHs derived from original and re-generated plan dose
(Solid line) original dose; (Dotted line) re-generated plan dose; (Upper)
Test case 1; (Below) Test case 2.
1307
Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 01:42:54 UTC from IEEE Xplore. Restrictions apply.
to better identify spatial information between PTV to the skin ACKNOWLEDGMENT
and other OAR. Thanks to Dr.Supratik Bose for his insightful advice on
Another important innovation of the proposed model is the experiment design and manuscript revising.
use of SE-dense-block to strengthen weights of some channels
to increase training efficiency. It is found that HD-UNet result REFERENCES
showed unsmooth dose distribution in PTV and other region. It [1] Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global
is found that using dense connection alone involves too many cancer statistics 2018: GLOBOCAN estimates of incidence and mortality
worldwide for 36 cancers in 185 countries. CA Cancer J Clin.
redundancy information in training process and might need 2018;68(6):394-424.
more training epochs to eliminate these redundancy information. [2]
Although HD-UNet performed better than original Dense-Net,
(d) [3] Mathieu D, Campeau MP, Bahig H, Larrivee S, Vu T, Lambert L, et al.
it still could not predict lung cases well. Because the PTV in lung Long-term quality of life in early-stage non-small cell lung cancer patients
cases are at various location. The network needs to concentrate treated with robotic stereotactic ablative radiation therapy. Pract Radiat
more on information of PTV region. In other radiotherapy tumor Oncol. 2015;5(4):e365-73.
cases, the tumor locations were similar among different patients [4] Otto K. Volumetric modulated arc therapy: IMRT in a single gantry arc.
[8-11]. Although the input channel includes spatial information Med Phys 2008; 35: 310-17.
of PTV region, it was not enough in predicting complicate cases. [5] YAO C H, CHANG T H, TSAI M J, et al. Dose verification of volumetric
Nguyen [10] use HD-UNet to predict cancer in head and neck, modulation arc therapy by using a NIPAM gel dosimeter combined with
a parallel-beam optical computed tomography scanner 㹙J㹛. J Radioanal
in which tumor locations were almost in the middle of the head Nucl Chem, 2017, 311(2): 1277-1286.K. Elissa.
in 2D image. The HI of their results were 0.08f0.02, which is [6] Breedveld, Sebastiaan, Storchi, Pascal R. M., Voet, Peter W. J., and
close to ground truth 0.06f0.04. However, as shown in Fig5, Heijmen, Ben J. M. Wed . "iCycle: Integrated, multicriterial beam angle,
HD-UNet could not achieve similar good results in the lung case and profile optimization for generation of coplanar and noncoplanar
IMRT plans". United States. doi:10.1118/1.3676689.
in this study. The newly proposed CAD-UNet successfully
improved the values of HI and CI to be closer to ground truth [7] Xhaferllari I, Wong E, Bzdusek K, Lock M, Chen JZ. Automated IMRT
planning with regional optimization using planning scripts. J Appl Clin
than HD-UNet. Although the standard deviation of our HI and Med Phys. 2013;14:176–191.
CI results was a little larger than HD-UNet, our worst result is [8] Dosimetric features-driven machine learning model for DVH prediction
still better than best HD-UNet result. in VMAT treatment planning
Fig6 results indicated that, by taking predicted DVHs into [9] Ma M, Kovalchuk N, Buyyounouski MK, Xing L, Yang Y. Incorporating
dosimetric features into the prediction of 3D VMAT dose distributions
consideration, physicists could achieve a lower OAR dose with using deep convolutional neural network. Phys Med Biol.
fewer times manually tuning and keep the PTV dose index. That 2019;64(12):125017.
might because some original plans were not pushed to the [10] Song Y, Hu J, Liu Y, Hu H, Huang Y, Bai S, et al. Dose prediction using
physical limits, due to lack of planner experience and a deep neural network for accelerated planning of rectal cancer
insufficient planning time. Our predicted dose has the potential radiotherapy. Radiother Oncol. 2020;149:111-6.
to serve as a guideline for planners to achieve better radiotherapy [11] Nguyen D, Jia X, Sher D, Lin MH, Iqbal Z, Liu H, Jiang SB Three-
plan in a shorter time. dimensional radiotherapy dose prediction on head and neck cancer
patients with a hierarchically densely connected U-net deep learning
The performance of the predicted dose still is not very architecture. Phys Med Biol 2019
precise, especially for those areas away from the PTV region, [12] Barragán-Montero, A. M., Nguyen, D., Lu, W., Lin, M. H., Norouzi-
which could lead to a not very good result for some auto- Kandalan, R., Geets, X., Sterpin, E. & Jiang, S. B., Jan 1 2019, In :
Medical physics. Three-dimensional dose prediction for lung IMRT
planning methods that use voxel dose rather than DVHs as input. patients with deep neural networks: robust learning from heterogeneous
The bad performance might be caused by many reasons. One of beam configurations
the main reasons is the quality of the training data. The judgment [13] Chollet F 2015 keras: deep learning library for Theano and
of radiotherapy plan quality is subjective and patient-specific, tensorflow(https://keras.io)
and differ from physicist to physicist and patient to patient. [14] Glorot X, Bordes A, Bengio Y. Deep Sparse Rectifier Neural
Another reason is the limited training cohort size of less than Networks2010.
200, which may impair the ability for the network to predict [15] Y. Wu and K. He. Group Normalization. In The European Conference on
complicate cases. Computer Vision (ECCV), 2018.
[16] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers:
In conclusion, this study proposes a new deep learning Surpassing human-level performance on imagenet classification. In ICCV,
network CAD-UNet that combines densely connected channel 2015a.
attention with U-Net architecture. In addition to contour [17] Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In
information as input, the spatial distance information was added Proceedings of the IEEE conference on computer vision and pattern
as an extra input channel and PTV-dice as part of the global loss recognition, pages 7132–7141, 2018.
function. The results demonstrates better homogeneity on dose [18] International Commission on Radiation Units and Measurements. ICRU
83. Prescribing, recording and reporting photon beam intensity-
distribution of PTV regions than HD-UNet and some DVH Modulated Radiation therapy (IMRT). Report No: 83. Washington DC.
improvement from the original plan to re-generated plan 2010.
indicates that predicted dose has the potential to guide doctors, [19] Feuvret L, Noël G, Mazeron JJ, Bey P. Conformity index: a review. Int J
physicists, and dosimetrists in the radiotherapy planning stage. Radiat Oncol Biol Phys 2006; 64: 333–42. doi: https:// doi. org/ 10. 1016/
j. ijrobp. 2005. 09. 028
1308
Authorized licensed use limited to: San Francisco State Univ. Downloaded on June 22,2021 at 01:42:54 UTC from IEEE Xplore. Restrictions apply.