You are on page 1of 9

Paper

IONIZING RADIATION ALTERS THE TRANSITION/TRANSVERSION RATIO IN


THE EXOME OF HUMAN GINGIVA FIBROBLASTS

Neetika Nath,1,2✝ Lisa Hagenau,1✝ Stefan Weiss,1 Ana Tzvetkova,1,2 Lars R. Jensen,1 Lars Kaderali,2
Matthias Port,3 Harry Scherthan,3 and Andreas W. Kuss1

thus might imply that mismatch repair (MMR) plays a role in the
Abstract—Little is known about the mutational impact of ionizing cellular damage response to IR-induced DNA lesions.
Downloaded from http://journals.lww.com/health-physics by BhDMf5ePHKbH4TTImqenVCkVvhxkEdN1LnrOfMjjv/sTvjKWu9RfKjRt1l2oKBEE on 06/23/2020

radiation (IR) exposure on a genome-wide level in mammalian tis- Health Phys. 119(1):109–117; 2020
sues. Recent advancements in sequencing technology have pro-
Key words: analysis, statistical; dose; exposure, radiation; genetic
vided powerful tools to perform exome-wide analyses of genetic
effects
variation. This also opened up new avenues for studying and char-
acterizing global genomic IR-induced effects. However, genotypes
generated by next generation sequencing (NGS) studies can con-
tain errors, which may significantly impact the power to detect
signals in common and rare variant analyses. These genotyping
errors are not explicitly detected by the standard Genotype Anal- INTRODUCTION
ysis ToolKit (GATK) and Variant Quality Score Recalibration
(VQSR) tool and thus remain a potential source of false-positive IT IS well known that ionizing radiation (IR) causes lesions in
variants in whole exome sequencing (WES) datasets. In this con- the DNA; e.g., single strand breaks (SSB) and double strand
text, the transition-transversion ratio (Ti/Tv) is commonly used
as an additional quality check. In case of IR experiments, this is breaks (DSB) (Lomax et al. 2013; Sutherland et al. 2000).
problematic when Ti/Tv itself might be influenced by IR treat- Cells possess a number of mechanisms with which DNA
ment. It was the aim of this study to determine a suitable threshold damage can be repaired; however, many of these mecha-
for variant filters for NGS datasets from irradiated cells in order nisms are error-prone (Rodgers and McVey 2016). Mutations
to achieve high data quality using Ti/Tv, while at the same time be-
ing able to investigate radiation-specific effects on the Ti/Tv ratio for caused by cellular repair mechanisms range from large-scale
different radiation doses. By testing a variety of filter settings and (chromosomal rearrangements, large deletions, copy number
comparing the obtained results with publicly available datasets, variations) to small-scale (short insertions/deletions and point
we observe that a coverage filter setting of depth (DP) 3 and geno- mutations) (Lomax et al. 2013). Point mutations, where a sin-
type quality (GQ) 20 is sufficient for high quality single nucleotide
variants (SNVs) calling in an analysis combining GATK and gle base in the DNA sequence is substituted by another, can
VSQR and that Ti/Tv values are a consistent and useful indicator be categorized into two different types: transitions refer to a
for data quality assessment for all tested NGS platforms. Further- pyrimidine-pyrimidine (C↔T) or purine-purine substitution
more, we report a reduction in Ti/Tv in IR-induced mutations in (A↔G), whereas transversions refer to pyrimidine-purine
primary human gingiva fibroblasts (HGFs), which points to an ele-
vated proportion of transversions among IR-induced SNVs and substitutions or vice versa. The transition/transversion ratio
(Ti/Tv) of spontaneous occurring mutations in the whole
1
Department of Functional Genomics, Interfaculty Institute for Genetics
human genome is shown to be around 2.1 (DePristo et al.
and Functional Genomics, University Medicine Greifswald, Greifswald, 2011); however, depending on the region of the genome un-
Germany 2Institute of Bioinformatics, University Medicine Greifswald, der investigation, a varying range of Ti/Tv ratios have been
Greifswald, Germany, 3Bundeswehr Institute for Radiobiology, University
of Ulm, München, Germany. observed (Wang et al. 2015). One of the most common tran-
✝Equal contribution sition mutations is the deamination of methylated cytosine
The authors declare no conflicts of interest. to thymine. Methylated cytosine occurs in high concentra-
For correspondence contact: Andreas W. Kuss, Universitätsmedizin
Greifswald, Interfakultäres Institut für Genetik und Funktionelle tions in CpG islands, which are commonly located in exonic
Genomforschung, Felix-Hausdorff-Str. 8, 17475, Greifswald regions. Accordingly, with a value of around 3 (Bainbridge
kussa@uni-greifswald.de; phone: +49-3834-420-5814; fax: +49-3834-
420-5809 et al. 2011; Guo et al. 2012), the Ti/Tv ratio in the human
(Manuscript accepted 15 November 2019) exome is higher than in the genome. Previous studies show
Supplemental digital content is available in the HTML and PDF a varying range of Ti/Tv ratios in the human genome that is
versions of this article on the journal’s website www.health-physics.com.
0017-9078/20/0 dependent on the region of the genome under investigation
Copyright © 2020 Health Physics Society (Wang et al. 2015). Even though transversions and transi-
DOI: 10.1097/HP.0000000000001251 tions, and particularly changes in CpG islands, may have
www.health-physics.com 109

Copyright © 2020 Health Physics Society. Unauthorized reproduction of this article is prohibited.
110 Health Physics July 2020, Volume 119, Number 1

epigenetic consequences, the influence of radiation exposure METHODS


on the Ti/Tv ratio in healthy tissues has rarely been inves-
Cell culture
tigated. An early study observed a high proportion of Primary HGFs were obtained from Provitro AG (Berlin,
transversions in a mouse cell line containing a mutation re- Germany) and cultured and treated as previously described in
porter gene after x irradiation with 10–40 Gy (Yuan et al. Weissmann et al. (2016). Cells were irradiated with 240 kV X
1995). In radiation-associated second malignancies, a rela- rays at 13 mA (YXLON Maxishot, Hamburg, Germany)
tively low number of mutations was found; however, the ef- filtered with 3 mm beryllium at a dose rate of 1 Gy min−1
fect on the Ti/Tv ratio was not reported (Behjati et al. 2016). as described earlier (Weissmann et al. 2016).
This lack of information might be owing to the fact that
during single nucleotide variant (SNV) discovery with mod- Study design and DNA-sequencing
ern high throughput sequencing technologies, the Ti/Tv ra- After exposure of the cells to different doses of X radi-
tio is commonly used as a quality check (Wang et al. 2015), ation (0, 0.5, 2, and 10 Gy), they were cultured along with
where a lower-than-expected ratio is considered to indicate non-irradiated control cells for 16 h (repair interval). DNA
that the SNV set under investigation contains false positive was extracted and purified using the NucleoSpin Tissue
calls (DePristo et al. 2011). This complicates matters when Kit by Macherey Nagel (Düren, Germany) according to
Ti/Tv itself might be influenced by the experiment, as could the manufacturers’ instructions and subjected to WES using
be the case in studies that aim to detect mutations induced three different NGS platforms each time according to the in-
by IR. Therefore, it was the aim of this study to investigate structions provided by the manufacturer of the respective se-
the possibility to distinguish IR-specific effects on the Ti/Tv quencing system. Fig. 1 illustrates our study design. Table 1
ratio in primary human gingiva fibroblasts (HGFs). summarizes key information for the obtained WES datasets,
A popular method for SNP discovery is the Genotype including sequencing platform, average read length, bioinfor-
Analysis ToolKit (GATK) best practices workflow (DePristo matics tools used for mapping and variant calling, as well as
et al. 2011). It contains a standard variant quality filter average depth (including standard deviation) for each sample
(VQSR) that removes low quality variants from the dataset after variant calling.
but does not take genotype quality (GQ) into account. GQ
Reference sample
is a Phred-scaled value representing the degree of confi-
To compute the reference Ti/Tv ratio from variants from
dence that the called genotype is the true genotype, with
human genome and exome samples, variant data from the
higher values reflecting more accurate genotype calls.
third phase of the 1000 Genomes Projects were downloaded
Depth (DP) values represent the number of reads passing
(Auton et al. 2015). In the third phase of the 1000 Genome
the quality control used to calculate the genotype at a spe-
Projects, genotypes have been determined by use of whole-
cific site in a specific sample. Higher DP values generally
genome sequencing and WES in parallel. A subpopulation
lead to more accurate genotype calls. Filtering steps based
of 49 Utah residents with Northern and Western European
on these metrics improve the overall variant calling quality
ancestry (CEU) was selected to investigate the Ti/Tv ratio.
(Carson et al. 2014).
We used VCFtools for selecting the genotypes from the ref-
Therefore, in order to achieve high whole exome se-
erence dataset (Danecek et al. 2011). Sample IDs are listed
quencing (WES) data quality using Ti/Tv, while at the same
in the supplementary material (please see Supplemental Dig-
time being able to investigate radiation-specific effects on
ital Content 1, http://links.lww.com/HP/A183). We excluded
the Ti/Tv ratio for different radiation doses, a filtering strat-
the genetic variants lying outside the defined exome target to
egy was developed: in addition to VQSR, this strategy also
calculate the exome specific Ti/Tv ratio. The exome coordi-
takes GQ and DP into account using a coverage filter based
nates were downloaded from ftp://ftp.1000genomes.ebi.ac.
on these genotype-level quality metrics and the Ti/Tv ratio
uk/vol1/ftp/technical/reference/exome_pull_down_targets/.
as additional quality control to obtain a high quality variant
set. To determine the best parameters for DP and GQ em- Bioinformatics workflow
pirically, the control dataset was used; i.e., only those First, trimmomatic was used to remove adapters from
cells that had not been exposed to radiation. The parame- the reads and to distinguish low-quality reads in the dataset
ters determined by this procedure were then applied to the (Bolger et al. 2014). Second, TopHat (for NextSeq and S5
whole dataset. datasets) or Lifescope (for SOLiD dataset) was used to map
After that, a sample filter removing SNVs present in reads to the human genome reference sequence (hg19).
the control dataset (for further details see Nath et al. 2018) Third, the mapped exome datasets were preprocessed using
was employed to distinguish radiation-specific Ti/Tv changes packages from the GATK and Picard (McKenna et al. 2010;
in samples exposed to varying doses of radiation. Broad Institute 2018). Reads originating from a single DNA
To the best of our knowledge, this is the first study in- fragment were marked as duplicates using the MarkDuplicate
vestigating changes in the Ti/Tv ratio of irradiated HGFs. tool. The reads were then assigned to read groups and to
www.health-physics.com

Copyright © 2020 Health Physics Society. Unauthorized reproduction of this article is prohibited.
Ionizing radiation alters the Ti/Tv ratio c N. NATH ET AL. 111

Fig. 1. Study workflow: HGFs were exposed to different x-ray doses (0.5, 2, and 10 Gy), and subsequently cultured for 16 h (repair interval) before
whole exome sequencing was performed and subsequent data analysis carried out. For comparison, publicly available datasets were used (“Ref-
erence data”) and literature review was performed.

sample identifiers using AddOrReplaceReadGroups. After- For further analysis, only SNVs with heterozygous (0/1) and
ward, reads were locally realigned using RealignerTargetCreator homozygous (1/1) genotype were considered.
and IndelRealigner in order to minimize the number of
mismatching bases that could have appeared due to the Coverage filter. Variants were further filtered based on
presence of insertions and deletions (InDels) in the dataset. the quality metrics DP and GQ. The absolute minimum re-
Subsequent quality scores were optimized with VQSR, which quirement to call a heterozygous variant is three reads, two
applies machine learning to model errors empirically and ad- of which show the same non-reference base (in order to ex-
justs the quality scores accordingly. Lastly, HaplotypeCaller clude sequence artefacts affecting only one read); hence, the
was used for variant calling, including SNVs and InDels. lowest value for DP was set at 3. All datasets were analyzed

Table 1. Information related to the exome dataset used in this study.

Sequencing platform SOLiD 5500 xl NextSeq r1 NextSeq r2 Ion S5 XL


(Illumina, San (Illumina) (Thermo Fisher
Diego, CA) Waltham, MA)
Bioinformatics workflow Lifescope/Cashaw -> Tophat -> Tophat -> Tophat ->
(mapping and HaplotyperCaller HaplotypeCaller HaplotypeCaller HaplotypeCaller
variant calling)
Average ReadLength 75 bp forward reads 150 bp 150 bp 200 bp
and 35 reverse read
Sequencing read type Paired end Paired end Paired end Single end
Radiation dose in Gy
(16 h repair interval)
DP average (±stdev)
0 Gy 35.3 (± 37.5) 14.3 (± 25.3) 34.2 (± 38.5) 3.3 (± 1.7)
0.5 Gy 47.5 (± 43.7) 16.3 (± 28.3) 39.6 (± 46.2) 3.9 (± 2.3)
2 Gy 44.7 (± 43.8) 16.1 (± 26.7) 47.8 (± 66.8) 2.7 (± 1.2)
10 Gy 71.1 (± 56.1) 17.9 (± 30.1) 39.6 (± 49.1) 3.3 (± 1.7)

www.health-physics.com

Copyright © 2020 Health Physics Society. Unauthorized reproduction of this article is prohibited.
112 Health Physics July 2020, Volume 119, Number 1

using coverage/quality filter settings with DP values rang- the control datasets. Over all filtering settings, the mean
ing from 3 to 20 in combination with GQ values of 10, 13, Ti/Tv values of the control exome datasets ranged between
and 20. 2.32 (±0.02) and 2.42 (±0.06) for the NextSeq and SOLiD
platforms, while the low coverage dataset (S5) showed a
Determination of Ti/Tv ratio. Ti/Tv ratios are only an
mean Ti/Tv ratio of 1.39 (±0.7) (Fig. 2b). The observation
approximate measure of quality so that a certain range of Ti/Tv
pertaining to the low coverage dataset is due to the decreasing
ratios are associated with lower false positives; for example,
numbers of variants that remain at filter settings with increas-
the Ti/Tv ratios in high quality exome variant datasets range
ing stringency (compare Fig. 3).
between 2.40 and 3.23 (see Table 2 for reference). Ti/Tv ra-
The Ti/Tv ratios of variants with a low stringency filter-
tios for on-target variants were computed using the follow-
ing setting (DP 3, GQ 20) yielded values from 2.2 to 2.6 and
ing formula:
were thus comparable to literature findings (Table 2). For
count ðchanges ðA↔G jC↔T ÞÞ data from the NextSeq and S5 platforms, an average in-
Ti=Tv ¼ : ð1Þ
countðchanges ðA↔T =CjG↔T =C ÞÞ crease in Ti/Tv ratio by 0.1 and 0.02, respectively, was ob-
served, and thus higher quality variant calls. The SOLiD
Sample filter. All variants that were present in the non- dataset was not affected by low stringency filtering and
irradiated control samples were removed from the irradiated did not show any changes in Ti/Tv values.
samples. Variants that were identified in the non-irradiated Fig. 3 shows the Ti/Tv ratios and numbers of variants
controls were assumed not to be IR-induced (Nath et al. 2018). for different coverage filter settings for all datasets. The dif-
ferences between irradiated and control samples were mini-
RESULTS mal, indicating that the coverage filter was suitable to use on
datasets from IR experiments.
Literature review for Ti/Tv ratios Using the Ti/Tv ratio as a quality measure, we found
First, a review of the pertinent literature was performed,
that our filter based on GQ and DP metrics improved the
which included a wide range of information on Ti/Tv from
overall quality of our variant call dataset. A coverage filter
2011 to 2018. The reported Ti/Tv ratios for exome data
with DP 3 and GQ 20 was sufficient to obtain high quality
range between 2.4 and 3.23 (Table 2). Some of the studies
variant calls, allowing the inclusion of low coverage datasets
included Ti/Tv ratios for whole genome datasets as well,
in the analysis in principle.
where these values ranged from 2.00 and 2.15. This is in
concordance with previous observations that the Ti/Tv ratio Suitability of quality filtering for data from different
is strongly influenced by the selected genome target region. sequencing platforms
While only two of the studies used a sequencing platform While it could be shown that the filtering procedure
other than Illumina, the reported Ti/Tv ratios for both ge- can be used for data from all three sequencing platforms,
nome and exome are similar to the values reported by the platform-specific differences were nevertheless visible (Figs. 2
Illumina-based studies. This suggests that the Ti/Tv ratio and 3). SOLiD data showed the least variability in response
is mostly platform-independent. Therefore, the Ti/Tv ratio to changes in filtering conditions with regard to Ti/Tv values
can be considered to be a stable indicator for quality assess- as well as to the number of variants (Fig. 2). This points to
ment in SNV datasets. a particularly low number of false positive calls in these
Distribution of Ti/Tv in public reference datasets datasets. For NextSeq data, the Ti/Tv ratio increased after
In addition to our literature review (Table 1), 49 exome applying a low stringency filter to the unfiltered dataset
datasets from phase three of the 1000 Genome Project (Fig. 3), but any stricter filtering did not change the Ti/
(Auton et al. 2015) were downloaded and analyzed, reveal- Tv ratio further. The number of variants, however, decreased
ing Ti/Tv ratios between 2.6 and 2.8 (Fig. 2a). This is also slightly. In contrast, the low coverage dataset (S5 dataset) un-
within the range of previously observed Ti/Tv values from surprisingly showed more pronounced changes in the Ti/Tv
the literature (Table 2). ratio when stricter filtering conditions were applied. This
was due to the small number of variants with high coverage
Ti/Tv based quality assessment in study datasets and thus rapidly decreasing number of overall variants con-
In order to empirically determine suitable coverage fil-
tributing to the Ti/Tv calculation with increasing coverage fil-
ter parameters for the exome sequences of primary HGFs,
ter stringency.
the Ti/Tv ratios for non-irradiated control samples (“0 Gy”,
Fig. 2b) were calculated for a range of DP and GQ values. Distribution of Ti/Tv for before and after
Based on minimum requirements for heterozygous variant sample filtering
calling and overall read coverage, DP values from 3 to 20 In order to study whether IR has an influence on the
and GQ values of 10, 13 and 20 were selected. Fig. 2b dis- Ti/Tv ratio, coverage filtering with the optimized settings
plays the distribution of Ti/Tv ratios for all filter settings in of DP 3 and GQ 20 was performed. The genotypes
www.health-physics.com

Copyright © 2020 Health Physics Society. Unauthorized reproduction of this article is prohibited.
Table 2. Literature review of reported Ti/Tv values from various studies published between 2011 and 2018, including information on sequencing platforms, variant calling, and filtering prac-
tices as well as sample providence is reported.
Ti/Tv ratio
Whole
Publication Genome Exome SNP calling SNP filtering Data source Sequencing platform Samples Population

Gudbjartsson et al. 2015 2.14 2.41 GATK 2.3.9 GATK best practices, other, sequenced for study Illumina 2636 Icelandair
simple tandem repeats
Tennessen et al. 2012 n.a. 2.93 glfMultiples read depth, coverage, sequenced for study Illumina 2240 European (n = 1351),
quality and others African (n = 1088)
Stubbs et al. 2012 2.07 2.99 Complete Genomics Complete Genomics sequenced for Complete Genomics 31 European
(proprietary software) study/database
Gioia et al. 2018 2.10 n.a. GATK haplotype caller GATK variant quality score sequenced for study Ilumina 1 European
recalibration (VQSR) scheme
Wang et al. 2015 2.06 2.81 GATK Unified Genotyper GATK best practices 1000 Genomes Phase 1 Illumina, SOLiD 1092 Mixed
GATK quality score recalibration,
Bainbridge et al. 2011 2.00 3.00 GATK minimum quality score, sequenced for study SOLiD, Illumina 5 Hispanic, European

www.health-physics.com
coverage, mapping quality
Zook et al. 2014 2.04 2.60 GATK UnifiedGenotyper VQSR, manual curation of 1000 Genomes, Illumina 1 European (HapMap
and HaplotypeCaller v2.6 systematic sequencing errors Broad (subset) NA12878)
DePristo et al. 2011 2.15 (2.05) 3.27 (2.57) GATK GATK best practices (article 1000 Genomes Illumina 1 European (HapMap
establishes workflow) NA12878)
Ionizing radiation alters the Ti/Tv ratio c N. NATH ET AL.

Tang et al. 2016 n.a. 2.48 to 2.56 GATK 2.8 GATK best practices, VQSR sequenced for study Illumina 72 Australian-aboriginal
Guo et al. 2012 n.a. 2.81 GATK Unified Genotyper genotype quality (GQ) and depth, sequenced for study Illumina 22 Asian
n.a. 2.30 GATK Unified Genotyper using Ti/Tv ratio close to sequenced for study Illumina 6 Asian
expected values
n.a. 3.23 GATK Unified Genotyper 1000 Genomes (Pilot 3) Illumina 6 Caucasian

Copyright © 2020 Health Physics Society. Unauthorized reproduction of this article is prohibited.
113
114 Health Physics July 2020, Volume 119, Number 1

Fig. 2. (a) Boxplot showing the Ti/Tv ratios in 49 public exome datasets from the 1000-Genome Project (Auton et al. 2015) central European pop-
ulation sample (box plot) and the unfiltered Ti/Tv values of non-irradiated samples from this study (symbols as indicated to the right of the plot); (b)
boxplot representing the distribution of TiTv ratios that were obtained after different filter cutoffs for depth (DP) and genotype quality (GQ), on
y-axis. On x-axis different sequencing platforms (NextSeq, SOLiD, ION S5) used for generating WES data for non-IR samples.

identified in control samples were then subtracted from the Interestingly, there was no consistent irradiation dose
datasets of irradiated samples in the sample-filtering step dependence of SNV counts. However, a reduction in
(Nath et al. 2018). This led to a removal of approximately Ti/Tv ratio from 2.2–2.5 to 1.0–1.9 was observed for all
80% of SNVs from all exome datasets (Table 3). samples and for all sequencing platforms (Fig. 4), which

Fig. 3. Heatmap representing the Ti/Tv ratios for different depth (DP) and genotype quality (GQ) filtering cutoffs in exome sequencing datasets
from three different sequencing platforms (facets) after treatment with different doses of IR and a subsequent repair interval of 16 h. Heatmap col-
oring indicates Ti/Tv ratios of 2–2.5, variant numbers after filtering are represented by the size of the circle.
www.health-physics.com

Copyright © 2020 Health Physics Society. Unauthorized reproduction of this article is prohibited.
Ionizing radiation alters the Ti/Tv ratio c N. NATH ET AL. 115

Table 3. SNV counts before and after sample filtering in exome data sets used in this study.

Sequencing machine SOLiD 5500 xl NextSeq r1 NextSeq r2 Ion S5 System


Genetic Analyzer (Illumina) (Illumina) (Thermo Fisher)
Bioinformatics workflow Lifescope -> Tophat -> Tophat -> Tophat ->
(mapping and HaplotyperCaller HaplotypeCaller HaplotypeCaller HaplotypeCaller
variant calling)
Radiation dose in Gy
(16 h repair interval)
SNV counts
before / after 0.5 Gy 28,032 / 1,487 154,516 / 38,838 115,288 / 14,068 337,693 / 9,940
sample filter
2 Gy 27,601 / 1,260 144,836 / 30,355 143,110 / 37,499 21,091 / 2,731
10 Gy 28,996 / 2,499 157,003 / 38,966 120,231 / 20,685 29,786 / 5,836

indicates an increase in the number of transversions GATK HaplotypeCaller variant calling is sufficient for
among the IR-induced SNVs. quality filtering of high throughput sequencing data for
the analysis of IR-induced sequence alterations. The results
were platform-independent and allowed the inclusion of a
DISCUSSION low coverage dataset in the analysis.
In this study, we analyzed the IR-induced mutational Comparison of the Ti/Tv ratios for exome data from
spectrum of IR-exposed HGFs by WES. We show that a this study with previous findings (Table 2) and the results
combination of a DP value of 3 and a GQ value of 20 after obtained from publicly available datasets (Fig. 2a) showed
that our dataset yielded similar results (Fig. 3). The large
range for Ti/Tv values in the previous studies involving ex-
ome data can partly be explained by the different parameter
settings used for the respective bioinformatics workflows,
the varying filtering procedures applied, and/or the nature
of the investigated variants. Some studies included values
for previously unobserved and thus unconfirmed variants,
which could have resulted in lower Ti/Tv ratios as compared
to datasets comprising mostly or exclusively known variants.
In such cases, the authors assumed that this effect was the
result of sequencing or variant calling errors among the pre-
viously unconfirmed variants. Other variables include the
regions for which the Ti/Tv ratio was calculated (e.g., exonic
or consensus coding sequence), the data source and the num-
ber of samples included in the respective study.
Nonetheless, our findings for untreated as well as for
irradiated HGFs prior to sample filtering (i.e., the removal
of SNVs that are already present in untreated cells) agree
with these previous results.
Due to the high sequencing accuracy of the SOLiD
platform, datasets generated with the SOLiD sequencer were
already at a high quality level so that coverage filtering had
little impact on data quality, whereas data from the NextSeq
sequencer were improved. Filtering the data from the S5
sequencer produced less consistent results, which can be
explained by the low overall coverage of this dataset. It
can be concluded that the choice of sequencing technique
has an influence on the filtering strategy required.
Our WES results showed no consistent correlation be-
Fig. 4. Ti/Tv ratio in for non-IR and IR-exposed samples before and
after sample filtering for three different sequencing platforms. After
tween radiation dose and SNV count. One possible explana-
sample filtering, DP=3 and GQ=20 were selected as a coverage filter tion for this might be a DNA repair bias at euchromatic
parameters. (transcribed) loci, where DNA repair mechanisms are highly
www.health-physics.com

Copyright © 2020 Health Physics Society. Unauthorized reproduction of this article is prohibited.
116 Health Physics July 2020, Volume 119, Number 1

efficient, while in heterochromatic regions this seems to be transversions among IR-induced SNVs and thus might imply
less so. As WES focuses on transcribed areas of the genome, that MMR plays a role in the cellular damage response to
it misses untranscribed and thus less efficiently repaired IR-induced DNA lesions.
areas with a higher variant load. Hence, the variant counts
observed by our exome-based analysis may not include all Acknowledgments—We thank Jessica Müller, Christian Sperling, and Corinna
IR-induced changes. Jensen for excellent technical help. AWK received funding for this project
through the German Federal Ministry of Defense (E/U2AD/CF520/DF554).
After removal of all SNVs present in untreated cells, a
reduction in Ti/Tv ratios was observed. This result was
platform-independent and could be reproduced in all inves-
REFERENCES
tigated biological replicates. At this point, however, the
Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel
lower coverage dataset (S5) was less conclusive than those JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR. A
from the other two platforms, showing that for this kind of global reference for human genetic variation. 1000 Genomes
analysis a higher coverage should be aimed for, if possible. Project Consortium. Nature 526(7571):68–74; 2015.
The lower Ti/Tv ratio for IR-induced SNVs may in part be Bainbridge MN, Wang M, Wu Y, Newsham I, Musny DM, Jeffries
JL, Albert TJ, Burgess DL, Gibbs RA. Targeted enrichment be-
explained by a relatively small proportion of transitions aris- yond the consensus coding DNA sequence exome revels exons
ing from spontaneous deamination of 5-methylcytosine resi- with higher variant densities. Genome Biology 12(7):1–12; 2011.
dues. These are at least partially responsible for the normally Behjati S, Gundem G, Wedge DC, Roberts ND, Tarpey PS, Cooke
observed surplus of transitions, and thus higher Ti/Tv values, SL, Van Loo P, Alexandrov LB, Ramakrishna M, Davies
HM, Stebbings L, Menzies A, Jones D, Shepherd R, Butler
in unirradiated cells. Our experiments were carried out with AP, Teague JW, Jorgensen M, Khatri Bhavisha PN, Shlien
a 16-h repair interval, so that the cells had little time to ac- A, Futreal P, Colin S, Eales RA, Easton D, Foster C, Neal
cumulate transitions of this kind. Still, in all cases the results DE, Brewer DS, Hamdy F, Lu YJ, Lunch AG, Massi CE,
clearly demonstrate that IR leads to a relative increase in the Ng A, Whitaker HC, Yu Y, Zhang H, Bancroft E, Berney
D, Camacho N, Corbishley C, Dadaev T, Dennis N,
number of transversions among the observed IR-induced Dudderidge T, Edwards S, Risher C, Ghori J, Gnanapragasam
SNVs. This corresponds with recent studies on plant ge- VJ, Greenman C, Hawkins S, Hazell S, Howat W, Karaszi K,
nomes and transcriptomes where, after neutron irradiation, Kay J, Kote-Jarai Z, Kremeer B, Livni N, Luxton H, Matthews
a lower Ti/Tv ratio was observed in IR-induced mutations L, Mayer E, Merson S, Nicol D, Ogden C, O’Meara S, Pelvender
G, Shah NC, Tavare S, Thomas S, Thompson A, Verrill C,
as compared to spontaneous mutations (Zhou et al. 2019; Warren A, Zamora J, McDermott U, Bova GS, Richardson
Li et al. 2016; Shirasawa et al. 2019). AL, Adrienne L, Adrienne JF, Stratton MR, Campbell PJ. Mu-
Our finding is also in agreement with the earlier findings tational signatures of ionizing radiation in second malignan-
of Yuan et al. (1995) in murine cells and could point to a more cies. Nature Communications 7:12605; 2016.
Belfield EJ, Gan X, Mithani A, Brown C, Jiang C, Franlin K,
prominent contribution of the mismatch repair (MMR) system Alvey E, Wibowo A, Jung M, Bailey K Kalwani S, Ragousis
to the IR-induced DNA-damage response in mammalian cells J, Mott R, Harberd NP. Genome-wide analysis of mutations
than previously assumed: MMR strongly favors the repair of in mutan lineages selected following fast-neutron irradiation
transitions over transversions (Lujan et al. 2012), which could mutagenesis of Arabidopsis Thaliana. Genome Research 22(7):
2114–21220; 2014.
account for an increase in the number of transversions. Still, Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trim-
very little is known so far about a possible involvement of mer for illumine sequence data. Bioinformatics 39(15):
the MMR system in the context of IR-induced DNA damage 2114–2220; 2014.
(Martin et al. 2010). In this respect, it is of note that the upreg- Broad Institute. Picard Toolkit, GitHub Repository. Cambridge,
MA: Broad Institute; 2018. Available at http://broadinstitute.
ulation of the MMR system in inhabitants of the high natural github.io/picard/. Accessed on 30 April 2020.
background radiation area of Kerala was published while this Carson AR, Smith EN, Matsui H, Braekkan SK, Jepsen K,
paper was in press (Bakhtiari et al. 2019). (https://www.ncbi. Hansen JB, Frazer KA. Effective filtering strategies to improve
nlm.nih.gov/pubmed/31323602). Therefore, additional stud- data quality from population-based whole exome sequencing
studies. BMC Bioinformatics 15(1):125; 2014.
ies are required to further elucidate this question. Danecek P, Auto A, Abecasis G, Albers CA, Banks E, DePristo
MA, Hansaker RE, Luner G, Marth GB, Sherry ST, McVean
G, Durbin R. The variant call format and VCFtools. Bio-
CONCLUSION iformatics 27(15):2156–2158; 2011.
DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR,
Taken together, our results provide strong evidence that Hartl C, Philippakis AA, Del Angel G, Rivas MA, Hanna M,
a coverage filter setting of DP 3 and GQ 20 is sufficient for McKenna A, Fennel TJ, Kernytsky AM, Altshuler D, Daly
high quality SNV calling in datasets from IR experiments MJ. A framework for variation discovery and genotyping using
and that Ti/Tv ratios are a consistent and useful indicator next-generation DNA sequencing data. Nature Genetics 43:
491–498; 2011.
for data quality assessment for all tested NGS platforms. Fur- Gioia L, Siddique A, Head SR, Salomon DR, Su AI. A genome-
thermore, we report a drop in Ti/Tv ratios in IR-induced mu- wide survey of mutations in the Jurkat cell line. BMC Geno-
tations in HGFs, which points to an elevated proportion of mics 19(1):334; 2018.
www.health-physics.com

Copyright © 2020 Health Physics Society. Unauthorized reproduction of this article is prohibited.
Ionizing radiation alters the Ti/Tv ratio c N. NATH ET AL. 117

Gudbjartsson DF, Helgason H, Gudjonsson SA, Zink F, Oddson A, Shirasawa K, Hirakawa H, Nunome T, Tabata S, Isobe S.
Gylfason A, Hjartarson E, Sigurdson GTh, Stacey SN, Frigge Genome-wide survey of artificial mutations induced by ethyl
ML, Holm H, Saemundsdottir J, Gunnlaugur TG, Sverrison methanesulfonate and gamma rays in tomato. Plant Biotech J
JTh, Gretarsdottir S, Walters GB, Rafnar T, Thjodeifsson B, 14(1):51–60: 2019.
Bjornsson ES, Olafsson S, Thorindksottir H, Steingrimsdottir Stubbs A, McClellan EA, Horsman S, Hiltermann SD, Palli I,
T, Gudmundsdottir TS, Bjornsdottir G, Jonsson JG, Sigurdson Nouwens S, Koning AHJ, Hoogland F, Reumers J, Hejsman
A, Bjornsdottir G, Jonsson JJ, Thorarensen O, Ludvigsson P, D, Swagemakers S, Kremer A, Meijerink J, Lambrechts D,
Gudbjartsson H, Eyjolfsson GL, Sigurdardottir O, Olafsson I, van der Spek PJ. Huvariome: a web server resource of whole
Arnar DO, THorsteindsottir U, Helgason A, Sulem P, Stefansson genome next-generation sequencing allelic frequencies to aid
K. Large-scale whole-genome sequencing of the Icelandic pop- in pathological candidate gene selections. J Clin Bioinformat-
ulation. Nature Genetics 47:435; 2015. ics 2(1):1–14; 2012.
Guo Y, Long J, He J, Li CI, Cai Q, Shu XO, Zheng W, Li C. Ex- Sutherland BM, Bennett PV, Sidorkina O, Laval J. Clustered DNA
ome sequencing generates high quality data in non-target re- damages induced in isolated DNA and in human cells by low
gions. BMC Genomics 13(1):194; 2012. doses of ionizing radiation. Proc Natl Acad Sci USA 97(1):
Li G, Chern M, Jain R, Martin JA, Schackwitz WS, Jiang L, Vega- 103; 2000.
Sánches ME, Lipsen AM, Barry KW, Schmutz J, Ronald PC. Tang D, Anderson D, Francis RW, Syn G, Jamieson SE, Lassmann
Genome-wide sequencing of 41 rice (Oryza Sativa L.) mutated T, Blackwell JM. Reference genotype and exome data from an
lines reveals diverse mutations induced by fast-neutron irradi- Australian Aboriginal population for health-based research.
ation. Molecular Plant 9(7):1078–1081; 2016. Scientific Data 3:160023; 2016.
Lomax ME, Folks LK, O’Neill P. Biological consequences of ra- Tennessen JA, Bigham AW, O’Connor TD, Fu W, Kenny EE,
diation-induced DNA damage: relevance to radiotherapy. Adv Gravel S, McGee S, Do R, Liu X, Jun G, Kang HM, Jordan
Clin Radiobiol 25(10):578–585; 2013. D, Leal SM, Gabriel S, Rieder MJ, Abecasis G, Altshuler D,
Lujan SA, Williams JS, Pursell ZF, Abdulovic-Cui AA, Clark AB, Nickerson DA, Boerwinkle E, Sunyaev S, Bustamante CD,
Nick McElhinny SA, Kunkel TA. Mismatch repair balances Bamshad MJ, Akey JM. Evolution and functional impact of
leading and lagging strand DNA replication fidelity. PLOS Ge- rare coding variation from deep sequencing of human exomes.
netics 8(10):e1003016; 2012. Science 337(6090):64; 2015.
Martin LM, Marples B, Coffey M, Lawler M, Lynch TH, Hollywood Wang J, Raskin L, Samuels DC, Shyr Y, Guo Y. Genome mea-
D, Marignol L. DNA mismatch repair and the DNA damage re- sures used for quality control are dependent on gene function
sponse to ionizing radiation: making sense of apparently con- and ancestry. Bioinformatics 31(3):318–232; 2015.
flicting data. Cancer Treatment Rev 36(7):518–527; 2010. Weissmann R, Kacprowski T, Peper M, Esche J, Jensen LR, van
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Diepen I, Port M, Kuss AW, Scherthan H. Transcriptome alter-
Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, ations in x-irradiated human gingiva fibroblasts. Health Phys
DePristo MA. The Genome Analysis Toolkit: a MapReduce 111:75–84; 2016.
framework for analyzing next-generation DNA sequencing data. Yuan J, Yeasky TM, Rhee MC, Glazer PM, Frequent T:A->G:C
Genomc Research 20(9):1297–1303; 2010. transversions in x-irradiated mouse cells. Carcinogenesis 16(1):
Nath N, Esche J, Muller J, Jensen LR, Port M, Stanke M, Kaderali L, 83–88; 1995.
Scherthan H, Kuss AW. Exome sequencing discloses ionizing- Zook JM, Chapman B, Wang J, Mittelman D, Hoffmann O, Hide
radiation-induced DNA variants in the genome of human gin- W, Salit M. Integrating human sequence data sets provides a re-
giva fibroblasts. Health Phys 115(1):151–160; 2018. source of benchmark SNP and indel genotype calls. Nature
Rodgers K, McVey M, Error-prone repair of DNA double-strand Biotech 32:246–251; 2014.
breaks. J Cell Phys 23(1):15–24; 2016. ■■

www.health-physics.com

Copyright © 2020 Health Physics Society. Unauthorized reproduction of this article is prohibited.

You might also like