Occupational Benzene Exposure, Metabolism, and Health Impacts

Occupational Exposure Assessment, Metabolism, and Health Effects of Benzene Exposure
BY
MADHAWA SARANADASA
B.A., University of Pennsylvania, 2008
M.S., Vanderbilt University, 2012
THESIS
Submitted as partial fulfillment of the requirements

for the degree of Doctor of Philosophy in Public Health Sciences
in the Graduate College of the
University of Illinois at Chicago, 2020
Chicago, Illinois
Defense Committee:
Leslie Stayner, Chair and Advisor

Maria Argos, Epidemiology and Biostatistics
Mary E. Turyk, Epidemiology and Biostatistics
Lorraine Conroy, Environmental and Occupational Health and Safety
R. Jeffrey Lewis, ExxonMobil Biomedical Sciences, Inc
ACKNOWLEDGMENTS
I am extremely grateful for the guidance and support of my committee members: Drs.
Leslie Stayner, Maria Argos, Jeff Lewis, Mary Turyk and Lorraine Conroy. I would especially
like the thank Leslie, Jeff, and Maria for each providing me with wonderful opportunities that
have shaped my career.
I also want to thank Aria for her amazing support, encouragement and positive attitude
that helped guide me through my thesis.
Finally, I want to thank my family: Cooper, Nishada, mom, and dad. This thesis is
dedicated to mom and dad who have done so much to support me.
ii
TABLE OF CONTENTS
CHAPTER PAGE
I. INTRODUCTION .................................................................................................................. 1
A. Overview ................................................................................................................................... 1
B. Production and Uses .................................................................................................................. 1
C. Sources of Exposure .................................................................................................................. 2
D. Health Effects ............................................................................................................................ 3
E. Mechanisms of Toxicity ............................................................................................................ 5
F. Specific Aims ............................................................................................................................ 7
1. Aim 1 ............................................................................................................................ 7
2. Aim 2 ............................................................................................................................ 7
3. Aim 3 ............................................................................................................................ 8
II. EXPOSURE RECONSTRUCTION IN A COHORT OF REFINERY WORKERS ....... 9

A. Background ............................................................................................................................... 9
B. Methods ................................................................................................................................... 11
1. Exposure Data ............................................................................................................ 11
2. Work History Data ..................................................................................................... 12
3. Data Linkage .............................................................................................................. 12
4. Exposure Reconstruction............................................................................................ 14
a. Data Aggregation .......................................................................................... 14
b. Model Building ............................................................................................. 14
c. Exposure Prediction ...................................................................................... 15
C. Results ..................................................................................................................................... 15
1. Description of Linked Data ........................................................................................ 15
2. Model Building And Predictions ................................................................................ 22
D. Discussion ............................................................................................................................... 25
III. METABOLISM OF OCCUPATIONAL BENZENE EXPOSURE ................................. 31

A. Background ............................................................................................................................. 31
B. Methods ................................................................................................................................... 34
1. Study Population ........................................................................................................ 34
2. Exposure Assessment ................................................................................................. 34
3. Urinary Metabolites.................................................................................................... 35
4. Genetic Polymorphisms ............................................................................................. 35
5. Other Study Variables ................................................................................................ 35
6. Statistical Analysis ..................................................................................................... 36
C. Results ..................................................................................................................................... 38
D. Discussion ............................................................................................................................... 51
IV. HEMATOLOGIC EFFECTS OF ENVIRONMENTAL BTEXS EXPOSURE ............. 56

A. Background ............................................................................................................................. 56
B. Methods ................................................................................................................................... 58
1. Study Population ........................................................................................................ 58
2. Assessment of VOC exposure .................................................................................... 58
3. Assessment of Blood Parameters ............................................................................... 58
4. Assessment of Covariates........................................................................................... 59
iii
TABLE OF CONTENTS (CONTINUED)
CHAPTER PAGE
5. Statistical Analysis ..................................................................................................... 59
C. Results ..................................................................................................................................... 61
D. Discussion ............................................................................................................................... 83
V. DISCUSSION ........................................................................................................................ 90
A. Summary and Discussion of Aims .......................................................................................... 90
B. Conclusions ............................................................................................................................. 94
CITED LITERATURE .............................................................................................................. 95
APPENDICES ........................................................................................................................... 101

APPENDIX A .............................................................................................................................. 102
APPENDIX B .............................................................................................................................. 103
APPENDIX C .............................................................................................................................. 105
VITA ........................................................................................................................................ 106
iv
LIST OF TABLES
TABLE PAGE
I. WORK HISTORY CHARACTERISTICS OF THE FOUR REFINERY

STUDY ..................................................................................................................16
II. TOTAL NUMBER OF EXPOSURE RECORDS AVAILABLE FOR EACH

STANDARDIZED JOB TITLE ............................................................................19
III. SINGLE NUCLEOTIDE POLYMORPHISMS ASSOCIATED WITH

BENZENE METABOLISM ..................................................................................36
IV. CHARACTERISTICS OF THE STUDY SAMPLE .............................................39
V. DISTRIBUTION OF GENETIC POLYMORPHISMS ........................................40
VI. GEOMETRIC MEANS OF EXPOSURE AND METABOLITES BY SEX

AND BMI CATEGORY........................................................................................42
VII. GEOMETRIC MEANS OF EXPOSURE AND OUTCOMES BY SERUM

COTININE AND ALCOHOL USE ......................................................................43
VIII. PARAMETER ESTIMATES FROM LINEAR REGRESSION MODELS .........46
IX. DETERMINATION OF SAMPLE SIZE ..............................................................61
X. CHARACTERISTICS OF THE STUDY SAMPLE .............................................62
XI. GEOMETRIC MEANS OF EXPOSURES BY SELECTED COVARIATES .....65
XII. ARITHMETIC MEANS OF OUTCOME BY SELECTED COVARIATES .......66
v
LIST OF FIGURES
FIGURE PAGE
1. Overview of benzene metabolism ..........................................................................6
2. Exposure reconstruction overview .......................................................................13
3. Total number of exposure records available over the study duration ..................18
4. Selected standardized job title-specific exposure profiles over the

study duration .......................................................................................................20
5. Observation time and exposure records by standardized job title ........................21
6. Proportion of variability exposure by model with different

specifications of calendar time .............................................................................23
7. Model predictions and exposure data from selected standardized job

titles ......................................................................................................................24
8. Hypothetical career timeline and exposure reconstruction ..................................26
9. Distribution of cumulative exposure estimates ....................................................27
10. Overview of benzene metabolism ........................................................................32
11. Distributions of exposure and outcome ................................................................41
12. Scatterplots of metabolite concentrations versus benzene exposure ....................45
13. Dose-response curves predicted from linear models ............................................47
14. Genetic interactions in dose-response curves predicted from linear

models ..................................................................................................................48
15. Benzene exposure by muconic acid deciles .........................................................49
vi
LIST OF FIGURES (CONTINUED)
FIGURE PAGE
16. Distributions of exposure and outcome at low exposure .....................................50
17. Dose-response curves at low dose benzene exposure ..........................................51
18. Distributions of BTEXS blood concentrations .....................................................63
19. Distributions of blood parameters ........................................................................64
20. Correlation between BTEXS elements.................................................................68
21. Correlation between blood parameters .................................................................69
22. Single-exposure linear regression results for white blood cells ...........................70
23. Single-exposure linear regression results for red blood cells ...............................70
24. Single-exposure linear regression results for hemoglobin ...................................71
25. Single-exposure linear regression results for platelet count .................................71
26. Multiple-exposure linear regression results for white blood cells .......................72
27. Multiple-exposure linear regression results for red blood cells ...........................72
28. Multiple-exposure linear regression results for hemoglobin ................................73
29. Multiple-exposure linear regression results for platelet count .............................73
30. Association between BTEXS and blood parameters assessed by

qgComp ................................................................................................................75
vii
LIST OF FIGURES (CONTINUED)
FIGURE PAGE
31. Association between BTEXS and WBC count assessed by BKMR ....................76
32. Association between BTEXS and platelet count assessed by BKMR .................77
33. Association between BTEXS and RBC count assessed by BKMR .....................78
34. Association between BTEXS and hemoglobin assessed by BKMR ....................80

qgComp among nonsmokers ................................................................................81

qgComp among current smokers ..........................................................................82
37. Association between BTEXS and hematologic parameters assessed

by BKMR among nonsmokers .............................................................................84
38. Association between BTEXS and hematologic parameters assessed

by BKMR among smokers ...................................................................................85
viii
LIST OF ABBREVIATIONS
ACGIH American Conference of Governmental Industrial Hygienists

AIHA American Industrial Hygiene Association
ANOVA Analysis of variance
BKMR Bayesian kernel machine regression
BMI Body mass index
BTEXS Benzene, toluene, ethylbenzene, xylene, and styrene
CA Catechol
CV Coefficient of variation
CYP2E1 Cytochrome P450 2E1
DNA Deoxyribonucleic acid
GC-MS Gas chromatography-mass spectroscopy
GSTT1 Glutathione S-transferase theta-1
GuLF Gulf Long-term Follow-up
HQ Hydroquinone
IARC International Agency for Research on Cancer
LOD Limit of detection
MA Muconic acid
MPO Myeloperoxidase
NCI-CAPM National Cancer Institute and the Chinese Academy of Preventive
Medicine
NHANES National Health and Nutrition Examination Survey
NIOSH National Institute for Occupational Safety and Health
NQO1 NADPH:quinone oxidoreductase
OSHA Occupational Safety and Health Administration’s
PH Phenol
ppm Part per million
ix
LIST OF ABBREVIATIONS (CONTINUED)
qgComp Quantile g-computation

ROS Reactive oxygen species
SEG Similar exposure group
SNP Single nucleotide polymorphism
TLV Threshold limit value
US United States
VOC Volatile organic chemical
x
SUMMARY
Benzene is a ubiquitous hydrocarbon that has several natural and anthropogenic sources. It is a
natural constituent of crude oil and can be found throughout the petroleum industry. It is also used as a
precursor for the manufacture of complex organic chemicals and can be found in rubbers, lubricants,
dyes, detergents, and pesticides. Finally, it is also a product of hydrocarbon combustion and can be found
in the burning of fossil fuels and tobacco smoke. Given the variety of sources, human exposure to
benzene is prevalent and has been associated with a variety of adverse health effects. When absorbed in
the body, benzene is metabolized into toxic intermediates that accumulate in the bone marrow and
generate reactive oxygen species. This eventually leads to damage to hematopoietic progenitor cells and
culminates in perturbations of hematologic parameters and several types of blood tissue malignancies.
This work aimed to examine several aspects of benzene toxicity including occupational exposure, kinetics
of metabolism and environmental exposure. Our current understanding of occupational benzene exposure
is limited to a handful of relatively old cohorts and there are few studies investigating benzene
metabolism and environmental exposure. We expect that the findings of this work will inform future
occupational exposure limits to help protect workers and deepen our understanding of environmental
exposure to protect the general public.
Aim 1 of this work established an exposure reconstruction methodology for the Four Refinery
Cohort. This is a cohort of US refinery workers that contained work history records derived from payroll
information between 1979 to 2010 and benzene exposure records derived from an industrial hygiene
monitoring program between 1976 and 2007. These data sources used different classification schemes
that were linked together by developing a standardized job title. To predict benzene exposure at a given
point during the study, a weighted linear regression model was then developed that defined exposure as a
function of standardized job title, refinery site, and calendar time. Both standardized job title and refinery
site were specified as categorical variables. Calendar time was specified using a natural spline with two
knots to allow for non-linear changes in exposure levels throughout the study duration. The model was
xi
SUMMARY (CONTINUED)
then applied to individual work histories to reconstruct a timeline of exposure for each worker and derive
estimates of cumulative exposure. These exposure estimates can be used in future epidemiological
investigations to evaluate the health effects of occupational exposure to benzene in a modern context.
Aim 2 of this work modelled the dose-response relationship between occupational benzene
exposure and the production of metabolites using data from the Shanghai Health Study. Personal benzene
exposure was assessed using passive samplers and post-shift metabolite concentrations were assessed in
urinary specimens for a total of 185 factory workers. Regression models were constructed to characterize
the association between benzene exposure and four metabolites: catechol, hydroquinone, phenol and
muconic acid. The modeled relationships exhibited a clear dose-response with non-linearities that are
consistent with enzyme kinetics. Specifically, the rate of metabolite production increased as benzene
exposure increased until saturation, which appeared to be around 70 mg/m3 for all metabolites. The
models were further extended to assess whether functional polymorphisms in genes related to benzene
metabolism modified the dose-response relationship. Single nucleotide polymorphisms were assessed in
four genes that code for enzymes involved in benzene metabolism: NQO1, MPO, CYP2E1 and GSTT1.
Interactions were detected between benzene exposure and the MPO G463A polymorphism in the dose-
response for two metabolites: hydroquinone and muconic acid. In both cases, individuals with the variant
allele were predicted to have higher urinary metabolite concentrations with increasing benzene exposure,
relative to individuals with the wild-type genotype. These findings further elucidate the kinetics of
benzene toxicity which can help to inform benzene regulatory standards and protect workers.
Aim 3 of this work evaluated the association between environmental exposure to a mixture of volatile
organic chemicals (benzene, toluene, ethylbenzene, xylene, and styrene) and hematologic parameters
among a subset of participants in the National Health and Nutrition Examination Survey. We specifically
focused on implementing mixture methods to understand the overall health effects of the mixture and to
further identify the individual components that are driving these effects. Two mixture methods were used:
xii
SUMMARY (CONTINUED)
Bayesian kernel machine regression and quantile g-computation. In the quantile g-computation analysis,
the overall mixture was associated with increases in red blood cell count and hemoglobin. For both of
these outcomes, benzene and toluene contributed to these associations and toluene consistently had the
highest positive weight. In the Bayesian kernel machine regression analysis, the overall mixture was
associated with an increase in hemoglobin and toluene was the only element to have a relatively high
posterior inclusion probability. These findings are helpful for risk identification efforts and provide
mechanistic insight into the health effects of exposure to this mixture of volatile organic chemicals.
In sum, this work investigated occupational exposure assessment, molecular mechanisms of
benzene toxicity and hematologic manifestations of toxicity to provide an integrated understanding of the
health effects of human exposure to benzene. These results can help to inform occupational exposure
limits and environmental risk characterization.
xiii
1
I. INTRODUCTION
A. Overview
Benzene is a clear, colorless, volatile, and highly flammable liquid. It is an aromatic hydrocarbon
where the molecular structure is composed of six carbon atoms joined to form a ring with one hydrogen
atom attached to each carbon. Benzene is a ubiquitous chemical that has several sources. It is a natural
constituent of crude oil and can be found throughout the petroleum industry. It is also used as a precursor
for the manufacture of complex organic chemicals and can be found in rubbers, lubricants, dyes,
detergents, and pesticides. Finally, it is a product of hydrocarbon combustion and can be found in
volcanic eruptions, forest fires, burning of fossil fuels, and tobacco smoke. Because of the multitude of
sources, human exposure to benzene is prevalent and has been associated with a variety of adverse health
effects. According to the International Agency for Research on Cancer (IARC), there is sufficient
evidence of carcinogenicity for benzene specifically for acute myeloid leukemia and acute
nonlymphocytic leukemia[1]. There is also limited evidence for carcinogenicity for benzene specifically
for acute lymphocytic leukemia, chronic lymphocytic leukemia, multiple myeloma, and non-Hodgkin’s
lymphoma. The remainder of this introduction will elaborate on four aspects of benzene toxicity to further
illustrate its adverse impacts on human health: Production and Uses, Sources of Exposure, Health Effects,
and Mechanisms of Toxicity.
B. Production and Uses
Benzene can be produced in several ways. It can be prepared by cracking and fractional
distillation of crude oil, catalytic reforming of cycloparaffins or hydrodealkylation of toluene[2]. By
volume, benzene is one of the most produced chemicals in the world[3]. In 2012, the global production of
benzene was approximately 42.9 million tones and greatest producers of benzene were China, the United
States (US), South Korea, Japan, and Germany. About half of all benzene produced is consumed by
China, the US, and western Europe. Benzene is a highly produced and consumed chemical because it
serves as both a starting and intermediate material for many industrial compounds such as styrene,
phenol, cyclohexane, aniline, alkylbenzenes and chlorobenzenes. These compounds are then used in the
2
synthesis of plastics, rubbers, polymers, resins, dyes, and detergents. For example, the production of
styrene for downstream use in polystyrene, styrene copolymers, latex, and resins accounts for about 52%
of all benzene consumption. Historically, benzene has also been used as a degreaser for metals, solvent
for organic materials and as an additive in unleaded gasoline[4].
C. Sources of Exposure
Because of its widespread production and use, occupational and environmental exposure to
benzene is ubiquitous. Occupational exposure is particularly prevalent within the upstream and
downstream components of the petroleum industry. Upstream petroleum activities include the
exploration, drilling and extraction of crude oil from oil wells and transport to refineries via pipelines or
tank ships. These processes generally take place in closed systems to minimize benzene exposure,
however exposure is possible whenever the system is opened. Potential exposures are therefore during
cleaning and maintenance of tanks and separators, pipeline pigging operations and storage tank
gauging[5]. Downstream petroleum activities include all refining operations as well as the distribution
and retail of petroleum products. Within refining operations, there is potential for exposure among
process technicians, laboratory technicians, and dock workers in a variety of tasks including sampling,
opening of vessels for maintenance and the loading of petrol for distribution[6]. After refining, petroleum
products are distributed through a transport chain and there is potential for benzene exposure at each point
where products are stored or transferred. Finally, there is potential for exposure during the retail of
petroleum products. For example, exposure to ambient benzene is possible at gasoline stations during
refueling, however levels have steadily decreased over time due to decrease in benzene content in
gasoline and the advent of vapor recovery systems[7].
In addition to occupational exposures, environment exposure is possible through natural and
anthropogenic sources. Natural sources include forest fires, volcano eruptions and other naturally
occurring combustion of hydrocarbons. Anthropogenic activities are the major source of environment
exposure and include tobacco smoke, industrial emissions and the burning of coal and oil. Benzene can
also be found in microenvironments associated with motor vehicle operations, including proximity to
3
traffic, parking garages and gasoline stations[8]. Because benzene is used as a precursor for the
manufacture of complex organic chemicals such as rubbers, lubricants, dyes, detergents, and pesticides,
trace amounts can also be found in a variety of consumer products[9].
D. Health Effects
The extensive sources of occupational and environmental exposure in combination with potent
toxicity, positions benzene as a significant threat to human health. According to IARC, there is sufficient
evidence for the carcinogenicity of benzene for acute myeloid leukemia and acute nonlymphocytic
leukemia. There is also limited evidence for the carcinogenicity of benzene for acute lymphocytic
leukemia, chronic lymphocytic leukemia, multiple myeloma, and non-Hodgkin’s lymphoma. Given the
prevalence of benzene in petroleum refining and industrial manufacturing, the majority of this evidence is
derived from occupational studies among oil workers and rubber manufacturing workers.
Among occupational studies within the petroleum industry, a Norwegian cohort of offshore oil
workers is one of the largest and most influential[10]. The cohort was composed of 27,919 workers who
were employed in the offshore oil industry between 1981 and 2003. Work history records were obtained
through the Norwegian Registry of Employers and Employees. The cohort was followed until 2001 for
cancer incidence using the Cancer Registry of Norway. Leveraging the work history records, the
investigators developed a job-exposure matrix to quantitatively assess each worker’s benzene exposure
during their employment in the offshore oil industry. Using these estimates and a stratified case-cohort
design, the study detected a strong dose-response relationship between cumulative benzene exposure and
both acute myeloid leukemia and multiple myeloma. There was also a suggestive association between
average benzene exposure and chronic lymphocytic leukemia. The key advantages of this study were its
prospective design, comprehensive incidence data and quantitative exposure estimates.
Another prominent study within the petroleum industry is a pooled analysis of three cohort
studies of petroleum distribution workers from Australia, Canada, and the United Kingdom[11]. The
pooled cohort was composed of workers who performed a variety of tasks related to petroleum
distribution and who were employed between 1950 and 1999. Exposure was harmonized between the
4
three cohorts and six exposure metrics were derived: 1) cumulative exposure, 2) duration of employment,
3) average exposure intensity, 4) maximum exposure intensity, 5) peak exposure, and 6) dermal exposure.
Cancer incidence was assessed through 2006 using hospital records, histopathology reports and cancer
registry data. The study further independently verified diagnostic classifications using two
hematopathologists who were blinded to the exposure assessment. The analysis was a nested case-control
study using 370 cases and 1587 matched controls. The study detected a monotonic dose-response
relationship between cumulative benzene exposure and myelodysplastic syndrome. The key advantages of
this study were the quantitative exposure estimates and independent review of the diagnostic
classification.
Outside of the petroleum industry, occupational studies on benzene exposure tend to focus on
manufacturing processes were benzene was used as a solvent. One of the largest of these studies was a
collaboration between the National Cancer Institute and the Chinese Academy of Preventive Medicine
(NCI-CAPM)[12]. The NCI-CAPM cohort is composed of 74,828 exposed and 35,805 unexposed
Chinese workers from 712 different factories. The exposed workers completed tasks in painting, rubber,
chemical manufacturing, and shoe-making processes were benzene was used as a solvent. The unexposed
workers were selected from processes where benzene was not present. The cohort was initially followed
between 1972 and 1987 and follow-up was further extended to 1999 using factory records, hospital
records, and death certificates. Relative to the unexposed workers, benzene-exposed workers experienced
increased risk for all-cause mortality, lung cancer mortality, incidence of myelodysplastic syndrome/acute
myeloid leukemia, incidence of chronic myeloid leukemia and incidence of non-Hodgkin’s lymphoma.
Although there was no quantitative exposure assessment, the key advantages of this study were the long
follow-up time and the large cohort size that included diverse workers from several different
manufacturing processes.
One of the most contentious and influential series of studies on occupational benzene exposure
centers around the Pliofilm cohort[13–17]. This cohort is composed of workers from three rubber
manufacturing plants in Ohio where benzene was used as a solvent. A total of 1696 workers were
5
followed up for mortality between 1940 and 1996. One of the major benefits of this study was that
benzene was the only solvent used in the manufacturing processes, therefore confounding by potential co-
exposures was minimized. Quantitative exposure assessment for this cohort was also possible using
industrial hygiene monitoring data, however incomplete records across the study duration have introduced
considerable uncertainty in the exposure reconstruction process. Competing strategies for exposure
reconstruction have been introduced by various investigators that have led to different distributions of
benzene exposure estimates. Consequently, different groups have drawn different conclusions on the risks
of benzene. A recent investigation reassessed exposure with a probabilistic approach using the industrial
hygiene data and additional information derived from interviews with former workers[14]. The group
concluded that workers in the highest exposure category had significantly elevated risks of acute
nonlymphocytic leukemia and acute myelocytic leukemia.
Given the evidence that benzene is associated with certain types of leukemia and the potential for
exposure in both an occupational and environmental context, several regulatory agencies have
promulgated benzene regulatory standards to protect human health. The US Environmental Protection
Agency has enforced a maximum contaminant level of benzene in drinking water of 5 parts per billion.
Additionally, the Occupational Safety and Health Administration has implemented several benzene
regulations to protect workers. The current eight-hour time-weighted average concentration that cannot be
exceeded for a workday is 1 part per million (ppm). There is also a short-term exposure limit of 5 ppm,
which is the maximum 15-minute time-weighted average exposure that cannot be exceeded any time
during the workday.
E. Mechanisms of Toxicity
It is generally accepted that the toxic effects of benzene require metabolism and subsequent
generation of reactive oxygen species. Figure 1 presents a simplified schematic of benzene metabolism.
The first step in metabolism is the conversion of benzene into benzene oxide. This is the only know entry
point into metabolism and is catalyzed by cytochrome P450 2E1 (CYP2E1)[18, 19]. From benzene oxide,
metabolism can proceed down several different pathways. First, it can continue through detoxification
6
Figure 1. Overview of benzene metabolism

7
pathways which are mediated primarily by glutathione S-transferase theta-1 (GSTT1) to form S-phenyl-
mercapturic acid[20]. Second, it can be converted to muconic acid by CYP2E1[21]. Finally, it can
proceed down a pathway that ultimately yields reactive oxygen species (ROS)-generating metabolites.
This pathway starts with the nonenzymatic rearrangement of benzene oxide into phenol. From there,
phenol is converted to either hydroquinone or catechol by CYP2E1. Catechol and hydroquinone are then
further oxidized by myeloperoxidase (MPO) to produce 1,2- and 1,4-benzoquinone, respectively[22]. In
both cases, the reverse reaction is catalyzed by NADPH:quinone oxidoreductase (NQO1)[23]. The
benzoquinones produced by MPO accumulate in the bone marrow where they promote the production of
ROS. ROS cause damage to DNA, tubulin, histone proteins and other DNA associated proteins in bone
marrow stem cells and early progenitor cells[24]. Accumulation of oxidative damage reduces the capacity
of these progenitor cells to produce circulating blood cells (hematotoxicity) and eventually leads to
oncogenic transformation (leukemogenicity).
F. Specific Aims
1. Aim 1
Aim 1 is to quantitatively estimate cumulative benzene exposure in an occupational cohort of
refinery workers from four locations in the US. To do this, we will develop a methodology to reconstruct
exposure for each member of the cohort in a two-step process that uses industrial hygiene monitoring data
from 1976 to 2007 and work history records from 1979 to 2010. In the first step, the industrial hygiene
monitoring data will be used to develop a model to predict benzene exposure given the calendar year,
refinery location and job title. In the second step, the model will be applied to individual work histories to
derive quantitative estimates of cumulative exposure. These cumulative exposure estimates will serve as
the foundation for future epidemiologic investigations to assess the association between occupational
benzene exposure and adverse health effects in this cohort.
2. Aim 2
Aim 2 is to characterize the dose-response relationship between benzene exposure and the
production of urinary metabolites in an occupational cohort of factory workers in Shanghai, China. Four
8
urinary metabolites will be assessed: phenol, hydroquinone, catechol and muconic acid. To detect
potential non-linearities in the dose-response relationship, flexible natural splines will be used to model
urinary metabolite concentrations as a function of benzene exposure. Modeling the dose-response
relationship between exposure and the production of metabolites will help in the risk characterization of
benzene.
3. Aim 3
Aim 3 is to evaluate the association between environmental exposure to a prevalent, highly
correlated mixture of volatile organic chemicals (VOCs) and hematologic parameters in a subset of
participants from the National Health and Nutrition Examination Survey. Understanding the health effects
of a correlated mixture has many inherent analytic challenges. To address these challenges, we will
implement two mixture methods: quantile G-computation and Bayesian kernel machine regression. These
efforts will help characterize the health effects of the overall VOC mixture and to identify individual
constituents of the mixture that are driving these effects.

9
II. EXPOSURE RECONSTRUCTION IN A COHORT OF REFINERY WORKERS
A. Background
Exposure to benzene has been associated with lymphohematopoietic cancers in many
occupational cohorts[25–29]. One of the foundational cohorts that provides clear evidence of this
association is the Pliofilm rubber worker study, where benzene was used in the Pliofilm manufacturing
process as the principal solvent to dissolve rubber. Rinsky and colleagues used data from this cohort to
conduct a series of epidemiologic investigations that are pivotal to the modern understanding of benzene
leukemogenicity[15, 30, 31]. These studies demonstrate a dose-response relationship between benzene
exposure and leukemia morality that has been used as evidence to set many regulatory standards,
including the Occupational Safety and Health Administration’s (OSHA) permissible exposure limit, the
American Conference of Governmental Industrial Hygienists’ (ACGIH) threshold limit value and the US
Environmental Protection Agency’s cancer potency factor.
Because the Pliofilm cohort has been so influential in an extensive history of benzene regulatory
action, the analysis of the Pliofilm cohort has been intensely scrutinized and reanalyzed by other
investigators. One of the major points of contention in debates and competing analysis has been how
exposure was retrospectively assessed given an incomplete record of historical industrial hygiene
measurements. The Pliofilm cohort had two main components to reconstruct exposure: 1) task-specific
individual work histories between 1940 and 1976; and 2) area-specific, historical airborne benzene
measurements obtained as part of an industrial hygiene program. In the first iteration of the exposure
reconstruction, Rinsky and colleagues collapsed the tasks from the work history and the areas from the
exposure data into “similar exposure groups”[31]. A job-exposure matrix was then developed which
could associate each time point in an individual’s work history with a specific exposure level. However,
gaps in exposure data were pervasive in the study and work history could not always be directly assigned
an exposure value. In these instances, benzene exposure was imputed as the interpolation between the
closest available values before and after the time point where exposure was missing. Exposure estimates
10
derived from this approach were subsequently used to show a strong dose-response relationship between
cumulative benzene exposure and leukemia mortality.
This approach was later criticized in a reanalysis performed by Crump et al [13, 32]. The authors
argued that interpolation based solely on available data in times with missing data would unreliably
estimate exposure. They instead argued that whenever data was not available, workplace concentrations
should be informed by the regulatory context of the time. As such, Crump et al estimated exposure in
times with missing data as proportional to the closest available ACGIH threshold limit value (TLV).
Exposure estimates were constructed using this TLV approach, which resulted in significantly higher
estimation of benzene exposure and consequently lower estimation of the toxic effects of benzene. A
separate approach to the exposure reconstruction by Williams and Paustenbach further incorporated
additional information obtained through interviews with former workers to adjust the exposure data based
on workplace practices, accuracy of monitoring devices, and historical implementation of engineering
controls[14, 17]. The authors also applied probabilistic models to account for the variability and
uncertainty in the reconstruction. This method produced a third set of exposure estimates that, when
applied to the mortality data, suggested a previously unforeseen threshold phenomenon in the association
between benzene exposure and risk of certain leukemias.
The reanalysis and debate surrounding the Pliofilm cohort illustrates how the structure and
availability of the exposure data affected how investigators implement an exposure reconstruction.
Differences in reconstruction implementations then led to different findings when assessing the health
effects associated with occupational benzene exposure. Although the merits of competing Pliofilm
analyses continue to be debated [33], operation of the processes ceased in 1976 and exposures are
relatively high compared to modern standards. This highlights the need to study contemporary cohorts
and to delineate robust exposure reconstruction methods given the structure of the exposure data in
question. In the current study, an exposure reconstruction methodology was developed for a cohort of
workers at four US refineries by combining historical industrial hygiene data with detailed work histories
11
from 1976 to 2010. The exposure estimates generated from this analysis can be used for future
epidemiological investigations into the health effects of modern occupational benzene exposure.
B. Methods
1. Exposure Data
This study used data collected as part of an ongoing historical benzene exposure assessment
program conducted at four US oil refineries located in Baton Rouge, Louisiana; Beaumont, Texas;
Baytown, Texas; and Joliet, Illinois. An overview of the sampling strategy and characterization of the
historical benzene exposure data within each refinery has been previously published [34–38]. Analysis
has also been performed to assess the feasibility and viability of pooling data across the refineries and to
characterize the pooled dataset [39]. Briefly summarizing these previous efforts, exposure monitoring
began at the refineries in 1976 and this analysis used data up to 2007. Personal benzene sampling was
conducted by industrial hygienists to target specific areas and tasks within the refining process. Samples
were collected according to standard operating procedures involving the use of either 150-mg charcoal
sorbent tubes or passive organic vapor badges. Samples were analyzed by a laboratory accredited by the
American Industrial Hygiene Association (AIHA) according to methods consistent with internal standard
operating procedures. A time-weighted average for each sample was calculated based on the sample
collection duration. Measurements were then pooled across the four refineries and assigned to one of 50
job titles and one of 48 work areas, creating a total of 362 unique job title-area combinations (referred to
as worker exposure groups). These categories were developed by a panel of industrial hygienists to pool
activities across the refineries that have similar exposure potential.
Values below the limit of detection were imputed using the regression on order statistics method
for a lognormal distribution. A linear regression model was fitted to the detected values within the dataset
to quantiles of an assumed lognormal distribution. Values below the limit of detection (LOD) were then
extrapolated from the linear regression model. This method has been shown to produce robust estimates
12
of mean and standard deviation using data with modest departures from the lognormal distribution,
multiple limits of detect and high proportions of values below LOD [40, 41].
The benzene monitoring program implemented a targeted sampling strategy to continuously
ensure employee exposure remained below exposure limits and to comply with OSHA standards.
Therefore, sampling was focused on workers who had a higher potential of exposure based on their task
and work area. As a result, the dataset overrepresents workers and work activities with higher benzene
exposure.
2. Work History Data
This study also used work history records from the four refineries obtained through electronic
databases. A deidentified database extract of work history records from 1979 to 2010 was obtained that
contained information on employment dates, employment location (i.e. one of the four refineries), and
payroll titles used for payroll processing.
3. Data Linkage
Figure 2 depicts an overview of the exposure reconstruction process. Broadly, the reconstruction
process generates estimates of historical benzene exposure for each worker by merging the pooled
exposure data and the work history data. These data sources were not initially designed to be combined
together, therefore the terminology implemented to describe the job titles and work areas in the exposure
data used a different classification scheme than the payroll titles in the work history data. The 362 unique
combinations of job title and work area were considered as distinct “worker exposure groups”. The work
history data contained 7,318 unique payroll titles. In order to coherently merge these entities, a linkage
process was conducted using industrial hygiene expertise from experts familiar with the exposure
monitoring program and process operations at the four refinery sites (Figure 2A). The 362 unique worker
exposure groups and the 7,318 unique payroll titles were collapsed into 52 mutually exclusive
“standardized job titles”. These standardized job titles were then mapped back to the exposure data and
work history data to be used as an identifier variable for linkage. Although a total of 52 standardized job
titles were mapped back to the work history data, only 39 standardized job titles were mapped back to the
13
Figure 2. Exposure reconstruction overview

14
exposure data. This means that 13 standardized job titles in the work history data could not be linked to
the exposure data.
4. Exposure Reconstruction
After the two data sources were linked, historical exposure was reconstructed for each member of
the cohort using a modelling approach (Figure 2B and 2C). This approach is based on previous exposure
reconstruction implementations that build a statistical model that can predict exposure given a set of
predictor variables within the work history data. As an example, this approach has been used in the
sterilization industry to estimate retrospective exposure to ethylene oxide [42]. The exposure
reconstruction process was divided into three components: data aggregation, model building, and model
prediction.
a. Data Aggregation
Exposure data was aggregated by standardized job title, refinery site, and calendar year
to summarize potentially dense exposure data and to stabilize exposure reconstruction
estimates. For each unique stratum of standardized job title, refinery site, and calendar year, the
arithmetic mean of exposure values was calculated. Each stratum was also assigned a weight
designed to be indicative of the sampling reliability within the stratum. The weight was a
function of the coefficient of variation (CV = standard deviation/mean) and the sample size (n)
of the stratum’s exposure values and took the following form:
𝑊𝑒𝑖𝑔ℎ𝑡 = 1⁄ 𝐶𝑉 2
𝑙𝑜𝑔 ( 𝑛 + 1)
In strata where the standard deviation could not be calculated (n < 3), the CV was imputed as
the global CV across all exposure data.
b. Model Building
Weighted linear regression was used to model the relationship between the aggregated
exposure data in a stratum and the corresponding standardized job title, refinery site, and
calendar year. In order to normalize the residuals of the model, a logarithmic transformation
15
was applied to the exposure measurements. A logarithmic transformation has the additional
benefit of permitting only positive estimates of exposure during prediction. The model was
specified with both standardized job title and refinery site as nominal categorical predictor
variables. Several model forms were evaluated to determine the appropriate way to specify
calendar time in the model. A continuous linear term for calendar time assumes a constant year-
to-year change in exposure values. Given the implementation of various engineering control
and regulations throughout the time frame of the study, nonlinear relationships between
calendar time and exposure values were also considered. Specifically, calendar time was
specified using a quadratic term or a natural spline with an increasing number of knots. The
final model specification was chosen by a balance of the adjusted R2 (i.e. the model’s ability to
describe the variability in the exposure values), the stability of the predictions over the
observed study duration, and the overall complexity of the model.
c. Exposure Prediction
After the model was finalized, a reconstructed history of exposure was estimated for
each worker in the cohort by applying the model to the corresponding work history data
(Figure 2C). First, a timeline of worker history was specified using the start and end dates of
employment. The timeline was then populated with events whenever there was evidence of a
change in standardized job title or refinery site. History of exposure was then predicted by
applying the model to the values of standardized job title, refinery site, and calendar time along
the timeline.
C. Results
1. Description of Linked Data
After the work history data and exposure data were successfully linked via a set of standardized
job titles, both datasets were summarized to understand how patterns within the data might affect the
exposure reconstruction process. Table I presents characteristics of the work history data after the data
16
TABLE I.
WORK HISTORY CHARACTERISTICS OF THE FOUR REFINERY STUDY
Characteristic n %
Total workers 5129
Workers with process technician history 2287 100.0
Refinery Site Baytown 787 34.4

Baton Rouge 648 28.3
Beaumont 554 24.2
Joliet 284 12.4
Multiple sites 14 0.6
Number of job titles held 1 495 21.6

2 1069 46.7
3 410 17.9
4 202 8.8
5 68 3.0
6 28 1.2
7 14 0.6
8 1 0.0
Career duration Mean 13.2

Median 10.8
Min 1
Max 32
17
linkage process translated 7,318 unique payroll titles into 52 unique standardized job titles. There was a
total of 5,129 workers who contributed at least one year of work history data at any of the four refineries
between 1979 and 2010. For the purposes of developing and evaluating the exposure reconstruction
methodology, this analysis focused only on workers who had any work history corresponding to
standardized job titles containing the phrase “Process Technician”. This subsample of 2,287 individuals
represents workers who have the highest potential of benzene exposure in the cohort. Within this
subsample, the majority of workers spent their entire career in Baytown (34.4%). This was followed by
Baton Rouge (28.3%), Beaumont (24.2%) and Joliet (12.4%). These relative proportions are indicative of
the scale and volume of production at each of the four refinery sites [39]. Notably, only 14 workers
(0.6%) had evidence of working at multiple sites in their career. The cohort also displayed diverse
flexibility in the structure of their career. The mean career duration was 13.2 years with a minimum of 1
year and maximum of 32 years. The majority of workers (46.7%) held two distinct standardized job titles
throughout their career and the maximum number of standardized job titles held was eight.
The exposure data was similarly summarized after the linkage process mapped 362 unique
worker exposure groups to 39 unique job titles out of the 52 that were initially developed. Figure 3
presents the total number of exposure records pooled across all job titles and refinery sites for each year
of the study. Notably, the Occupational Safety and Health Administration ratified a new benzene standard
in 1989, lowering the permissible exposure limit from a time-weighted average of 10 ppm to 1 ppm for an
8-hour workday. This figure illustrates that overall sampling throughout the program was much more
prevalent leading up to 1989 as controls were implemented and validated to comply with the new
standard. Table II presents the number of exposure records available for each job title. This table
demonstrations that due to the nature of the targeted sampling strategy implemented at the four refinery
sites, job titles with higher potential for benzene exposure were sampled more frequently. In order to
understand sampling patterns within job titles further, Figure 4 presents the exposure profiles over time
for 16 randomly selected strata after the data was stratified by job title and refinery site (exposure values
are presented on the logarithmic scale). These exposure profiles further reinforce key aspects of the
18
Figure 3. Total number of exposure records available over the study duration
targeted nature of the benzene monitoring program. First, it is apparent that some job titles such as
“Pipefitter/Welder” had higher benzene exposure values and these job titles were sampled more
frequently as a result. Second, job titles displayed heterogeneity in their sampling pattern over time.
Whereas “Electrician” was sampled relatively uniformly over the duration of the study,
“Pipefitter/Welder” had a marked increase in sampling leading up to 1989.
In the ideal linkage scenario, job titles that contribute the majority of observed person-time would
also have a high density of exposure information to reliability support a reconstruction of historical
exposure. To understand how the patterns observed in the work history data and exposure data could
interact during the exposure reconstruction, Figure 5 compares the proportion of observation time
contributed by each job title to the prevalence of that job title in the exposure data (restricted to the 20 job
19
TABLE II.
TOTAL NUMBER OF EXPOSURE RECORDS AVAILABLE FOR
EACH STANDARDIZED JOB TITLE
Standardized Job Title n
Machinist 2013
Pipefitter/Welder 1649
Laboratory Technician 1403
Process Technician/Reformer 1033
Process Technician/Oil Movements 747
Process Technician/Coker 593
Process Technician/Pipestill 501
Process Technician/Catalytic Cracker 388
Process Technician/Desulfurization 372
Process Technician/Light Ends Unit 325
Process Technician/Tank Farm 287
Process Technician/Hydrofiner 243
Process Supervisor 235
Electrician 219
Instrument Technician 206
Process Technician/General 162
Process Technician/Waste Treatment 157
Process Technician/Dewaxing Area 123
Process Technician/Hydrocracker 112
Environmental/Safety Inspector 75
Field Supervisor 62
Process Technician/Lube Blending and Storage 61
Process Technician/Lube Extraction Unit 60
Process Technician/Utilities 57
Building Trades 50
Process Technician/Lube Rack 49
Industrial Hygienist 41
Mobile Equipment Operator 39
Engineer 31
Process Technician/Laboratory 24
Process Technician/WCLA 20
Garage Mechanic 18
Office Administration 16
Laboratory Supervisor 15
Inspector 13
Material Specialist 10
Maintenance Supervisor 9
Security Officer 6
Maintenance-Superintendent 4
20
Figure 4. Selected standardized job title-specific exposure profiles over the study duration
21
Figure 5. Observation time and exposure records by standardized job title

22
titles that contributed the most observation time to the study). This figure highlights some potential
weaknesses in how the two data sources overlap. For example, although “Machinist” and
“Pipefitter/Welder” were the most prevalent job titles in the exposure data (representing 17.6% and 14.4%
of all exposure records, respectively), the cohort only contributed 3.5% and 0.9% of observation time
towards these jobs. Similarly, “Process Technician/General” represented 51.4% of observation time in the
cohort but only corresponded to 1.4% of all exposure records. As previously mentioned, while the linkage
process generated 52 unique standardized job titles in the work history data, only 39 could be mapped to
the exposure data (see Figure 2A for an overview of the linkage process). This is a potential limitation of
the linkage, especially if the remaining 13 job titles in the work history that do not link to any records in
the exposure data contribute a significant amount of observed person-time. However, these 13 job titles
only contributed 3% of the overall observed person-time and in these instances, workers were assumed to
have no benzene exposure.
2. Model Building And Predictions
After the work history data and exposure data were characterized using the standardized job title,
a model was developed to predict benzene exposure given information from an individual’s work history.
For appropriate interpolation within the model, only variables shared between the work history data and
exposure data were considered, namely calendar time, refinery site, and standardized job title. Exposure
data was aggregated and weights for each stratum were calculated as previously described (Section 4.a).
Weighed linear regression was then used to model benzene exposure as a function of standardized job
title, refinery site, and calendar year. Standardized job title and refinery site were both specified as
categorical variables. Five distinct specifications were considered for calendar time: a linear term, a
quadratic term, or a natural spline with two to four knots. Figure 6 compares the proportion of variability
explained by the model (adjusted R2) between the different specifications of calendar time. All models
explained a substantiable amount of variability in the weighted data and increasing the complexity of
calendar time specification did not appreciably change performance. Specifically, the adjusted R2 for the
model with a linear term was 0.861 compared to 0.865 for the model using a natural spline with four
23
Figure 6. Proportion of variability exposure by model with different specifications of calendar time
knots. The final prediction model used a natural spline with two knots to specify calendar time for two
reasons. First, this specification represents a balance between a modest increase in R2 relative to using a
linear term and a minimized model burden relative to more complicated models. Second, given the
punctations in regulatory benzene standards over the course of the study, it is important to allow some
non-linearity in the relationship between calendar time and benzene exposure.
To visualize how the prediction model fits the exposure data, Figure 7 presents predictions
derived from the model along with the corresponding exposure data. Data was stratified by standardized
job title and refinery site. Nine strata were randomly selected. The empirical exposure data over the study
duration for each stratum and a trend line derived from model predictions within the stratum were plotted.
The trend lines demonstrate the same profile in benzene exposure over time but have different magnitudes
depending on the stratum. This is due to the nature of the model, where calendar time is specified with the
24
Figure 7. Model predictions and exposure data from selected standardized job titles
25
same coefficients independent of the standardized job title or refinery site. Notably, these profiles model a
decrease in exposure levels starting approximately at 1989, the year the current benzene standard was
ratified.
The finalized prediction model was then applied to individual work histories to reconstruct
exposure. Figure 8A illustrates an example where work history data was used to generate a career
timeline for a hypothetical worker (although the worker history presented here is not actual data, it is
representative of how career timelines are structured). Each segment of the timeline contains a start date,
end date, standardized job title, and refinery site. Figure 8B illustrates how exposure values are predicted
along the timeline using the model and the given values of calendar year, standardized job title, and
refinery site. The result is a profile of reconstructed exposure over the study duration. The profile depicts
shifts in exposures as the worker changes job title and a general decrease in exposure starting in 1989 (as
described previously). An estimate of cumulative exposure can be derived from the exposure timeline by
calculating the area under the curve (for the example the cumulative benzene exposure estimate was 1.998
ppm-years). To summarize the exposure reconstruction across the entire cohort, cumulative exposure
estimates were calculated for each worker and presented in Figure 9. The distribution of cumulative
exposure estimates appears lognormal with a median of 0.39 ppm-years, a minimum of 0.004 ppm-years,
and a maximum of 7.16 ppm-years. Based on the distribution, the maximum value of 7.16 appears to be
an outlier and next highest value is 3.44 ppm-years.
D. Discussion
This study developed a framework for occupational benzene exposure reconstruction by linking
exposure data and work history data from a new cohort of US refinery workers (i.e. the Four Refinery
cohort). Two disparate data sources were utilized: 1) detailed work histories for each worker derived from
payroll information at the four refineries between 1979 and 2010; and historical benzene exposure data
derived from an industrial hygiene monitoring program between 1976 and 2007. These data sources used
different classification schemes that were linked together by developing a standardized job title. To
26
Figure 8. Hypothetical career timeline and exposure reconstruction

27
Figure 9. Distribution of cumulative exposure estimates
predict benzene exposure at a given point during the study, a weighted linear regression model was then
developed that defined exposure as a function of standardized job title, refinery site, and calendar time.
Both standardized job title and refinery site were specified as categorical variables. Calendar time was
specified using a natural spline with two knots to allow for non-linear changes in exposure levels
throughout the study duration. The model was then applied to individual work histories to reconstruct a
timeline of exposure for each worker and derive estimates of cumulative exposure. These exposure
estimates can be used in future epidemiological investigations to evaluate the health effects of
occupational exposure to benzene in a modern context.
The contemporary nature of the Four Refinery cohort is a valuable distinction from previous
occupational benzene cohorts. Thus far, one of the most influential cohorts in understanding the effects of
28
benzene exposure and setting regulations has been the Pliofilm rubber worker study [13, 15, 17, 30, 31].
This study involved rubber manufacturing operations between 1940 and 1976 which was a much weaker
regulatory context than today. As a result, the Pliofilm cohort observed very high exposures and there few
cohorts established to provide information on lower level benzene exposure. One key advantage of the
Four Refinery cohort is that observation is ongoing and occurs in a modern regulatory context, with the
majority of observation time taking place after the most recent regulatory standard of 1 ppm. This makes
the cohort ideally suited to assess the health effects of long-term exposure to modern levels of
occupational benzene. Another unique benefit is the rich exposure data available for this study. The
industrial hygiene monitoring program established at the four refineries provides a single source of
systemically collected, dense, task-specific exposure data that has been used to reconstruct exposure. In
contrast, a common caveat among previous cohorts is the lack of exposure data that limits many aspects
of the analysis and subsequent interpretations. For example, studies of the Pliofilm cohort have been
limited to assemble exposure data from various sources including National Institute for Occupational
Safety and Health (NIOSH), Ohio Department of Health, University of North Carolina, and Goodyear,
sometimes with no information of how or why samples were collected. Although exposures were
observed for about thirty years, the first two decades contain less than 1% of all available exposure data
[13, 14, 17]. This lack of information for the majority of the study is a limitation that has been handled
differently between different investigations. As a result, these investigations have made distinct and
conflicting conclusions regarding the toxic effects of occupational benzene exposure. Utilizing a rich
source of exposure data, the Four Refinery cohort is well-positioned to avoid these methodological issues.
Another important difference between the Four Refinery cohort and the Pliofilm cohort is the use
of a regression model to estimate exposures. Previous investigations of the Pliofilm cohort relied on
exposure estimates obtained by interpolating between available values. This was motivated by the lack of
exposure data for the majority of the study and results in a simplistic and uncertain exposure
reconstruction. In contrast, the density of information in the Four Refinery cohort allowed for a
regression-based approach that predicts exposure based on multiple, uniquely specified variables.
29
Calendar time was specified non-linearly using a spline with two knots and the shape of the spline was
consistent across all standardized job titles and refinery sites. While it is conceivable that different job
titles could have unique exposure profiles over time, industrial hygiene programs aimed at reducing
exposure were often implemented globally. Therefore, all job titles across refineries would likely have
similar changes in exposures over time.
Although the dense and systemically collected exposure data is a key advantage of this study, the
targeted nature of the industrial hygiene monitoring program has important consequences. First, the
sampling strategy was initially designed to focus on jobs where there was a greater potential of exposure
due to the nature of the task. As a result, high exposure jobs are overrepresented in the dataset. For the
purposes of monitoring exposures to protect vulnerable workers, this is an intended behavior of the
monitoring program. However, high exposure jobs are among the least prevalent in the cohort in terms of
observed person-time contributed to the study and an exposure reconstruction would ideally benefit from
increased sampling in the job titles that contributed the most person-time. This disparity should be
considered as a potential weakness inherit to any exposure reconstruction retrofitting industrial hygiene
monitoring data. Another secular trend within the exposure data is the decrease in sampling frequency
over time. Sampling was more prevalent leading up to 1989 when new regulatory actions were being
ratified and exposures were vigorously monitored for compliance. Since then exposure levels have been
controlled and sampling has been performed at a surveillance level. As a result, more recent observation
time relies on less exposure data, however this is likely due to contemporaneous exposures being well-
characterized and controlled.
Another limitation to consider is the specificity of the work history records. In contrast to the task
and area variables of the industrial hygiene data, the job title variable from the work history data was not
explicitly developed to capture task-specific information that could be used to infer exposure. Instead, the
work history data was initially structured for payroll administration and it was at the discretion of the
administrators to create payroll titles that were informative of the specific tasks performed. In this study,
expertise from individuals familiar with the monitoring program and the administration at the four sites
30
was consulted to link between the payroll title and the appropriate task-specific industrial hygiene
exposure data. Nevertheless, there were instances where the nomenclature of the payroll titles was
ambiguous. For example, while much of the work history data containing the phrase “Process
Technician” could be further classified into a specific area or task (i.e. “Process Technician/Hydrofiner”
or “Process Technician/Lube Extraction Unit”), some of this data could not be further categorized. These
instances were classified as “Process Technician/General” to capture status as a process technician but
without further task-specific information. This work history data was subsequently associated with
generalized process technician exposure data present in the exposure dataset. However, the potential for
exposure between different types of process technicians is markedly different and this generalization may
represent loss of information. The use of the “Process Technician/General” job title is therefore an
inherent limitation of the structure of the work history data and may pose a risk of misclassification in the
exposure reconstruction.
In summary, this study delineates an exposure reconstruction procedure by linking industrial
hygiene monitoring data and work history records from a new cohort of US refinery workers. This
exposure reconstruction will be important for future epidemiologic investigations of the cohort to assess
the health effects associated with benzene. For example, the cumulative exposure estimates generated
from the reconstructed exposure can be paired with mortality data obtained through the National Death
Index to investigate whether long-term exposure to contemporary levels of occupational benzene
exposure is associated with all-cause or cause-specific mortality. Similarly, cancer incidence data
obtained through state cancer registries can be linked to the cohort to monitor leukemia incidence.
Additionally, the exposure reconstruction thus far has been developed using exposure data that is
informative of daily averages in order to estimate long-term exposures. The linkage scheme developed in
this work can further be extended with additional data to investigate the role peak benzene exposures have
on health effects. Continuing to follow the Four Refinery cohort with these future directions will provide
valuable new insights into the health effects of occupational benzene exposure.
31
III. METABOLISM OF OCCUPATIONAL BENZENE EXPOSURE
A. Background
Benzene is a toxic hydrocarbon that serves as an essential component for the manufacture of
many synthetic organic chemicals. Because of its widespread use, exposure is possible through
occupational and environmental pathways[43]. Occupational exposure takes place in petroleum refining,
rubber manufacturing, and other industrial processes[44]. The most common sources of environmental
exposure include tobacco smoke, fossil fuel combustion, and trace amounts in a diverse array of
consumer products[9, 45, 46]. Both occupational and environmental exposures to benzene are associated
with a range of hematotoxic effects. Immediate exposure results in damage to hematopoietic progenitor
cells and can manifest as alterations in hematologic parameters[47]. Prolonged exposure can lead to
aplastic anemia, leukemia, and other hematologic cancers[48].
It is generally accepted that the hematotoxic and leukemogenic effects of benzene require
metabolism and subsequent generation of reactive oxygen species[49]. Figure 10 presents a simplified
schematic of benzene metabolism, which takes place primarily in the liver. The first step in metabolism is
the conversion of benzene into benzene oxide. This is the only know entry point into metabolism and is
catalyzed by cytochrome P450 2E1 (CYP2E1)[18, 19]. CYP2E1 also catalyzes several other downstream
metabolic reactions. From benzene oxide, metabolism can proceed down several different pathways. First,
it can continue through detoxification pathways which are mediated primarily by glutathione S-
transferase theta-1 (GSTT1)[20]. Second, it can be converted to muconic acid by CYP2E1[21]. Finally, it
can proceed down a pathway that ultimately yields reactive oxygen species (ROS)-generating
metabolites. This pathway starts with the nonenzymatic rearrangement of benzene oxide into phenol.
From there, phenol is converted to either hydroquinone or catechol by CYP2E1. Catechol and
hydroquinone are then further oxidized by myeloperoxidase (MPO) to produce 1,2- and 1,4-
benzoquinone, respectively[22]. In both cases, the reverse reaction is catalyzed by NADPH:quinone
oxidoreductase (NQO1)[23]. The benzoquinones produced by MPO accumulate in the bone marrow
where they promote the production of ROS. ROS cause damage to DNA, tubulin, histone proteins and
32
Figure 10. Overview of benzene metabolism

33
other DNA associated proteins in bone marrow stem cells and early progenitor cells[24]. Accumulation of
oxidative damage reduces the capacity of these progenitor cells to produce circulating blood cells
(hematotoxicity) and eventually leads to oncogenic transformation (leukemogenicity).
Given that the enzymes involved in metabolism ultimately lead to the production ROS, it has
been hypothesized that functional polymorphisms in the upstream genes predispose some individuals to
benzene toxicity. Specifically, there is a growing body of evidence to suggest that polymorphisms in
CYP2E1, GSTT1, MPO and NQO1 may modify the toxic effects of benzene[22]. For example,
individuals with a variant allele of MPO were found to be at lower risk of hematotoxicity due to benzene
exposure, presumably because diminished function of MPO leads to less production of toxic
benzoquinones[50]. Similarly, individuals with a variant allele of NQO1 where found to be at higher risk
of hematotoxicity likely because diminished function of NQO1 leads to more accumulation of
benzoquinones. In the case of GSTT1, individuals with a null phenotype lacking any GSTT1 function
were found to be at higher risk of benzene poisoning likely due to decreased efficiency in detoxification
pathways[51].
Because the toxic effects of benzene are related to its metabolism, studying the kinetics of
metabolite production is important to understand the onset of adverse effects in the context of
occupational exposure. A handful of studies have reported the relationship between benzene exposure and
the urinary concentrations of phenol, catechol, hydroquinone and muconic acid in occupationally exposed
workers. One prominent example is the study of 390 factor workers in Tianjin, China where benzene
exposure was the result of shoe manufacturing processes[52–59]. In the Tianjin study, post-shift urinary
concentrations of benzene metabolites were modelled as a function of air benzene concentrations obtained
for each worker. Models were constructed using linear regression with natural splines to characterize the
dose-response relationship between benzene and each of the metabolites assessed[52]. These models were
extended to test if the dose-response curves were different among individuals who had differing
polymorphisms in genes related to benzene metabolism. The Tianjin study provided valuable initial
insights into the kinetics of benzene metabolism because it is the only study to characterize the production
34
of benzene metabolites across a wide range of exposures and suggest that production may be different
among genetic subpopulations. In order investigate this phenomenon in a separate population, the current
study uses the Shanghai Health Study to investigate the dose-response relationship between benzene
exposure and the production of metabolites as well as to determine if there is effect modification via
genetic polymorphisms.
B. Methods
1. Study Population
This analysis uses data from the Shanghai Health Study, which is a cross-sectional study of
workers from five factories in and around Shanghai, China[47, 60, 61]. Each factory contained different
manufacturing processes: two specialized in rubber production, one specialized in shoe production, one
specialized in insulation production and one specialized in pharmaceutical synthesis. All of the factories
used benzene in their production processes. Data was collected on a total of 1046 workers in several on-
site visits between September 2003 and June 2007.
2. Exposure Assessment
Occupational exposure to benzene was estimated for the entire study sample through a
combination of direct assessment in a random subsample and imputation. First, all of the 1046 workers
were assigned to one of 133 similar exposure groups (SEGs). These SEGs were designed to classify
workers based on similar job, location task, work schedule and materials used. Next, a random sample of
734 workers wore an organic vapor badge on their lapel during their work shift to sample workplace
benzene concentrations. Badges were analyzed at the Fudan University School of Public Health
Analytical Laboratory in accordance with NIOSH Methods 1501 and 4000. The limit of detection (LOD)
for benzene was 0.1 mg/m3 and for observations below the LOD, the value was imputed as the LOD
divided by square root of two. Several samples were collected for each worker across multiple shifts, with
an average of four shifts assessed per worker (ranged between one and 14 shifts). For the sampled
workers (n = 734), their individual-level benzene measurement was the arithmetic mean of their samples.
35
For the workers who were not sampled (n = 312), their individual-level benzene measurement was
imputed as the arithmetic mean of all samples within their corresponding SEG.
3. Urinary Metabolites
A total of 142 workers from the initial sample of 1046 had blood and urine specimens collected
after one of their work shifts. The urine specimens were processed at a laboratory using gas
chromatography-mass spectroscopy (GC-MS) to determine the urinary concentrations of four benzene
metabolites: phenol, catechol, hydroquinone and muconic acid. GC-MS was performed according to
previously described protocols using a PerkinElmer Autosystem XL GC/Turbo MS within an
autosampler.
4. Genetic Polymorphisms
The blood specimens collected were used to assess genetic polymorphisms associated with
benzene metabolism. First, genomic DNA was isolated from blood using a Qiagen QIAamp DNA
isolation kit according to the manufacturer’s protocol. Next, the presence of specific single nucleotide
polymorphisms (SNPs) were assessed in genomic DNA by restriction fragment length polymorphism
analysis. A total of four SNPs associated with benzene metabolism were assessed and are summarized in
Table III. The NQO1 C609T variant allele is predicted to be more toxic because loss-of-function of
NQO1 leads to higher levels of benzoquinones. The MPO G462A variant allele is predicted to be less
toxic because loss-of-function of MPO leads to lower levels of benzoquinones. The CYP2E1 C1019T
variant allele is predicted to be less toxic because loss-of-function of CYP2E1 leads to less benzene
entering several metabolic pathways. Finally, the GSTT1 null allele is predicted to be more toxic because
loss-of-function of GSTT1 leads to less benzene intermediates entering detoxification pathways.
5. Other Study Variables
The study also contained a questionnaire component that was administered in-person by a trained
interviewer prior to blood and urine specimen collection. The questionnaire queried a broad range of
topics and this analysis used information on age, sex, body mass index (BMI), current alcohol use and
current tobacco use. Urinary cotinine was also used as a measure of smoking status.
36
TABLE III.
SINGLE NUCLEOTIDE POLYMORPHISMS ASSOCIATED
WITH BENZENE METABOLISM
Gene Polymorphism Effect
NQO1 C609T Loss of function, accumulation of toxic benzoquinones
MPO G462A Loss of function, accumulation of benzoquinone precursors
Loss of function, less entry of benzene into metabolic

CYP2E1 C1019T
pathways
GSTT1 null Loss of function, less activity of detoxification pathways
6. Statistical Analysis
The goal of this study was to characterize the dose-response kinetics between occupational
benzene exposure and the production of urinary metabolites in a subsample of the Shanghai Health Study.
To understand the distributions of key variables in the analysis, descriptive statistics were calculated
overall and stratified by sex. Next, the distributions of benzene exposure and the four metabolites were
visualized to understand the appropriate statistical tests to use in the analysis and to determine if
transformations were required during modeling procedures. Finally, a preliminary assessment of
confounding within the association between benzene exposure and the production of urinary metabolites
was conducted. Specifically, the geometric means of the exposure and outcomes were calculated at each
level of relevant covariates to determine if the covariates were associated with both the exposure and
outcomes. Because the distributions of the exposure and outcomes were not normal, the Kruskal-Wallis
test was conducted as a non-parametric method to determine if the distributions were different across
levels of the covariates.

37
After descriptive statistics summarized the study sample and preliminarily assessed for
confounding, the shape of the dose-response curve between benzene exposure and the production of
urinary metabolites was characterized using multivariable linear regression. For each metabolite, a
separate regression model was developed as a function of benzene exposure. All models were adjusted for
age, sex, BMI category, current alcohol use, current smoking status, urinary cotinine, and urinary
creatinine. Because the metabolites were not normally distributed, the metabolite values were log-
transformed to normalize residuals. Several transformations of benzene exposure were evaluated to
investigate flexibility and non-linearity in the shape of the dose-response. The final benzene specification
was selected based on a priori knowledge regarding the nature of metabolic kinetics and the adjusted R2
(the proportion of variability in the dependent variable explained by the model). The assumptions of
linear regression were also validated during model development by visualizing diagnostics plots such as:
1) a plot of the residuals versus the fitted values to confirm the absence of any distinct patterns in the
residuals; 2) a normal quantile-quantile plot to confirm that the residuals are normally distributed; and 3)
a plot of the square root of the standardized residuals versus the fitted values to confirm homoscedasticity.
After finalized models were specified, the shape of the dose-response between benzene exposure
and the production of urinary metabolites was visualized by plotting model predictions. Predicted values
of metabolite concentrations were estimated within the range of benzene concentrations observed in the
study. Predictions were generated for men and women at the median values for the other covariates in the
model. To account for uncertainty in the sampling, models were bootstrapped from the data over 100
iterations and the 5th, 50th and 95th percentile for predictions were calculated.
The presence of statistical interactions between benzene exposure and SNPs associated with
benzene metabolism were also evaluated in the finalized models. For each metabolite, two-way
interactions were tested individually between benzene exposure and each of the four SNPs. Tests for
interaction were performed by comparing the “full” model containing an interaction term to the “nested”
model without an interaction term using the F-test. A p-value threshold for interaction was set to 0.10
which was adjusted for multiple comparisons using a Bonferroni correction. Therefore, the final p-value
38
threshold for each model was 0.10 / 4 SNPs = 0.025. Models with interactions were visualized by plotting
predicted dose-response curves between benzene exposure and the production of urinary metabolites at
each level of the SNP. To visualize uncertainty, bootstrapping was performed as previously described.
C. Results
Table IV presents the characteristics of the study sample. A total of 185 participants from the
Shanghai Health Study had data on benzene exposure and urinary metabolite concentrations. The majority
of participants were male (n = 111), who were older and have higher BMI compared the females. Self-
reported current alcohol use and current smoking status were also higher in males (70.3% vs. 5.4% and
79.3% vs. 0%, respectively). Although all females in the sample reported no current smoking, serum
cotinine was assessed as an objective marker of nicotine exposure to detect passive smoke exposure and
possible misclassification of self-reported smoking status. As expected, females had lower levels of serum
cotinine relative to males but the variability of serum cotinine within females suggests that self-reported
smoking status alone is not sufficient to capture information on tobacco smoke exposure. Table V
presents the distributions of SNPs associated with benzene metabolism within the study sample. Because
very few participants had a homozygous variant phenotype for CYP2E1 and MPO, these SNPs were
collapsed into dichotomous variables for analysis. In contrast, all possible genotypes for NQO1 were
well-represented in the study sample and this SNP was considered as a three-level variable.
Figure 11 presents the overall distributions of benzene exposure and the urinary metabolites
(catechol, hydroquinone, phenol and muconic acid). These histograms demonstrate that the exposure and
the metabolites all have lognormal distributions. Because variables were lognormally distributed, Table
VI and Table VII summarizes the exposure and metabolites using geometric means. The exposure and
the metabolites were all significantly higher in females. This is likely due to the fact that males and
females had different job titles at the factories and in general females performed tasks that were
associated with higher benzene exposure. Benzene exposure and the concentrations of metabolites all
appeared to decrease with alcohol use and serum cotinine quartiles (this relationship was significant for
benzene, phenol and muconic acid). Alcohol use is expected to be associated with metabolite
39
TABLE IV.
CHARACTERISTICS OF THE STUDY SAMPLE
Overall Male Female

Variable Value
(n = 185) (n = 111) (n = 74)
Age <30 21 (11.4) 4 (3.6) 17 (23)

30 to <40 30 (16.2) 15 (13.5) 15 (20.3)
40 to <50 103 (55.7) 64 (57.7) 39 (52.7)
>=50 31 (16.8) 28 (25.2) 3 (4.1)
BMI Underweight 11 (5.9) 4 (3.6) 7 (9.5)

Normal 127 (68.6) 72 (64.9) 55 (74.3)
Overweight 47 (25.4) 35 (31.5) 12 (16.2)
Current Smoking No 97 (52.4) 23 (20.7) 74 (100)

Yes 88 (47.6) 88 (79.3)
Serum Cotinine 0 to 10 ng/mL 25 (13.5) 6 (5.4) 19 (25.7)

10 to 100 ng/mL 70 (37.8) 16 (14.4) 54 (73)
>100 ng/mL 90 (48.6) 89 (80.2) 1 (1.4)
Current Alcohol No 103 (55.7) 33 (29.7) 70 (94.6)

Yes 82 (44.3) 78 (70.3) 4 (5.4)
40
TABLE V.
DISTRIBUTION OF GENETIC POLYMORPHISMS
Overall Male Female
Variable Value
(n = 185) (n = 111) (n = 74)
NQO1 wt/wt 52 (28.1) 34 (30.6) 18 (24.3)

wt/vt 95 (51.4) 55 (49.5) 40 (54.1)
vt/vt 38 (20.5) 22 (19.8) 16 (21.6)
GSTT1 Positive 85 (45.9) 54 (48.6) 31 (41.9)

Negative 100 (54.1) 57 (51.4) 43 (58.1)
CYP2E1 C1C1 111 (60) 69 (62.2) 42 (56.8)

C1C2 or C2C2 74 (40) 42 (37.8) 32 (43.2)
MPO G/G 135 (73) 84 (75.7) 51 (68.9)

A/G or A/A 50 (27) 27 (24.3) 23 (31.1)
concentrations because benzene metabolism occurs in the liver and alcohol consumption inhibits liver
function. Tobacco smoke exposure is expected to be associated with higher levels of benzene and
metabolites because tobacco smoke is a significant source of benzene. Interestingly, the opposite
relationship was observed in the study sample. This is likely due to the fact that tobacco smoke exposure
is highly correlated with sex and females have higher exposure to occupational benzene.
The dose-response curves between benzene exposure and the production of urinary metabolites
were modelled using multivariable linear regression. Figure 12 depicts untransformed scatterplots and
log-space scatterplots to demonstrate two important aspects of the empirical relationship between benzene
exposure and metabolite concentrations. First, these plots confirm the previous finding that females have
higher levels of benzene and metabolites relative to males. Second, the plots illustrate skewed metabolite
41
Figure 11. Distributions of exposure and outcome
distributions and potential non-linearity in the dose-response curves. To normalize residuals during
regression procedures, the metabolites were log-transformed prior to modeling. To account for potential
non-linearity in the dose-response curve, different transformations of benzene were evaluated: 1)
untransformed benzene; 2) log-transformed benzene; 3) benzene specified with a natural spline using two
knots; and 4) log-transformed benzene with a natural spline using two knots. Models were developed
separately for each of the four metabolites and all models were adjusted for age, sex, BMI category, self-
reported current alcohol use, self-reported current smoking status, serum cotinine and urinary creatinine.
For all metabolites, the finalized models specified benzene using a natural spline with two knots based on
42
TABLE VI.
GEOMETRIC MEANS OF EXPOSURE AND METABOLITES
BY SEX AND BMI CATEGORY
Sex BMI Category
Compound Overall Male Female Underweight Normal Overweight
(n = 185) (n = 111) (n = 74) (n = 11) (n = 127) (n = 47)
BZ 6.96 5.03 11.32 17.86 6.42 6.95

0.0001 0.0751
PH 27.84 19.45 47.67 55.32 26.97 25.82

<0.0001 0.2414
CA 9.05 6.63 14.45 19.82 8.55 8.79

0.0005 0.0970
HQ 7.62 6.27 10.22 14.59 6.96 8.39

0.0050 0.1232
MA 4.80 3.30 8.40 7.79 4.61 4.79

0.0001 0.5402
meeting the assumptions for linear regression, maximizing the adjusted R2 and allowing flexibility in the
dose-response curve. Table VIII presents the parameter estimates for the finalized models. In each
model, the first benzene knot was significant but the second knot was only significant in the models
describing phenol and catechol. Both knots were kept in all models to preserve and evaluate any non-
linearity. Figure 13 presents the dose-response curves predicted from the finalized models among normal
BMI, non-smoking, non-drinking women at the female-specific median values of the other covariates. In
order to account for the sensitivity of the model to individual observed values, models were bootstrapped
100 times and each iteration was plotted along with the 5th, 50th and 95th percentile for predictions. These
plots demonstrate non-linearity in the dose-response curve that is consistent with enzyme-mediated
metabolic processes. Specifically, the rate of metabolism (in other words, the production of metabolite
43
TABLE VII.
GEOMETRIC MEANS OF EXPOSURE AND OUTCOMES
BY SERUM COTININE AND ALCOHOL USE
Serum Cotinine Alcohol Use
Compound
Quartile 1 Quartile 2 Quartile 3 Quartile 4 No Yes
(n = 47) (n = 46) (n = 46) (n = 46) (n = 103) (n = 82)
BZ 10.12 10.33 6.21 3.59 8.98 5.05

0.0020 0.0060
PH 37.31 44.40 21.89 16.46 35.44 20.56

0.0003 0.0048
CA 11.43 11.32 7.22 7.17 10.74 7.31

0.2271 0.1774
HQ 8.41 9.20 7.75 5.62 8.25 6.91

0.1487 0.2595
MA 6.30 7.44 4.29 2.63 6.20 3.48

0.0025 0.0103
44
per unit increase in benzene exposure) increases with dose until saturation. Saturation of metabolite
production become evident at roughly 70 ppm of benzene exposure. The plots also demonstrate that
uncertainty increases at the upper range of exposure due to sparse data and influential data points.
Next, the presence of statistical interaction between benzene exposure and genetic polymorphisms were
assessed in the modelled dose-response curves. Table III summarizes the four SNPs that were evaluated.
The finalized models presented in Table VIII were compared to extended models that contained a two-
way interaction term between the benzene spline and one of the four SNPs (see Methods). Significant
interactions were detected between benzene and MPO for the dose-response curves modeling
hydroquinone and muconic acid. To visualize the interaction, Figure 14 presents the predictions from the
extended models stratified by MPO genotype. For both hydroquinone and muconic acid, MPO genotypes
containing a variant allele (A) appeared to have a steeper dose-response curve with earlier saturation.
Finally, we assessed the dose-response at low levels of benzene exposure. Thus far, we have investigated
the dose-response across the entire range of observed exposure (0.07 to 148.8 mg/m3 benzene). However,
the distribution of the exposure and outcomes are log-normally distributed and about half the observations
are below 10 mg/m3 benzene. Therefore, truncating the data and modeling low-dose exposure may
provide additional insights into the shape of the dose-response that can be obscured when modeling the
full range of exposure. We used muconic acid as the basis for the truncating because this metabolite has
previously been validated as reliable biomarker of low-dose benzene exposure. Figure 15 demonstrates
how deciles of muconic acid correspond to benzene exposure in the study population. We truncated the
data at the 40th percentile of muconic acid (1.67 mg/mL) for a final sample size of 56 individuals. Figure
16 illustrates the distributions of the exposure and outcomes after truncation. All variables are
lognormally distributed, indicating that transformations may be required during modeling of the dose-
response. Each of the metabolites were log-transformed and adjusted for age, sex, BMI, alcohol
consumption, serum cotinine, smoking status, and urinary creatine. Several benzene specifications were
evaluated: a linear term, log-transformed term, or a spline with two knots. A spline with two knows was
45
Figure 12. Scatterplots of metabolite concentrations versus benzene exposure

46
TABLE VIII.
PARAMETER ESTIMATES FROM LINEAR REGRESSION MODELS
CA HQ MA PH
Term
Beta p Beta p Beta p Beta p
Benzene 4.5555 <0.001 4.8227 <0.001 6.5016 <0.001 4.7407 <0.001

2.2496 <0.001 0.5023 0.275 0.2683 0.673 1.0420 0.025
Age -0.0004 0.943 0.0177 0.003 0.0155 0.058 0.0043 0.463
Sex Female
Male -0.4065 0.011 -0.1611 0.318 -0.1429 0.522 -0.0828 0.608
BMI Underweight
Normal -0.1870 0.368 -0.3641 0.088 0.2519 0.392 -0.0650 0.760
Overweight -0.1526 0.503 -0.2779 0.234 0.2615 0.418 -0.1174 0.616
Cotinine 0.000034 0.675 0.000012 0.883 0.000008 0.943 0.000114 0.175
Smoking No
Yes 0.2909 0.087 0.0793 0.647 -0.0930 0.698 -0.3989 0.022
Alcohol No
Yes 0.0623 0.605 0.1392 0.259 -0.0461 0.787 0.0343 0.781
Creatinine 0.0006 <0.001 0.0006 <0.001 0.0007 <0.001 0.0006 <0.001

47
Figure 13. Dose-response curves predicted from linear models

48
Figure 14. Genetic interactions in dose-response curves predicted from linear models
49
Figure 15. Benzene exposure by muconic acid deciles

50
Figure 16. Distributions of exposure and outcome at low exposure
selected based on the R-squared and ability to interrogate flexibility in the dose-response. Figure 17
presents the bootstrapped dose-response curves predicted from these models among normal BMI, non-
smoking, non-drinking women at the female-specific median values of the other covariates. The model
describing catechol provided little evidence for a dose-response with benzene exposure. In contrast, the
models describing hydroquinone, muconic acid, and phenol suggest a small increase in metabolite
production at low dose of benzene exposure, especially below 5 mg/m3. These finalized models were
evaluated for interactions with polymorphisms in NQO1, GSTT1 CYP2E1 and MPO, however no
interactions were detected. These low dose model suggest that there are small increases in metabolite
production at lower levels of benzene exposure, however the majority of dose-dependent metabolism
occurs at higher doses.

51
Figure 17. Dose-response curves at low dose benzene exposure
D. Discussion
The toxic effects of benzene exposure are dependent on the production of reactive benzoquinones
during metabolism. Therefore, characterizing the relationship between exposure and markers of
metabolism is a promising approach to understanding the kinetics of toxicity. To investigate further, this
study modelled the dose-response relationship between occupational benzene exposure and the
production of metabolites using data from the Shanghai Health Study. Personal benzene exposure was
assessed using passive samplers and post-shift metabolite concentrations were assessed in urinary
specimens for a total of 185 factory workers. Four separate linear regression models were constructed to
characterize the association between benzene exposure and four metabolites: catechol, hydroquinone,
52
phenol and muconic acid. In all models the metabolite outcomes were log-transformed. To account for
potential non-linearity in the dose-response, benzene was specified using a natural spline with two knots.
All models were adjusted for age, sex, BMI category, self-reported current alcohol use, self-report current
smoking status, serum cotinine, and urinary creatinine. The modeled relationships exhibited a clear dose-
response with non-linearities that are consistent with enzyme kinetics. Specifically, the rate of metabolite
production increased as benzene exposure increased until saturation, which appeared to be around 70
mg/m3 for all metabolites. The models were further extended to assess whether functional polymorphisms
in genes related to benzene metabolism modified the dose-response relationship. SNPs were assessed in
four genes that code for enzymes involved in benzene metabolism: NQO1, MPO, CYP2E1 and GSTT1
(Table III). Interactions were detected between benzene exposure and the MPO G463A polymorphism in
the dose-response for two metabolites: hydroquinone and muconic acid. In both cases, individuals with
the variant allele were predicted to have higher urinary metabolite concentrations with increasing benzene
exposure, relative to individuals with the wild-type genotype.
We also assessed the shape of the dose-response at low levels of benzene exposure. Separate
models were constructed after truncating the data at the 40th percentile of muconic acid concentration. By
truncating the data and constructing separate models, we were able to independently evaluate the effects
of low dose exposure that may have been obscured in our full-range analysis. At low levels of exposure,
we observed no association between benzene exposure and production of catechol and phenol. In contrast,
benzene exposure was associated with increases in production of hydroquinone and muconic acid,
especially below 5 mg/m3. These findings reinforce that hydroquinone and muconic acid are reliable
markers of low-dose benzene exposure. Our findings further suggest that low-dose benzene exposure
favors the production of these two metabolites over catechol and phenol.
This study is based on the Shanghai Health Study cohort which is one of only two cohorts to
investigate occupationally exposure workers and collect information on benzene exposure, urinary
metabolites and functional polymorphisms associated with benzene metabolism. The previous cohort is
comprised of 250 benzene-exposed workers and 136 control workers from a total of five factories in
53
Tianjin, China[56]. Similar to the Shanghai Health Study, the Tianjin study collected data on personal
benzene exposure and urinary concentrations of benzene metabolites. To understand the kinetics of
benzene toxicity, linear regression models were used to characterize urinary metabolite concentrations as
a function of benzene exposure[52]. Similar to our analysis, the metabolite concentrations were log-
transformed to satisfy an assumption of linear regression that the outcome must be normally distributed.
However, one critical difference is the specification of benzene exposure. To account for non-linearities
in the dose-response due to saturation of enzyme kinetics, both approaches specified benzene using a
natural spline. The Tianjin study took the additional step of log-transforming benzene exposure,
presumably to normalize the distribution. However, normalizing the distribution of the independent
variable is not a requirement of linear regression and can result in model misspecification. In the Tianjin
study, the combination of log-transformation and spline specification of the exposure further leads to
cumbersome and unintuitive model predictions. In our analysis, we compared models with four
specifications of benzene: 1) untransformed benzene; 2 log-transformed benzene; 3) untransformed
benzene with a natural spline; and 4) log-transformed benzene with a natural spline. Within our sample,
the model diagnostic plots demonstrate that untransformed benzene with a natural spline provided
comparable results in terms of normalized residuals and homoscedasticity without the complications of
additional transformations. Despite the differences in model specifications, both analyses made similar
conclusions regarding the kinetics of metabolite production. Specifically, there was a strong dose-
response relationship between benzene exposure and the urinary metabolite concentration and both
analyses observed saturation of metabolism at high levels of exposure. However, the results from the
Tianjin study are presented in log-space which limits their interpretability.
The Tianjin study also assessed for interactions between benzene exposure and functional
polymorphisms associated with benzene metabolism using the metabolite dose-response models
described above[59]. Both studies assessed for the same SNPs in GSTT1, MPO, NQO1 and CYP2E1. In
the Tianjin study, interactions were detected for NQO1 (for all metabolite models) and CYP2E1 (for the
model describing muconic acid, hydroquinone, and phenol). Interestingly, our study only detected
54
interactions with MPO in the models describing hydroquinone and muconic acid production. These
inconsistencies may be due to the different modeling approaches implemented in the two studies. Both
studies have relatively small sample sizes and the differences in demographic characteristics and
distribution of SNP genotypes may also contribute to the different findings.
Within our study, individuals with a variant MPO allele were predicted to have higher urinary
concentrations of hydroquinone and muconic acid, relative to individuals with the wild-type genotype.
The increased accumulation of hydroquinone is consistent with the predicted behavior of this MPO
polymorphism. MPO is the enzyme that catalyzes the conversion of catechol and hydroquinone to 1,2-
and 1,4-benzoquinone, respectively[62]. The G463A polymorphism is presumed to translate to a variant
form of MPO that has decreased activity to catalyze this reaction[47]. As a result, this polymorphism is
predicted to result in decreased production of ROS-generating benzoquinones and increased accumulation
of more stable upstream intermediates such as hydroquinone. Interestingly, increased muconic acid
production was also observed with MPO G463A despite the fact that it is not directly in the metabolic
pathway catalyzed by MPO. It is possible that compensatory mechanisms within benzene metabolism
direct intermediates towards the production of muconic acid when other pathways are not as active. To
date, only one study has characterized the health effects of MPO G463A[59]. The study was an extension
of the Tianjin cohort that investigated changes in blood counts associated with benzene exposure.
Individuals with the wild-type MPO genotype had a greater decrease of white blood cells counts due to
benzene exposure compared to individuals who had a variant G463A allele. Our findings support these
observations and further suggest that the decrease in hematotoxicity among individuals with the variant
G463A allele is due to increased accumulation of more stable upstream metabolites presumably at the
expense of ROS-generating benzoquinone production.
Our study has important limitations that should be considered when interpreting the results. The
first limitation is the relatively small sample size. A total of 185 benzene-exposed workers were observed
in this study and their exposures ranged from 0.05 mg/m3 to 150 mg/m3. Saturation because evident after
75 mg/m3, however data started to become sparse at these high levels as well. To account for uncertainty,
55
models were bootstrapped over 100 iterations to generate median predictions and 95% confidence
intervals. Figure 13 plots these results and illustrates increased uncertainty as exposure increases. In some
cases, particular data points are highly influential on the predicted trend. Therefore, caution must be
applied when making conclusions about the point at which saturation of metabolism is achieved. Another
limitation is the potential for co-exposures in this study population. Although benzene was the principal
solvent used in the manufacturing processes in the factories of the Shanghai Health Study, there is the
possibility of co-exposure to other volatile organic chemicals[61]. Potential co-exposures could influence
the metabolic processes assessed in this study.
In summary, our study modelled the dose-response relationship between occupational benzene
exposure and the production of metabolites in a cross-sectional sample of factory workers. Additionally,
interactions were detected among polymorphisms of MPO that support previous findings that the MPO
G463A allele is protective against the hematotoxic effects of benzene. These findings further elucidate the
kinetics of benzene toxicity which can help to inform benzene regulatory standards and protect workers.
56
IV. HEMATOLOGIC EFFECTS OF ENVIRONMENTAL BTEXS EXPOSURE
A. Background
Benzene, toluene, ethylbenzene, xylene, and styrene (commonly referred to as BTEXS) are a
prevalent mixture of hazardous volatile organic chemicals that share several exposure pathways. Sources
of exposure include refineries, petrochemical plants, dry cleaners, gasoline stations, and combustion and
evaporative emissions from gasoline and diesel vehicles[63–66]. Because of their use as fuel additives,
solvents and industrial intermediates, BTEXS can also be found in a variety of consumer products
including paints, adhesives, cleaning agents, deodorizers, and personal care products[67]. Finally, BTEXS
is also a major constituent of tobacco smoke, therefore smoking remains the most prevalent source of
exposure in the United States[68]. Given the diverse array of exposure pathways, BTEXS exposure has
the potential to adversely affect human health in the context of both occupational and environmental
exposure.
The hazardous effects of the individual elements of BTEXS are well-characterized. With regard
to carcinogenicity, benzene is a known carcinogen implicated in acute myelogenous leukemia and
ethylbenzene is a potential carcinogen[61, 69–71]. Toluene, ethylbenzene, xylene and styrene are also
associated with respiratory, neurological, hepatic and reproductive effects[72–75]. Furthermore, benzene
and toluene are associated with a variety of hematological effects including changes in blood cell counts,
damage to hematopoietic progenitor cells and the destruction of bone marrow[60, 76, 77]. The majority of
this evidence is derived from studies focused on occupational exposure and there are few studies
assessing the health effects of environmental exposure to BTEXS owing to complications in study design.
Occupational studies tend to focus on controlled processes where the exposure is well-characterized and
co-exposures are minimized. In contrast, because all BTEXS elements share common environmental
exposure pathways, environmental exposure tends to occur as a highly correlated mixture. Attempts to
characterize the effects of each BTEXS element in the context of a correlated mixture by incorporating
separate regression parameters tend to encounter problems of collinearity and inflated standard errors.
57
One recent example that illustrates the challenges in studying the health effects of environmental
BTEXS exposure is an investigation from the Gulf Long-term Follow-up (GuLF) Study[78]. The study
population were residents of the United States gulf states (Alabama, Florida, Louisiana, Mississippi, and
eastern Texas) between May 2011 and May 2013 who were potentially exposure to BTEXS as a
consequence of the Deepwater Horizon oil spill[79]. Blood samples were collected to determine if blood
BTEXS concentrations were associated with changes in hematologic parameters. The authors noted
strong correlations between the individual BTEXS elements, especially among smokers. Second, they
observed an inverse association between blood benzene concentrations and hemoglobin among non-
smokers and positive associations between several BTEXS elements and several hematologic parameters
among smokers. This analysis used generalized linear models to characterize the effects of BTEXS,
however only single-exposure models were used. Therefore, the effect of each BTEXS element in the
context of the overall mixture was not assessed. Given the highly correlated nature of the BTEXS
mixture, single-exposure models are unable to disentangle the effects attributable to each of the mixture
elements. Ideally, the BTEXS elements should be mutually adjusted for one another, however the authors
noted that attempting to mutually adjust for highly colinear variables in a single regression model would
results in highly imprecise effect estimates.
The issue of highly colinear co-exposures is an emerging problem in environmental
epidemiology. Recently, the National Institute of Environmental Health and Sciences has highlighted the
need to move beyond the traditional approaches of testing one chemical at a time and instead conduct
research designed to evaluate a mixture of co-exposures[80]. These approaches should be able to not only
quantify the health effects of the overall mixture but also identify key constituents that are driving the
association. Several methods have been developed to assess mixtures including principal component
analysis and LASSO regression as well as newer techniques such as quantile G-computation and
Bayesian kernel machine regression[81–83]. Few studies have used these techniques in the context of
BTEXS exposure. In this analysis, we implemented modern mixture methods to evaluate the association
58
between environmental exposure to BTEXS and hematologic parameters among a subset of participants
in the National Health and Nutrition Examination Survey (NHANES).
B. Methods
1. Study Population
The National Health and Nutrition Examination Survey is a nationally representative cross-
sectional survey designed to assess the health and nutritional status of the general US population.
Interviews were conducted in participants’ homes by trained interviewers to collect information on
demographic characteristics, health history and health-related behaviors. Health-related measurements
and collection of biospecimens were performed at mobile examination centers. In this analysis, we used
the 1999-2000 and 2001-2002 survey cycles and analyzed data from adults aged 20 to 59 years old.
2. Assessment of VOC exposure
Measures of blood benzene, toluene, ethylbenzene, xylene, and styrene concentrations were
obtained in a nationally representative subsample containing approximately one-fourth of participants
aged 20 to 59 years old. Detailed information regarding whole blood biospecimen collection, laboratory
methods to determine VOC concentrations and quality control processes can be found in the NHANES
laboratory procedure manuals. The limits of detection for the VOCs were different between the 1999-
2000 and 2001-2002 survey cycles: benzene (0.045 and 0.024 ng/mL for the 1999-2000 and 2001-2002
survey cycles, respectively), toluene (0.033, 0.025 ng/mL), ethylbenzene (0.014, 0.016 ng/mL), xylene
(0.029, 0.033 ng/mL) and styrene (0.009, 0.015 ng/mL). For analytical results below the limit of
detection, the value was imputed as the limit of detection divided by the square root of two. For this
analysis, the percentage of sample below the limit of detection was 18.6% for benzene, 3.7% for toluene,
24.2% for ethylbenzene, 3.5% for xylene, and 22.5% for styrene.
3. Assessment of Blood Parameters
Measures of blood parameters from a complete blood count were obtained in all participants aged
one year and older. Detailed information regarding whole blood biospecimen collection, laboratory
methods to determine complete blood count parameters and quality control processes can be found in the
59
NHANES laboratory procedure manuals. For this analysis, the following parameters were used from the
complete blood count: white blood cell count (millions of cells per uL of blood), red blood cell count
(millions of cells per uL of blood), hemoglobin (grams per dL of blood) and platelet count (thousands of
cells per uL of blood).
4. Assessment of Covariates
Several covariates that may be associated with VOC exposure, blood parameters or both were
also evaluated in this analysis. The participants’ demographic characteristics were ascertained from
questionnaires conducted during household interviews. These characteristics include sex (male or
female), race/ethnicity (non-Hispanic white, non-Hispanic black, Hispanic or other race), age (20 to 29,
30 to 39, 40 to 49 or 50-59) and annual family income (less than $20,000, $20,000 to $44,999, $45,000 to
$74,999 and greater than $74,999). Questionnaire data was also used to assess alcohol and tobacco
exposure. Current alcohol use (yes, no) was defined as having at least 12 drinks in the past year. Smoking
(current, former, never) was defined through a combination of the questions “Do you now smoke
cigarettes?” and “Have you smoked at least 100 cigarettes in your entire life?”. Cumulative pack-years of
smoking for current smokers was defined as the difference between current age and the age at which
regular smoking started, multiplied by the average number of packs smoked in the past month.
Cumulative pack-years of smoking for former smokers was defined as the difference between the age at
which regular smoking stopped and the age at which regular smoking started, multiplied by the average
number of packs smoked per month as a former smoker. Household smoking (yes, no) was defined
through the question “Does anyone smoke in the home?”. Weight and height were measured using a
digital scale and stadiometer, respectively. BMI was then derived by dividing weight by height squared.
5. Statistical Analysis
In this study, we utilized several mixture methods that do not currently accommodate the
complex survey design implemented in NHANES. We therefore did not use the survey weights in any
analysis. As a consequence, the findings from this study cannot be generalize to the general US
population but nevertheless provide valuable mechanistic insights on the health effects of VOC mixture
60
exposure. To understand the outcome and exposure variables, histograms were generated to visualize
their distributions and to determine if variables were approximately normal or lognormal. For an initial
assessment of confounding, means of the exposures and outcomes were calculated at each level of
covariates of interest to assess for empirical associations. For normally distributed variables, arithmetic
means were calculated and associations were tested using analysis of variance (ANOVA). For
lognormally distributed variables, geometric means were calculated and associations were tested using the
Kruskal-Wallis test. Prior to investigating the health effects of VOC as a mixture, the effects of each VOC
was assessed individually in single-exposure multivariable linear regression models. All models were
adjusted for age, sex, BMI category, annual family income, current alcohol use, smoking status,
cumulative pack-years smoking and household smoking. To account for potential non-linearities in the
dose-response, each exposure was specified as nominal quartiles and effect sizes were estimated relative
to the first quartile. A test for trend was also performed by specifying original quartiles. To assess the
effects of VOC as a mixture, two statistical approaches were implemented: quantile g-computation
(qgComp) and Bayesian kernel machine regression (BKMR)[84, 85]. qgComp uses g-computation to
estimate the joint effects of a mixture and generate a set of weights that describes the contribution of each
mixture element to the overall effect estimate. These weights can take both positive and negative values,
which is a distinct advantage over other quantile-based mixture methods that instead require an
assumption of directional homogeneity. For each outcome, weights were calculated and the overall
mixture effect was quantified after 1000 bootstrapped iterations. The method uses a flexible kernel
function to specify the unknown exposure-outcome function without any parametric assumptions. This
allows for a wide range of relationships, including non-linear associations and interactions. In this
analysis, VOCs were log-transformed and a Guassian kernel was specified for each of the blood
parameter outcomes. After fitting the final model by running a Markov chain Monte Carlo sampler for
50,000 iterations, the posterior inclusion probability for each VOC was generated and estimates of the
overall exposure-outcome function were produced.

61
C. Results
Table IX shows the details of the study sample selection process. A total of 21,004 participants
were available for the 1999-2000 and 2001-2002 survey cycles. Participants were excluded if they had
missing survey weights for the VOC subsample, missing covariate information or missing outcome and
exposure data. Participants were also excluded if they were prescribed medications associated with the
outcomes or had pre-existing cancer, anemia, or history of blood transfusions. Table X presents the
unweighted characteristics of the final study sample. The majority of participants were white and male.
50.5% of the sample reported never smoking, 76.3% of the sample reported no household smoking and
86% of the sample reported current alcohol use.
TABLE IX.
DETERMINATION OF SAMPLE SIZE
Stage n
Total, 2000 and 2002 survey cycles 21004

Non-missing weights 2300
Non-missing covariates 1579
Non-missing outcome 1577
Non-missing all exposures 731
After medication/condition exclusions 649
62
TABLE X.
CHARACTERISTICS OF THE STUDY SAMPLE
Variable Value n %
Overall 649 100.0
Sex Male 334 51.5

Female 315 48.5
Age Group 20 to 29 191 29.4

30 to 39 168 25.9
40 to 49 175 27.0
50 to 59 115 17.7
Ethnicity White 338 52.1

Black 99 15.3
Hispanic 192 29.6
Other 20 3.1
BMI Group Normal 255 39.3

Overweight 232 35.7
Obese 162 25.0
Income Less than $20K 148 22.8

$20K to $45K 199 30.7
$45K to <$75K 160 24.7
Greater than $75K 142 21.9
Smoking Status Never 328 50.5

Former 114 17.6
Current 207 31.9
Smoking Pack-Years 0 pack-years 336 51.8

>0 to 15 pack-years 218 33.6
>15 pack-years 95 14.6
Family Smoking No 495 76.3

Yes 154 23.7
Alcohol Use No 90 13.9

Yes 559 86.1
63
Figure 18 illustrates the distributions of BTEXS blood concentrations in the sample. All exposure
variables demonstrated lognormal distributions. The overall geometric means for benzene, toluene,
ethylbenzene, xylene and styrene were 0.051 (min of 0.017 and max of 1.02), 0.147 (0.017, 5.10), 0.034
(0.010, 0.949), 0.142 (0.024, 3.60), and 0.034 (0.006, 1.90), respectively. Figure 19 illustrates the
distributions of the blood parameter outcomes in the sample. All outcome variables demonstrated normal
distributions.
For an initial assessment of confounding, the means of the BTEXS blood concentrations and the
blood parameters were calculated at each level of relevant covariates to determine if these covariates are
Figure 18. Distributions of BTEXS blood concentrations

64
Figure 19. Distributions of blood parameters
associated with the exposures and the outcomes. Table XI presents the geometric means of BTEXS blood
concentrations. All measures of tobacco smoke exposure (smoking status, cumulative smoking pack-years
and household smoking) were associated with higher levels of all BTEXS blood concentrations. Alcohol
use was associated with higher levels of benzene, ethylbenzene, and xylene. Females has lower levels of
toluene and ethylbenzene. Older age was associated with higher levels of benzene, toluene, ethylbenzene,
and xylene. Higher income was associated with lower levels of benzene and toluene. Finally, higher BMI
was associated with lower levels of ethylbenzene and xylene. Table XII presents the arithmetic means of
the blood parameters assessed in this study. All blood parameters were associated with sex and BMI.
Smoking status and cumulative smoking pack-years were both associated with white blood cell count, red
65
TABLE XI.
GEOMETRIC MEANS OF EXPOSURES BY SELECTED COVARIATES
Benzene Toluene Ethylbenzene Xylene Styrene
Variable Value
(ng/mL) (ng/mL) (ng/mL) (ng/mL) (ng/mL)
Sex Male 0.053 0.165 0.037 0.151 0.036

Female 0.048 0.130 0.032 0.133 0.032
p-value 0.5049 0.017 0.0265 0.2384 0.0846
Age Group 20 to 29 0.046 0.132 0.031 0.131 0.033

30 to 39 0.044 0.131 0.033 0.130 0.033
40 to 49 0.057 0.158 0.037 0.152 0.037
50 to 59 0.060 0.185 0.040 0.167 0.035
p-value 0.031 0.0343 0.0065 0.0041 0.1836
Ethnicity White 0.053 0.152 0.036 0.148 0.036

Black 0.058 0.161 0.034 0.135 0.039
Hispanic 0.044 0.131 0.032 0.135 0.030
Other 0.055 0.157 0.035 0.159 0.036
p-value 0.4368 0.5436 0.4726 0.3868 0.0268
BMI Group Normal 0.055 0.153 0.038 0.154 0.036

Overweight 0.048 0.153 0.035 0.147 0.035
Obese 0.047 0.131 0.029 0.119 0.032
p-value 0.2411 0.2066 0.0041 0.0056 0.3453
Income Less than $20K 0.058 0.165 0.037 0.149 0.037

$20K to $45K 0.056 0.160 0.035 0.143 0.036
$45K to <$75K 0.047 0.137 0.035 0.140 0.033
Greater than $75K 0.040 0.124 0.030 0.137 0.031
p-value 0.0068 0.1049 0.2171 0.8829 0.2652
Smoking Status Never 0.034 0.100 0.027 0.116 0.027

Former 0.043 0.126 0.033 0.150 0.033
Current 0.105 0.293 0.054 0.190 0.053
p-value <.0001 <.0001 <.0001 <.0001 <.0001
Smoking Pack-Years 0 pack-years 0.034 0.101 0.027 0.117 0.027

>0 to 15 pack-years 0.060 0.169 0.039 0.158 0.039
>15 pack-years 0.142 0.397 0.065 0.224 0.061
p-value <.0001 <.0001 <.0001 <.0001 <.0001
Family Smoking No 0.039 0.118 0.030 0.128 0.030

Yes 0.112 0.295 0.054 0.197 0.052
p-value <.0001 <.0001 <.0001 <.0001 <.0001
Alcohol Use No 0.041 0.122 0.030 0.119 0.032

Yes 0.052 0.151 0.035 0.146 0.035
p-value 0.0206 0.0764 0.0229 0.0492 0.2365
p-value from Kruskal-Wallis test

66
TABLE XII.
ARITHMETIC MEANS OF OUTCOME BY SELECTED COVARIATES
WBC
RBC Count Platelet
Count Hemoglobin
Variable Value (million (1000
(million (g/dL)
cells/uL) cell/uL)
cells/uL)
Sex Male 6.80 5.09 15.58 254.12
Female 7.58 4.36 13.30 284.96
p-value <.0001 <.0001 <.0001 <.0001
Age Group 20 to 29 7.72 4.73 14.42 277.60

30 to 39 7.37 4.72 14.41 267.91
40 to 49 6.84 4.78 14.54 270.86
50 to 59 6.51 4.70 14.54 253.96
p-value <.0001 0.5678 0.8231 0.0187
Ethnicity White 7.37 4.73 14.61 269.51

Black 6.24 4.64 13.77 256.42
Hispanic 7.25 4.80 14.63 271.98
Other 7.78 4.72 14.12 296.75
p-value <.0001 0.1258 <.0001 0.0462
BMI Group Normal 6.85 4.65 14.34 261.14

Overweight 7.24 4.81 14.70 266.50
Obese 7.60 4.75 14.35 285.30
p-value 0.0026 0.0029 0.0330 0.0006
SES Group Less than $20K 7.41 4.82 14.69 273.26

$20K to $45K 7.24 4.68 14.34 265.91
$45K to <$75K 7.21 4.72 14.41 273.01
Greater than $75K 6.81 4.73 14.50 264.76
p-value 0.1131 0.1218 0.2251 0.5031
Smoking Status Never 7.16 4.68 14.16 273.00

Former 6.67 4.72 14.46 261.69
Current 7.49 4.83 14.97 266.96
p-value 0.0055 0.0033 <.0001 0.2282
Pack-Years 0 pack-years 7.15 4.68 14.16 273.10

>0 to 15 pack-years 6.90 4.77 14.66 263.58
>15 pack-years 7.92 4.86 15.14 267.55
p-value 0.0007 0.0061 <.0001 0.2267
Family Smoking No 7.01 4.73 14.38 268.80

Yes 7.70 4.74 14.77 270.00
p-value 0.0006 0.8679 0.0096 0.8399
Alcohol Use No 7.67 4.63 13.90 280.13

Yes 7.10 4.75 14.56 267.31
p-value 0.0206 0.0551 0.0003 0.0785
p-value from ANOVA
67
blood cell count and hemoglobin. Household smoking was only associated with white blood cell count
and hemoglobin. Older age was associated with lower white blood cell count and lower platelet count.
Finally, race/ethnicity was associated with all blood parameters except for red blood cell count.
Figure 20 illustrates the correlation between individual elements in the BTEXS mixture, overall
and stratified by smoking status. The correlations between the elements in the overall study sample is
high, with Spearman rank correlations ranging between 0.601 and 0.873. Because tobacco smoke is a
major source of BTEXS exposure, the correlations are even higher among current smokers, with
Spearman rank correlations ranging between 0.762 and 0.923. Given the high correlation between
elements in the BTEXS mixture, this figure illustrates the need for mixture methods to appropriately
quantify overall health effects and to disentangle individual contributions. Figure 21 illustrates the
correlation between the different blood parameter outcomes assessed in this analysis. Not surprisingly,
red blood cell count and hemoglobin were highly positively correlated with each other. Platelet count and
white blood cell count were also weakly positively correlated with each other.
Before using mixture methods to understand the overall effects of the BTEXS mixture, effects of
the individual elements were quantified in adjusted single-exposure models. Based on the results from the
initial assessment of confounding, all models were adjusted for sex, age, ethnicity, BMI category, annual
household income, current alcohol use, smoking status, cumulative smoking pack-years and household
smoking. Exposures were specified using nominal quartiles to evaluate the shape of the dose-response
relative to the first quartile. Exposures were also specified using ordinal quartiles to test for a monotonic
trend in the dose-response relationship. Figure 22 to Figure 25 summarize the results for the single-
exposure models for each of the blood parameter outcomes. For white blood cell count (Figure 22) and
platelet count (Figure 25), none of the exposures provided evidence of a dose-response relationship. For
red blood cell count (Figure 23) and hemoglobin (Figure 24), the single-exposure models identified
positive dose-response relationships for benzene, toluene, ethylbenzene, and styrene. Given the effects of
BTEXS on hematologic parameters when each component is assessed individually, we also tested
whether estimates changed when all BTEXS components were specified in single linear regression model.
68
Figure 20. Correlation between BTEXS elements

69
Figure 21. Correlation between blood parameters
Figure 26 to Figure 29 summarize the results for the single linear regression model specifying all
BTEXS components. Exposures were specified using nominal quartiles to evaluate the shape of the dose-
response relative to the first quartile and specified using original quartiles to test for a monotonic trend.
When exposures were specified as nominal quartiles, none were significant relative to the first quartile.
When exposures were specified as ordinal quartiles, none of the exposures were significant. This further
illustrates the challenges surrounding investigating correlated mixtures: although individual components
demonstrated strong associations with hematologic parameters, mutual adjustment for correlated
covariates inflates standard errors and obscures attempts to characterize independent effects. These results
provide some initial information about the health effects of BTEXS, however given the correlation
70
Figure 22. Single-exposure linear regression results for white blood cells
Figure 23. Single-exposure linear regression results for red blood cells
71
Figure 24. Single-exposure linear regression results for hemoglobin
Figure 25. Single-exposure linear regression results for platelet count

72
Figure 26. Multiple-exposure linear regression results for white blood cells
Figure 27. Multiple-exposure linear regression results for red blood cells
73
Figure 28. Multiple-exposure linear regression results for hemoglobin
Figure 29. Multiple-exposure linear regression results for platelet count

74
structure, mixture methods are required to understand the overall mixture effect and identify contributions
from each BTEXS element.
The first mixture method implemented in this analysis was qgComp[85]. qgComp calculates
weights for each mixture element that are proportional to its respective contribution to the overall effect.
Each mixture element is assigned either a positive or negative weight depending on the direction of its
effect in the context of the overall mixture. The absolute value of all negative and positive weights each
sum to one. These weights are then used to calculate a parameter estimate for the overall mixture effect.
The direction and associated p-value of the parameter estimate can then be used to evaluate the mixture
effect. Figure 30 presents the qgComp results for each of the four outcomes. Similar to the results from
the single-exposure models, qgComp did not identify any significant overall mixture effects for the white
blood cell count and platelet count outcomes. In contrast, the overall mixture was significantly positively
associated with red blood cell count and hemoglobin. In both cases benzene and toluene were found to
positively contribute to these associations, with toluene consistently being the highest contributor. Styrene
was also positively weighted for both outcomes, however the magnitude of the weight was negligible.
Ethylbenzene was the only element whose direction of weighting was different between red blood cell
count (positive) and hemoglobin (negative). These qgComp results suggest that exposure to the BTEXS
mixture is associated with increases in red blood cell count and hemoglobin and that these associates are
driven primarily by benzene and toluene.
The second mixture method used in this analysis was BKMR[84]. BKMR fits a flexible kernel
function for each element that can be used to interpret effects of the overall mixture. First, the method
generates posterior inclusion probabilities for each mixture element that indicates how much the data
favors the inclusion of that element when modelling an outcome. Second, the overall risk summary of the
mixture can be determined by evaluating changes in the outcome as all kernel functions increase
simultaneously. Third, the individual exposure-responses in the context of the mixture can be determined
by evaluating changes in the outcome as one kernel function increases and all other are held constant.
Figure 31, Figure 32, and Figure 33 present the BKMR results for white blood cell count, platelet count,
75
Figure 30. Association between BTEXS and blood parameters assessed by qgComp
76
Figure 31. Association between BTEXS and WBC count assessed by BKMR
77
Figure 32. Association between BTEXS and platelet count assessed by BKMR
78
Figure 33. Association between BTEXS and RBC count assessed by BKMR
79
and red blood cell count, respectively. In all cases, all of the BTEXS elements had low posterior inclusion
probabilities, suggesting that none of the elements were useful in modeling these outcomes. The overall
risk summaries further suggest that there is no clear dose-response relationship between the BTEXS
mixture and these outcomes. Figure 34 presents the BKMR results for hemoglobin. The overall risk
summary suggested a positive linear dose-response relationship between BTEXS mixture exposure and
hemoglobin, although all confidence intervals included the null. In the context of the BTEXS mixture,
toluene was the only element to have a relatively high posterior inclusion probability and its individual
exposure-response relationship was positive and non-linear. These BKMR results suggest that exposure to
the BTEXS mixture is associated with an increase in hemoglobin and that toluene is the major contributor
to this association.
Finally, we performed the mixture analysis stratified by smoking status. Personal tobacco smoke
exposure is a major source of BTEXS, therefore it is possible that the correlation structure of the BTEXS
mixture is qualitatively different based on smoking status. This could potentially lead to different effects
among current smokers vs. nonsmokers. Figure 20 previously depicted the BTEXS correlated structure
among current smokers and nonsmokers. Overall, pairwise correlations were stronger in current smokers.
Whereas the pairwise correlation between ethylbenzene and xylene was the only one above 0.8 in
nonsmokers (correlation coefficient = 0.83), eight out of ten pairwise correlations were above 0.8 in
current smokers. We therefore performed the qgComp and BKMR mixture analyses separately among
nonsmokers (n = 442) and current smokers (n = 207). Figures 35 and 36 presents the qgComp results for
each of the four outcomes in nonsmokers and current smokers, respectively. Similar to the results from
the overall qgComp analysis, the overall mixture was significantly positively associated with hemoglobin
and marginally positively associated with red blood cell count in nonsmokers. In both cases, toluene was
the highest positive contributor to these associations. Results were also similar in the smoking stratum,
with the overall BTEXS mixture marginally positively associated with hemoglobin and red blood cell
count. The modest sample size (n = 207 compared to n = 649 in the overall analysis) may contribute to
unstable estimates and the inability to detect any significant associations. Nevertheless, the qualitatively
80
Figure 34. Association between BTEXS and hemoglobin assessed by BKMR

81
Figure 35. Association between BTEXS and blood parameters assessed by qgComp among
nonsmokers
82
Figure 36. Association between BTEXS and blood parameters assessed by qgComp among current
smokers
83
similar findings in the two strata suggest that BTEXS affects hematologic parameters similarly in
nonsmokers and current smokers.
Bayesian kernel machine regression analysis was also stratified by smoking status. Figure 37
present the BKMR results in the nonsmoking stratum for white blood cell count, platelet count, red blood
cell count, and hemoglobin. Results were generally similar between the total population and the
nonsmoking stratum. Specifically, the overall risk summary suggested a positive linear dose-response
relationship between BTEXS mixture and hemoglobin, although all confidence intervals included the
null. In this instance, only toluene was identified as having a relatively high posterior inclusion
probability. Figure 38 present the BKMR results in the smoking stratum for white blood cell count,
platelet count, red blood cell count, and hemoglobin. Here the results were different between the total
population and the smoking stratum. For both the white blood cell count and platelet count outcome, all
five BTEXS components had modest posterior inclusion probabilities. In both cases, the overall risk
summary suggested a modest inverse U-shaped association with decreases in white blood cell counts and
platelet counts at higher exposure levels. This suggests a non-linear relationship between BTEXS and
both white blood cell count and platelet count that is specific to current smokers.
D. Discussion
Benzene, toluene, ethylbenzene, xylene, and styrene is a collection of toxic volatile organic
chemicals that pose substantial risks to human health. These chemicals are found in industrial
intermediates, cleaning products, fossil fuel combustion and tobacco smoke[63–65, 67]. Environmental
exposure is prevalent and occurs as a correlated mixture. Few studies have assessed the health effects of
environmental exposure to BTEXS and those that do have been limited by their ability to address the
methodological issues inherent to correlated mixtures. In this study, we evaluated the association between
environmental exposure to BTEXS and hematologic parameters among a subset of participants in
NHANES. We specifically focused on implementing mixture methods to understand the overall health
effects of BTEXS exposure and to further identify the individual VOCs that are driving these effects.
84
Figure 37. Association between BTEXS and hematologic parameters assessed by BKMR among
nonsmokers
85
Figure 38. Association between BTEXS and hematologic parameters assessed by BKMR among
smokers
86
We divided out analysis into three components. First, we individually characterized the
association between each of the BTEXS elements and hematologic parameters using separate single-
exposure regression models. We also implemented qgComp and BKMR to appropriately characterize the
overall mixture effects as well as the effects of each of the BTEXS elements within the context of the
mixture. In the single-exposure analysis, increases in red blood cell count and hematologic were detected
for benzene, toluene, ethylbenzene, and styrene exposure. In the qgComp analysis, the overall BTEXS
mixture was associated with increases in red blood cell count and hemoglobin. For both of these
outcomes, benzene and toluene contributed to these associations and toluene consistently had the highest
positive weight. In the BKMR analysis, the overall BTEXS mixture was associated with an increase in
hemoglobin and toluene was the only element to have a relatively high posterior inclusion probability.
There are several similarities and differences in the findings derived from the different
approaches. All methods detected associations between some aspect of BTEXS and hemoglobin. Single-
exposure models and qgComp also detected associations between some aspect of BTEXS and red blood
cell count, which is not surprising because red blood cell count and hemoglobin are highly correlated
outcomes. Between all methods, toluene also was consistently identified as an important element in these
associations. Despite these similarities, there are some inconsistencies that highlight key differences in the
analytical approach of the different methods. It is clear that the single-exposure models were the most
liberal and detected associations with hematologic outcomes in four out of the five BTEXS elements
(benzene, toluene, ethylbenzene, and styrene). Without a framework to handle correlated exposures,
separately assessing associations for each of the correlated BTEXS elements unsurprisingly lead to many
individual associations being detected. This finding highlights the need to implement mixture methods,
which can characterize the overall effect of BTEXS and the effects of each BTEXS element while
removing mutual confounding effects. After implementing two distinct mixture methods, there was
evidence that toluene is driving these associations. Another important issue was that qgComp and BKMR
identified different health effects as being associated with the overall BTEXS mixture. Specifically,
qgComp detected associations between overall BTEX exposure and both red blood cell count and
87
hemoglobin, whereas BKMR only detected associations for hemoglobin. These inconsistencies may be
due to the different strategies used by the methods to quantify overall effects. qgComp uses g-
Compuation to calculate an overall exposure index that is defined by a set of weights. This exposure
index is then used in regression model that is adjusted for covariates to quantify the overall effect of the
mixture. In contrast, BKMR defines a flexible kernel function for each mixture element and the overall
mixture effect is quantified by increasing all kernel functions simultaneously and fixing covariates at their
median values. The different steps each method takes to quantify overall effects may contribute to the
difference in findings.
We also performed mixture analyses stratified by smoking status. Personal tobacco smoke
exposure is a major source of BTEXS, therefore it is possible that the effect of the BTEXS mixture on
hematologic parameters are different in nonsmokers vs. current smokers. The stratified results of the
qgComp analyses were generally similar to the results in the total population. Both strata provided
evidence of positive associations between the overall BTEXS mixture and both red blood cell count and
hemoglobin which were principally driven by toluene (however the association was only significant in the
nonsmoking stratum). In contrast, the stratified results of the BKMR analyses provided evidence of
different effects in the two strata. In the non-smoking strata, the results were similar to the qgComp
analysis and suggested a positive association between BTEXS and hemoglobin which was principally
driven by toluene. In the smoking strata, we detected an inverse U-shaped association between BTEXS
and both white blood cell count the platelet count. In these cases, all BTEXS components had equal and
relatively modest weights. The inconsistent findings in the smoking stratum of the BKMR analysis can be
for several reasons. First, the inverse U-shaped association suggests that lower levels of BTEXS exposure
in smokers leads to a proliferative response whereas higher levels of exposure eventually lead to
depression of the hematopoiesis. The higher correlation of the BTEXS components in current smokers vs.
nonsmokers suggests that synergistic effects within the mixture may be required for these nonlinear
effects. The increased flexibility in kernel functions may explain why this phenomenon was only detected
in BKMR. However, the population of current smokers in our analysis was relatively small and instable
88
estimates combined with kernel function overfitting may have led to spurious associations. More research
in a larger population of current smokers can help to clarify these findings.
Despite the differences between the methodologies we used, our consistent finding that the
overall BTEXS mixture was associated with changes in hemoglobin concentration is mechanistically
plausible. Metabolism of BTEXS occurs in the liver and bone marrow to form multiple reactive
metabolites that ultimately accumulate in the bone marrow[86]. These reactive metabolites exhibit
cytotoxic to hematopoietic progenitor cells and mature blood cells[87]. Studies suggest that this
cytotoxicity has countervailing effects. Cytotoxicity can lead to depression of hematopoietic tissue that
can manifest as reduced blood cell counts[47, 76, 88–90]. Cytotoxicity can also lead to a
hyperinflammatory microenvironment that promote erythropoiesis and leukocytosis that ultimately
manifest as increase blood cell counts[91, 92]. Our results that environmental BTEXS exposure was
associated with an increase in hemoglobin concentration suggest that BTEXS exposure may promote
erythropoiesis.
Very few studies have investigated the role of environmental BTEXS exposure on hemoglobin
concentration. One example is a cross-sectional study that investigated the hematologic effects of BTEX
exposure (benzene, toluene, ethylbenzene and xylene) among residents near a petrochemical complex in
Nanjing, China[77]. A control group was also selected from a community that did not live near a
petrochemical complex. The exposed group had 1.2 to 6.7 times higher blood BTEX levels and single-
exposure models suggested that blood benzene concentrations were associated with lower levels of
hemoglobin concentration. Another example is the GuLF study that investigated hematologic effects
among a cohort of adults who were exposed to BTEXS during the Deepwater Horizon oil spill
cleanup[78]. Similar to the Nanjing study, single-exposure models suggested that blood benzene
concentration was associated with lower hemoglobin concentrations. In contrast to previous studies, one
important strength of our study is the implementation of mixture methods rather than relying exclusively
on single-exposure models. Single-exposure models assess each BTEXS element individually with no
framework to remove the confounding effects of highly correlated exposures. This difference may explain
89
why previous studies using single-exposure models primarily identified benzene as the major exposure
driving the association whereas our analysis identified toluene.
This study has important limitations to consider. First, it is unclear why BTEXS exposure was
associated with lower hemoglobin concentrations in previous studies, yet our analysis indicated it may be
associated with higher hemoglobin concentrations. One possibility is that the nature of the exposure is
qualitatively different between the studies. The Nanjing study and the GuLF study cohorts were
specifically constructed to capture environmental VOC exposure through proximity to petrochemical
operations and the Deepwater Horizon oil spill cleanup, respectively. In contrast, our analyses use a
subset of participants from the 1999-2002 NHANES survey cycles. The survey is designed to sample the
general US population and therefore VOC exposure is likely a combination of tobacco smoke, fossil fuel
combustion and cleaning products. The diffuse nature of these exposures may explain our observation
that BTEXS was associated with higher hemoglobin concentrations compared to other studies with more
explicit exposures that found the opposite results. Another limitation is that our analyses does not utilize
the NHANES survey weights. The survey implements a complex sampling design that oversamples
certain population subgroups and requires the use of provided survey weights to make inferences on the
general US population. The mixtures methods used in these analyses currently do not have a framework
to appropriately handle survey weights, therefore our results are not generalizable to the general US
population.
In summary, our study used qgComp and BKMR to understand the association between overall
BTEXS exposure and hematologic outcomes and to further understand what BTEXS elements are driving
these associations. We identified an association between BTEXS exposure and increased hemoglobin
concentration which was primarily driven by toluene. These findings are helpful for risk identification
efforts and provide mechanistic insight into the health effects of BTEXS exposure.
90
V. DISCUSSION
This work evaluated the risks occupational and environmental benzene exposure poses to human
health by combining an exposure assessment, a dose-response characterization of toxicokinetics, and an
epidemiological investigation using mixture mixtures. By focusing on three different aspects of benzene
research in different cohorts and exposure contexts, this work has been uniquely positioned to provide
diverse insights on occupational and environmental benzene exposure.
A. Summary and Discussion of Aims
Aim 1 of this work developed a methodology to quantitatively assess benzene exposure in the
Four Refinery cohort. Two data sources were utilized for this aim: work history records for the cohort
between 1979 and 2010 and benzene exposure measurements derived from an industrial hygiene
monitoring program between 1976 and 2007. To quantitatively estimate benzene exposure for each
worker, a model was developed that described the industrial hygiene data as a function of calendar year,
refinery location and job title. This model was then applied to individual work histories to reconstruct a
timeline of exposure and provide a quantitative estimate of cumulative exposure for each worker.
This exposure reconstruction has the potential to add valuable new insights to the field of
occupational benzene exposure. Previous studies of occupational benzene exposure tend to focus on
relatively older cohorts. For example, the Pliofilm cohort dates back to the 1940s and has produced a
series of studies that serve as the foundation of the modern understanding of the health effects of benzene
exposure. The time frame of this cohort has two important consequences. First, the regulatory controls
that were in place during observation of the Pliofilm cohort were much weaker than the modern context.
This means that inferences drawn on the health effects of relatively high benzene exposure using these
earlier studies may not be appropriate to extrapolate to modern, much lower exposures. Second, the
Pliofilm cohort is severely limited by a paucity of exposure measurements over the study duration. This is
primarily due to the fact that industrial hygiene efforts were not as systematic and rigorous as modern
standards. As a result, propagates uncertainty throughout the analysis and limits the utility of derived
findings. Aim 1 of this work uniquely addresses these limitations and establishes the Four Refinery
91
Cohort that is positioned to contribute valuable new insights to the current understanding of occupational
benzene exposure. First, by collection data in a modern regulatory context, the Four Refinery cohort
observes refinery workers exposed to relatively low benzene exposure. Following these workers over time
will provide a better understanding of the health effects of low benzene exposure and quantify the
protective effect of modern controls. Eventually, these efforts will help to determine if more controls are
required and detect the unique consequences of low-level exposure. Second, the industrial hygiene
monitoring program employed in the Four Refinery cohort provides much denser exposure coverage than
the Pliofilm cohort. Inferences drawn on the Four Refinery cohort will therefore be more stable and less
uncertain.
The work to quantitatively estimate benzene exposure in Aim 1 can be used in future studies to
advance our understanding of low-dose occupational benzene exposure. The Four Refinery cohort can be
linked to cancer registries or the National Death Index to follow the cohort for cancer incidence and
mortality, respectively. Combining these outcomes with the quantitative exposure estimates in
epidemiologic investigations can provide information on the association between occupational benzene
exposure and long-term health effects. Previous studies with similar designs have utilized cohorts with
relatively high exposures, therefore the Four Refinery cohort is uniquely positioned to provide new
understanding on low-dose occupational exposures.
Aim 2 of this work characterized the dose-response relationship between benzene exposure and
urinary metabolite production in the Shanghai Health Study, a cohort of factory workers in Shanghai,
China. Personal benzene exposure was assessed during the work shift and urinary metabolite
concentrations were assessed post-shift for a total of 185 factory workers. In total, four urinary
metabolites were assessed: catechol, hydroquinone, phenol, and muconic acid. These compounds were
selected as outcomes in dose-response analysis for two reasons. First, all four metabolites can be easily
assessed in post-shift urine samples and also have relatively short half-lives between 4 to 6 hours. This
means that post-shift urine samples can unambiguously assess metabolism of the corresponding workday
instead of more cumulative exposure. Second, these metabolites eventually give rise to toxic
92
benzoquinones. In turn, benzoquinones produce reactive oxygen species that accumulate in the bone
marrow and damage hematopoietic stem cells. Therefore, understanding the production of the urinary
metabolites for a given level of benzene exposure is the first step in understanding the possible burden of
toxic benzoquinones in the body and subsequent damage to hematologic tissue. For these two reasons,
catechol, hydroquinone, phenol, and muconic acid are ideal candidates to monitor the internal benzene
dose among occupationally exposure workers. In Aim 2, the dose-response relationship between air
benzene exposure and urinary metabolite concentrations were modelled using natural splines to capture
potential non-linearity. This work provides a thorough characterization of the relationship between
benzene exposure and urinary metabolites. Future benzene monitoring efforts can use this dose-response
information to establish a urinary monitoring program to track levels of metabolite production among
workers. Combining personal air monitoring with urinary metabolite monitoring can provide a more
complete picture of both external and internal benzene dose. Ideally, these types of programs will require
the definition of a threshold urinary metabolite concentration above which certain adverse hematologic
effects are expected. To help define this threshold, future investigations within the Shanghai Health Study
should focus on the relationship between urinary metabolite concentrations and hematologic effects such
as hematotoxicity and perturbations in hematologic parameters.
The Shanghai Health Study cohort also contained information on genetic polymorphisms that are
associated with benzene metabolism. A secondary objective in Aim 2 was to assess for interactions
between these polymorphisms and the benzene-dependent production of urinary metabolites. We detected
interactions between benzene exposure and a variant MPO allele in the models describing hydroquinone
and muconic acid production. Individuals with the variant G462A MPO allele were predicted to have
higher urinary concentrations of hydroquinone and muconic acid. We hypothesize that the increase in
these upstream metabolites is at the expense of downstream ROS-generating benzoquinones and therefore
results in less toxicity. This hypothesis is consistent with other findings of the G463A MPO allele, which
demonstrated less hematotoxicity relative to the wild-type allele. Importantly, Aim 2 is one of only two
93
studies that combined a dose-response characterization of urinary metabolite production with information
on genetic polymorphisms associated with benzene metabolism.
Aim 3 of this work assessed the association between environmental exposure to a prevalent
mixture of volatile organic chemicals and hematologic parameters in a subsample of NHANES 1999-
2002. Blood concentrations of five volatile organic chemicals were assessed: benzene, toluene,
ethylbenzene, xylene, and styrene (BTEXS). Four hematologic parameters derived from a complete blood
count were assessed: white blood cell count, red blood cell count, hemoglobin, and platelet count.
Because the BTEXS mixture was highly correlated in our study sample, we implemented two mixture
methods to characterize their effects: qgComp and BKMR. These techniques mitigate the problems faced
by traditional regression approaches in studying a correlated mixture of exposures. Both qgComp and
BKMR characterize the overall effect of a mixture and can identify key mixture elements that are driving
the association. In the qgComp analysis, the overall BTEXS mixture as associated with increases in red
blood cell count and hemoglobin and for both of these outcomes, benzene and toluene were identified as
key mixture elements driving the association. In the BKMR analysis, the overall BTEXS mixture was
associated with increases in hemoglobin and toluene was identified as the key mixture element. The
consistency in findings between the two methods, despite their methodological differences, strengthens
our conclusions.
Environmental exposure to BTEXS is prevalent in the general US population through sources
such as tobacco smoke exposure, fossil fuel combustion, industrial manufacturing, and residual
contamination of consumer goods. In Aim 3, we characterized associations between environmental
BTEXS exposure and increases in red blood cell count and hemoglobin and further that toluene was
primarily driving these associations. These findings are helpful to protect individuals from BTEXS
exposure. First, it identifies red blood cell count and hemoglobin as readily accessible biomarkers of
BTEXS exposure. These outcomes can be targets for biomonitoring studies focusing on vulnerable
populations that are in proximity to BTEXS sources, such as industrial manufacturing and chemical
processing sites. Increases in red blood cell count and hemoglobin in these populations can be early
94
indicators of effects due to BTEXS exposure and prompt community interventions. Second, our work
identifies toluene as the primary BTEXS constituent that is driving these associations. Toluene can
therefore be a potential target to reduce emissions at point sources in order to improve air quality.
B. Conclusions
In conclusion, this work represents a body of research addressing key questions regarding
exposure to both occupational and environmental sources of benzene. Aim 1 performed an exposure
assessment in a cohort of refinery workers, Aim 2 characterized the dose-response of benzene
metabolisms in a cohort of factory workers, and Aim 3 assessed the association between environmental
BTEXS exposure and hematologic parameters. The unique construction of this work to move from
exposure assessment, to metabolism, and finally to effects on human health provide an overarching and
integrated understanding of the risks posed by benzene exposure. These findings can be used to advance
our understanding of benzene exposure to protect human health in occupational and environment
contexts.
95
CITED LITERATURE
1. Cogliano VJ, Baan R, Straif K, et al. Updating IARC’s carcinogenicity assessment of benzene. Am
J Ind Med 2011; 54: 165–7.
2. Fruscella W. Benzene. In: Kirk-Othmer Encyclopedia of Chemical Technology. Hoboken, NJ,

USA: John Wiley & Sons, Inc. Epub ahead of print 10 June 2002. DOI:
10.1002/0471238961.0205142606182119.a01.pub2.
3. Franck H-G, Stadelhofer JW. Production and uses of benzene derivatives. In: Industrial Aromatic
Chemistry. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 132–235.
4. Glass D. Retrospective exposure assessment for benzene in the Australian petroleum industry. Ann
Occup Hyg 2000; 44: 301–320.
5. Verma DK, Johnson DM, McLean JD. Benzene and Total Hydrocarbon Exposures in the
Upstream Petroleum Oil and Gas Industry. AIHAJ - Am Ind Hyg Assoc 2000; 61: 255–263.
6. Verma DK, Johnson DM, Shaw ML, et al. Benzene and total hydrocarbons exposures in the
downstream petroleum industries. AIHAJ 2001; 62: 176–94.
7. Wixtrom RN, Brown SL. Individual and population exposures to gasoline. J Expo Anal Environ
Epidemiol; 2: 23–78.
8. Page NP, Mehlman M. Health effects of gasoline refueling vapors and measured exposures at
service stations. Toxicol Ind Health 1989; 5: 869–90.
9. Salviano dos Santos VP, Medeiros Salgado A, Guedes Torres A, et al. Benzene as a Chemical
Hazard in Processed Foods. Int J Food Sci 2015; 2015: 1–7.
10. Stenehjem JS, Kjærheim K, Bråtveit M, et al. Benzene exposure and risk of
lymphohaematopoietic cancers in 25 000 offshore oil industry workers. Br J Cancer 2015; 112:
1603–12.
11. Schnatter AR, Glass DC, Tang G, et al. Myelodysplastic syndrome and benzene exposure among
petroleum workers: an international pooled analysis. J Natl Cancer Inst 2012; 104: 1724–37.
12. Rothman N, Li G-L, Dosemeci M, et al. Hematotoxocity Among Chinese Workers Heavily
Exposed to Benzene Hematoxicity and Benzene. 1995.
13. Crump KS. Risk of benzene-induced leukemia: a sensitivity analysis of the pliofilm cohort with
additional follow-up and new exposure estimates. J Toxicol Environ Health 1994; 42: 219–42.
14. Williams PRD, Paustenbach DJ. Reconstruction of benzene exposure for the Pliofilm cohort
(1936-1976) using Monte Carlo techniques. J Toxicol Environ Health A 2003; 66: 677–781.
15. Rinsky RA, Hornung RW, Silver SR, et al. Benzene exposure and hematopoietic mortality: A
long-term epidemiologic risk assessment. Am J Ind Med 2002; 42: 474–80.
16. Infante PF, Rinsky RA, Wagoner JK, et al. Leukaemia in benzene workers. Lancet (London,
England) 1977; 2: 76–8.
96
17. Paustenbach DJ, Price PS, Ollison W, et al. Reevaluation of benzene exposure for the Pliofilm
(rubberworker) cohort (1936-1976). J Toxicol Environ Health 1992; 36: 177–231.
18. Lindstrom AB, Yeowell-O’Connell K, Waidyanatha S, et al. Investigation of benzene oxide in

bone marrow and other tissues of F344 rats following metabolism of benzene in vitro and in vivo.
Chem Biol Interact 1999; 122: 41–58.
19. Lindstrom AB, Yeowell-O’Connell K, Waidyanatha S, et al. Measurement of benzene oxide in the
blood of rats following administration of benzene. Carcinogenesis 1997; 18: 1637–41.
20. Zarth AT, Murphy SE, Hecht SS. Benzene oxide is a substrate for glutathione S-transferases.
Chem Biol Interact 2015; 242: 390–395.
21. Lin L-C, Chen W-J, Chiung Y-M, et al. Association between GST Genetic Polymorphism and
Dose-Related Production of Urinary Benzene Metabolite Markers, trans, trans-Muconic Acid and
S-Phenylmercapturic Acid. Cancer Epidemiol Biomarkers Prev 2008; 17: 1460–1469.
22. Dougherty D, GARTE S, Barchowsky A, et al. NQO1, MPO, CYP2E1, GSTT1 and GSTM1
polymorphisms and biological effects of benzene exposure--a literature review. Toxicol Lett 2008;
182: 7–17.
23. Ross D. Functions and distribution of NQO1 in human bone marrow: Potential clues to benzene
toxicity. Chem Biol Interact 2005; 153–154: 137–146.
24. McDonald TA, Waidyanatha S, Rappaport SM. Production of benzoquinone adducts with
hemoglobin and bone-marrow proteins following administration of [ 13 C 6 ]benzene to rats.
Carcinogenesis 1993; 14: 1921–1925.
25. Yin SN, Li GL, Tain FD, et al. Leukaemia in benzene workers: a retrospective cohort study. Br J
Ind Med 1987; 44: 124–8.
26. Hayes RB, Yin SN, Dosemeci M, et al. Benzene and the dose-related incidence of hematologic
neoplasms in China. Chinese Academy of Preventive Medicine--National Cancer Institute
Benzene Study Group. J Natl Cancer Inst 1997; 89: 1065–71.
27. Hayes RB, Songnian Y, Dosemeci M, et al. Benzene and lymphohematopoietic malignancies in
humans. Am J Ind Med 2001; 40: 117–26.
28. Collins JJ, Anteau SE, Swaen GMH, et al. Lymphatic and hematopoietic cancers among benzene-
exposed workers. J Occup Environ Med 2015; 57: 159–63.
29. Bloemen LJ, Youk A, Bradley TD, et al. Lymphohaematopoietic cancer risk among chemical
workers exposed to benzene. Occup Environ Med 2004; 61: 270–4.
30. Rinsky RA, Young RJ, Smith AB. Leukemia in benzene workers. Am J Ind Med 1981; 2: 217–45.
31. Rinsky RA, Smith AB, Hornung R, et al. Benzene and leukemia. An epidemiologic risk
assessment. N Engl J Med 1987; 316: 1044–50.
32. Crump KS. On summarizing group exposures in risk assessment: is an arithmetic mean or a
geometric mean more appropriate? Risk Anal 1998; 18: 293–7.
97
33. Utterback DF, Rinsky RA. Benzene exposure assessment in rubber hydrochloride workers: a
critical evaluation of previous estimates. Am J Ind Med 1995; 27: 661–76.
34. Panko JM, Gaffney SH, Burns AM, et al. Occupational exposure to benzene at the ExxonMobil
refinery at Baton Rouge, Louisiana (1977-2005). J Occup Environ Hyg 2009; 6: 517–29.
35. Gaffney SH, Burns AM, Kreider ML, et al. Occupational exposure to benzene at the ExxonMobil
refinery in Beaumont, TX (1976-2007). Int J Hyg Environ Health 2010; 213: 285–301.
36. Kreider ML, Unice KM, Panko JM, et al. Benzene exposure in refinery workers: ExxonMobil
Joliet, Illinois, USA (1977-2006). Toxicol Ind Health 2010; 26: 671–90.
37. Gaffney SH, Panko JM, Unice KM, et al. Occupational exposure to benzene at the ExxonMobil
refinery in Baytown, TX (1978-2006). J Expo Sci Environ Epidemiol 2011; 21: 169–85.
38. Widner TE, Gaffney SH, Panko JM, et al. Airborne concentrations of benzene for dock workers at
the ExxonMobil refinery and chemical plant, Baton Rouge, Louisiana, USA (1977-2005). Scand J
Work Environ Health 2011; 37: 147–58.
39. Burns A, Shin JM, Unice KM, et al. Combined analysis of job and task benzene air exposures
among workers at four US refinery operations. Toxicol Ind Health 2017; 33: 193–210.
40. Baccarelli A, Pfeiffer R, Consonni D, et al. Handling of dioxin measurement data in the presence
of non-detectable values: overview of available methods and their application in the Seveso
chloracne study. Chemosphere 2005; 60: 898–906.
41. Huybrechts T, Thas O, Dewulf J, et al. How to estimate moments and quantiles of environmental
data sets with non-detected observations? A case study on volatile organic compounds in marine
water samples. J Chromatogr A 2002; 975: 123–33.
42. Hornung RW, Greife AL, Stayner LT, et al. Statistical model for prediction of retrospective
exposure to ethylene oxide in an occupational mortality study. Am J Ind Med 1994; 25: 825–36.
43. Wallace LA. Major sources of benzene exposure. Environ Health Perspect 1989; 82: 165–169.
44. Schnatter AR, Nicolich MJ, Bird MG. Determination of Leukemogenic Benzene Exposure
Concentrations: Refined Analyses of the Pliofilm Cohort. Risk Anal 1996; 16: 833–840.
45. Chaiklieng, Suggaravetsiri, Autrup. Risk Assessment on Benzene Exposure among Gasoline
Station Workers. Int J Environ Res Public Health 2019; 16: 2545.
46. Lorkiewicz P, Riggs DW, Keith RJ, et al. Comparison of Urinary Biomarkers of Exposure in
Humans Using Electronic Cigarettes, Combustible Cigarettes, and Smokeless Tobacco. Nicotine
Tob Res 2019; 21: 1228–1238.
47. Robert Schnatter A, Kerzic PJ, Zhou Y, et al. Peripheral blood effects in benzene-exposed
workers. Chem Biol Interact 2010; 184: 174–181.
48. Smith MT. Overview of benzene-induced aplastic anaemia. Eur J Haematol Suppl 1996; 60: 107–
10.
98
49. Snyder R. Xenobiotic Metabolism and the Mechanism(s) of Benzene Toxicity. Drug Metab Rev
2004; 36: 531–547.
50. Wan J, Shi J, Hui L, et al. Association of genetic polymorphisms in CYP2E1, MPO, NQO1,
GSTM1, and GSTT1 genes with benzene poisoning. Environ Health Perspect 2002; 110: 1213–
1218.
51. Nourozi MA, Neghab M, Bazzaz JT, et al. Association between polymorphism of GSTP1, GSTT1,
GSTM1 and CYP2E1 genes and susceptibility to benzene-induced hematotoxicity. Arch Toxicol
2018; 92: 1983–1990.
52. Kim S, Vermeulen R, Waidyanatha S, et al. Modeling human metabolism of benzene following
occupational and environmental exposures. Cancer Epidemiol biomarkers Prev 2006; 15: 2246–
52.
53. Rappaport SM, Kim S, Thomas R, et al. Low-dose metabolism of benzene in humans: science and
obfuscation. Carcinogenesis 2013; 34: 2–9.
54. Rappaport SM, Waidyanatha S, Yeowell-O’Connell K, et al. Protein adducts as biomarkers of

human benzene metabolism. Chem Biol Interact 2005; 153–154: 103–9.
55. Cox LA, Schnatter AR, Boogaard PJ, et al. Non-parametric estimation of low-concentration
benzene metabolism. Chem Biol Interact 2017; 278: 242–255.
56. Rappaport SM, Yeowell-O’connell K, Smith MT, et al. Non-linear production of benzene oxide-
albumin adducts with human exposure to benzene. J Chromatogr B Analyt Technol Biomed Life
Sci 2002; 778: 367–74.
57. Rappaport SM, Kim S, Lan Q, et al. Evidence that humans metabolize benzene via two pathways.
Environ Health Perspect 2009; 117: 946–52.
58. Kim S, Vermeulen R, Waidyanatha S, et al. Using urinary biomarkers to elucidate dose-related
patterns of human benzene metabolism. Carcinogenesis 2006; 27: 772–81.
59. Kim S, Lan Q, Waidyanatha S, et al. Genetic polymorphisms and benzene metabolism in humans
exposed to a wide range of air concentrations. Pharmacogenet Genomics 2007; 17: 789–801.
60. Kerzic PJ, Liu WS, Pan MT, et al. Analysis of hydroquinone and catechol in peripheral blood of
benzene-exposed workers. Chem Biol Interact 2010; 184: 182–8.
61. Gross SA, Paustenbach DJ. Shanghai Health Study (2001-2009): What was learned about benzene
health effects? Crit Rev Toxicol 2018; 48: 217–251.
62. Smith MT. Benzene, NQO1, and genetic susceptibility to cancer. Proc Natl Acad Sci 1999; 96:
7624–7626.
63. Liu A, Hong N, Zhu P, et al. Understanding benzene series (BTEX) pollutant load characteristics
in the urban environment. Sci Total Environ 2018; 619–620: 938–945.
64. Liu A, Hong N, Zhu P, et al. Characterizing benzene series (BTEX) pollutants build-up process on
urban roads: Implication for the importance of temperature. Environ Pollut 2018; 242: 596–604.
99
65. Caselli M, de Gennaro G, Marzocca A, et al. Assessment of the impact of the vehicular traffic on
BTEX concentration in ring roads in urban areas of Bari (Italy). Chemosphere 2010; 81: 306–311.
66. Wallace L. Environmental exposure to benzene: an update. Environ Health Perspect 1996; 104
Suppl: 1129–36.
67. Lim SK, Shin HS, Yoon KS, et al. Risk Assessment of Volatile Organic Compounds Benzene,
Toluene, Ethylbenzene, and Xylene (BTEX) in Consumer Products. J Toxicol Environ Heal Part
A 2014; 77: 1502–1521.
68. Symanski E, Stock TH, Tee PG, et al. Demographic, residential, and behavioral determinants of
elevated exposures to benzene, toluene, ethylbenzene, and xylenes among the U.S. population:
results from 1999-2000 NHANES. J Toxicol Environ Health A 2009; 72: 915–24.
69. Savitz DA, Andrews KW. Review of epidemiologic evidence on benzene and lymphatic and
hematopoietic cancers. Am J Ind Med 1997; 31: 287–295.
70. Paxton MB. Leukemia risk associated with benzene exposure in the Pliofilm cohort. Environ
Health Perspect 1996; 104 Suppl: 1431–6.
71. Mchale CM, Zhang L, Smith MT. Current understanding of the mechanism of benzene-induced
leukemia in humans: implications for risk assessment. Carcinogenesis 2012; 33: 240–252.
72. Niaz K, Bahadar H, Maqbool F, et al. A review of environmental and occupational exposure to
xylene and its health concerns. EXCLI J 2015; 14: 1167–86.
73. Camara-Lemarroy CR, Rodríguez-Gutiérrez R, Monreal-Robles R, et al. Acute toluene

intoxication–clinical presentation, management and prognosis: a prospective observational study.
BMC Emerg Med 2015; 15: 19.
74. Haye b. Exposure to aromatic hydrocarbons in a coke oven by-product plant. Am Ind Hyg Assoc J;
25: 386–91.
75. Banton MI, Bus JS, Collins JJ, et al. Evaluation of potential health effects associated with
occupational and environmental exposure to styrene – an update. J Toxicol Environ Heal Part B
2019; 22: 1–130.
76. Qu Q, Shore R, Li G, et al. Hematological changes among Chinese workers with a broad range of
benzene exposures. Am J Ind Med 2002; 42: 275–85.
77. Chen Q, Sun H, Zhang J, et al. The hematologic effects of BTEX exposure among elderly
residents in Nanjing: a cross-sectional study. Environ Sci Pollut Res Int 2019; 26: 10552–10561.
78. Doherty BT, Kwok RK, Curry MD, et al. Associations between blood BTEXS concentrations and
hematologic parameters among adult residents of the U.S. Gulf States. Environ Res 2017; 156:
579–587.
79. Kwok RK, Engel LS, Miller AK, et al. The GuLF study: A prospective study of persons involved
in the Deepwater horizon oil spill response and clean-Up. Environ Health Perspect 2017; 125:
570–578.
100
80. Birnbaum LS. NIEHS’s new strategic plan. Environ Health Perspect 2012; 120: a298.
81. Gibson EA, Goldsmith J, Kioumourtzoglou MA. Complex Mixtures, Complex Analyses: an
Emphasis on Interpretable Results. Current environmental health reports 2019; 6: 53–61.
82. Yorita Christensen KL, Carrico CK, Sanyal AJ, et al. Multiple classes of environmental chemicals
are associated with liver disease: NHANES 2003-2004. Int J Hyg Environ Health 2013; 216: 703–
9.
83. Stafoggia M, Breitner S, Hampel R, et al. Statistical Approaches to Address Multi-Pollutant

Mixtures and Multiple Exposures: the State of the Science. Current environmental health reports
2017; 4: 481–490.
84. Bobb JF, Valeri L, Claus Henn B, et al. Bayesian kernel machine regression for estimating the
health effects of multi-pollutant mixtures. Biostatistics 2015; 16: 493–508.
85. Keil AP, Buckley JP, O’Brien KM, et al. A quantile-based g-computation approach to addressing
the effects of exposure mixtures. Environ Health Perspect; 128. Epub ahead of print 2020. DOI:
10.1289/EHP5838.
86. Snyder R. The bone marrow niche, stem cells, and leukemia: impact of drugs, chemicals, and the
environment. Ann N Y Acad Sci 2014; 1310: 1–6.
87. Hirabayashi Y, Inoue T. Benzene-induced bone-marrow toxicity: A hematopoietic stem-cell-

specific, aryl hydrocarbon receptor-mediated adverse effect. Chem Biol Interact 2010; 184: 252–
258.
88. Aksoy M, Dinçol K, Akgün T, et al. Haematological effects of chronic benzene poisoning in 217
workers. Br J Ind Med 1971; 28: 296–302.
89. Lan Q, Zhang L, Li G, et al. Hematotoxicity in workers exposed to low levels of benzene. Science
2004; 306: 1774–6.
90. Ward E, Hornung R, Morris J, et al. Risk of low red or white blood cell count related to estimated
benzene exposure in a rubberworker cohort (1940-1975). Am J Ind Med 1996; 29: 247–57.
91. Natelson EA. Benzene Exposure and Refractory Sideroblastic Erythropoiesis: Is There an
Association? Am J Med Sci 2007; 334: 356–360.
92. Tishevskaya N V., Bolotov AA, Lebedeva YE. Dynamics of Erythropoiesis in Erythroblastic
Islands in the Bone Marrow in Experimental Benzene-Induced Anemia. Bull Exp Biol Med 2016;
161: 384–387.
101
APPENDICES
102
APPENDIX A
Notice of Determination
Activity Does Not Represent Human Subjects Research
December 4, 2019
Madhawa Saranadasa
Epidemiology and Biostatistics
RE: Protocol # 2019-1342

“Develop Methods for Benzene Exposure Reconstruction in a Cohort of US Petroleum
Workers”
Dear Mx. Saranadasa:
The UIC Office for the Protection of Research Subjects received your application, and has
determined that this activity DOES NOT meet the definition of human subject research as
defined by 45 CFR 46.102(e).
Specifically, utilizing a previously published dataset and a de-identified dataset containing task
type and duration information for Exxon Mobil workers at four US refineries, determined by
Exxon Mobil to not constitute the use or transfer of identifiable worker data, to generate
estimates of anonymous cumulative worker exposure to benzene over time in order to better
compare and understand how model building may impact the distributions of estimates and
determine a more accurate final model for estimating task-specific exposure over time.
You may conduct your activity without further submission to the IRB.
Please note:
• If this activity is used in conjunction with any other research involving human subjects,
prospective IRB approval or a Claim of Exemption is required.
• If this activity is altered in such a manner that may result in the activity representing
human subject research, a NEW Determination application must be submitted.
Sincerely,
Sandra Costello
Assistant Director, IRB # 7
Office for the Protection of Research
cc: Ronald C. Hershow, Epidemiology and Biostatistics, M/C 923

Leslie T. Stayner (faculty advisor), Epidemiology and Biostatistics, M/C 923
103
APPENDIX B
August 21, 2019
20190848-125785-2
Madhawa Saranadasa

“Assessing the dose-response relationship between occupational benzene exposure and
benzene metabolism in a cross-sectional study of factory workers in Shanghai, China.”
Sponsor/Funding: None
Dear Madhawa Saranadasa:
• Regarding the Data transfer: EM has supplied you with an EM laptop computer with access to
the EM network. All analysis will take place on this laptop, using de-identified datasets with no out-of-
network transfer to the university. Network activity will be monitored. EM legal services has
confirmed that EM has complete ownership of the data with no international partnerships that would
require regulatory considerations. Although ExxonMobil (EM), the owner of the data, has concluded
that no data use or transfer agreement will be required, please be reminded of the requirement to enter
into contracts and/or agreements – if any - with non-UIC sites/entities through the UIC Office of
Research Subjects.
The UIC Office for the Protection of Research Subjects received your Determination application, and
has determined that this activity DOES NOT meet the definition of human subject research as
defined by 45 CFR 46.102(e)/ 21 CFR 50.3(g) and 21 CFR 56.102(e).
This research proposal will utilize data that has been previously collected in the Shanghai Health Study.
The Shanghai Health Study was funded by the Shanghai Health Research Consortium and with
partnership between BP, Chevron, ConocoPhillips, ExxonMobil and Shell Chemical. The study’s
protocols and conduct were reviewed by independent science advisory boards (Shanghai Health Study
Science Review Panel and Ethics Review Panel). The study was also approved by Institutional Review
Boards at the University of Colorado and Fudan University, who assisted in study administration. The
Shanghai Health Study is a cross-sectional study of workers from five factories in and around Shanghai,
China that used benzene in their production processes. This study was designed to research the health
effects of occupational benzene exposure. A total of 1046 consenting workers from the five study
factories were included in the study. Trained interviewers administered an in-person questionnaire to
collect demographic data (birth date, tobacco use, alcohol use, height, weight, medication use, existing
diseases and exposure to pesticides). Blood and urine samples were also collected for each participant.
All data was collected at factory visits between 2003 and 2007 and there was no follow-up of subjects
after the factory visit. Although the data will include birth dates, the investigators will not be able to
directly or indirectly link the birth dates or other data to individuals.
•
104
APPENDIX B (CONTINUED)
Please note:
• If this activity is altered in such a manner that may result in the activity representing human
subject research, a NEW Determination application must be submitted.
cc: Ronald C. Hershow

Leslie T. Stayner
105
APPENDIX C
June 11, 2020 20200775-133164-1

Madhawa Saranadasa

“Assess the association between environmental BTEXS exposure and hematologic
parameters in the general US population”
Sponsor: None
Dear Madhawa Saranadasa:
The UIC Office for the Protection of Research Subjects received your Determination application
and has determined that this activity DOES NOT meet the definition of human subject
research as defined by 45 CFR 46.102(e)/ 21 CFR 50.3(g) and 21 CFR 56.102(e).
Specifically, this study will use data from the National Health and Nutrition Examination Survey
(NHANES). NHANES is a publicly accessible, deidentified, nationally representative cross-
sectional survey conducted to assess the health and nutritional status of the general US
population.
• Please note:
• If this activity is altered in such a manner that may result in the activity representing
human subject research, a NEW Determination application must be submitted.
cc: Ronald C. Hershow

Leslie T. Stayner
106
VITA
MADHAWA SARANADASA
EDUCATION
2015 - Present PhD Candidate, University of Illinois at Chicago, School of Public Health,
Epidemiology
2010 - 2012 MS, Vanderbilt University, Developmental Biology
2004 - 2008 BA, University of Pennsylvania, Biology
RESEARCH EXPERIENCE
2019 - 2020 Principal Analyst, Illinois Precision Medicine Consortium, Epidemiology
2019 Statistical Consultant, AIDS Foundation of Chicago
2019 - 2020 Consultant, ExxonMobil Biomedical Sciences, Inc., Epidemiology
2018 Intern, ExxonMobil Biomedical Sciences, Inc., Epidemiology
2012 - 2015 Manager of Statistical Programming, Symbiance, Inc.
TEACHING EXPERIENCE
2019 Teaching Assistant, University of Illinois at Chicago, School of Public
Health, EPID 406: Epidemiologic Computing
PROFESSIONAL PRESENTATIONS
1. Saranadasa M, Bulka C, Argos M. No cross-sectional association between arsenic exposure and
serum
prostate-specific antigen levels in U.S. men over 40 years old: 2003-2010 NHANES. Society of
Epidemiological Research. Baltimore, MD. 2018
2. Saranadasa M, Argos M, Jasmine F, Parvez F, Islam T, Yunus M, Kibriya M, Ahsan H. Differential
gene
expression and chronic arsenic exposure in a transcriptome-wide association study of adults in
Bangladesh. National Institute of Environmental Health Sciences Superfund Research Program.
Durham,
NC. 2016
3. Saranadasa M, Schneider JD, Vianna PG, Choi E, Hipkens SB, Jia H, Osipovich A, Yuan W,
MacDonald RJ, Magnuson MA. Ngn3 is sufficient for acinar to endocrine reprogramming. Beta
Cell Biology Consortium. Chantilly, VA. 2012
ORIGINAL ARTICLES
1. Gasparian AV, Burkhart CA, Purmal AA, Brodsky L, Pal M, Saranadasa M, Bosykh DA,
Commane M, Guryanova OA, Pal S, Safina A, Sviridov S, Koman IE, Veith J, Komar AA, Gudkov
AV, Gurova KV. Curaxins: anticancer compounds that simultaneously suppress NF-κB and
activate p53 by targeting FACT. Science Translational Medicine. (2011)
2. Saranadasa M, Wang ES. Vascular endothelial growth factor inhibition: conflicting roles in tumor
growth. Cytokine. (2010)

Occupational Benzene Exposure, Metabolism, and Health Impacts

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Occupational Benzene Exposure, Metabolism, and Health Impacts

Uploaded by

Copyright:

Available Formats

Occupational Exposure Assessment, Metabolism, and Health Effects of Benzene Exposure

Submitted as partial fulfillment of the requirements

Leslie Stayner, Chair and Advisor

have shaped my career.

that helped guide me through my thesis.

II. EXPOSURE RECONSTRUCTION IN A COHORT OF REFINERY WORKERS ....... 9

III. METABOLISM OF OCCUPATIONAL BENZENE EXPOSURE ................................. 31

IV. HEMATOLOGIC EFFECTS OF ENVIRONMENTAL BTEXS EXPOSURE ............. 56

CITED LITERATURE .............................................................................................................. 95

APPENDICES ........................................................................................................................... 101

VITA ........................................................................................................................................ 106

I. WORK HISTORY CHARACTERISTICS OF THE FOUR REFINERY

II. TOTAL NUMBER OF EXPOSURE RECORDS AVAILABLE FOR EACH

III. SINGLE NUCLEOTIDE POLYMORPHISMS ASSOCIATED WITH

IV. CHARACTERISTICS OF THE STUDY SAMPLE .............................................39

V. DISTRIBUTION OF GENETIC POLYMORPHISMS ........................................40

VI. GEOMETRIC MEANS OF EXPOSURE AND METABOLITES BY SEX

VII. GEOMETRIC MEANS OF EXPOSURE AND OUTCOMES BY SERUM

VIII. PARAMETER ESTIMATES FROM LINEAR REGRESSION MODELS .........46

IX. DETERMINATION OF SAMPLE SIZE ..............................................................61

X. CHARACTERISTICS OF THE STUDY SAMPLE .............................................62

XI. GEOMETRIC MEANS OF EXPOSURES BY SELECTED COVARIATES .....65

XII. ARITHMETIC MEANS OF OUTCOME BY SELECTED COVARIATES .......66

1. Overview of benzene metabolism ..........................................................................6

2. Exposure reconstruction overview .......................................................................13

4. Selected standardized job title-specific exposure profiles over the

5. Observation time and exposure records by standardized job title ........................21

6. Proportion of variability exposure by model with different

7. Model predictions and exposure data from selected standardized job

8. Hypothetical career timeline and exposure reconstruction ..................................26

9. Distribution of cumulative exposure estimates ....................................................27

10. Overview of benzene metabolism ........................................................................32

11. Distributions of exposure and outcome ................................................................41

12. Scatterplots of metabolite concentrations versus benzene exposure ....................45

13. Dose-response curves predicted from linear models ............................................47

14. Genetic interactions in dose-response curves predicted from linear

15. Benzene exposure by muconic acid deciles .........................................................49

16. Distributions of exposure and outcome at low exposure .....................................50

17. Dose-response curves at low dose benzene exposure ..........................................51

18. Distributions of BTEXS blood concentrations .....................................................63

19. Distributions of blood parameters ........................................................................64

20. Correlation between BTEXS elements.................................................................68

21. Correlation between blood parameters .................................................................69

24. Single-exposure linear regression results for hemoglobin ...................................71

25. Single-exposure linear regression results for platelet count .................................71

28. Multiple-exposure linear regression results for hemoglobin ................................73

29. Multiple-exposure linear regression results for platelet count .............................73

30. Association between BTEXS and blood parameters assessed by

34. Association between BTEXS and hemoglobin assessed by BKMR ....................80

35. Association between BTEXS and blood parameters assessed by

36. Association between BTEXS and blood parameters assessed by

37. Association between BTEXS and hematologic parameters assessed

38. Association between BTEXS and hematologic parameters assessed

ACGIH American Conference of Governmental Industrial Hygienists

qgComp Quantile g-computation

exposure to protect the general public.

In sum, this work investigated occupational exposure assessment, molecular mechanisms of

limits and environmental risk characterization.

and Mechanisms of Toxicity.

B. Production and Uses

distillation of crude oil, catalytic reforming of cycloparaffins or hydrodealkylation of toluene[2]. By

for organic materials and as an additive in unleaded gasoline[4].

gasoline and the advent of vapor recovery systems[7].