Professional Documents
Culture Documents
prevalent and incident TB, while 10,883 were included of our analysis of incident TB alone.
1,986 individuals were excluded from the analysis of incident TB because they did not receive a
tuberculin skin test (TST) at the time of the intake observation, while another 1,166 cases were
excluded from both analyses because they were missing other covariates (BCG, age, index
smear/culture status) included in our analysis. To ensure that our results are not impacted by the
exclusion of the largest group of 1,946 cases who are missing TST results, as well as the 810
First, to account for the impact of missing skin test results, we estimated a logistic regression
model to determine whether cases missing skin test responses were systematically different than
included ones along several dimensions, such as age, household exposure and development of
TB disease during the 1-year follow-up. Results of this analysis are presented in Table E1. We
also estimated a second logistic regression model to understand which factors were predictive of
missing BCG vaccination status (Table E2). We then created a synthetic dataset in which
missing baseline skin test and BCG vaccination responses were imputed using the values
predicted by logistic regression models fitted to the baseline TST and BCG responses for cases
with complete records. Finally, we conducted a sensitivity analysis in which we re-estimated the
models presented in the main text to determine whether our results changed as a consequence of
Table E1 shows the results of a logistic regression analysis of age, socioeconomic status (SES),
and household exposure factors predicting whether an individual did not receive a TST at
enrollment. Also included in this model are variables indicating whether the individual was
diagnosed with TB at any point during the yearlong follow-up, whether she was BCG vaccinated,
or received IPT. This model shows that children and young adults were more likely to have a
missing baseline TST than adults. In addition, individuals living in households with an
unimproved roof were more likely to be missing from the analysis, as were those exposed to an
both more likely to have received a TB diagnosis (OR=1.73, 95%CI=1.22, 2.44) as well as IPT
(OR=1.42, 95%CI=1.25, 1.62), than those individuals with a baseline skin test.
Table E2 shows factors predictive of a missing self-reported BCG vaccination response. This
shows that individuals with a missing BCG response are more likely to be male (OR=1.44,
those not missing a BCG self-report. Importantly, younger individuals are the least likely to have
95%CI=0.06,0.20), and are not significantly more likely to develop TB disease than those with
full observations (OR=0.89, 95%CI=0.49,1.60). This suggests that our finding that BCG
information.
E2
To assign missing skin test results, we fit a multinomial logistic regression model to the baseline
skin test outcomes of included cases. To do this, we used the framework outlined by Zelner et al.
(author’s unpublished data) in which a generalized additive model (GAM) was used to estimate
response. We fit two models, one predicting the probability of LTBI (TST ≥ 10mm), with TST
reactivity (R; 0mm < TST < 10mm) and non-reactivity (NR, TST = 0mm) as reference responses.
The second logistic regression model predicted the probability of R, with NR and LTBI as
reference responses. We then normalized the predicted probabilities of each response (i.e., LTBI,
LT BI X
e
PLT BI (X) = LT BI X RX
1+e +e
RX
e
PR (X) = LT BI X RX
1+e +e
1
PN R (X) = LT BI X RX
1+e +e
Because omitted individuals were more likely to develop TB disease than included individuals
(Table E1), our sensitivity analysis was constructed to assume that these individuals had similar
baseline TST responses than included cases, ensuring that the results of the sensitivity analysis
would provide a more conservative depiction of the impact of baseline TST response on risk than
To assign baseline responses to individuals without them, we used the unmodified probabilities
of membership in each skin test response category as described above. For each individual with a
E3
missing skin test response, we drew a multinomial random variate with probability proportional
to these baseline membership probabilities. Using these random responses, we then created a
synthetic ‘complete’ dataset including these individuals with imputed skin test responses as well
In our sample, while 11,482 of the 12,716 (> 99%) individuals reporting BCG vaccination also
had a BCG scar, 75% of individuals without a BCG scar (1,234/1,637) still reported having been
vaccinated. To assign BCG vaccination statuses to those missing this information, we fit a
logistic regression model predicting the probability of a positive response for BCG vaccination
as a function of the presence of a BCG scar and individual age. This model reflects the strong
association between a BCG scar and self-reported vacation (OR=33.1, 95%CI=26.6,41.08), and a
slowly declining prevalence of vaccination with age (OR=0.98, 95%CI=0.97 0.98). We tested an
additional model with an interaction between BCG scar and age, but found that this was non-
significant. To assign BCG vaccination statuses in the synthetic dataset to the 810 individuals
with missing values, we drew a Bernoulli random variate with success probability equal to that
predicted by the logistic regression model given their BCG scar status and age.
Using the synthetic dataset consisting of both individuals with complete and imputed baseline
skin test and BCG vaccination responses, we re-ran the analyses of co-prevalent and incident TB
presented in the main text. When estimating the probability of both co-prevalent and incident TB
cases (Table E3), we still find a protective effect of BCG for individuals < 10 years of age
(RR=0.37, 95%CI=0.20,0.68). Results from the model predicting only incident TB are presented
E4
in Table E4 and Figure E4. Point-estimates from this model are similar to those present in Table
4 in the main text, as are the age-specific protective effects of BCG and IPT. This suggests that
the exclusion of these cases is unlikely to impact our qualitative results or policy
recommendations.
Table E1. Predictors of missing baseline tuberculin skin test result. Statistically significant
(p < 0.05) predictors are highlighted in bold.
Variable
RR
95%
CI
Intercept
0.01
(0.00,
0.04)
Table E2. Predictors of missing BCG vaccination status. Statistically significant (p < 0.05)
predictors are highlighted in bold.
Variable
RR
95%
CI
Intercept
0.02
(0.01,
0.05)
Table E3. Risk factors for incident and co-prevalent TB disease at baseline and during one-
year follow-up. Statistically significant coefficients (p < 0.05) are highlighted in bold.
Variable
RR
95%
CI
Intercept
0.01
(0.00,
0.03)
Table E4. Risk factors for incident TB disease during one-year follow-up. Statistically
significant coefficients (p < 0.05) are highlighted in bold.
BCG IPT
1.5 1.5
A B
1.0 1.0
RR
RR
0.5 0.5
0.0 0.0
0 10 20 30 0 10 20 30
Age (years) Age (years)
R LTBI
10.0 10.0
C D
7.5 7.5
RR
RR
5.0 5.0
2.5 2.5
0.0 0.0
0 10 20 30 40 50 0 10 20 30 40 50
Age (years) Age (years)
Figure E1. Interactive effect of age and baseline TST-reactivity (R) or LTBI on probability
of TB disease. The solid lines show point estimates for the relative risk (RR) of TB disease at
each age. Panels A & B show the protective effects of BCG vaccination and isoniazid preventive
therapy as a function of age. Panels C & D show the RR of TB disease associated with TST
reactivity (R) or latent TB (LTBI) as a function of age. The dashed lines show age-specific 95%
confidence intervals for these quantities; the horizontal dotted line is a guide for assessing
statistical significance.
BCG + IPT
1.5
1.0
RR
0.5
0.0
0 10 20 30
Age (years)
Figure E2. Relative risk (RR) of TB disease for individuals with IPT and BCG. RRs are
presented relative to individuals without either BCG vaccination or IPT.