You are on page 1of 9

SUPPORTING INFORMATION

Deep Learning Analysis of Vibrational Spectra of Bacterial Lysate for Rapid


Antimicrobial Susceptibility Testing

William John Thrift†, Sasha Ronaghi‡, Muntaha Samad§, Hong Wei†, Dean Gia Nguyen¶,
Antony Superio Cabuslayǁ, Chloe E. Groome†, Peter Joseph Santiago†, Pierre Baldi§, Allon I.
Hochbaum†ǁ¶ǂ, Regina Ragan†¶ *

†Department of Materials Science and Engineering, University of California, Irvine


‡Sage Hill School, Newport Coast, CA 92657
§Department of Computer Science, University of California, Irvine
ǁDepartment of Chemistry, University of California, Irvine
¶Department of Chemical and Biomolecular Engineering, University of California, Irvine
ǂDepartment of Molecular Biology and Biochemistry, University of California, Irvine
*Corresponding Author: rragan@uci.edu

Table of Contents

● Optical density at 600 nm (OD600) to identify the minimum inhibitory concentration of


various antibiotics for Pseudomonas aeruginosa and Escherichia coli..
● OD600 of a P. aeruginosa cell culture as a function of time without antibiotic treatment
and after exposure to carbenicillin and rifampicin at concentrations below minimum
inhibitory concentration.
● t-stochastic neighbor embedding (t-SNE) visualization of dose and temporal response of
Pseudomonas aeruginosa and Escherichia coli exposed to the same antibiotic.
● t-SNE visualization of temporal response of Pseudomonas aeruginosa exposed to different
antibiotics.
● 2-Class DNN model mean ten-fold cross validation accuracies, sensitivity and specificity.
● 5-Class DNN model sensitivity and specificity.
● Support vector machine model classification accuracy.
● Visualization of the latent space of combined metabolite mixture and AST dataset with
Table of legend for data.
● Isolation forest predictions of outliers and inliers of SERS spectra in AST dataset from
combined VAE latent space.
● Supporting video of spectra generated in a trajectory across the latent space depicted in
Figure 3.
Supporting Figure S1 depicts optical density measured at 600 nm (OD600) plotted to
identify the minimum inhibitory concentration (MIC50), antibiotic concentration which inhibits
50% of growth at control conditions, of different antibiotics for Pseudomonas aeruginosa and
Escherichia coli. Cultures of both bacteria were inoculated at 0.02 OD600 at concentration of
antibiotics from 0 to 1000 μg/mL, according to the growth and washing protocols in the Methods
section. Figures S1a-c show OD600 of a P. aeruginosa culture 24 h after exposure to carbenicillin,
rifampicin, and gentamicin, respectively. Figure S1d shows OD600 for E. coli 24 h after treatment
with gentamicin. The MIC50 of gentamicin is approximately 1 μg/mL for both P. aeruginosa and
E. coli. For P. aeruginosa, the MIC50 is approximately 100 μg/mL for carbenicillin and no
significant growth inhibition is caused by rifampicin throughout the concentrations tested.
Supporting Figure S2a-c shows OD600 at the time points used for SERS measurements from P.
aeruginosa lysate of untreated, treated with 50 μg/mL carbenicillin, and treated with 400 μg/mL
rifampicin cultures, respectively. These concentrations are under the MIC50 of these antibiotics in
this organism. The cultures were inoculated at 0.5 OD600 with or without antibiotic treatment.
The lack of significant cell density variations of sub-MIC treated cultures compared to the control
at the tested time points indicates that the observed changes in SERS spectra of these samples are
due to metabolic changes within the cells rather than differences in growth or growth inhibition.
Supporting Figure S1: Optical density at 600 nm (OD600) to identify the minimum inhibitory concentration of a)
carbenicillin, b) rifampicin, and c) gentamicin for Pseudomonas aeruginosa, and d) gentamicin for Escherichia coli

Supporting Figure S2: OD600 of a P. aeruginosa cell culture at indicated growth times after adjustment to OD 0.5 at
time, t = 0 h, a) without antibiotic treatment, b) with 50 μg/mL carbenicillin introduced at 0 h, and c) with 400
μg/mL rifampicin introduced at 0 h.

Supporting Figure S3 and S4 depict t-stochastic neighbor embedding (t-SNE) visualization


of the spectra used to build the VAE latent spaces depicted in Figure 2 and 4, respectively. Being
an unsupervised model, it demonstrates that there are differences in the SERS spectra, which are
discernable in the absence of user defined labels.

Supporting Figure S3 a, c) dose response t-SNE of P. aeruginosa (a) and E. coli (c). 0, 0.1, 0.5, 1, and 10 μg/mL
carbenicillin dosed lysate are depicted in red, purple, yellow, blue, and green, respectively. b, d) Temporal response
t-SNE of P. aeruginosa (b) and E. coli (d). Lysate processed after 0, 5, 10, 20, and 40 minutes of 10 μg/mL
carbenicillin dosage are depicted in red, purple, yellow, blue, and green, respectively.

Supporting Figure S4 depicts a t-SNE visualization of clustering of the SERS spectra acquired for Figure 4, the AST
dataset.

Table S1 and S2 lists 2-class and 5-class, respectively, DNN model mean accuracy,
sensitivity, and specificity for temporal and dosage dependent response data when using 10-fold
cross validation.
E. coli Temporal P. aeruginosa Temporal
Accuracy Sensitivity Specificity Accuracy Sensitivity Specificity

(0min) | (5min, 10min,20min,40min) 99% ± 0.1% 100% 99% 99% ± 0.2% 100% 99%

(0min, 5min) | (10min,20min,40min) 99% ± 0.1% 99% 100% 99% ± 0.1% 99% 100%

(0min, 5min, 10min) | (20min,40min) 99% ± 0.2% 99% 100% 99% ± 0.2% 99% 99%

(0min, 5min, 10min, 20min) | (40min) 99% ± 0.1% 99% 100% 99% ± 0.1% 99% 99%

E. coli Dosage P. aeruginosa Dosage


Accuracy Sensitivity Specificity Accuracy Sensitivity Specificity

(0 μg/mL) | ( 0.1 μg/mL, 0.5 μg/mL, 1 99% ± 1% 100% 99% 98% ± 1% 99% 95%
μg/mL, 10 μg/mL)

(0 μg/mL, 0.1 μg/mL) | (0.5 μg/mL, 1 97% ± 1% 98% 95% 98% ± 1% 98% 98%
μg/mL, 10 μg/mL)

(0 μg/mL, 0.1 μg/mL, 0.5 μg/mL) | (1 95% ± 1% 93% 95% 99% ± 1% 98% 99%
μg/mL, 10 μg/mL)

(0 μg/mL, 0.1 μg/mL, 0.5 μg/mL, 1 99% ± 1% 90% 99% 98% ± 1% 94% 99%
μg/mL) | (10 μg/mL)

Table S1: 2-Class DNN model performance metrics.

E. coli Dosage 5-Class P. aeruginosa Dosage 5-Class


Accuracy: 95% ± 1% Accuracy: 98% ± 1%

Class Specificity Sensitivity Specificity Sensitivity

0 μg/mL 99.94% 99.75% 99.29% 96.16%

0.1 μg/mL 98.50% 94.75% 99.29% 99.04%

0.5 μg/mL 97.94% 89.00% 100% 99.75%

1.0 μg/mL 98.50% 93.75% 99.85% 98.75%

10 μg/mL 98.75% 97.25% 98.88% 96.00%


E. coli Temporal 5-Class P. aeruginosa Temporal 5-Class
Accuracy: 99% ± .3% Accuracy: 99% ± .2%

Class Specificity Sensitivity Specificity Sensitivity

0 min 99.81% 99.50% 100% 99.75%

5 min 99.88% 99.75% 100% 100.00%

10 min 99.91% 98.88% 99.91% 99.25%

20 min 99.94% 100.00% 99.97% 99.88%

40 min 99.97% 99.88% 99.78% 99.75%

Table S2: 5-Class DNN model performance metrics

Supporting Figure S5 demonstrates the classification performance of a support vector


machine (SVM) model47 on both standard preprocessed spectra and VAE encoded spectra as a
function of the number of training examples (defined as one spectrum per class). For the AST
dataset shown in Figure 4 of the main text, random spectra from each class (six different antibiotic
treatment conditions) are removed from the AST dataset without replacement and used to train the
models. This process is repeated 100 times and the mean and standard deviation of the model
accuracy are reported. One observes that the classification accuracy is much higher when
analyzing the VAE encoded spectra. VAE SVM models show continuously improving
performance with increasing number of samples, while the standard SVM models generally
experience less improvement with additional number of labeled samples. The performance
increase due to VAE encoding is even more pronounced on dose and temporal response datasets.
For these, the SVM models generally experience continuously better improvement with increased
examples of VAE encoded data whereas the standard SVM analysis alone (without VAE encoding)
shows very little improvement with the number of training examples.
Supporting Figure S5: SVM model classification accuracy on evaluation of preprocessed spectra (blue dot), and on
VAE encoded spectra (orange dots) versus the number of training examples. (a) The AST dataset (P. aeruginosa
untreated and treated with carbenicillin or rifampicin after 0.5 h/ 2 h, 6 classes), (b) P. aeruginosa and (d) E. coli
dose response (0, 0.1, 0.5, 1, and 10 μg/mL gentamicin, 5 classes), and (c)
P. aeruginosa and (e) E. coli temporal response (0 min, 20 min, 40 min, 3 classes) are shown.

Supporting Figure S6 depicts a) VAE visualization of 2-dimensional VAE latent space and
b) t-SNE visualization of a 32-dimensional VAE latent space trained on the metabolite
combination dataset. Table S2 provides the legend for the dataset. The ‘combined’ dataset greatly
benefits from a higher (32) dimensional latent space, which is observed by utilizing t-SNE to
visualize clustering as is observed in Figure S6. Increasing the latent space from 2-dimensions to
32-dimensions not only greatly improves the clustering of the different metabolite conditions, it
also reduces the test loss of the VAE model by nearly 50%.

a abcf acd adef bcdef be cef


ab abd acde adf bcdf bef cf
abc abde acdef ae bce bf d
abcd abdef acdf aef bcef c de
abcde abdf ace af bcf cd def
abcdef abe acef b bd cde df
abcdf abef acf bc bde cdef e
abce abf ad bcd bdef cdf ef
abcef ac ade bcde bdf ce f

Table S3: Color labels for combined metabolite mixture data shown in Figure S6. Labels are defined as follows: 2-
methyl napthalene (A), o-cresol (B), 2-amino acetophenone (C), pyrrole (D), 2-pentyl furan (E), and indole (F).
Figure S6: a) VAE visualization and b) t-SNE visualization of the combined metabolite mixture and AST dataset
latent space. All 63 possible mixture combinations of metabolites 2-methyl napthalene (A), o-cresol (B), 2-amino
acetophenone (C), pyrrole (D), 2-pentyl furan (E), and indole (F) are plotted with legend of color and corresponding
mixture in Table S2.
In addition to the superior clustering that the combination VAE latent space provides, it
also enhances the difference between outliers and inliers. Supporting Figure S7 depicts an isolation
forest outlier detection applied to the AST spectra that has been encoded into the combination
VAE model. Outliers are easily identified by isolation forest and removed for processing in the
predictive models.

Supporting Figure S7: Isolation forest predictions of AST spectra that have been encoded with the combination
VAE model. Outliers are shown as red and inliers are shown as blue.

The generative aspect of the VAE model is demonstrated in Supporting Video 1. The
animation is a series of 100 SERS spectra sampled in a line drawn between the center of the
encoded/decoded 0.5 h carbenicillin-treated lysate data and that of the untreated 2 h control data
in the VAE latent space. The shifts in the spectral data between these two antibiotic treatment
classes can be visualized and interpreted in contrast to non-generative machine learning
techniques.

You might also like