Professional Documents
Culture Documents
Supplemental Methods
Custom software was used to annotate the beginning and end of the QRS complex of each
beat. Premature ventricular contractions and beats with excessive noise were excluded. The
median QRS complex from the annotated beats was obtained for each lead for each patient.
We first extracted the median QRS complex from each lead, and stretched it to a standard
length of 200 data points using linear interpolation. The stretched QRS complexes from each of
the 12 leads were concatenated into a single vector of 2400 data points. QRSd was then added
to the end of the vector. Therefore, each patient’s QRS complex information was represented by
a feature vector with 2401 elements. The feature vector was standardized by subtracting each
element’s mean and dividing by its standard deviation in the dataset. QRS complex processing
Dimensionality reduction
represented by a large number of variables (“dimensions”) into fewer dimensions. PCA is a well-
established linear dimensionality reduction approach that transforms a set of variables into new
variables, termed “principal components,” that are linearly uncorrelated1. The first component
accounts for the largest amount of variance within the data, while subsequent components
account for sequentially less variance within the data. Performing PCA on high-dimensional
data and retaining only the first few principal components allows a large amount of data
our dataset of 2401 dimensions. To determine the number of components to retain, we plotted
the percent variance explained by each principal component, and retained the number of
components prior to the greatest drop-off in explained variance (Supplemental Figure 1). We
call the standardized feature vector for the nth patient xn. The nth patient’s score for dimension
one, sn,1, was obtained by mapping xn with a same-dimensional vector of coefficients w1, such
that
Similarly, the score for dimension i was obtained as sn,i = xn●wi, where wi is the mapping
K-means clustering
Once we embedded the QRS complex information into a two-dimensional space using PCA, we
used k-means clustering to aggregate patients into groups based on their QRS PCA
representation. K-means clustering is a popular clustering algorithm that partitions data points
into k number of clusters such that each data point is a member of the cluster with the closest
mean value2. Because a binary cutoff is typically identified for ECG metrics to stratify risk, we
used k=2 to identify two groups. K=3 and k=4 were also assessed to determine if any additional
patterns emerged. We used 1000 replicates to obtain consistent clusters, and we used
Stability of the mapping vectors wi was assessed by repeating PCA on subsets of the data
generated by randomly sampling 10%, 20%, 30%, …, 90% of the available observations and
plotting wi.
To evaluate the impact of coordinated analysis of all 12 leads, we repeated the PCA and
unsupervised clustering process using subsets of the ECG leads and examined the primary
outcomes between the unsupervised clustering groups. We assessed the following lead
subsets: each of the individual leads, the set of precordial leads (V1-V6), the set of lateral leads
(I, aVL, V5, V6), and the linearly independent leads (I, II, V1-V6).
Validation analysis
In the validation analysis, the objective was to directly apply the QRS PCA transformation that
was derived on the Primary Cohort to new data. The same mathematical operations that were
used to obtain the QRS PCA representation from the 12-lead QRS patterns from patients in the
Primary Cohort were used to generate QRS PCA representations from 12-lead QRS patterns of
patients in the validation cohorts. Patients in the validation cohorts were assigned to one of two
groups. Group were assigned based on which of the two cluster centroids defined by k-means
clustering on the Primary Cohort had the shortest distance to the two-dimensional QRS PCA
Cluster stability
To assess cluster stability, the PCA and clustering process was independently repeated in the
validation cohorts. The patient characteristics and clinical outcomes of the groups identified by
repeating the clustering process in the validation cohorts were compared to those identified in
the Primary Cohort. In addition, the overlap of the groups identified by repeating the PCA and
clustering process in the validation cohorts with the groups identified by applying the PCA
mapping vector and clusters derived from the Primary Cohort was quantified.
PCA is a linear method of dimensionality reduction, and thus may not capture nonlinear
reduction process was also attempted with non-linear methods followed by k-means clustering.
Primary outcomes were compared between the identified groups. The following non-linear
methods were assessed: t-distributed stochastic network embedding (t-SNE), isomap, locally
linear embedding (LLE), and multidimensional scaling (MDS). This sub-analysis was conducted
The primary analysis used PCA on feature vectors of 12-lead QRS waveforms and QRSd.
Alternative feature vectors were also assessed, including (1) 12-lead QRS waveforms without
QRSd, (2) 12-lead waveforms and QRS area, and (3) 12-lead waveforms and both QRSd and
QRS area. Additionally, the effect of reducing 12-lead QRS waveforms to 2 dimensions, then (4)
adding QRSd as a third dimension, (5) adding QRS area as a third dimension, or (6) adding
QRSd as a third dimension and QRS area as a fourth dimension, prior to k-means clustering
was assessed. The primary outcomes of the two groups identified after k-means clustering with
the 6 aforementioned ECG inputs were assessed. The percent overlap in cluster assignment
compared to the primary PCA and clustering method was also computed. This sub-analysis was
To assess the impact of using a supervised learning approach, machine learning classifiers
were trained to predict CRT response using ECG variables. CRT response was defined by
First, we assessed the impact of adding QRS area and QRS PCA representation to an
ECG-only model. The baseline classifier was a logistic regression of QRSd and presence of
LBBB. The classifier was also compared to a QRS area alone, QRS PCA dimensions alone,
and logistic regression and random forest using input of QRSd, presence of LBBB, QRS area,
baseline classifier model using 9 common clinical variables3 (QRSd, QRS morphology, gender,
ischemic etiology of cardiomyopathy, NYHA status, LVEF, LVEDD, history of atrial fibrillation,
and epicardial LV lead). This baseline classifier was compared to a random forest model with
The mean area under curve (AUC) of classifiers were evaluated through 100 iterations
of 5-fold cross-validation on the Primary Cohort. Statistical significance to the baseline classifier
during cross-validation was evaluated using a modified t-test for repeated iterations of cross-
validation.
After cross-validation, the best-performing classifier was trained on the entirety of the
Primary Cohort, and tested on the validation cohorts. The AUC of predictions on the Echo
Validation Cohort was computed. Models incorporating clinical variables were unable to be
assessed on the Echo Validation Cohort due to unavailable necessary clinical data. In the
Survival Validation Cohort, the c-index of the classifier output was computed. C-indexes for
discriminating long-term survival were computed and compared with the nonparametric method
by Kang et al4.
LV lead location was determined radiographically using lateral and posteroanterior chest x-ray
after CRT implant. LV lead location was categorized along the longitudinal axis (basal,
midventricle, or apical) as well as the short axis (posterior, lateral, or anterior). First, primary
outcomes were compared between locations on the long axis as well as between locations on
the short axis. Next, primary outcomes ( (1) survival free from death, LVAD, or heart transplant
and (2) degree of LVEF change) were compared between lead locations within QRS PCA
groups to assess for any significant interactions between QRS PCA groups and lead location.
Supplemental References
1. Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments.
Heidelberg, 2006). (eds. Kogan, J., Nicholas, C. & Teboulle, M.) 25–71 doi:10.1007/3-540-
28349-8_2.
3. Feeny AK, Rickard J, Patel D, Toro S, Trulock KM, Park CJ, LaBarbera MA, Varma N,
4. Kang L, Chen W, Petrick NA, Gallas BD. Comparing two correlated C indices with right-
censored survival outcome: a one-shot nonparametric approach. Stat Med. 2015; 34, 685–
703.
Supplemental Table 1. Baseline characteristics in the Primary Cohort
Body mass index (kg/m2) 28.9 ± 6.1 28.8 ± 6.0 29.0 ± 6.3 0.62
History of atrial fibrillation 271 (50.3%) 132 (43.3%) 139 (59.4%) <0.001
Chronic obstructive
82 (15.2%) 47 (15.4%) 35 (15.0%) 0.98
pulmonary disease
Cerebrovascular accident or
61 (11.3%) 34 (11.1%) 27 (11.5%) 0.99
transient ischemic attack
Electrocardiography
QRS area (μVs) 111.5 ± 47.2 138.9 ± 41.3 75.8 ± 25.8 <0.001
Laboratory
Serum creatinine (mg/dL) 1.1 [0.9-1.4] 1.0 [0.8-1.3] 1.2 [0.9-1.6] <0.001
Pharmacotherapy
Angiotensin converting
angiotensin-receptor blocker
Outcomes
Echocardiogram follow-up
9.1 [5.7-16.1]
(months)
Echocardiogram follow-up
22 (4%)
less than 60 days
distributed variables are reported as median [25th-75th percentile]. Categorical variables are
reported as n (%). PCA = principal components analysis. NYHA = New York Heart Association.
LVEF = left ventricular ejection fraction. LVEDD = left ventricular end-diastolic diameter. LVESD
= left ventricular end-systolic diameter. MR = mitral regurgitation. LBBB = left bundle branch
block. RBBB = right bundle branch block. IVCD = intraventricular conduction delay.
Supplemental Table 2. Primary outcomes in subgroups of the Primary Cohort: (1) Cox
proportional hazards model for death, heart transplantation, or LVAD, and (2) LVEF
change
Composite
LVEF %
endpoint
p change p
hazard ratio
(mean ± SD)
[95% CI]
LBBB (n = 352)
Non-LBBB (n = 40) vs. 0.41 [0.27-0.62] <0.001 3.7 ± 9.7 vs. <0.001
LBBB (n = 275) 12.3 ± 12.0
QRSd units in ms. QRS area units in μVs. LVAD = left ventricular assist device. LVEF = left
Chronic obstructive
Cerebrovascular accident or
Electrocardiography
Laboratory
Serum creatinine (mg/dL) 1.2 [0.9-1.6] 1.1 [0.9-1.4] 1.3 [1.0, 1.8] 0.001
Pharmacotherapy
Angiotensin converting
enzyme inhibitor or
Outcomes
distributed variables are reported as median [25th-75th percentile]. Categorical variables are
reported as n (%). PCA = principal components analysis. NYHA = New York Heart Association.
LVEF = left ventricular ejection fraction. LVEDD = left ventricular end-diastolic diameter. LVESD
= left ventricular end-systolic diameter. MR = mitral regurgitation. LBBB = left bundle branch
block. RBBB = right bundle branch block. IVCD = intraventricular conduction delay.
Supplemental Table 4. Event-free survival in the Survival Validation Cohort: Cox
proportional hazards model for death, heart transplantation, or left ventricular assist
device
Composite endpoint
p
hazard ratio [95% CI]
Entire cohort (n = 301)
QRS PCA Group 2 (n = 152) vs.
QRS PCA Group 1 (n = 149) 0.49 [0.37-0.65] <0.001
Non-LBBB (n = 137) vs.
LBBB (n = 164) 0.58 [0.44-0.77] <0.001
QRS area ≤95 (n = 143) vs.
QRS area >95 (n = 158) 0.43 [0.32-0.56] <0.001
LBBB (n = 164)
QRS PCA Group 2 (n = 44) vs.
QRS PCA Group 1 (n = 120) 0.43 [0.29-0.65] <0.001
QRSd <150 (n = 63) vs.
QRSd ≥150 (n = 101) 0.67 [0.44-1.01] 0.058
QRS area ≤95 (n = 39) vs.
QRS area >95 (n = 125) 0.35 [0.23-0.53] <0.001
LBBB and QRSd ≥150 (n = 101)
QRS PCA Group 2 (n = 35) vs.
QRS PCA Group 1 (n = 66) 0.54 [0.31-0.94] 0.029
QRS area ≤95 (n = 29) vs.
QRS area >95 (n = 72) 0.41 [0.21-0.80] 0.009
Non-LBBB (n = 137)
QRS PCA Group 2 (n = 108) vs.
QRS PCA Group 1 (n = 29) 0.77 [0.47-1.23] 0.27
QRSd <150 (n = 53) vs.
QRSd ≥150 (n = 84) 0.88 [0.60-1.28] 0.51
QRS area ≤95 (n = 104) vs.
QRS area >95 (n = 33) 0.63 [0.40-1.00] 0.048
QRSd <150 (n = 116)
QRS PCA Group 2 (n = 67) vs. 0.46 [0.29-0.72] <0.001
QRS PCA Group 1 (n = 49)
Non-LBBB (n = 53) vs.
LBBB (n = 63) 0.71 [0.46-1.09] 0.12
QRS area ≤95 (n = 86) vs.
QRS area >95 (n = 30) 0.36 [0.21-0.62] <0.001
QRS PCA Group 1 (n = 149)
Non-LBBB (n = 29) vs.
LBBB (n = 120) 0.56 [0.34-0.92] 0.022
QRS area ≤95 (n = 25) vs.
QRS area >95 (n = 124) 0.53 [0.31-0.90] 0.020
QRS PCA Group 2 (n = 152)
Non-LBBB (n = 108) vs.
LBBB (n = 44) 0.99 [0.68-1.46] 0.98
QRS area ≤95 (n = 118) vs.
QRS area >95 (n = 34) 0.51 [0.32-0.81] 0.004
QRS area >95 (n = 158)
Non-LBBB (n = 33) vs.
LBBB (n = 125) 0.64 [0.39-1.03] 0.065
QRS PCA Group 2 (n = 34) vs.
QRS PCA Group 1 (n = 124) 0.75 [0.46-1.23] 0.26
QRS area ≤95 (n = 143)
Non-LBBB (n = 104) vs.
LBBB (n = 39) 1.09 [0.73-1.62] 0.68
QRS PCA Group 2 (n = 118) vs.
QRS PCA Group 1 (n = 25) 0.70 [0.42-1.18] 0.18
QRSd units in ms. QRS area units in μVs. LVAD = left ventricular assist device. CI = confidence
interval. PCA = principal components analysis. LBBB = left bundle branch block.
Supplemental Table 5. Baseline characteristics in the Echo Validation Cohort
QRS PCA QRS PCA
Entire cohort
Group 1 Group 2 p
(n=106)
(n =67) (n=39)
Echocardiography
Electrocardiography
Outcomes
Echocardiogram follow-up
Echocardiogram follow-up
distributed variables are reported as median [25th-75th percentile]. Categorical variables are
reported as n (%). PCA = principal components analysis. LVEF = left ventricular ejection
fraction. LVEDD = left ventricular end-diastolic diameter. LVESD = left ventricular end-systolic
diameter. LBBB = left bundle branch block. RBBB = right bundle branch block. IVCD =
Validation Cohort
LVEF % change
p
(mean ± SD)
Entire cohort (n = 106)
QRS PCA Group 2 (n = 39) vs.
QRS PCA Group 1 (n = 67) 5.5 ± 7.9 vs. 9.8 ± 10.0 0.025
Non-LBBB (n = 24) vs.
LBBB (n = 82) 2.3 ± 7.3 vs. 10.0 ± 9.3 <0.001
QRS area ≤95 (n = 37) vs.
QRS area >95 (n = 69) 5.0 ± 9.4 vs. 10.0 ± 9.0 0.009
LBBB (n = 82)
QRS PCA Group 2 (n = 20) vs.
QRS PCA Group 1 (n = 62) 7.9 ± 7.5 vs. 10.6 ± 9.8 0.26
QRSd <150 (n = 22) vs.
QRSd ≥150 (n = 60) 7.9 ± 8.0 vs. 10.7 ± 9.7 0.20
QRS area ≤95 (n = 17) vs.
QRS area >95 (n = 65) 7.5 ± 10.9 vs. 10.6 ± 8.9 0.22
LBBB and QRSd ≥150 (n = 60)
QRS PCA Group 2 (n = 15) vs.
QRS PCA Group 1 (n = 45) 8.3 ± 7.3 vs. 11.5 ± 10.3 0.27
QRS area ≤95 (n = 9) vs.
QRS area >95 (n = 51) 8.6 ± 12.5 vs. 11.1 ± 9.2 0.47
Non-LBBB (n = 24)
QRS PCA Group 2 (n = 19) vs.
QRS PCA Group 1 (n = 5) 3.1 ± 7.6 vs. -0.6 ± 5.7 0.33
QRSd <150 (n = 14) vs.
QRSd ≥150 (n = 10) 1.2 ± 6.2 vs. 3.8 ± 8.7 0.41
QRS area ≤95 (n = 20) vs.
QRS area >95 (n = 4) 2.9 ± 7.7 vs. -0.8 ± 4.9 0.37
QRSd <150 (n = 36)
QRS PCA Group 2 (n = 15) vs.
QRS PCA Group 1 (n = 21) 3.4 ± 7.3 vs. 6.6 ± 8.3 0.24
Non-LBBB (n = 14) vs.
LBBB (n = 22) 1.2 ± 6.2 vs. 7.9 ± 8.0 0.012
QRS area ≤95 (n = 20) vs.
QRS area >95 (n = 16) 3.6 ± 7.8 vs. 7.4 ± 7.9 0.16
QRS PCA Group 1 (n = 67)
Non-LBBB (n = 5) vs.
LBBB (n = 62) -0.6 ± 5.7 vs. 10.6 ± 9.8 0.014
QRS area ≤95 (n = 11) vs.
QRS area >95 (n = 56) 7.7 ± 13.4 vs. 10.2 ± 9.3 0.46
QRS PCA Group 2 (n = 39)
Non-LBBB (n = 19) vs.
LBBB (n = 20) 3.1 ± 7.6 vs. 7.9 ± 7.5 0.053
QRS area ≤95 (n = 26) vs.
QRS area >95 (n = 13) 3.8 ± 7.2 vs. 8.9 ± 8.3 0.056
QRS area >95 (n = 69)
Non-LBBB (n = 4) vs.
LBBB (n = 65) -0.8 ± 4.9 vs. 10.6 ± 8.8 0.014
QRS PCA Group 2 (n = 13) vs.
QRS PCA Group 1 (n = 56) 8.9 ± 8.3 vs. 10.2 ± 9.3 0.65
QRS area ≤95 (n = 37)
Non-LBBB (n = 20) vs.
LBBB (n = 17) 2.9 ± 7.7 vs. 7.5 ± 10.9 0.14
QRS PCA Group 2 (n = 26) vs.
QRS PCA Group 1 (n = 11) 3.8 ± 7.2 vs. 7.7 ± 13.4 0.26
QRSd units in ms. QRS area units in μVs. LVEF = left ventricular ejection fraction. SD =
standard deviation. PCA = principal components analysis. LBBB = left bundle branch block.
Supplemental Table 7: Primary outcomes between unsupervised clustering groups
hazard ratio
[95% CI]
III 0.87 [0.69-1.09] 0.22 7.3 ± 11.6 vs. 9.7 ± 11.8 0.021
aVR 0.50 [0.40-0.64] <0.001 6.4 ± 11.4 vs. 10.9 ± 11.7 <0.001
aVL 0.71 [0.57-0.89] 0.003 6.9 ± 11.5 vs. 10.1 ± 11.8 0.002
aVF 1.00 [0.80-1.25] 0.99 8.2 ± 11.9 vs. 8.9 ± 11.5 0.51
Precordial (V1-V6) 0.49 [0.39-0.61] <0.001 3.4 ± 9.2 vs. 11.3 ± 12.0 <0.001
Lateral (I, aVL, V5, V6) 0.6 [0.47-0.76] <0.001 6.5 ± 11.1 vs. 11.6 ± 12.0 <0.001
Independent (I, II, V1-V6) 0.44 [0.35-0.55] <0.001 4.1 ± 9.6 vs. 11.3 ± 12.1 <0.001
12 leads 0.44 [0.35-0.55] <0.001 4.6 ± 10.0 vs. 11.4 ± 12.1 <0.001
Hazard ratios and LVEF change comparisons were made with Group 2 vs Group 1. PCA =
principal components analysis. CI = confidence interval. LVEF = left ventricular ejection fraction.
SD = standard deviation.
Supplemental Table 8: Primary outcomes in groups identified by independently repeating
Percent Composite
LVEF %
overlap in endpoint
Validation method p change p
cluster hazard ratio
(mean ± SD)
assignment [95% CI]
validation cohort
Independently repeating
cohorts
Supplemental Table 9: LVEF Responder Rates
Responder Super-responder
LBBB (n = 598)
Non-LBBB (n = 348)
LVEF post-implant (%) 31.7 ± 12.9 34.5 ± 13.1 28.1 ± 11.7 <0.001
LVEDD post-implant (cm) 5.8 ± 1.1 5.5 ± 1.1 6.1 ± 1.0 <0.001
LVESD post-implant (cm) 4.6 ± 1.2 4.3 ± 1.3 5.0 ± 1.1 <0.001
MR grade post-implant (1-9) 3.0 ± 2.2 2.7 ± 2.2 3.4 ± 2.2 0.001
Change in LVEF (absolute %) 8.5 ± 11.7 11.4 ± 12.1 4.6 ± 10.0 <0.001
Change in LVEDD (cm) -0.3 ± 0.9 -0.6 ± 0.9 0.0 ± 0.8 <0.001
Change in LVESD (cm) -0.5 ± 1.1 -0.8 ± 1.2 -0.1 ± 1.0 <0.001
Change in NYHA class -0.8 ± 0.8 -0.9 ± 0.8 -0.7 ± 0.8 0.003
PCA = principal components analysis. LVEF = left ventricular ejection fraction. LVEDD = left
ventricular end-diastolic diameter. LVESD: left ventricular end-systolic diameter. NYHA = New
clustering input
Composite
Overlap in LVEF %
endpoint
Clustering input cluster p change p
hazard ratio
assignment (mean ± SD)
[95% CI]
LVEF %
Hazard coefficient
change
for death, heart
p regression p
transplant, or
coefficient ±
LVAD [95% CI]
standard error
LVAD = left ventricular assist device. CI = confidence interval. LVEF = left ventricular ejection
fraction. LBBB = left bundle branch block. PCA = principal components analysis.
Supplemental Table 14. Results of supervised machine learning classification
Cross-validation results: Effect of adding QRS area, QRS PCA to QRSd and LBBB
P
P
Model Mean area under curve (versus QRS
(versus LR 1)
PCA score)
LR 1: Logistic regression
of LBBB
representation
representation
Cross-validation results: Effect of adding QRS area, QRS PCA to clinical variables
P
Model Mean area under curve
(versus LR 2)
LR 2: Logistic regression
0.72 ± 0.05 n/a
with 9 clinical variables
PCA representation
PCA representation
P
Area under curve
Model P (versus LR 1) (versus QRS
[95% CI]
PCA Score)
LR 1: Logistic regression
0.69 [0.59 – 0.79] n/a 0.65
(QRSd, LBBB)
QRS PCA groups to stratify QRS morphology instead of QRS duration: (1) Cox
proportional hazards model for death, heart transplantation, or LVAD, and (2) LVEF
change
Composite
[95% CI]
QRSd units in ms. QRS area units in μVs. PCA = principal components analysis. LVAD = left
ventricular assist device. LVEF = left ventricular ejection fraction. CI = confidence interval. SD =
standard deviation. LBBB = left bundle branch block. PCA = principal components analysis.
Supplemental Figure 1: Percent variance explained by principal components
The percentage of variance explained by each of the first ten principal components. The “elbow”
of the plot occurs at principal component 3, where the largest drop in explained variance
occurred from principal component two to three (15% to 8%). Subsequently, only the first two
In our stability analysis, the mapping vector w1 was obtained by repeating our methods using
only 10%, 20%, 30%, …, 90% of the available patients. w1 showed a robust pattern across
Clustering using 3 and 4 groups affirmed that survival and reverse remodeling outcomes
The mapping vector w1 used to obtain the QRS PCA score is interestingly similar to the Kors
the precordial leads. Lead II was the only lead with discrepant coefficients. This suggests that
the QRS area on the VCG Z-axis is a good representation of the dominant component of
variance in patients with conduction delay. Note: the Kors transformation vector does not use
lead III or the augmented limb leads. The Kors transformation coefficients were uniformly