Professional Documents
Culture Documents
1
S1 Additional main comparisons of numerical studies
Fig. S1 and Fig. S2 compare the proposed method LinDA with different zero-handling
approaches under settings S6C0 and S0C0. Fig. S3 depicts the results of LinDA, CLR-OLS
and MaAsLin2 with different normalization approaches under setting S0C0. Fig. S4–S10,
S12 and S13–S14 show the results of settings S0C1, S0C2, S1C0, S2C0, S4C0, S5C0, S6C0,
S7C0, S8.1C0, and S8.2C0, respectively. The comparison between disabling and enabling
zero treatment of the ANCOM-BC method is depicted in Fig. S11 under setting S6C0.
Fig. S15 shows the results of setting S0C0 with stronger compositional effects.
2
A n = 50 n = 200
0.8
0.6
Sparse Signal
Empirical False Discovery Rate
0.4
0.2
Method
Adaptive
0.8 Pseudo−count
Imputation
0.6
Dense Signal
0.4
0.2
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
1.0
Sparse Signal
0.8
True Positive Rate
0.6
Method
Adaptive
1.0 Pseudo−count
Imputation
Dense Signal
0.8
0.6
2 4 6 2 4 6
Signal Strength
3
Fig. S1: Performance of LinDA with different zero-handling approaches (S6C0: 10-fold
difference in library size, a binary covariate). Empirical false discovery rate (A) and true
positive rates (B) were averaged over 100 simulation runs. The dashed horizontal line (A)
indicates the target FDR level of 0.05. Note that the red and blue lines are overlapped as
the covariate and sequencing depth are significantly correlated.
A n = 50 n = 200
0.05
Sparse Signal
0.04
Empirical False Discovery Rate
0.03
0.02 Method
Adaptive
0.05 Pseudo−count
Imputation
Dense Signal
0.04
0.03
0.02
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.8
0.7
Sparse Signal
0.6
0.5
True Positive Rate
0.4
Method
0.3 Adaptive
Pseudo−count
0.8 Imputation
0.7
Dense Signal
0.6
0.5
0.4
0.3
2 4 6 2 4 6
Signal Strength
4
Fig. S2: Performance of LinDA with different zero-handling approaches (S0C0: log normal
abundance distribution, a binary covariate). Empirical false discovery rate (A) and true
positive rates (B) were averaged over 100 simulation runs. The dashed horizontal line (A)
indicates the target FDR level of 0.05.
A n = 50 n = 200
0.4
Sparse Signal
0.3
Empirical False Discovery Rate
0.2
0.1 Method
LinDA
0.0 CLR−OLS
MaAsLin2−TSS
0.4 MaAsLin2−TMM
MaAsLin2−CSS
MaAsLin2−CLR
Dense Signal
0.3
0.2
0.1
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.8
Sparse Signal
0.6
True Positive Rate
0.4 Method
LinDA
CLR−OLS
MaAsLin2−TSS
MaAsLin2−TMM
0.8
MaAsLin2−CSS
MaAsLin2−CLR
Dense Signal
0.6
0.4
2 4 6 2 4 6
Signal Strength
5
Fig. S3: Performance comparison between LinDA and MaAsLin2 (S0C0: log normal abun-
dance distribution, a binary covariate). Empirical false discovery rate (A) and true positive
rates (B) were averaged over 100 simulation runs. The dashed horizontal line (A) indicates
the target FDR level of 0.05.
A n = 50 n = 200
0.20
0.15
Sparse Signal
Empirical False Discovery Rate
0.10
0.05
Method
LinDA
0.00
ANCOM−BC
0.20 ALDEx2
MaAsLin2
Spearman
0.15
Dense Signal
0.10
0.05
0.00
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.75
Sparse Signal
0.50
True Positive Rate
0.25
Method
LinDA
0.00
ANCOM−BC
ALDEx2
MaAsLin2
0.75 Spearman
Dense Signal
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
6
Fig. S4: Performance comparison (S0C1: log normal abundance distribution, a continuous
covariate). Empirical false discovery rate (A) and true positive rates (B) were averaged
over 100 simulation runs. Error bars (A) represent the 95% CIs of the method LinDA and
the dashed horizontal line indicates the target FDR level of 0.05.
A n = 50 n = 200
0.6
Sparse Signal
0.4
Empirical False Discovery Rate
0.2
Method
LinDA
0.0
ANCOM−BC
0.6 ALDEx2
MaAsLin2
Wilcoxon
Dense Signal
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.8
0.6
Sparse Signal
0.4
True Positive Rate
0.2
Method
LinDA
0.0
ANCOM−BC
0.8 ALDEx2
MaAsLin2
Wilcoxon
0.6
Dense Signal
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
7
Fig. S5: Performance comparison (S0C2: log normal abundance distribution, a binary
variable of interest and two confounders). Empirical false discovery rate (A) and true
positive rates (B) were averaged over 100 simulation runs. Error bars (A) represent the
95% CIs of the method LinDA and the dashed horizontal line indicates the target FDR
level of 0.05.
A n = 50 n = 200
0.2
Sparse Signal
Empirical False Discovery Rate
0.1
Method
LinDA
0.0 ANCOM−BC
ALDEx2
metagenomeSeq2
MaAsLin2
0.2 Wilcoxon
Dense Signal
0.1
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.8
0.6
Sparse Signal
0.4
True Positive Rate
0.2 Method
LinDA
ANCOM−BC
ALDEx2
0.8 metagenomeSeq2
MaAsLin2
Wilcoxon
0.6
Dense Signal
0.4
0.2
2 4 6 2 4 6
Signal Strength
8
Fig. S6: Performance comparison (S1C0: zero inflated absolute abundances, a binary co-
variate). Empirical false discovery rate (A) and true positive rates (B) were averaged over
100 simulation runs. Error bars (A) represent the 95% CIs of the method LinDA and the
dashed horizontal line indicates the target FDR level of 0.05.
A n = 50 n = 200
0.4
0.3
Sparse Signal
Empirical False Discovery Rate
0.2
0.1 Method
LinDA
0.0 ANCOM−BC
ALDEx2
0.4 metagenomeSeq2
MaAsLin2
0.3 Wilcoxon
Dense Signal
0.2
0.1
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.75
Sparse Signal
0.50
0.25
True Positive Rate
Method
LinDA
0.00 ANCOM−BC
ALDEx2
metagenomeSeq2
0.75 MaAsLin2
Wilcoxon
Dense Signal
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
9
Fig. S7: Performance comparison (S2C0: correlated absolute abundances, a binary covari-
ate). Empirical false discovery rate (A) and true positive rates (B) were averaged over
100 simulation runs. Error bars (A) represent the 95% CIs of the method LinDA and the
dashed horizontal line indicates the target FDR level of 0.05.
A n = 50 n = 200
0.10
Sparse Signal
Empirical False Discovery Rate
0.05
Method
LinDA
0.00 ANCOM−BC
ALDEx2
metagenomeSeq2
MaAsLin2
Wilcoxon
0.10
Dense Signal
0.05
0.00
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.6
Sparse Signal
0.4
0.2
True Positive Rate
Method
LinDA
0.0 ANCOM−BC
ALDEx2
0.6
metagenomeSeq2
MaAsLin2
Wilcoxon
Dense Signal
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
10
Fig. S8: Performance comparison (S4C0: smaller m, a binary covariate). Empirical false
discovery rate (A) and true positive rates (B) were averaged over 1000 simulation runs.
Error bars (A) represent the 95% CIs of the method LinDA and the dashed horizontal line
indicates the target FDR level of 0.05.
A n = 20 n = 30
0.3
Sparse Signal
0.2
Empirical False Discovery Rate
0.1
Method
LinDA
0.0 ANCOM−BC
0.3 ALDEx2
metagenomeSeq2
MaAsLin2
Wilcoxon
Dense Signal
0.2
0.1
0.0
2 4 6 2 4 6
Signal Strength
B n = 20 n = 30
0.5
0.4
Sparse Signal
0.3
0.2
True Positive Rate
0.1 Method
LinDA
0.0 ANCOM−BC
ALDEx2
0.5 metagenomeSeq2
MaAsLin2
0.4 Wilcoxon
Dense Signal
0.3
0.2
0.1
0.0
2 4 6 2 4 6
Signal Strength
11
Fig. S9: Performance comparison (S5C0: smaller n, a binary covariate). Empirical false
discovery rate (A) and true positive rates (B) were averaged over 100 simulation runs.
Error bars (A) represent the 95% CIs of the method LinDA and the dashed horizontal line
indicates the target FDR level of 0.05.
A n = 50 n = 200
0.75
Sparse Signal
Empirical False Discovery Rate
0.50
0.25
Method
LinDA
0.00 ANCOM−BC
ALDEx2
metagenomeSeq2
MaAsLin2
0.75 Wilcoxon
Dense Signal
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
1.00
0.75
Sparse Signal
0.50
True Positive Rate
0.25 Method
LinDA
ANCOM−BC
0.00
ALDEx2
1.00
metagenomeSeq2
MaAsLin2
0.75 Wilcoxon
Dense Signal
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
12
Fig. S10: Performance comparison (S6C0: 10-fold difference in library size, a binary co-
variate). Empirical false discovery rate (A) and true positive rates (B) were averaged over
100 simulation runs. Error bars (A) represent the 95% CIs of the method LinDA and the
dashed horizontal line indicates the target FDR level of 0.05.
A n = 50 n = 200
0.8
Sparse Signal
0.6
Empirical False Discovery Rate
0.4
0.2
Method
ANCOM−BC−1
0.8 ANCOM−BC−2
0.6
Dense Signal
0.4
0.2
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
1.000
0.975
Sparse Signal
0.950
0.925
True Positive Rate
0.900 Method
1.000 ANCOM−BC−1
ANCOM−BC−2
0.975
Dense Signal
0.950
0.925
0.900
2 4 6 2 4 6
Signal Strength
13
0.15
Sparse Signal
0.10
Empirical False Discovery Rate
0.05
Method
LinDA
0.00 ANCOM−BC
ALDEx2
metagenomeSeq2
0.15 MaAsLin2
Wilcoxon
Dense Signal
0.10
0.05
0.00
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.6
Sparse Signal
0.4
True Positive Rate
0.2 Method
LinDA
ANCOM−BC
0.0
ALDEx2
metagenomeSeq2
MaAsLin2
0.6 Wilcoxon
Dense Signal
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
14
0.2
Sparse Signal
Empirical False Discovery Rate
0.1
Method
LinDA−LMM
0.0
LinDA−OLS
CLR−LMM
CLR−OLS
MaAsLin2
0.2
Dense Signal
0.1
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.8
0.6
Sparse Signal
0.4
True Positive Rate
0.2
Method
LinDA−LMM
0.0 LinDA−OLS
0.8 CLR−LMM
CLR−OLS
MaAsLin2
0.6
Dense Signal
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
15
0.3
Sparse Signal
0.2
Empirical False Discovery Rate
0.1
Method
LinDA−LMM
0.0
LinDA−OLS
CLR−LMM
CLR−OLS
0.3
MaAsLin2
Dense Signal
0.2
0.1
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.6
Sparse Signal
0.4
True Positive Rate
0.2
Method
LinDA−LMM
0.0
LinDA−OLS
CLR−LMM
CLR−OLS
0.6 MaAsLin2
Dense Signal
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
16
Fig. S14: Performance comparison (S8.2C0: replicate sampling, a binary covariate). Em-
pirical false discovery rate (A) and true positive rates (B) were averaged over 100 simulation
runs. The dashed horizontal line (A) indicates the target FDR level of 0.05.
A n = 50 n = 200
0.6
Sparse Signal
0.4
Empirical False Discovery Rate
0.2
Method
LinDA
0.0 ANCOM−BC
ALDEx2
metagenomeSeq2
0.6
MaAsLin2
Wilcoxon
Dense Signal
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.2
Sparse Signal
0.1
True Positive Rate
Method
LinDA
0.0 ANCOM−BC
ALDEx2
metagenomeSeq2
MaAsLin2
0.2 Wilcoxon
Dense Signal
0.1
0.0
2 4 6 2 4 6
Signal Strength
17
Fig. S15: Performance comparison (S0C0 with strong compositional effects). Empirical
false discovery rate (A) and true positive rates (B) were averaged over 100 simulation runs.
Error bars (A) represent the 95% CIs of the method LinDA and the dashed horizontal line
indicates the target FDR level of 0.05.
A
Otu00161
Otu00156
Otu00131
Otu00106
Taxa
Otu00047 Debiased
Non−debiased
Otu00044
Otu00042
Otu00036
Otu00013
−5 −4 −3 −2 −1 0 1
Log2FoldChange
10
−Log10Padj
Otu00047
Otu00042
−4 −3 −2 −1 0 1
Log2FoldChange
18
Fig. S16: Effect size plot (A) of differential taxa at FDR level of 0.1 and volcano plot (B) for
the CDI dataset. The “Debiased” points represent the bias-corrected regression coefficients,
and “Non-debiased” points represent the original (biased) regression coefficients. The error
bars represent the 95% CIs of the “Debiased” points. The taxa in black are detected by
LinDA, taxa in red are detected solely by LinDA, and the taxa in blue are missed by LinDA
but detected by one or more of the other methods (A).
A
470392
546227 Debiased
179655
72853
Non−debiased
179381
470172
294672
245916
193312
584417
208565
469991
358798
319455
426436
299777
204932
329241
178915
208543
290251
204072
469888
203708
308873
16076
−2 0 2 4 6
Log2FoldChange
183824
294672
192252
4 182994
194924
204072 470392
−Log10Padj
−2 0 2 4
Log2FoldChange
19
Fig. S17: Effect size plot (A) of differential taxa at FDR level of 0.1 and volcano plot (B) for
the IBD dataset. The “Debiased” points represent the bias-corrected regression coefficients,
and “Non-debiased” points represent the original (biased) regression coefficients. The error
bars represent the 95% CIs of the “Debiased” points. The taxa in black are detected by
LinDA, taxa in red are detected solely by LinDA, and taxa in blue are missed by LinDA
but detected by two or more of the other methods (A).
A
Otu429
Otu427
Otu417 Debiased
Otu411
Otu409
Otu389
Non−debiased
Otu341
Otu336
Otu320
Otu296
Otu294
Otu287
Otu264
Otu261
Otu249
Otu236
Otu235
Otu232
Otu207
Otu204
Otu176
Otu172
Otu151
Otu147
Otu98
Otu95
Otu92
Otu70
Otu58
Otu50
Otu47
Otu23
Otu22
Otu20
Otu19
Otu4
Otu2
−5 0 5
Log2FoldChange
Otu4
Otu20 Otu235
Otu236 Otu685
Otu336 Otu454
Otu429 Otu627
Otu264 Otu494
Otu722
Otu58 Otu98
Otu287 Otu735
2 Otu294 Otu411
Otu480 Otu555 Otu341
Otu204
−Log10Padj
−4 −2 0 2 4 6
Log2FoldChange
20
Fig. S18: Effect size plot (A) of differential taxa at FDR level of 0.1 and volcano plot (B) for
the RA dataset. The “Debiased” points represent the bias-corrected regression coefficients,
and “Non-debiased” points represent the original (biased) regression coefficients. The error
bars represent the 95% CIs of the “Debiased” points. The taxa in black are detected by
LinDA, taxa in red are detected solely by LinDA, and taxa in blue are missed by LinDA
but detected by two or more of the other methods (A).
A
Smoke: n v.s. y
573384
570119
529659
518865
484437
470738
469920
428237
239506
237323
Taxa
191687 Debiased
Non−debiased
186277
185969
149109
94166
92743
86047
74391
70671
15555
3931
−4 −2 0 2
Log2FoldChange
Smoke: n v.s. y
2.0
470738 3931
1.5
428237
191687 15555
186277 padj>0.1 & lfc<=1
1.0 padj>0.1 & lfc>1
padj<=0.1 & lfc<=1
padj<=0.1 & lfc>1
0.5
0.0
−3 −2 −1 0 1 2
Log2FoldChange
21
Fig. S19: Effect size plot (A) of differential taxa detected by LinDA at FDR level of 0.1
and volcano plot (B) for the SMOKE dataset. The “Debiased” points represent the bias-
corrected regression coefficients, and “Non-debiased” points represent the original (biased)
regression coefficients. The error bars represent the 95% CIs of the “Debiased” points. The
taxa in black are detected by LinDA, taxa in red are detected by LinDA but missed by
MaAsLin2, and no taxa are detected by MaAsLin2 but missed by LinDA (A).
A n = 50 n = 200
0.8
0.6
Sparse Signal
Empirical False Discovery Rate
0.4
Method
0.2 LinDA
ANCOM−BC
ALDEx2
0.0
metagenomeSeq2
0.8 MaAsLin2
Wilcoxon
DESeq2
0.6 edgeR
Dense Signal
metagenomeSeq
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.75
Sparse Signal
0.50
Method
True Positive Rate
0.25
LinDA
ANCOM−BC
ALDEx2
0.00
metagenomeSeq2
MaAsLin2
Wilcoxon
0.75
DESeq2
edgeR
Dense Signal
metagenomeSeq
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
22
Fig. S20: Full performance comparison (S0C0: log normal abundance distribution, a binary
covariate). Empirical false discovery rate (A) and true positive rates (B) were averaged
over 100 simulation runs. The dashed horizontal line (A) indicates the target FDR level of
0.05.
A n = 50 n = 200
0.6
Sparse Signal
Empirical False Discovery Rate
0.4
0.2
Method
LinDA
ANCOM−BC
0.0 ALDEx2
MaAsLin2
Spearman
DESeq2
0.6
edgeR
Dense Signal
metagenomeSeq
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
1.00
0.75
Sparse Signal
0.50
Method
True Positive Rate
0.25
LinDA
ANCOM−BC
0.00 ALDEx2
1.00 MaAsLin2
Spearman
DESeq2
0.75 edgeR
Dense Signal
metagenomeSeq
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
23
Fig. S21: Full performance comparison (S0C1: log normal abundance distribution, a con-
tinuous covariate). Empirical false discovery rate (A) and true positive rates (B) were
averaged over 100 simulation runs. The dashed horizontal line (A) indicates the target
FDR level of 0.05.
A n = 50 n = 200
0.8
0.6
Sparse Signal
Empirical False Discovery Rate
0.4
Method
0.2
LinDA
ANCOM−BC
0.0 ALDEx2
MaAsLin2
0.8
Wilcoxon
DESeq2
0.6 edgeR
Dense Signal
metagenomeSeq
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.8
0.6
Sparse Signal
0.4
Method
True Positive Rate
0.2
LinDA
ANCOM−BC
0.0 ALDEx2
MaAsLin2
0.8 Wilcoxon
DESeq2
edgeR
0.6
Dense Signal
metagenomeSeq
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
24
Fig. S22: Full performance comparison (S0C2: log normal abundance distribution, a binary
variable of interest and two confounders). Empirical false discovery rate (A) and true
positive rates (B) were averaged over 100 simulation runs. The dashed horizontal line (A)
indicates the target FDR level of 0.05.
A n = 50 n = 200
0.8
0.6
Sparse Signal
Empirical False Discovery Rate
0.4
Method
0.2 LinDA
ANCOM−BC
ALDEx2
0.0
metagenomeSeq2
0.8 MaAsLin2
Wilcoxon
DESeq2
0.6 edgeR
Dense Signal
metagenomeSeq
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.75
Sparse Signal
0.50
Method
True Positive Rate
0.25 LinDA
ANCOM−BC
ALDEx2
metagenomeSeq2
MaAsLin2
Wilcoxon
0.75 DESeq2
edgeR
Dense Signal
metagenomeSeq
0.50
0.25
2 4 6 2 4 6
Signal Strength
25
Fig. S23: Full performance comparison (S1C0: zero inflated absolute abundances, a binary
covariate). Empirical false discovery rate (A) and true positive rates (B) were averaged
over 100 simulation runs. The dashed horizontal line (A) indicates the target FDR level of
0.05.
A n = 50 n = 200
0.8
0.6
Sparse Signal
Empirical False Discovery Rate
0.4
Method
0.2 LinDA
ANCOM−BC
ALDEx2
0.0
metagenomeSeq2
0.8 MaAsLin2
Wilcoxon
DESeq2
0.6
edgeR
Dense Signal
metagenomeSeq
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.75
Sparse Signal
0.50
Method
True Positive Rate
0.25
LinDA
ANCOM−BC
ALDEx2
0.00
metagenomeSeq2
MaAsLin2
Wilcoxon
0.75 DESeq2
edgeR
Dense Signal
metagenomeSeq
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
26
Fig. S24: Full performance comparison (S2C0: correlated absolute abundances, a binary
covariate). Empirical false discovery rate (A) and true positive rates (B) were averaged
over 100 simulation runs. The dashed horizontal line (A) indicates the target FDR level of
0.05.
A n = 50 n = 200
0.6
Sparse Signal
0.4
Empirical False Discovery Rate
Method
0.2
LinDA
ANCOM−BC
ALDEx2
0.0 metagenomeSeq2
MaAsLin2
0.6 Wilcoxon
DESeq2
edgeR
Dense Signal
0.4 metagenomeSeq
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
1.00
0.75
Sparse Signal
0.50
Method
True Positive Rate
LinDA
0.25 ANCOM−BC
ALDEx2
metagenomeSeq2
1.00 MaAsLin2
Wilcoxon
DESeq2
0.75 edgeR
Dense Signal
metagenomeSeq
0.50
0.25
2 4 6 2 4 6
Signal Strength
27
Fig. S25: Full performance comparison (S3C0: gamma abundance distribution, a binary
covariate). Empirical false discovery rate (A) and true positive rates (B) were averaged
over 100 simulation runs. The dashed horizontal line (A) indicates the target FDR level of
0.05.
A n = 50 n = 200
0.8
0.6
Sparse Signal
Empirical False Discovery Rate
0.4
Method
0.2 LinDA
ANCOM−BC
ALDEx2
0.0
metagenomeSeq2
0.8 MaAsLin2
Wilcoxon
DESeq2
0.6
edgeR
Dense Signal
metagenomeSeq
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.6
Sparse Signal
0.4
0.2 Method
True Positive Rate
LinDA
ANCOM−BC
ALDEx2
0.0
metagenomeSeq2
0.6 MaAsLin2
Wilcoxon
DESeq2
edgeR
Dense Signal
0.4
metagenomeSeq
0.2
0.0
2 4 6 2 4 6
Signal Strength
28
Fig. S26: Full performance comparison (S4C0: smaller m, a binary covariate). Empirical
false discovery rate (A) and true positive rates (B) were averaged over 1000 simulation
runs. The dashed horizontal line (A) indicates the target FDR level of 0.05.
A n = 20 n = 30
0.8
0.6
Sparse Signal
Empirical False Discovery Rate
0.4
Method
0.2 LinDA
ANCOM−BC
ALDEx2
0.0
metagenomeSeq2
MaAsLin2
0.8
Wilcoxon
DESeq2
0.6 edgeR
Dense Signal
metagenomeSeq
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 20 n = 30
0.6
Sparse Signal
0.4
0.2 Method
True Positive Rate
LinDA
ANCOM−BC
ALDEx2
0.0
metagenomeSeq2
0.6 MaAsLin2
Wilcoxon
DESeq2
edgeR
Dense Signal
0.4
metagenomeSeq
0.2
0.0
2 4 6 2 4 6
Signal Strength
29
Fig. S27: Full performance comparison (S5C0: smaller n, a binary covariate). Empirical
false discovery rate (A) and true positive rates (B) were averaged over 100 simulation runs.
The dashed horizontal line (A) indicates the target FDR level of 0.05.
A n = 50 n = 200
1.00
0.75
Sparse Signal
Empirical False Discovery Rate
0.50
Method
0.25 LinDA
ANCOM−BC
ALDEx2
0.00
metagenomeSeq2
1.00 MaAsLin2
Wilcoxon
DESeq2
0.75
edgeR
Dense Signal
metagenomeSeq
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
1.00
0.75
Sparse Signal
0.50
Method
True Positive Rate
0.25 LinDA
ANCOM−BC
ALDEx2
0.00 metagenomeSeq2
1.00 MaAsLin2
Wilcoxon
DESeq2
0.75 edgeR
Dense Signal
metagenomeSeq
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
30
Fig. S28: Full performance comparison (S6C0: 10-fold difference in library size, a binary
covariate). Empirical false discovery rate (A) and true positive rates (B) were averaged
over 100 simulation runs. The dashed horizontal line (A) indicates the target FDR level of
0.05.
A n = 50 n = 200
0.8
0.6
Sparse Signal
Empirical False Discovery Rate
0.4
Method
0.2 LinDA
ANCOM−BC
ALDEx2
0.0
metagenomeSeq2
MaAsLin2
0.8 Wilcoxon
DESeq2
0.6 edgeR
Dense Signal
metagenomeSeq
0.4
0.2
0.0
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.75
Sparse Signal
0.50
Method
True Positive Rate
0.25 LinDA
ANCOM−BC
ALDEx2
0.00 metagenomeSeq2
MaAsLin2
Wilcoxon
0.75
DESeq2
edgeR
Dense Signal
metagenomeSeq
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
31
Fig. S29: Full performance comparison (S7C0: negative binomial abundance distribution,
a binary covariate). Empirical false discovery rate (A) and true positive rates (B) were
averaged over 100 simulation runs. The dashed horizontal line (A) indicates the target
FDR level of 0.05.
A n = 50 n = 200
0.75
Sparse Signal
Empirical False Discovery Rate
0.50
Method
0.25
LinDA
ANCOM−BC
ALDEx2
0.00
metagenomeSeq2
MaAsLin2
Wilcoxon
0.75 DESeq2
edgeR
Dense Signal
metagenomeSeq
0.50
0.25
0.00
2 4 6 2 4 6
Signal Strength
B n = 50 n = 200
0.5
0.4
Sparse Signal
0.3
0.2
Method
True Positive Rate
0.1 LinDA
ANCOM−BC
ALDEx2
0.0
metagenomeSeq2
0.5 MaAsLin2
Wilcoxon
0.4 DESeq2
edgeR
Dense Signal
0.3 metagenomeSeq
0.2
0.1
0.0
2 4 6 2 4 6
Signal Strength
32
Fig. S30: Full performance comparison (S0C0 with strong compositional effects). Empirical
false discovery rate (A) and true positive rates (B) were averaged over 100 simulation runs.
The dashed horizontal line (A) indicates the target FDR level of 0.05.