You are on page 1of 67

Molecule TOPKAT_Aerobic_Biodegradability

Structural Similar Compounds


Name Dicofol Triphenyltin_hydroxide Phenol,_5-chloro-2-(2,4-
dichlorophenoxy)-
Structure

C20H14ClF4N2[?] Actual Endpoint Non-Degradable Non-Degradable Non-Degradable


Molecular Weight: 393.78517 Predicted Endpoint Non-Degradable Non-Degradable Non-Degradable
ALogP: 6.54 Distance 0.611 0.671 0.693
Rotatable Bonds: 4 Reference Environmental Toxicology Environmental Toxicology Environmental Toxicology
& Chemistry 18(9), 1763- & Chemistry 18(9), 1763- & Chemistry 18(9), 1763-
Acceptors: 2 1768, 1999. 1768, 1999. 1768, 1999.
Donors: 1
Model Applicability
Model Prediction Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: Non-Degradable in the training set.
Probability: 0.0123 1. All properties and OPS components are within expected ranges.
Enrichment: 0.0283
Bayesian Score: -23.5 Feature Contribution
Mahalanobis Distance: 16.5
Mahalanobis Distance p-value: 2.84e-017
Top features for positive contribution
Prediction: Positive if the Bayesian score is above the estimated Fingerprint Bit/Smiles Feature Structure Score Degradable in
best cutoff value from minimizing the false positive and false training set
negative rate.
Probability: The esimated probability that the sample is in the SCFP_12 136597326 0.36 179 out of 307
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 0 0.223 328 out of 646

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Degradable in
training set
SCFP_12 -52074512 -1.87 5 out of 93

SCFP_12 -601571304 -1.84 5 out of 91

SCFP_12 26 -1.62 0 out of 10


Molecule TOPKAT_Ames_Mutagenicity
Structural Similar Compounds
Name 24225-71-6 1250-95-9 Cyclohexanecarbamic
acid; 1-phenyl-1-(3;4-
xylyl)-2-propynyl ester
Structure

C20H14ClF4N2[?]
Actual Endpoint Mutagen Mutagen Mutagen
Molecular Weight: 393.78517
Predicted Endpoint Mutagen Non-Mutagen Mutagen
ALogP: 6.54
Distance 0.544 0.560 0.567
Rotatable Bonds: 4
Reference Kazius et. al., J. Med. Kazius et. al., J. Med. EMIC
Acceptors: 2 Chem. (2005) 48, 312-320 Chem. (2005) 48, 312-320
Donors: 1
Model Applicability
Model Prediction Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: Non-Mutagen in the training set.
Probability: 0.458 1. All properties and OPS components are within expected ranges.
Enrichment: 0.82
Bayesian Score: -8.45 Feature Contribution
Mahalanobis Distance: 8.49
Mahalanobis Distance p-value: 0.954
Top features for positive contribution
Prediction: Positive if the Bayesian score is above the estimated Fingerprint Bit/Smiles Feature Structure Score Mutagen in training
best cutoff value from minimizing the false positive and false set
negative rate.
Probability: The esimated probability that the sample is in the SCFP_12 -1798344807 0.437 81 out of 91
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 1684592399 0.344 27 out of 33

SCFP_12 667776369 0.335 336 out of 420

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Mutagen in training
set
SCFP_12 -300280774 -1.51 3 out of 30

SCFP_12 -1903175541 -1.51 3 out of 30

SCFP_12 -1211133908 -1.51 2 out of 22


Molecule TOPKAT_Developmental_Toxicity_Potential
Structural Similar Compounds
Name Benzbromarone Triclabendazole Hexachlorophene
Structure

Actual Endpoint Toxic Toxic Toxic


C20H14ClF4N2[?] Predicted Endpoint Toxic Toxic Toxic
Molecular Weight: 393.78517 Distance 0.612 0.633 0.645
ALogP: 6.54 Reference Shinryo to Shinaku Toxicology 43(3):283-287; Teratology 12:83-88; 1975
Rotatable Bonds: 4 16:1521-1545; 1979 1987
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Toxic 1. All properties and OPS components are within expected ranges.
Probability: 0.452
Enrichment: 0.859 Feature Contribution
Bayesian Score: -2.72 Top features for positive contribution
Mahalanobis Distance: 10.3 Fingerprint Bit/Smiles Feature Structure Score Toxic in training
Mahalanobis Distance p-value: 0.009 set
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false SCFP_6 2096873229 0.271 1 out of 1
negative rate.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_6 -1211866396 0.21 8 out of 12

SCFP_6 -1794884847 0.202 6 out of 9

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Toxic in training
set
SCFP_6 1684592399 -0.718 0 out of 2

SCFP_6 -1794974220 -0.55 2 out of 8

SCFP_6 2010506287 -0.422 0 out of 1


Molecule TOPKAT_Mouse_Female_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Dronabinol Quazepam Hexachlorophene
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


C20H14ClF4N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 393.78517 Distance 0.614 0.614 0.661
ALogP: 6.54 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. OPS PC21 out of range. Value: 5.6716. Training min, max, SD, explained variance: -3.5557,
Probability: 0.205 4.2395, 1.228, 0.0153.
Enrichment: 0.639 2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Bayesian Score: -6.28 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance: 14.2 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance p-value: 7.17e-006 5. Unknown ECFP_2 feature: -1068136657: [*][c](:[*]):[c](F):[c]([*]):[*]
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false
negative rate. Feature Contribution
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows Top features for positive contribution
a normal distribution and is different from the prediction using a
cutoff. Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Enrichment: An estimate of enrichment, that is, the increased training set
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_6 -1046436026 0.104 8 out of 23

ECFP_6 -938530932 0.0661 8 out of 24

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_6 -219423964 -0.935 0 out of 5

ECFP_6 1335691903 -0.669 3 out of 22

ECFP_6 -1952889961 -0.657 0 out of 3


Molecule TOPKAT_Mouse_Female_NTP
Structural Similar Compounds
Name Dicofol 1-trans-delta(sup 9)- 4;4'-Thiobis-(6-tert-butyl-
tetrahydrocannabinol m-cresol)
Structure

C20H14ClF4N2[?] Actual Endpoint Non-Carcinogen Carcinogen Non-Carcinogen


Molecular Weight: 393.78517 Predicted Endpoint Non-Carcinogen Carcinogen Non-Carcinogen
ALogP: 6.54 Distance 0.599 0.604 0.666
Rotatable Bonds: 4 Reference NTP/TR-090 NTP446 NTP/TR-435
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen 1. All properties and OPS components are within expected ranges.
Probability: 0.701 2. Unknown ECFP_2 feature: -1305021906: [*]['?']
3. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Enrichment: 1.78
4. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Bayesian Score: 3.03 5. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: 12.5 6. Unknown ECFP_2 feature: -1068136657: [*][c](:[*]):[c](F):[c]([*]):[*]
Mahalanobis Distance p-value: 5e-008 7. Unknown ECFP_2 feature: 220735655: [*]:[c](:[*])F
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false
negative rate.
Probability: The esimated probability that the sample is in the
Feature Contribution
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a Top features for positive contribution
cutoff. Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. training set
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_8 -1734834311 0.544 2 out of 2

ECFP_8 1806797765 0.378 1 out of 1

ECFP_8 -1292275645 0.378 1 out of 1

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_8 -1085223908 -0.268 6 out of 22

ECFP_8 734603939 -0.104 57 out of 171


ECFP_8 767488533 -0.0545 1 out of 3
Molecule TOPKAT_Mouse_Male_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Dronabinol Quazepam Hexachlorophene
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


C20H14ClF4N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 393.78517 Distance 0.608 0.619 0.646
ALogP: 6.54 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.271
Enrichment: 0.921
Bayesian Score: -1.36 Feature Contribution
Mahalanobis Distance: 22.1 Top features for positive contribution
Mahalanobis Distance p-value: 5.46e-024 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Prediction: Positive if the Bayesian score is above the estimated training set
best cutoff value from minimizing the false positive and false
negative rate. FCFP_6 71953198 0.612 12 out of 23
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_6 1115242110 0.38 2 out of 4

FCFP_6 -1151884458 0.348 6 out of 15

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
FCFP_6 -497728148 -0.96 2 out of 26

FCFP_6 -1194184847 -0.423 0 out of 2

FCFP_6 1152389402 -0.423 0 out of 2


Molecule TOPKAT_Mouse_Male_FDA_Single_vs_Multiple
Structural Similar Compounds
Name Nafenopin Sertraline Loratidine
Structure

Actual Endpoint Single-Carcinogen Single-Carcinogen Single-Carcinogen


C20H14ClF4N2[?] Predicted Endpoint Single-Carcinogen Single-Carcinogen Single-Carcinogen
Molecular Weight: 393.78517 Distance 0.708 0.724 0.741
ALogP: 6.54 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Single-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.148
Enrichment: 0.492
Bayesian Score: -11.1 Feature Contribution
Mahalanobis Distance: 22.5 Top features for positive contribution
Mahalanobis Distance p-value: 8.55e-012 Fingerprint Bit/Smiles Feature Structure Score Multiple-
Prediction: Positive if the Bayesian score is above the estimated Carcinogen in
best cutoff value from minimizing the false positive and false training set
negative rate.
Probability: The esimated probability that the sample is in the FCFP_12 907007053 0.235 5 out of 11
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 -497728148 0.174 1 out of 2

FCFP_12 136597326 0.0722 18 out of 49

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Multiple-
Carcinogen in
training set
FCFP_12 -1151884458 -1.11 0 out of 6

FCFP_12 1069584379 -1.11 0 out of 6

FCFP_12 71476542 -0.789 1 out of 10


Molecule TOPKAT_Mouse_Male_NTP
Structural Similar Compounds
Name 1-trans-delta(sup 9)- DICOFOL 4;4'-THIOBIS(6-t-BUTYL-
tetrahydrocannabinol m-CRESOL)
Structure

C20H14ClF4N2[?] Actual Endpoint Carcinogen Carcinogen Non-Carcinogen


Molecular Weight: 393.78517 Predicted Endpoint Carcinogen Carcinogen Non-Carcinogen
ALogP: 6.54 Distance 0.598 0.600 0.672
Rotatable Bonds: 4 Reference NTP446 NTP/TR-90 NTP/TR-435
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen 1. All properties and OPS components are within expected ranges.
Probability: 0.613
Enrichment: 1.56 Feature Contribution
Bayesian Score: 0.731 Top features for positive contribution
Mahalanobis Distance: 12.8 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Mahalanobis Distance p-value: 3.28e-008 training set
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false SCFP_12 1808857543 0.638 3 out of 3
negative rate.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 -1798344807 0.377 1 out of 1

SCFP_12 1684592399 0.377 1 out of 1

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
SCFP_12 -1794974220 -0.555 0 out of 2

SCFP_12 1498989769 -0.316 0 out of 1

SCFP_12 -300280774 -0.316 0 out of 1


Molecule TOPKAT_Ocular_Irritancy_Mild_vs_Moderate_Severe
Structural Similar Compounds
Name ANTHRAQUINONE;1- o-Acetotoluidide; 6'- BENZILIC ACID; 4;4'-
(2;4;6- chloro-2-(p- DICHLORO-; ISOPROPYL
TRIMETHYLPHENYLAMIN chlorobenzyl(2- ESTER
O)- (pyrrolidinyl)ethyl)amino)-
;
Structure

C20H14ClF4N2[?]
Molecular Weight: 393.78517
ALogP: 6.54 Actual Endpoint Moderate_Severe Moderate_Severe Moderate_Severe
Rotatable Bonds: 4 Predicted Endpoint Mild Moderate_Severe Moderate_Severe
Acceptors: 2 Distance 0.638 0.684 0.691
Donors: 1 Reference 28ZPAK-;242;72 Arzneimittel-Forschung CIGET* -;-;77
9;167;59

Model Prediction
Model Applicability
Prediction: Mild
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Probability: 0.765 in the training set.
Enrichment: 1.11
1. All properties and OPS components are within expected ranges.
Bayesian Score: -2.12 2. Unknown FCFP_2 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance: 7.34
Mahalanobis Distance p-value: 0.992
Prediction: Positive if the Bayesian score is above the estimated
Feature Contribution
best cutoff value from minimizing the false positive and false
negative rate. Top features for positive contribution
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Moderate_Severe
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a in training set
cutoff.
Enrichment: An estimate of enrichment, that is, the increased FCFP_10 -497728148 0.356 24 out of 25
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_10 136120670 0.206 53 out of 65

FCFP_10 3 0.165 383 out of 491

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Moderate_Severe
in training set
FCFP_10 900733322 -0.874 3 out of 13

FCFP_10 -1861645784 -0.598 9 out of 26

FCFP_10 -620155118 -0.598 9 out of 26


Molecule TOPKAT_Ocular_Irritancy_None_vs_Irritant
Structural Similar Compounds
Name ANTHRAQUINONE;1- o-Acetotoluidide; 6'- BENZILIC ACID; 4;4'-
(2;4;6- chloro-2-(p- DICHLORO-; ISOPROPYL
TRIMETHYLPHENYLAMIN chlorobenzyl(2- ESTER
O)- (pyrrolidinyl)ethyl)amino)-
;
Structure

C20H14ClF4N2[?]
Molecular Weight: 393.78517
ALogP: 6.54 Actual Endpoint Irritant Irritant Irritant
Rotatable Bonds: 4 Predicted Endpoint Irritant Irritant Irritant
Acceptors: 2 Distance 0.630 0.665 0.670
Donors: 1 Reference 28ZPAK-;242;72 Arzneimittel-Forschung CIGET* -;-;77
9;167;59

Model Prediction
Model Applicability
Prediction: Irritant
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Probability: 1 in the training set.
Enrichment: 1.18
1. All properties and OPS components are within expected ranges.
Bayesian Score: 1.79 2. Unknown FCFP_2 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance: 5.98
Mahalanobis Distance p-value: 1
Prediction: Positive if the Bayesian score is above the estimated
Feature Contribution
best cutoff value from minimizing the false positive and false
negative rate. Top features for positive contribution
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Irritant in training
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a set
cutoff.
Enrichment: An estimate of enrichment, that is, the increased FCFP_12 1747237384 0.208 44 out of 44
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 17 0.189 48 out of 49

FCFP_12 71476542 0.175 81 out of 84

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Irritant in training
set
FCFP_12 690511177 -0.268 1 out of 2

FCFP_12 136597326 0 612 out of 753

FCFP_12 203677720 0 319 out of 382


Molecule TOPKAT_Rat_Female_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Dronabinol Hexachlorophene Tretinoin
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Carcinogen


C20H14ClF4N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Carcinogen
Molecular Weight: 393.78517 Distance 0.625 0.672 0.692
ALogP: 6.54 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.199
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Enrichment: 0.617 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Bayesian Score: -7.37 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: 12.2
Mahalanobis Distance p-value: 0.00291 Feature Contribution
Prediction: Positive if the Bayesian score is above the estimated
best cutoff value from minimizing the false positive and false Top features for positive contribution
negative rate.
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
positive category. This assumes that the Bayesian score follows training set
a normal distribution and is different from the prediction using a
cutoff. ECFP_12 -428002189 0.208 1 out of 2
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_12 -1952889961 0.158 2 out of 5

ECFP_12 226796801 0.134 3 out of 8

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_12 -175507738 -1.56 0 out of 12

ECFP_12 1335691903 -1.11 2 out of 26

ECFP_12 767488533 -0.941 0 out of 5


Molecule TOPKAT_Rat_Female_NTP
Structural Similar Compounds
Name 1-trans-delta(sup 9)- 1-TRANS-DELTA(9)- HEXACHLOROPHENE
tetrahydrocannabinol TETRAHYDROCANNABIN
OL
Structure

C20H14ClF4N2[?]
Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 393.78517
Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
ALogP: 6.54
Distance 0.596 0.596 0.650
Rotatable Bonds: 4
Reference NTP446 TR-446 TR-40
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen
1. OPS PC11 out of range. Value: 3.6073. Training min, max, SD, explained variance: -2.3897,
Probability: 0.544 3.1905, 1.314, 0.0302.
Enrichment: 1.2 2. Unknown FCFP_2 feature: -1029533685: [*]:[c](:[*])C(F)(F)F
Bayesian Score: 0.931
Mahalanobis Distance: 17.5 Feature Contribution
Mahalanobis Distance p-value: 1.59e-020
Prediction: Positive if the Bayesian score is above the estimated
Top features for positive contribution
best cutoff value from minimizing the false positive and false Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
negative rate. training set
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows FCFP_12 -617729047 0.575 3 out of 3
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 71953198 0.423 26 out of 40

FCFP_12 -1861645784 0.387 6 out of 9

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
FCFP_12 907007053 -0.487 5 out of 21

FCFP_12 690511177 -0.349 0 out of 1

FCFP_12 1115242110 -0.349 0 out of 1


Molecule TOPKAT_Rat_Male_FDA_None_vs_Carcinogen
Structural Similar Compounds
Name Dronabinol Clotrimazole Tretinoin
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Carcinogen


C20H14ClF4N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Carcinogen
Molecular Weight: 393.78517 Distance 0.613 0.671 0.689
ALogP: 6.54 Reference US FDA (Centre for Drug US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.313
Enrichment: 0.935
Bayesian Score: -1.61 Feature Contribution
Mahalanobis Distance: 15.3 Top features for positive contribution
Mahalanobis Distance p-value: 1.52e-007 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Prediction: Positive if the Bayesian score is above the estimated training set
best cutoff value from minimizing the false positive and false
negative rate. SCFP_6 2010506287 0.415 1 out of 1
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_6 1334878018 0.333 8 out of 17

SCFP_6 384920865 0.322 11 out of 24

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
SCFP_6 -1211866396 -1.1 2 out of 25

SCFP_6 2096873229 -0.496 0 out of 2

SCFP_6 -1211133908 -0.496 0 out of 2


Molecule TOPKAT_Rat_Male_NTP
Structural Similar Compounds
Name 1-trans-delta(sup 9)- 1-trans-.delta.-9- 4;4'-Thiobis-(6-tert-butyl-
tetrahydrocannabinol Tetrahydrocannabinol m-cresol)
Structure

C20H14ClF4N2[?] Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


Molecular Weight: 393.78517 Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
ALogP: 6.54 Distance 0.604 0.604 0.665
Rotatable Bonds: 4 Reference NTP446 NTP/TR-446 NTP/TR-435
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen 1. OPS PC8 out of range. Value: 4.8241. Training min, max, SD, explained variance: -3.1888,
3.6693, 1.511, 0.0414.
Probability: 0.462
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Enrichment: 0.908 3. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Bayesian Score: -3.42 4. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance: 9.01 5. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance p-value: 0.023 6. Unknown ECFP_2 feature: -1068136657: [*][c](:[*]):[c](F):[c]([*]):[*]
Prediction: Positive if the Bayesian score is above the estimated 7. Unknown ECFP_2 feature: 220735655: [*]:[c](:[*])F
best cutoff value from minimizing the false positive and false
negative rate.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows Feature Contribution
a normal distribution and is different from the prediction using a
cutoff. Top features for positive contribution
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Bayesian Score: The standard Laplacian-modified Bayesian training set
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
ECFP_12 -219423964 0.575 7 out of 7

ECFP_12 -181568884 0.511 9 out of 10

ECFP_12 -1734834311 0.405 2 out of 2

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
ECFP_12 -175507738 -0.693 0 out of 2

ECFP_12 -1292275645 -0.406 0 out of 1


ECFP_12 226796801 -0.406 0 out of 1
Molecule TOPKAT_Skin_Irritancy_None_vs_Irritant
Structural Similar Compounds
Name Anthraquinone, 1-(2,4,6- Phenol, 4,4'- Aniline, 2,4-bis(o-
trimethylphenylamino)- isopropylidenebis(2,6- methylphenoxy)-
dichloro-
Structure

C20H14ClF4N2[?]
Actual Endpoint Irritant Irritant Irritant
Molecular Weight: 393.78517
Predicted Endpoint Non-Irritant Irritant Non-Irritant
ALogP: 6.54
Distance 0.666 0.683 0.717
Rotatable Bonds: 4
Reference 28ZPAK "Sbornik 85JCAE "Prehled 85JCAE "Prehled
Acceptors: 2 Vysledku Toxixologickeho Prumyslove Toxikologie; Prumyslove Toxikologie;
Donors: 1 Vysetreni Latek A Organicke Latky," Organicke Latky,"
Pripravku," Marhol d, J.V., Marhold, J., Prague , Marhold, J., Prague ,
Institut Pro Vychovu Czechoslovakia, Czechoslovakia,
Model Prediction Vedoucicn Pracovniku
Chemickeho Prumyclu
Avicenum, 1986
Volume(issue)/page/year:
Avicenum, 1986
Volume(issue)/page/year:
Prediction: Non-Irritant Praha, Cz echoslovakia, -,536,1986 -,725,1986
Probability: 0.968 1972
Volume(issue)/page/year:
Enrichment: 1.05 -,242,1
Bayesian Score: -1.16
Mahalanobis Distance: 7.21 Model Applicability
Mahalanobis Distance p-value: 0.991 Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: Positive if the Bayesian score is above the estimated in the training set.
best cutoff value from minimizing the false positive and false
negative rate. 1. All properties and OPS components are within expected ranges.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff. Feature Contribution
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. Top features for positive contribution
Bayesian Score: The standard Laplacian-modified Bayesian
score. Fingerprint Bit/Smiles Feature Structure Score Irritant in training
Mahalanobis Distance: The Mahalanobis distance (MD) is the set
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 192331578 0.0756 6 out of 6

FCFP_12 -1185376954 0.0756 6 out of 6

FCFP_12 -1029533685 0.0756 6 out of 6

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Irritant in training
set
FCFP_12 1069584379 -0.439 38 out of 65

FCFP_12 900733322 -0.153 3 out of 4


FCFP_12 367998008 -0.129 61 out of 76
Molecule TOPKAT_Skin_Sensitization_None_vs_Sensitizer
Structural Similar Compounds
Name Dehydroabietic Acid Palustric acid Abietic acid
Structure

Actual Endpoint Sensitizer Sensitizer Sensitizer


C20H14ClF4N2[?] Predicted Endpoint Sensitizer Sensitizer Sensitizer
Molecular Weight: 393.78517 Distance 0.693 0.714 0.723
ALogP: 6.54 Reference Contact Dermatitis (1989) Contact Dermatitis (1990) Contact Dermatitis (1989)
Rotatable Bonds: 4 20:41 23:90 20:41
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Sensitizer 1. All properties and OPS components are within expected ranges.
Probability: 0.851 2. Unknown FCFP_2 feature: -1861645784: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
3. Unknown FCFP_2 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Enrichment: 1.24
Bayesian Score: 1.41
Mahalanobis Distance: 7.96 Feature Contribution
Mahalanobis Distance p-value: 0.0403 Top features for positive contribution
Prediction: Positive if the Bayesian score is above the estimated Fingerprint Bit/Smiles Feature Structure Score Sensitizer in
best cutoff value from minimizing the false positive and false
negative rate. training set
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows FCFP_12 -497728148 0.304 15 out of 15
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 71953198 0.258 18 out of 19

FCFP_12 907007053 0.254 17 out of 18

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Sensitizer in
training set
FCFP_12 136120670 -0.245 15 out of 27

FCFP_12 554276193 -0.199 1 out of 2

FCFP_12 3 -0.0947 89 out of 136


Molecule TOPKAT_Skin_Sensitization_Weak_vs_Strong
Structural Similar Compounds
Name Dehydroabietic Acid Palustric acid Methyl abietate
Structure

Actual Endpoint Strong-Sensitizer Weak-Sensitizer Strong-Sensitizer


C20H14ClF4N2[?] Predicted Endpoint Strong-Sensitizer Weak-Sensitizer Weak-Sensitizer
Molecular Weight: 393.78517 Distance 0.725 0.747 0.754
ALogP: 6.54 Reference Contact Dermatitis (1989) Contact Dermatitis (1990) Contact Dermatitis (1989)
Rotatable Bonds: 4 20:41 23:90 20:41
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Strong-Sensitizer 1. All properties and OPS components are within expected ranges.
Probability: 0.984 2. Unknown FCFP_2 feature: -1861645784: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
3. Unknown FCFP_2 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Enrichment: 1.27
Bayesian Score: 2.78
Mahalanobis Distance: 6.93 Feature Contribution
Mahalanobis Distance p-value: 0.109 Top features for positive contribution
Prediction: Positive if the Bayesian score is above the estimated Fingerprint Bit/Smiles Feature Structure Score Strong-Sensitizer
best cutoff value from minimizing the false positive and false
negative rate. in training set
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows FCFP_12 16 0.232 165 out of 165
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 1618154665 0.232 164 out of 164

FCFP_12 203677720 0.232 139 out of 139

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Strong-Sensitizer
in training set
FCFP_12 136597326 -0.239 74 out of 119

FCFP_12 3 -0.131 61 out of 88

FCFP_12 0 0 186 out of 244


Molecule TOPKAT_Weight_of_Evidence_Rodent_Carcinogenicity
Structural Similar Compounds
Name Dicofol Dronabinol Hexachlorophene
Structure

Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen


C20H14ClF4N2[?] Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 393.78517 Distance 0.603 0.619 0.672
ALogP: 6.54 Reference NCI/NTP TR-90 US FDA (Centre for Drug US FDA (Centre for Drug
Rotatable Bonds: 4 Eval.& Res./Off. Testing & Eval.& Res./Off. Testing &
Res.) Sept. 1997 Res.) Sept. 1997
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Non-Carcinogen
1. All properties and OPS components are within expected ranges.
Probability: 0.441
Enrichment: 0.857
Bayesian Score: -2.56 Feature Contribution
Mahalanobis Distance: 4.8 Top features for positive contribution
Mahalanobis Distance p-value: 0.999 Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
Prediction: Positive if the Bayesian score is above the estimated training set
best cutoff value from minimizing the false positive and false
negative rate. SCFP_8 2096873229 0.428 2 out of 2
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_8 384920865 0.332 37 out of 55

SCFP_8 668219294 0.303 1 out of 1

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
training set
SCFP_8 -1211866396 -0.685 6 out of 27

SCFP_8 -1211133908 -0.67 0 out of 2

SCFP_8 -1850396224 -0.67 0 out of 2


Molecule TOPKAT_Carcinogenic_Potency_TD50_Mouse
Structural Similar Compounds
Name Dicofol Chlorobenzilate Clobuzarit s
Structure

Actual Endpoint (-log C) 4.05158 3.53947 3.29645


C20H14ClF4N2[?] Predicted Endpoint (-log 3.80707 3.34564 3.52771
Molecular Weight: 393.78517 C)
ALogP: 6.54 Distance 0.696 0.720 0.727
Rotatable Bonds: 4 Reference CPDB CPDB CPDB
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 2.98 1. OPS PC16 out of range. Value: 4.7861. Training min, max, SD, explained variance: -3.1026,
4.016, 1.245, 0.0193.
Unit: mg/kg_body_weight/day
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Mahalanobis Distance: 13.1 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance p-value: 6.91e-008 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the
distance to the center of the training data. The larger the MD, the Feature Contribution
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of Top features for positive contribution
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller Fingerprint Bit/Smiles Feature Structure Score
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly ECFP_6 655739385 0.229
inaccurate.
ECFP_6 1572579716 0.225

ECFP_6 1559650422 0.203

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 1996767644 -0.251

ECFP_6 642810091 -0.247

ECFP_6 -182236392 -0.232


Molecule TOPKAT_Carcinogenic_Potency_TD50_Rat
Structural Similar Compounds
Name Nafenopin s Indomethacin 1-(4-Chlorophenyl)-1-
phenyl-2-propynyl
carbamate
Structure

C20H14ClF4N2[?]
Actual Endpoint (-log C) 4.45051 5.49293 4.51245
Molecular Weight: 393.78517
Predicted Endpoint (-log 3.8403 4.9569 3.49372
ALogP: 6.54 C)
Rotatable Bonds: 4 Distance 0.676 0.700 0.710
Acceptors: 2 Reference CPDB CPDB CPDB
Donors: 1
Model Applicability
Model Prediction Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: 1.34 in the training set.
Unit: mg/kg_body_weight/day 1. OPS PC19 out of range. Value: -3.5336. Training min, max, SD, explained variance: -2.9709,
Mahalanobis Distance: 16.9 5.6065, 1.282, 0.0158.
Mahalanobis Distance p-value: 2.71e-017 2. Unknown FCFP_2 feature: -1029533685: [*]:[c](:[*])C(F)(F)F
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the Feature Contribution
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. Top features for positive contribution
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the Fingerprint Bit/Smiles Feature Structure Score
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non- FCFP_6 -1861645784 0.359
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_6 690511177 0.293

FCFP_6 32 0.154

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 16 -0.354

FCFP_6 17 -0.149

FCFP_6 0 -0.115
Molecule TOPKAT_Chronic_LOAEL
Structural Similar Compounds
Name CLOTRIMAZOLE MIDAZOLAM.HCL TRIAZOLAM
Structure

Actual Endpoint (-log C) 4.53762 4.55867 3.83659


C20H14ClF4N2[?] Predicted Endpoint (-log 4.55826 3.94765 3.85527
Molecular Weight: 393.78517 C)
ALogP: 6.54 Distance 0.665 0.697 0.703
Rotatable Bonds: 4 Reference NDA-18813 NDA-18654 UPJ-33030
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.00206 1. All properties and OPS components are within expected ranges.
Unit: g/kg_body_weight 2. Unknown ECFP_6 feature: -1305021906: [*]['?']
3. Unknown ECFP_6 feature: -1046436026: [*]F
Mahalanobis Distance: 25.3
4. Unknown ECFP_6 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
Mahalanobis Distance p-value: 5.05e-016 5. Unknown ECFP_6 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for 6. Unknown ECFP_6 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
correlations among the X properties. It is calculated as the 7. Unknown ECFP_6 feature: 432684389: ['?'][c](:[*]):[*]
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. 8. Unknown ECFP_6 feature: -938530932: [*]:[c](:[*])N
Mahalanobis Distance p-value: The p-value gives the fraction of 9. Unknown ECFP_6 feature: -175507738: [*]C([*])([*])[c](:[cH]:[*]):[cH]:[*]
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller 10. Unknown ECFP_6 feature: -1068136657: [*][c](:[*]):[c](F):[c]([*]):[*]
the p-value, the less trustworthy the prediciton. For highly non- 11. Unknown ECFP_6 feature: 99947387: [*]:[c](:[*])Cl
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate. 12. Unknown ECFP_6 feature: 220735655: [*]:[c](:[*])F
13. Unknown ECFP_6 feature: -1952889961: [*]:[c](:[*])C(F)(F)F
14. Unknown ECFP_6 feature: 226796801: [*]C([*])([*])F
15. Unknown ECFP_6 feature: -181568884: [*]:[cH]:[c](:[cH]:[*])[c](:[*]):[*]
16. Unknown ECFP_6 feature: 767488533: [*]:[c](:[*])CC

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 1559650422 0.129

FCFP_6 32 0.101

FCFP_6 3 0.0924

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 -453677277 -0.0906
FCFP_6 136597326 -0.0815

FCFP_6 203677720 -0.0713


Molecule TOPKAT_Daphnia_EC50
Structural Similar Compounds
Name Fenarimol Chlorophacinone Bifenthrin
Structure

Actual Endpoint (-log C) 4.687 5.944 8.422


C20H14ClF4N2[?] Predicted Endpoint (-log 4.53833 5.89478 6.66616
Molecular Weight: 393.78517 C)
ALogP: 6.54 Distance 0.629 0.661 0.669
Rotatable Bonds: 4 Reference Toropov and Benfenati, Toropov and Benfenati, Toropov and Benfenati,
2006, Bioorganic & 2006, Bioorganic & 2006, Bioorganic &
Acceptors: 2 Medicinal Chemistry, Medicinal Chemistry, Medicinal Chemistry,
Donors: 1 14(8), 2779-2788 14(8), 2779-2788 14(8), 2779-2788

Model Prediction Model Applicability


Prediction: 0.47 Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
in the training set.
Unit: mg/l
Mahalanobis Distance: 26.9 1. OPS PC39 out of range. Value: -4.339. Training min, max, SD, explained variance: -3.9736,
6.4329, 1.308, 0.0058.
Mahalanobis Distance p-value: 2.83e-023 2. Unknown ECFP_6 feature: -1305021906: [*]['?']
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for 3. Unknown ECFP_6 feature: -427397688: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
correlations among the X properties. It is calculated as the 4. Unknown ECFP_6 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. 5. Unknown ECFP_6 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance p-value: The p-value gives the fraction of 6. Unknown ECFP_6 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller 7. Unknown ECFP_6 feature: 432684389: ['?'][c](:[*]):[*]
the p-value, the less trustworthy the prediciton. For highly non- 8. Unknown ECFP_6 feature: -175507738: [*]C([*])([*])[c](:[cH]:[*]):[cH]:[*]
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate. 9. Unknown ECFP_6 feature: -1068136657: [*][c](:[*]):[c](F):[c]([*]):[*]
10. Unknown ECFP_6 feature: 220735655: [*]:[c](:[*])F
11. Unknown ECFP_6 feature: -1952889961: [*]:[c](:[*])C(F)(F)F
12. Unknown ECFP_6 feature: 226796801: [*]C([*])([*])F
13. Unknown ECFP_6 feature: -181568884: [*]:[cH]:[c](:[cH]:[*])[c](:[*]):[*]
14. Unknown ECFP_6 feature: 767488533: [*]:[c](:[*])CC

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 642810091 0.148

ECFP_6 1572579716 0.114

FCFP_6 1069584379 0.0966

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 0 -0.202
FCFP_6 136597326 -0.193

FCFP_6 17 -0.189
Molecule TOPKAT_Fathead_Minnow_LC50
Structural Similar Compounds
Name Dicofol 4;4'-Isopropylidene-bis- 2;2'-Methylene-bis-(3;4;6-
(2;6-dichlorophenol) trichlorophenol)
Structure

C20H14ClF4N2[?] Actual Endpoint (-log C) 5.788 5.44 7.287


Molecular Weight: 393.78517 Predicted Endpoint (-log 6.23295 6.63381 7.45687
C)
ALogP: 6.54
Distance 0.656 0.723 0.752
Rotatable Bonds: 4
Reference ATOCFM Volume 4 ATOCFM Volume 4 ATOCFM Volume 1
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 5.89e-005
1. All properties and OPS components are within expected ranges.
Unit: g/l
Mahalanobis Distance: 19.4
Mahalanobis Distance p-value: 2.93e-038 Feature Contribution
Mahalanobis Distance: The Mahalanobis distance (MD) is a Top features for positive contribution
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the Fingerprint Bit/Smiles Feature Structure Score
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. FCFP_2 1069584379 0.105
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_2 71953198 0.0871

FCFP_2 16 0.0139

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 0 -0.275

FCFP_2 136120670 -0.214

FCFP_2 3 -0.198
Molecule TOPKAT_Rat_Inhalational_LC50
Structural Similar Compounds
Name 1H-Benzimidazole; 5- Benzhydrol; 4;4'-dichloro- 1H-1;2;4-Triazole; 1-
chloro-6-(2;3- alpha-(trichloromethyl)- ((bis(4-
dichlorophenoxy)-2- fluorophenyl)methylsilyl)
(methylthio)- methyl)-
Structure

C20H14ClF4N2[?]
Molecular Weight: 393.78517 Actual Endpoint (-log C) 2.2548 1.2677 1.5166
ALogP: 6.54 Predicted Endpoint (-log 1.69815 2.08555 1.21248
Rotatable Bonds: 4 C)
Acceptors: 2 Distance 0.567 0.706 0.711
Donors: 1 Reference MDACAP Medicamentos PEMNDP Pesticide NTIS** National Technical
de Actualidad. (J.R. Manual. (The British Crop Information Service.
Prous; S.A.; Apartado de Protection Council; 20 (Springfield; VA 22161)
Model Prediction Correos 54 0; 08080
Barcelona; Spain) V.1-
Bridport R d.; Thornton
Heath CR4 7QG; UK)
Forme rly U.S.
Clearinghouse for
Prediction: 9.62e+004 1965- V.1- 1968- Scientific & Technical
Unit: mg/m3/h Volume(issue)/page/year: Volume(issue)/page/year: Information.
21;227;1985 9;267;1 991 Volume(issue)/pag e/year:
Mahalanobis Distance: 13.4 OTS0543806
Mahalanobis Distance p-value: 8.91e-010
Mahalanobis Distance: The Mahalanobis distance (MD) is a Model Applicability
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
distance to the center of the training data. The larger the MD, the in the training set.
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of 1. OPS PC6 out of range. Value: -2.7155. Training min, max, SD, explained variance: -2.4569,
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller 8.1177, 1.698, 0.0400.
the p-value, the less trustworthy the prediciton. For highly non- 2. OPS PC21 out of range. Value: -3.7574. Training min, max, SD, explained variance: -3.0247,
normal X properties (e.g., fingerprints), the MD p-value is wildly 4.4972, 1.058, 0.0155.
inaccurate.
3. Unknown ECFP_2 feature: -1305021906: [*]['?']
4. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
5. Unknown ECFP_2 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
6. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
7. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_2 642810091 0.214

ECFP_2 1572579716 0.159

ECFP_2 -817402818 0.129

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
ECFP_2 863188371 -0.338
ECFP_2 734603939 -0.302

ECFP_2 -1046436026 -0.26


Molecule TOPKAT_Rat_Maximum_Tolerated_Dose_Feed
Structural Similar Compounds
Name DICOFOL HEXACHLOROPHENE CHLORBENZILATE
Structure

Actual Endpoint (-log C) 3.9415 4.78017 3.38252


C20H14ClF4N2[?] Predicted Endpoint (-log 3.81186 3.20776 3.27894
Molecular Weight: 393.78517 C)
ALogP: 6.54 Distance 0.570 0.635 0.656
Rotatable Bonds: 4 Reference NCI/NTP TR-90 NCI/NTP TR-40 NCI/NTP TR-75
Acceptors: 2
Donors: 1 Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.0645 1. All properties and OPS components are within expected ranges.
Unit: g/kg_body_weight
Mahalanobis Distance: 10.4 Feature Contribution
Mahalanobis Distance p-value: 8.99e-006 Top features for positive contribution
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for Fingerprint Bit/Smiles Feature Structure Score
correlations among the X properties. It is calculated as the
distance to the center of the training data. The larger the MD, the FCFP_2 3 0.0737
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_2 136120670 0.064

FCFP_2 71953198 0.058

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 71476542 -0.134

FCFP_2 203677720 -0.0829

FCFP_2 16 -0.0512
Molecule TOPKAT_Rat_Maximum_Tolerated_Dose_Gavage
Structural Similar Compounds
Name MIX OF 1,2,3,6,7,8- AND 2,3,7,8-TCDD CHLORPHENIRAMINE
1,2,3,7,8,9-HCDD MALEATE
Structure

C20H14ClF4N2[?] Actual Endpoint (-log C) 7.89304 9.66271 3.96188


Molecular Weight: 393.78517 Predicted Endpoint (-log 7.3873 7.00828 3.83117
C)
ALogP: 6.54
Distance 0.860 0.886 0.901
Rotatable Bonds: 4
Reference NCI/NTP TR-198 NCI/NTP TR-201 NCI/NTP TR-317
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.0272
1. Num_AromaticRings out of range. Value: 3. Training min, max, mean, SD: 0, 2, 0.5625, 0.693.
Unit: g/kg_body_weight
2. OPS PC7 out of range. Value: -2.8996. Training min, max, SD, explained variance: -2.8003,
Mahalanobis Distance: 8.6 2.9332, 1.16, 0.0416.
Mahalanobis Distance p-value: 0.000393 3. Unknown FCFP_2 feature: -1861645784: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a 4. Unknown FCFP_2 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
generalization of the Euclidean distance that accounts for 5. Unknown FCFP_2 feature: -1029533685: [*]:[c](:[*])C(F)(F)F
correlations among the X properties. It is calculated as the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
Feature Contribution
given sample, assuming normally distributed data. The smaller Top features for positive contribution
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly Fingerprint Bit/Smiles Feature Structure Score
inaccurate.
FCFP_2 32 0.526
FCFP_2 367998008 0.413

FCFP_2 71953198 0.113

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
FCFP_2 136597326 -0.489

FCFP_2 203677720 -0.406

FCFP_2 0 -0.29
Molecule TOPKAT_Rat_Oral_LD50
Structural Similar Compounds
Name BENZBROMARONE PIMOZIDE MICONAZOLE. HNO3
(HNO3 STRIPPED)
Structure

C20H14ClF4N2[?] Actual Endpoint (-log C) 3.233 2.623 2.655


Molecular Weight: 393.78517 Predicted Endpoint (-log 2.84957 2.83946 2.71916
C)
ALogP: 6.54
Distance 0.547 0.652 0.656
Rotatable Bonds: 4
Reference IYKEDH 10;232;79 NIIRDN 6;639;82 NIIRDN 6;809;82
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: 0.184
1. All properties and OPS components are within expected ranges.
Unit: g/kg_body_weight
2. Unknown ECFP_2 feature: -1305021906: [*]['?']
Mahalanobis Distance: 16.4 3. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
Mahalanobis Distance p-value: 0.15 4. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Mahalanobis Distance: The Mahalanobis distance (MD) is a 5. Unknown ECFP_2 feature: -1068136657: [*][c](:[*]):[c](F):[c]([*]):[*]
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the 6. Unknown FCFP_6 feature: 16: [*][c](:[*]):[*]
distance to the center of the training data. The larger the MD, the 7. Unknown FCFP_6 feature: -1861645784: ['?'][c](:[*]):[c](:[cH]:[*])[c](:[*]):[*]
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of 8. Unknown FCFP_6 feature: 1618154665: [*][c](:[*]):[cH]:[c]([*]):[*]
training data with an MD greater than or equal to the one for the 9. Unknown FCFP_6 feature: 690511177: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non- 10. Unknown FCFP_6 feature: 1747237384: [*][c](:[*]):n:[c]([*]):[*]
normal X properties (e.g., fingerprints), the MD p-value is wildly 11. Unknown FCFP_6 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
inaccurate.
12. Unknown FCFP_6 feature: 1069584379: [*]:[c](:[*])N
13. Unknown FCFP_6 feature: 71476542: [*]:[c](:[*])Cl

Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 71953198 0.392

ECFP_6 -1046436026 0.349

ECFP_6 642810091 0.281

Top Features for negative contribution


Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 226796801 -0.32

ECFP_6 -817402818 -0.263


ECFP_6 -175507738 -0.262

You might also like