Professional Documents
Culture Documents
C20H14ClF4N2[?]
Actual Endpoint Mutagen Mutagen Mutagen
Molecular Weight: 393.78517
Predicted Endpoint Mutagen Non-Mutagen Mutagen
ALogP: 6.54
Distance 0.544 0.560 0.567
Rotatable Bonds: 4
Reference Kazius et. al., J. Med. Kazius et. al., J. Med. EMIC
Acceptors: 2 Chem. (2005) 48, 312-320 Chem. (2005) 48, 312-320
Donors: 1
Model Applicability
Model Prediction Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: Non-Mutagen in the training set.
Probability: 0.458 1. All properties and OPS components are within expected ranges.
Enrichment: 0.82
Bayesian Score: -8.45 Feature Contribution
Mahalanobis Distance: 8.49
Mahalanobis Distance p-value: 0.954
Top features for positive contribution
Prediction: Positive if the Bayesian score is above the estimated Fingerprint Bit/Smiles Feature Structure Score Mutagen in training
best cutoff value from minimizing the false positive and false set
negative rate.
Probability: The esimated probability that the sample is in the SCFP_12 -1798344807 0.437 81 out of 91
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
SCFP_12 1684592399 0.344 27 out of 33
C20H14ClF4N2[?]
Molecular Weight: 393.78517
ALogP: 6.54 Actual Endpoint Moderate_Severe Moderate_Severe Moderate_Severe
Rotatable Bonds: 4 Predicted Endpoint Mild Moderate_Severe Moderate_Severe
Acceptors: 2 Distance 0.638 0.684 0.691
Donors: 1 Reference 28ZPAK-;242;72 Arzneimittel-Forschung CIGET* -;-;77
9;167;59
Model Prediction
Model Applicability
Prediction: Mild
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Probability: 0.765 in the training set.
Enrichment: 1.11
1. All properties and OPS components are within expected ranges.
Bayesian Score: -2.12 2. Unknown FCFP_2 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance: 7.34
Mahalanobis Distance p-value: 0.992
Prediction: Positive if the Bayesian score is above the estimated
Feature Contribution
best cutoff value from minimizing the false positive and false
negative rate. Top features for positive contribution
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Moderate_Severe
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a in training set
cutoff.
Enrichment: An estimate of enrichment, that is, the increased FCFP_10 -497728148 0.356 24 out of 25
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_10 136120670 0.206 53 out of 65
C20H14ClF4N2[?]
Molecular Weight: 393.78517
ALogP: 6.54 Actual Endpoint Irritant Irritant Irritant
Rotatable Bonds: 4 Predicted Endpoint Irritant Irritant Irritant
Acceptors: 2 Distance 0.630 0.665 0.670
Donors: 1 Reference 28ZPAK-;242;72 Arzneimittel-Forschung CIGET* -;-;77
9;167;59
Model Prediction
Model Applicability
Prediction: Irritant
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Probability: 1 in the training set.
Enrichment: 1.18
1. All properties and OPS components are within expected ranges.
Bayesian Score: 1.79 2. Unknown FCFP_2 feature: -1151884458: ['?'][c](:[*]):[c](N):n:[*]
Mahalanobis Distance: 5.98
Mahalanobis Distance p-value: 1
Prediction: Positive if the Bayesian score is above the estimated
Feature Contribution
best cutoff value from minimizing the false positive and false
negative rate. Top features for positive contribution
Probability: The esimated probability that the sample is in the Fingerprint Bit/Smiles Feature Structure Score Irritant in training
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a set
cutoff.
Enrichment: An estimate of enrichment, that is, the increased FCFP_12 1747237384 0.208 44 out of 44
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 17 0.189 48 out of 49
C20H14ClF4N2[?]
Actual Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
Molecular Weight: 393.78517
Predicted Endpoint Non-Carcinogen Non-Carcinogen Non-Carcinogen
ALogP: 6.54
Distance 0.596 0.596 0.650
Rotatable Bonds: 4
Reference NTP446 TR-446 TR-40
Acceptors: 2
Donors: 1
Model Applicability
Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Model Prediction in the training set.
Prediction: Carcinogen
1. OPS PC11 out of range. Value: 3.6073. Training min, max, SD, explained variance: -2.3897,
Probability: 0.544 3.1905, 1.314, 0.0302.
Enrichment: 1.2 2. Unknown FCFP_2 feature: -1029533685: [*]:[c](:[*])C(F)(F)F
Bayesian Score: 0.931
Mahalanobis Distance: 17.5 Feature Contribution
Mahalanobis Distance p-value: 1.59e-020
Prediction: Positive if the Bayesian score is above the estimated
Top features for positive contribution
best cutoff value from minimizing the false positive and false Fingerprint Bit/Smiles Feature Structure Score Carcinogen in
negative rate. training set
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows FCFP_12 -617729047 0.575 3 out of 3
a normal distribution and is different from the prediction using a
cutoff.
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category.
Bayesian Score: The standard Laplacian-modified Bayesian
score.
Mahalanobis Distance: The Mahalanobis distance (MD) is the
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 71953198 0.423 26 out of 40
C20H14ClF4N2[?]
Actual Endpoint Irritant Irritant Irritant
Molecular Weight: 393.78517
Predicted Endpoint Non-Irritant Irritant Non-Irritant
ALogP: 6.54
Distance 0.666 0.683 0.717
Rotatable Bonds: 4
Reference 28ZPAK "Sbornik 85JCAE "Prehled 85JCAE "Prehled
Acceptors: 2 Vysledku Toxixologickeho Prumyslove Toxikologie; Prumyslove Toxikologie;
Donors: 1 Vysetreni Latek A Organicke Latky," Organicke Latky,"
Pripravku," Marhol d, J.V., Marhold, J., Prague , Marhold, J., Prague ,
Institut Pro Vychovu Czechoslovakia, Czechoslovakia,
Model Prediction Vedoucicn Pracovniku
Chemickeho Prumyclu
Avicenum, 1986
Volume(issue)/page/year:
Avicenum, 1986
Volume(issue)/page/year:
Prediction: Non-Irritant Praha, Cz echoslovakia, -,536,1986 -,725,1986
Probability: 0.968 1972
Volume(issue)/page/year:
Enrichment: 1.05 -,242,1
Bayesian Score: -1.16
Mahalanobis Distance: 7.21 Model Applicability
Mahalanobis Distance p-value: 0.991 Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: Positive if the Bayesian score is above the estimated in the training set.
best cutoff value from minimizing the false positive and false
negative rate. 1. All properties and OPS components are within expected ranges.
Probability: The esimated probability that the sample is in the
positive category. This assumes that the Bayesian score follows
a normal distribution and is different from the prediction using a
cutoff. Feature Contribution
Enrichment: An estimate of enrichment, that is, the increased
likelihood (versus random) of this sample being in the category. Top features for positive contribution
Bayesian Score: The standard Laplacian-modified Bayesian
score. Fingerprint Bit/Smiles Feature Structure Score Irritant in training
Mahalanobis Distance: The Mahalanobis distance (MD) is the set
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non-
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_12 192331578 0.0756 6 out of 6
C20H14ClF4N2[?]
Actual Endpoint (-log C) 4.45051 5.49293 4.51245
Molecular Weight: 393.78517
Predicted Endpoint (-log 3.8403 4.9569 3.49372
ALogP: 6.54 C)
Rotatable Bonds: 4 Distance 0.676 0.700 0.710
Acceptors: 2 Reference CPDB CPDB CPDB
Donors: 1
Model Applicability
Model Prediction Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
Prediction: 1.34 in the training set.
Unit: mg/kg_body_weight/day 1. OPS PC19 out of range. Value: -3.5336. Training min, max, SD, explained variance: -2.9709,
Mahalanobis Distance: 16.9 5.6065, 1.282, 0.0158.
Mahalanobis Distance p-value: 2.71e-017 2. Unknown FCFP_2 feature: -1029533685: [*]:[c](:[*])C(F)(F)F
Mahalanobis Distance: The Mahalanobis distance (MD) is a
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the Feature Contribution
distance to the center of the training data. The larger the MD, the
less trustworthy the prediction. Top features for positive contribution
Mahalanobis Distance p-value: The p-value gives the fraction of
training data with an MD greater than or equal to the one for the Fingerprint Bit/Smiles Feature Structure Score
given sample, assuming normally distributed data. The smaller
the p-value, the less trustworthy the prediciton. For highly non- FCFP_6 -1861645784 0.359
normal X properties (e.g., fingerprints), the MD p-value is wildly
inaccurate.
FCFP_6 690511177 0.293
FCFP_6 32 0.154
FCFP_6 17 -0.149
FCFP_6 0 -0.115
Molecule TOPKAT_Chronic_LOAEL
Structural Similar Compounds
Name CLOTRIMAZOLE MIDAZOLAM.HCL TRIAZOLAM
Structure
Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 1559650422 0.129
FCFP_6 32 0.101
FCFP_6 3 0.0924
Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_6 642810091 0.148
FCFP_6 17 -0.189
Molecule TOPKAT_Fathead_Minnow_LC50
Structural Similar Compounds
Name Dicofol 4;4'-Isopropylidene-bis- 2;2'-Methylene-bis-(3;4;6-
(2;6-dichlorophenol) trichlorophenol)
Structure
FCFP_2 16 0.0139
FCFP_2 3 -0.198
Molecule TOPKAT_Rat_Inhalational_LC50
Structural Similar Compounds
Name 1H-Benzimidazole; 5- Benzhydrol; 4;4'-dichloro- 1H-1;2;4-Triazole; 1-
chloro-6-(2;3- alpha-(trichloromethyl)- ((bis(4-
dichlorophenoxy)-2- fluorophenyl)methylsilyl)
(methylthio)- methyl)-
Structure
C20H14ClF4N2[?]
Molecular Weight: 393.78517 Actual Endpoint (-log C) 2.2548 1.2677 1.5166
ALogP: 6.54 Predicted Endpoint (-log 1.69815 2.08555 1.21248
Rotatable Bonds: 4 C)
Acceptors: 2 Distance 0.567 0.706 0.711
Donors: 1 Reference MDACAP Medicamentos PEMNDP Pesticide NTIS** National Technical
de Actualidad. (J.R. Manual. (The British Crop Information Service.
Prous; S.A.; Apartado de Protection Council; 20 (Springfield; VA 22161)
Model Prediction Correos 54 0; 08080
Barcelona; Spain) V.1-
Bridport R d.; Thornton
Heath CR4 7QG; UK)
Forme rly U.S.
Clearinghouse for
Prediction: 9.62e+004 1965- V.1- 1968- Scientific & Technical
Unit: mg/m3/h Volume(issue)/page/year: Volume(issue)/page/year: Information.
21;227;1985 9;267;1 991 Volume(issue)/pag e/year:
Mahalanobis Distance: 13.4 OTS0543806
Mahalanobis Distance p-value: 8.91e-010
Mahalanobis Distance: The Mahalanobis distance (MD) is a Model Applicability
generalization of the Euclidean distance that accounts for
correlations among the X properties. It is calculated as the Unknown features are fingerprint features in the query molecule, but not found or appearing too infrequently
distance to the center of the training data. The larger the MD, the in the training set.
less trustworthy the prediction.
Mahalanobis Distance p-value: The p-value gives the fraction of 1. OPS PC6 out of range. Value: -2.7155. Training min, max, SD, explained variance: -2.4569,
training data with an MD greater than or equal to the one for the
given sample, assuming normally distributed data. The smaller 8.1177, 1.698, 0.0400.
the p-value, the less trustworthy the prediciton. For highly non- 2. OPS PC21 out of range. Value: -3.7574. Training min, max, SD, explained variance: -3.0247,
normal X properties (e.g., fingerprints), the MD p-value is wildly 4.4972, 1.058, 0.0155.
inaccurate.
3. Unknown ECFP_2 feature: -1305021906: [*]['?']
4. Unknown ECFP_2 feature: -428002189: [*]:[cH]:[c](:n:[*])[c](:[*]):[*]
5. Unknown ECFP_2 feature: -1734834311: ['?'][c](:[*]):[c](N):n:[*]
6. Unknown ECFP_2 feature: -1659020767: [*][c](:[*]):[c](['?']):[c]([*]):[*]
7. Unknown ECFP_2 feature: 432684389: ['?'][c](:[*]):[*]
Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
ECFP_2 642810091 0.214
FCFP_2 16 -0.0512
Molecule TOPKAT_Rat_Maximum_Tolerated_Dose_Gavage
Structural Similar Compounds
Name MIX OF 1,2,3,6,7,8- AND 2,3,7,8-TCDD CHLORPHENIRAMINE
1,2,3,7,8,9-HCDD MALEATE
Structure
FCFP_2 0 -0.29
Molecule TOPKAT_Rat_Oral_LD50
Structural Similar Compounds
Name BENZBROMARONE PIMOZIDE MICONAZOLE. HNO3
(HNO3 STRIPPED)
Structure
Feature Contribution
Top features for positive contribution
Fingerprint Bit/Smiles Feature Structure Score
FCFP_6 71953198 0.392