Professional Documents
Culture Documents
Dr.G.P.L.Jayasree
History of QSAR
1937, L.P. Hammett studied chemical reactivity of substituted benzenes: Hammett equation, Linear Free Energy Relationship (LFER) 1964, C. Hansch and T. Fujita: the biologists Hammett equation. 1980s development of 2D QSAR (descriptors) 1980s 1990s, development of 3D QSAR
QSAR
Quantitative Structure Activity Relationship is a set of methods that tries to find a mathematical relationship between a set of descriptors of molecules and their activity. QSAR's most general mathematical form is:
Hansch approach
Corvin Hansch, 1964
1 2
3
4
QSAR Postulates
The molecular structure is responsible for all the activities Similar compounds have similar biological and chemico-physical properties (Meyer 1899) Hansch postulate (1963) biological system + compound = f1(Hydrophilicity) + f2(Electronics) + f3(Steric) + f4(Molecular property) QSAR is applicable only to similar compounds
QSAR
(Quantitative Structure Activity Relationships)
Activity = function (structural properties) Example*: Relate biological activity to electronics and hydrophobicity log(1/C) = k1logP k2(logP)2 + k3s + k4
C = concentration of a compound that gives a response P = partition coefficient between water & 1-octanol K1,k2,k3,k4 = constants s = Hammett substituents parameter
QSAR Process
Synthesize & test biological activity for diverse set of ligands Which properties are correlated with activity?
Choose/calculate Descriptors
Molecular QSAR-Descriptors
1D: Whole-molecule properties (e.g. molecular weight, melting point, logP etc.) Substituent constants (e.g. , , molar refractivity), fragment fingerprints, topological indices Surface or field properties (e.g. electrostatic potential, , steric fields, hydrophobicity, solvent accessible surface area etc.),
2D:
3D:
What is a Descriptor?
Molecular Descriptors:
Calculated:
Solution to a mathematical procedure that transforms chemical information into a number surface areas (polar, non-polar), dipole moment, volume
Experimental:
the result of some standardized experiment to measure a molecular attribute melting point, partition coefficients, refractive index, etc.
Types of Descriptors
Counts of features: For example HBAs, HBDs, aromatic ring systems, substructures/fragments ( e.g. , carbonyl groups, basic nitrogens, carboxyl groups,),etc. Physicochemical Properties: LogP, solubility, MW, MP, BP, heat of sublimation, molar refractivity, Hammett parameters, etc. Topological Indices: Wiener index, branching indices, kappa shape indices, electrotopological state indices, atom-pairs, topological torsions, etc. BCUTs (3-D,2-D): Electrostatic, charge, and polarizability (hydrophobic).
QSAR
QSAR models are derived from a series of (similar) molecules with known activity (training set)
If a statistically relevant QSAR model has been found, it can be applied to new molecules in this series (test set) in order to predict their activity before biological testing (or even before synthesis!)
QSAR
Example: Analgesic activity of Capsaicin analogs (taken from Walpole et al., Sandoz)
H N O O OH
Capsaicin
Capsaicin analogs
QSAR
Activity data of test series
Cmpd Cmpd Number Name 1 2 3 4 5 6 7 8 6a 6b 6d 6e 6f 6g 6h 6i X H Cl NO2 CN C6H5 N(CH3)2 I NHCHO EC50(M) 11.80 1.90 1.24 0.11 4.58 0.29 26.50 5.87 0.24 0.30 4.39 0.67 0.35 0.05 ?? ??
QSAR
the hydrophobic constant and the molar refractivity (MR) (correlated with the size and polarizability of the substituents)
Cmpd Number 1 2 3 4 5 6 7 8
Cmpd Name 6a 6b 6d 6e 6f 6g 6h 6i
-0.98 10.31
C water P Coct .
Partition coefficient
1-octanol OH H water O H
Since the differences are usually on a very large scale, Log10(P) is used.
OCTANOL/WATER
Partition coefficient
Partition- (P) or distribution coefficient (D) is the ratio of concentrations of a compound in the two phases of a mixture of two immiscible solvents at equilibrium. Hence these coefficients are a measure of differential solubility of the compound between these two solvents.
In medical practice, partition coefficients are useful for example in estimating distribution of drugs within the body. Hydrophobic drugs with high partition coefficients are preferentially distributed to hydrophobic compartments such as lipid bilayers of cells while hydrophilic drugs (low partition coefficients) preferentially are found in hydrophilic compartments such as blood serum.
Octanol-water partition coefficient logP is used in QSAR studies and rational drug design as a measure of molecular hydrophobicity. Hydrophobicity affects drug absorption, bioavailability, hydrophobic drug-receptor interactions, metabolism of molecules, as well as their toxicity.
QSAR
DESCRIPTORS
Electronic Descriptors
Charge - Sum of partial charges. Apol - Sum of atomic polarizabilities. Dipole - Dipole moment. Quantum chemical indices HOMO - Highest occupied molecular orbital LUMO - Lowest unoccupied molecular orbital. Sr - Superdelocalizability.
Quantum chemical descriptors such as net atomic changes, highest occupied molecular orbital/lowest unoccupied molecular orbital (HOMO-LUMO) energies, and superdelocalizabilities have been shown to correlate well with various biological activities
Dipole moment (Dipole) The dipole moment descriptor is a 3D electronic descriptor that indicates the strength and orientation behavior of a molecule in an electrostatic field. Dipole properties have been correlated to longrange ligand-receptor recognition and subsequent binding.
Lowest unoccupied molecular orbital energy (LUMO) LUMO (lowest unoccupied molecular orbital) is the lowest energy level in the molecule that contains no electrons. It is important in governing molecular reactivity and properties. When a molecule acts as a Lewis acid (an electron-pair acceptor) in bond formation, incoming electron pairs are received in its LUMO. Molecules with low-lying LUMOs are more able to accept electrons than those with high LUMOs; thus the LUMO descriptor should measure the electrophilicity of a molecule.
Topological descriptors
Topological indices are 2D descriptors based on graph theory concepts (Kier and Hall 1976, 1986; Katritzky and Gordeeva 1993). These indices have been widely used in QSPR and in QSAR studies. They help to differentiate the molecules according mostly to their size, degree of branching, flexibility, and overall shape. Wiener index (W) The Wiener index is the sum of the chemical bonds existing between all pairs of heavy atoms in the molecule. In graph-theoretical terms: the sum of lengths of minimal paths between all pairs of vertices representing heavy atoms.
Applications
Pharmacology A drug's distribution coefficient strongly affects how easily the drug can reach its intended target in the body, how strong an effect it will have once it reaches it target, and how long it will remain in the body in an active form.
Pharmacokinetics
Pharmacokinetics (what the body does to a drug), the distribution coefficient has a strong influence on ADME properties (Absorption, Distribution, Metabolism, and Elimination) of the drug. Hence the hydrophobicity of a compound (as measured by its distribution coefficient) is a major determinant of how drug-like it is. More specifically, in order for a drug to be orally absorbed, it normally must first pass through lipid bilayers in the intestinal epithelium (a process known as transcellular transport). For efficient transport, the drug must be hydrophobic enough to partition into the lipid bilayer, but not so hydrophobic, that once it is in the bilayer, it will not partition out again. Hydrophobicity plays a major role in determining where drugs are distributed within the body after adsorption and as a consequence in how rapidly they are metabolized and excreted.
Pharmacodynamics
Pharmacodynamics (what a drug does to the body), the hydrophobic effect is the major driving force for the binding of drugs to their receptor targets. Hydrophobic drugs tend to be more toxic because they in general are retained longer, have a wider distribution within the body are somewhat less selective in their binding to proteins, and finally are often extensively metabolized. Hence it is advisable to make the drug as hydrophilic as possible while it still retains adequate binding affinity to the therapeutic protein target. The ideal distribution coefficient for a drug is usually intermediate (not too hydrophobic nor too hydrophilic )
Fragment based prediction (group contribution; ClogP, etc.) The log P of a compound can be determined by the sum of its nonoverlapping molecular fragments (defined as one or more atoms covalently bound to each other within the molecule)
Data mining prediction A typical data mining based prediction uses support vector machines, decision trees, or neural networks. This method is usually very successful for calculating log P values when used with compounds that have similar chemical structures and known log P values.