You are on page 1of 50

QSAR

Dr.G.P.L.Jayasree

QSAR & QSPR


Quantitative Structure Activity Relationship QSAR: Relates molecular structure to pharmacological activity Quantitative Structure Property relationship QSPR: Relates structure to physical properties (boiling point, dipole moment, etc.)

History of QSAR
1937, L.P. Hammett studied chemical reactivity of substituted benzenes: Hammett equation, Linear Free Energy Relationship (LFER) 1964, C. Hansch and T. Fujita: the biologists Hammett equation. 1980s development of 2D QSAR (descriptors) 1980s 1990s, development of 3D QSAR

QSAR
Quantitative Structure Activity Relationship is a set of methods that tries to find a mathematical relationship between a set of descriptors of molecules and their activity. QSAR's most general mathematical form is:

The descriptors can be experimentally or computationally derived.

Hansch approach
Corvin Hansch, 1964

Biological response = f1(L) + f2(E) + f3(S) + f4(M)

1 2

Lipophilic properties Electronic properties Steric properties Other molecular properties

3
4

QSAR Postulates
The molecular structure is responsible for all the activities Similar compounds have similar biological and chemico-physical properties (Meyer 1899) Hansch postulate (1963) biological system + compound = f1(Hydrophilicity) + f2(Electronics) + f3(Steric) + f4(Molecular property) QSAR is applicable only to similar compounds

QSAR
(Quantitative Structure Activity Relationships)
Activity = function (structural properties) Example*: Relate biological activity to electronics and hydrophobicity log(1/C) = k1logP k2(logP)2 + k3s + k4
C = concentration of a compound that gives a response P = partition coefficient between water & 1-octanol K1,k2,k3,k4 = constants s = Hammett substituents parameter

*C. Hansch. (1969) Accounts of Chemical Res. 2:232-239

QSAR Process
Synthesize & test biological activity for diverse set of ligands Which properties are correlated with activity?
Choose/calculate Descriptors

Molecular QSAR-Descriptors
1D: Whole-molecule properties (e.g. molecular weight, melting point, logP etc.) Substituent constants (e.g. , , molar refractivity), fragment fingerprints, topological indices Surface or field properties (e.g. electrostatic potential, , steric fields, hydrophobicity, solvent accessible surface area etc.),

2D:

3D:

What is a Descriptor?
Molecular Descriptors:
Calculated:
Solution to a mathematical procedure that transforms chemical information into a number surface areas (polar, non-polar), dipole moment, volume

Experimental:
the result of some standardized experiment to measure a molecular attribute melting point, partition coefficients, refractive index, etc.

Types of Descriptors
Counts of features: For example HBAs, HBDs, aromatic ring systems, substructures/fragments ( e.g. , carbonyl groups, basic nitrogens, carboxyl groups,),etc. Physicochemical Properties: LogP, solubility, MW, MP, BP, heat of sublimation, molar refractivity, Hammett parameters, etc. Topological Indices: Wiener index, branching indices, kappa shape indices, electrotopological state indices, atom-pairs, topological torsions, etc. BCUTs (3-D,2-D): Electrostatic, charge, and polarizability (hydrophobic).

QSAR
QSAR models are derived from a series of (similar) molecules with known activity (training set)

If a statistically relevant QSAR model has been found, it can be applied to new molecules in this series (test set) in order to predict their activity before biological testing (or even before synthesis!)

QSAR
Example: Analgesic activity of Capsaicin analogs (taken from Walpole et al., Sandoz)
H N O O OH

Capsaicin
Capsaicin analogs

QSAR
Activity data of test series
Cmpd Cmpd Number Name 1 2 3 4 5 6 7 8 6a 6b 6d 6e 6f 6g 6h 6i X H Cl NO2 CN C6H5 N(CH3)2 I NHCHO EC50(M) 11.80 1.90 1.24 0.11 4.58 0.29 26.50 5.87 0.24 0.30 4.39 0.67 0.35 0.05 ?? ??

QSAR
the hydrophobic constant and the molar refractivity (MR) (correlated with the size and polarizability of the substituents)

Cmpd Number 1 2 3 4 5 6 7 8

Cmpd Name 6a 6b 6d 6e 6f 6g 6h 6i

X H Cl NO2 CN C6H5 N(CH3)2 I NHCHO

Log EC50 1.07 0.09 0.66 1.42 - 0.62 0.64 - 0.46 ??

0.00 0.71 - 0.28 - 0.57 1.96 0.18 1.12

MR 1.03 6.03 7.36 6.33 25.36 15.55 13.94

-0.98 10.31

Lipinskis Rule of Five


LogP < 5 MW < 500 Number of HBDonors < 5 Number of HBAcceptors < 10

Selected parameters for testing


Molecular weight known relationship between poor permeability and high molecular weight. Lipophilicity (ratio of octanol solubility to water solubility) measured through LogP. Number of hydrogen bond donors and acceptors HBD & HBA High numbers may impair permeability across membrane bilayer

The rule of five - formulation


Poor absorption or permeation are more likely when: There are more than 5 H-bond donors. The molecular weight is over 500. The LogP is over 5. There are more than 10 H-bond acceptors.

Partition coefficient Definition


The ratio of the equilibrium concentrations of a dissolved substance in a two-phase system containing two largely immiscible solvents (water and n-octanol)

C water P Coct .

Partition coefficient
1-octanol OH H water O H

Since the differences are usually on a very large scale, Log10(P) is used.

OCTANOL/WATER

Partition coefficient
Partition- (P) or distribution coefficient (D) is the ratio of concentrations of a compound in the two phases of a mixture of two immiscible solvents at equilibrium. Hence these coefficients are a measure of differential solubility of the compound between these two solvents.

In medical practice, partition coefficients are useful for example in estimating distribution of drugs within the body. Hydrophobic drugs with high partition coefficients are preferentially distributed to hydrophobic compartments such as lipid bilayers of cells while hydrophilic drugs (low partition coefficients) preferentially are found in hydrophilic compartments such as blood serum.

Octanol-water partition coefficient logP is used in QSAR studies and rational drug design as a measure of molecular hydrophobicity. Hydrophobicity affects drug absorption, bioavailability, hydrophobic drug-receptor interactions, metabolism of molecules, as well as their toxicity.

Partition coefficient and log P


The partition coefficient is a ratio of concentrations of un-ionized compound between the two solutions. To measure the partition coefficient of ionizable solutes, the pH of the aqueous phase is adjusted such that the predominant form of the compound is un-ionized. The logarithm of the ratio of the concentrations of the un-ionized solute in the solvents is called log P.

QSAR
DESCRIPTORS

Electronic Descriptors
Charge - Sum of partial charges. Apol - Sum of atomic polarizabilities. Dipole - Dipole moment. Quantum chemical indices HOMO - Highest occupied molecular orbital LUMO - Lowest unoccupied molecular orbital. Sr - Superdelocalizability.

Quantum chemical descriptors such as net atomic changes, highest occupied molecular orbital/lowest unoccupied molecular orbital (HOMO-LUMO) energies, and superdelocalizabilities have been shown to correlate well with various biological activities

The Hammett constant ()


The distribution of electrons within a molecule depends on the nature of the electron withdrawing and donating groups found in that structure. Hammett used this concept to calculate Hammett constants (X) Its value varies depending on whether the substituent is an overall electron donor or acceptor. A negative value for (X) indicates that the substituent is acting as an electron donor group A positive value for (X) shows that the substituent is acting as an electron withdrawing group (X) = log (KBX=KB)

Dipole moment (Dipole) The dipole moment descriptor is a 3D electronic descriptor that indicates the strength and orientation behavior of a molecule in an electrostatic field. Dipole properties have been correlated to longrange ligand-receptor recognition and subsequent binding.

Highest occupied molecular orbital energy (HOMO)


HOMO (highest occupied molecular orbital) is the highest energy level in the molecule that contains electrons. It is crucially important in governing molecular reactivity and properties. When a molecule acts as a Lewis base (an electron-pair donor) in bond formation, the electrons are supplied from the molecule's HOMO. How readily this occurs is reflected in the energy of the HOMO. Molecules with high HOMOs are more able to donate their electrons and are hence relatively reactive compared to molecules with low-lying HOMOs; thus the HOMO descriptor should measure the nucleophilicity of a molecule.

Lowest unoccupied molecular orbital energy (LUMO) LUMO (lowest unoccupied molecular orbital) is the lowest energy level in the molecule that contains no electrons. It is important in governing molecular reactivity and properties. When a molecule acts as a Lewis acid (an electron-pair acceptor) in bond formation, incoming electron pairs are received in its LUMO. Molecules with low-lying LUMOs are more able to accept electrons than those with high LUMOs; thus the LUMO descriptor should measure the electrophilicity of a molecule.

Superdelocalizability (Sr) Superdelocalizability is an index of reactivity in aromatic hydrocarbons

Topological descriptors
Topological indices are 2D descriptors based on graph theory concepts (Kier and Hall 1976, 1986; Katritzky and Gordeeva 1993). These indices have been widely used in QSPR and in QSAR studies. They help to differentiate the molecules according mostly to their size, degree of branching, flexibility, and overall shape. Wiener index (W) The Wiener index is the sum of the chemical bonds existing between all pairs of heavy atoms in the molecule. In graph-theoretical terms: the sum of lengths of minimal paths between all pairs of vertices representing heavy atoms.

Molinspiration Property Calculator


Interactive log P calculator http://www. mol inspiration. com/ cgibin/properties

Exception to the rule of five


Compound classes that are substrates for biological transporters: Antibiotics Fungicides- Protozoacides antiseptics Vitamins Cardiac glycosides.

Applications
Pharmacology A drug's distribution coefficient strongly affects how easily the drug can reach its intended target in the body, how strong an effect it will have once it reaches it target, and how long it will remain in the body in an active form.

Pharmacokinetics
Pharmacokinetics (what the body does to a drug), the distribution coefficient has a strong influence on ADME properties (Absorption, Distribution, Metabolism, and Elimination) of the drug. Hence the hydrophobicity of a compound (as measured by its distribution coefficient) is a major determinant of how drug-like it is. More specifically, in order for a drug to be orally absorbed, it normally must first pass through lipid bilayers in the intestinal epithelium (a process known as transcellular transport). For efficient transport, the drug must be hydrophobic enough to partition into the lipid bilayer, but not so hydrophobic, that once it is in the bilayer, it will not partition out again. Hydrophobicity plays a major role in determining where drugs are distributed within the body after adsorption and as a consequence in how rapidly they are metabolized and excreted.

Pharmacodynamics
Pharmacodynamics (what a drug does to the body), the hydrophobic effect is the major driving force for the binding of drugs to their receptor targets. Hydrophobic drugs tend to be more toxic because they in general are retained longer, have a wider distribution within the body are somewhat less selective in their binding to proteins, and finally are often extensively metabolized. Hence it is advisable to make the drug as hydrophilic as possible while it still retains adequate binding affinity to the therapeutic protein target. The ideal distribution coefficient for a drug is usually intermediate (not too hydrophobic nor too hydrophilic )

Algorithms for log P


Atomic based prediction (atomic contribution; AlogP, MlogP, etc.) The most common elements contained in drugs (hydrogen, carbon, oxygen, sulfur, nitrogen, and halogens) are divided into several different atom types depending on the environment of the atom within the molecule.

Fragment based prediction (group contribution; ClogP, etc.) The log P of a compound can be determined by the sum of its nonoverlapping molecular fragments (defined as one or more atoms covalently bound to each other within the molecule)

Data mining prediction A typical data mining based prediction uses support vector machines, decision trees, or neural networks. This method is usually very successful for calculating log P values when used with compounds that have similar chemical structures and known log P values.

You might also like