You are on page 1of 48

ROLE

OF BIOINFORMATICS DESIGNING AND DEVELOPMENT

IN

DRUG

Division of Biochemistry,
Indian Veterinary Research institute, Izatnagar, India-243122

INTRODUCTION The in silico identification of novel drug targets is now feasible by systematically searching for paralogs (related proteins within an organism) of known drug targets (eg. may be able to modify an existing drug to bind to the paralog). Can compare the entire genome of pathogenic and nonpathogenic strains of a microbe and identify genes/proteins associated with pathogenism. Current Opin. Microbiol 1:572-579 1998 Using gene expression microarrays and gene chip technologies, a single device can be used to evaluate and compare the expression of up to 20000 genes of healthy and diseased individuals at once. Trends Biotechnol 19:412-415 2001

INFORMATICS

The ability to transform raw data into meaningful information by applying computerized techniques for managing, analyzing, and

interpreting data.

The identification of new biological targets has benefited from the genomics approach: eg. The sequencing of the human genome. Nature 409:860-921 2001; Science 291:1304-1351 2001

Blueprint of all proteins

Bioinformatics methods are used to transform the raw sequence


into meaningful information (eg. genes and their encoded proteins) and to compare whole

IMPORTANT POINTS IN DRUG DESIGN BASED ON BIOINFORMATICS TOOLS

Detect the Molecular Bases for Disease


Detection of drug binding site Tailor drug to bind at that site Protein modeling techniques

Traditional Method (brute force testing)


Screen likely compounds built

Rational drug design techniques


Modeling large number of compounds (automated)


Application of Artificial intelligence Limitation of known structures

TECHNOLOGY
GENOMICS, PROTEOMICS & BIOPHARM.
Potentially producing many more targets and personalized targets

HIGH THROUGHPUT SCREENING


Identify disease Screening up to 100,000 compounds a day for activity against a target protein

VIRTUAL SCREENING
Isolate protein Using a computer to predict activity

COMBINATORIAL CHEMISTRY
Rapidly producing vast numbers of compounds Find drug

MOLECULAR MODELING
Computer graphics & models help improve activity

IN VITRO & IN SILICO ADME MODELS

Preclinical testing

Tissue and computer models begin to replace animal testing

DRUG DISCOVERY PROCESS WITH BIOINFORMATICS

Target Identification

Target validation and the identification of ligand binding regions Lead optimization through Docking Clinical Trial

DRUG TARGET IDENTIFICATION

The identification of new, clinically relevant, molecular targets is of utmost importance to the discovery of innovative drugs. It has been estimated that up to 10 genes contribute to multifactoral diseases. Science 287:1960-1964 (2000) Typically these disease genes are linked to another 5 to 10 gene products in physiological circuits which are also suitable for pharmaceutical intervention. If these numbers are multiplied with the number of diseases that pose a major medical problem in the industrial world, then there are ~5000 to 10000 potential drug targets

DRUG TARGET IDENTIFICATION DATABASE


In the age of genomics, discovery of novel drug targets needs to incorporate and integrate different sources of data including gene expression data, gene sequence data, gene polymorphism data and so on. Many public biological databases are warehousing and providing a great amount of functional information for drug discovery. Databases to create systematic analysis architecture will be helpful for inferring the underlying interaction of genes and gaining insights about the pathway structures with which drug targets interact

LIST OF SOME RELEVANT DATABASES FOR DRUG TARGET IDENTIFICATION.


Database BIND KEGG OMIM PIM KinG GPCRDB GEO Access http://bind.ca http://www.genome.ad.jp/ kegg/ Contents The biomolecular interaction network database Kyoto encyclopedia of genes and genomes

ttp://ww3.ncbi.nlm.nih.gov Online mendelian inheritance in /Omim/ man http://proteome.wayne.ed u/PIMdb.html http://hodgkin.mbu.iisc.er net.in/~king http://www.gpcr.org/7tm/ http://www.ncbi.nlm.nih.g ov/geo/ Protein interactions maps database Protein kinases database http://www.gpcr.org/7tm/ Gene expression omnibus

THE NETWORK-BASED STRATEGY FOR DRUG


TARGET IDENTIFICATION

With the development of bioinformatics, a number of computational techniques have been used to search for novel drug targets from the information contained in genomics. The network-based strategy for drug target identification attempts to reconstruct endogenous metabolic, regulatory and signaling networks with which potential drug targets interact Development of microarray technology, large volume of gene expression or protein expression data have been produced, and there have been considerable models proposed to infer gene networks or protein networks from these data. Microarray data, such as drug response expression data, timecourse expression data and steady-state expression data of gene knockout, could be used

LIST OF SOME RELEVANT COMPUTATIONAL TOOLS


FOR GENE NETWORK IDENTIFICATION
Tools GNA BioMiner GenePath Access http://wwwhelix.inrialpes. fr/gna http://www.zbi.unisaarland.de http://genepath.org Contents Tool for the modeling and simulation of genetic regulatory networks System for analyzing and visualizing biochemical pathways and networks Tool for automated construction of genetic networks from mutant data

Path Finder
ToPNet

http://bibiserv.techfak.uni bielefeld.de/pathfinder/
http://www.biosolveit.de/ ToPNet/

Tool for biochemical pathways reconstruction and dynamic visualization


Tool for joint analysis of biological networks and expression data

VisANT
Pathway Miner

http://visant.bu.edu

Integrative platform for network/pathway analysis

http://www.biorag.org/pat Extracting gene association networks hway.html from molecularpathways

TARGET VALIDATION
Involves demonstrating the relevance of the target protein in a disease process/ pathogenicity and ideally requires both gain and loss of function studies. This is accomplished primarily with knock-out or knock-in animal models, small molecule inhibitors/agonists/antagonists, antisense nucleic acid constructs, ribozymes, and neutralizing antibodies. In silico characterization can be carried by using approaches such as genetic-network mapping, protein-pathway mapping, proteinprotein interactions, disease-locus mapping, and subcellular localization predictions

Bioinformatics is being increasingly used to support target validation by providing functionally predictive information mined from databases and experimental datasets using a variety of computational tools. Sequence-based approaches-The most commonly used approach to assign function to proteins is by sequence similarity. The Eukaryotic Linear Motif (ELM) server (http://elm.eu.org/) is a resource for investigating short peptide linear motifs which are used for cell compartment targeting, proteinprotein interaction, regulation by phosphorylation, acetylation, glycosylation and a range of other post-translational modifications. Structure-based approaches- homology modelling (e.g. http://swissmodel.expasy.org/)produces the most accurate models, it does require homologous proteins with a structure and a high percentage sequence identity with the target protein.

LEAD COMPOUND IDENTIFICATION


The identification of a small molecule hit as a starting point for the hit-to lead process. The identification of small molecule modulators of protein function and the process of transforming these into high-content lead series are key activities in modern drug discovery (Robert AG 2006). Hits can be identified by one or more of several technology-based approaches like high throughput biochemical and cellular assays, assay of natural products, structure-based design

HIGH-THROUGHPUT SCREENING
Used to test large numbers of compounds for their ability to affect the activity of target proteins. Natural product and synthetic compound libraries with millions of compounds are screened using a test assay. Curr Opin Chem Biol 4:445-451 2000 There are concerns with the numbers approach to screening for a lead molecule. In theory generating the entire chemical space for drug molecules and testing them would be an elegant approach to drug discovery. One solution may be to accumulate as much knowledge as possible on biological targets (eg. structure, function, interactions, ligands) and choose targeted approaches to chemical synthesis.

VIRTUAL SCREENING
It is a computational technique used in drug discovery research. It involves the rapid in silico assessment of large libraries of chemical structures in order to identify those structures which are most likely to bind to a drug target, typically a protein receptor or enzyme. The aim of virtual screening is to identify molecules of novel chemical structure that bind to the macromolecular target of interest There are two broad categories of screening techniques: ligand-based structure-based

STRUCTURE BASED SCREENING


Three dimensional structures of compounds from virtual or physically existing libraries are docked into binding sites of target proteins with known or predicted structure. Scoring functions evaluate the steric and electrostatic complementarity between compounds and the target protein. The highest ranked compounds are then suggested for biological testing. Once hits (compounds that elicit a positive response in an assay) have been identified via the screening approach, these are validated by re-testing them and checking the purity and structure of the compounds

STRUCTURE-BASED DRUG DESIGN


Compound databases, Microbial broths, Plants extracts, Combinatorial Libraries 3-D ligand Databases Docking Linking or Binding Target Enzyme OR Receptor

3-D structure by Crystallography, NMR, electron microscopy OR Homology Modeling

Random screening synthesis

Receptor-Ligand Complex

Testing Redesign to improve affinity, specificity etc.

Lead molecule 3-D QSAR

LIGAND-BASED SCREENING
Given a set of structurally diverse ligands that binds to a receptor, a model of the receptor can be built by exploiting the collective information contained in such set of ligands. A candidate ligand can then be compared to the pharmacophore model to determine whether it is compatible with it and therefore likely to bind. Another approach to ligand-based virtual screening is to use 2D chemical similarity analysis to scan a database of molecules against one or more active ligand structure. A popular approach to ligand-based virtual screening is based on searching molecules with shape similar to that of known actives, as such molecules will fit the target's binding site and hence will be likely to bind the target

LEAD OPTIMIZATION
Molecules are chemically modified and subsequently characterized in order to obtain compounds with suitable properties to become a drug. Leads are characterized with respect to pharmacodynamic properties such as efficacy and potency in vitro and in vivo, physiochemical properties, pharmacokinetic properties, and toxicological aspects. Lead structures are optimized for target affinity and selectivity. Docking techniques are currently applied

CONT..
Only if the hits fulfill certain criteria are they regarded as leads. The criteria can originate from: Pharmacodynamic properties - efficacy, potency, selectivity Physiochemical properties - water solubility, chemical stability, Lipinskis rule-of-five. Pharmacokinetic properties - metabolic stability and toxological aspects. Chemical optimization potential - ease of chemical synthesis and derivatization. 5) Patentability

DOCKING METHODS

Docking of ligands to proteins is a formidable problem since it entails optimization of the 6 positional degrees of freedom. Rigid vs Flexible Speed vs Reliability Manual Docking Interactive

DOCKING TERMINOLOGY Receptor or host or lock The "receiving" molecule, most commonly a protein or other biopolymer. Ligand or guest or key The complementary partner molecule which binds to the receptor. Binding mode The orientation of the ligand relative to the receptor as well as the conformation of the ligand and receptor when bound to each other. Pose A candidate binding mode. Scoring The process of evaluating a particular pose by counting the number of favorable intermolecular interactions such as hydrogen bonds and hydrophobic contacts. Ranking The process of classifying which ligands are most likely to interact favorably to a particular receptor based on the predicted free-energy of binding.

ACTIVE SITE IDENTIFICATION


Active site identification is the first step in this program. It analyzes the protein to find the binding pocket, interaction sites within the binding pocket, and then prepares the necessary data for Ligand fragment link. The basic inputs for this step are the 3D structure of the protein and a pre-docked ligand in PDB format, as well as their atomic properties The space inside the ligand binding region would be studied with virtual probe atoms of the four types above so the chemical environment of all spots in the ligand binding region can be known.

AUTOMATED DOCKING METHODS


Basic Idea is to fill the active site of the Target protein with a set of spheres. Match the centre of these spheres as good as possible with the atoms in the database of small molecules with known 3-D structures. Examples: DOCK, CAVEAT, AUTODOCK, LEGEND, ADAM, LINKOR, LUDI.

THERMODYNAMICS OF RECEPTOR-LIGAND BINDING

Proteins that interact with drugs are typically enzymes or receptors Drug may be classified as: substrates/inhibitors (for enzymes) agonists/antagonists (for receptors) Ligands for receptors normally bind via a non-covalent reversible binding. Enzyme inhibitors have a wide range of modes:non-covalent reversible, covalent reversible/irreversible or suicide inhibition Enzymes prefer to bind transition states (reaction intermediates) and may not optimally bind substrates as part of energy used for catalysis.

CONT
In contrast, inhibitors are designed to bind with higher affinity: their affi nities often exceed the corresponding substrate affinities by several orders of magnitude! Agonists are analogous to enzyme substrates: part of the binding energy may be used for signal transduction, inducing a conformation or aggregation shift. To understand what forces are responsible for ligands binding to Receptors/Enzymes, It is worthwhile considering what forces drive protein folding they share many common features.

CONT

The observed structure of Protein is generally a consequence of the hydrophobic effect! Secondary amides form much stronger H-bonds to water than to other sec. Amides hydrophobic collapse Proteins generally bury hydrophobic residues inside the core, Exposing hydrophilic residues to the exterior bridges inside Salt-

Ligand building clefts in proteins often expose hydrophobic residues to solvent and may contain partially desolvated hydrophilic groups that are not paired:

SCORING METHOD

CLINICAL TRIALS The NIH organizes clinical trials into 5 different types: Treatment trials: test experimental treatments or a new combination of drugs. Prevention trials: look for ways to prevent a disease or prevent it from returning. Diagnostic trials: find better tests or procedures for diagnosing a disease. Screening trials: test methods of detecting diseases. Quality of Life trials: explore ways to improve comfort and quality of life for individuals with a chronic illness.

CONT..
Pharmaceutical clinical trials are commonly classified into 4 phases: (as of 2006, there are now 5) Phase 0 - a recent designation for exploratory, first-in-human trials. Designed to expedite the development of promising therapeutic agents by establishing early on whether the agent behaves in human subjects as was anticipated from preclinical studies New Scientist, March 2006,Catastrophic immune response may have caused drug trial horror Phase I - a small group of healthy volunteers (20-80) are selected to assess the safety, tolerability, pharmacokinetics, and pharmacodynamics of a therapy. - normally include dose ranging studies so that doses for clinical use can be set/adjusted.

CONT..
Phase I - there are 3 common kinds of phase I trials: Single Ascending Dose (SAD) studies- a small group of patients are given a single dose of the drug and then are monitored over a period of time. If they do not exhibit any adverse side effects, the dose is escalated and a new group of patients is given the higher dose. Multiple Ascending Dose (MAD) studies- a group of patients receives multiple low doses of the drug, while blood (and other fluids) are collected at various time points and analyzed to understand how the drug is processed within the body. The dose is subsequently escalated for further groups. Food effect- designed to investigate any differences in absorption caused by eating before the dose is given.

CONT..

Phase II - performed on larger groups (20-300) and are designed to assess the activity of the therapy, and continue Phase I safety assessments. Phase III - randomized controlled trials on large patient groups (hundreds to thousands) aimed at being the definitive assessment of the efficacy of the new therapy, in comparison with standard therapy. Side effects are also monitored. It is typically expected that there be at least two successful phase III clinical trials to obtain approval from the FDA. Once a drug has proven acceptable, the trial results are combined into a large document which includes a comprehensive description of manufacturing procedures, formulation details, shelf life, etc.

This document is submitted to the FDA for review.

CONT..

Phase IV - post-launch safety monitoring and ongoing technical support of a drug. - may be mandated or initiated by the pharmaceutical company. - designed to detect rare or long term adverse effects over a large patient population and timescale than was possible during clinical trials.

SIGNIFICANCE
As structures of more and more protein targets become available through crystallography, NMR and bioinformatics methods.

There is an increasing demand for computational tools that can identify and analyze active sites and suggest potential drug molecules that can bind to these sites specifically.
Time and cost required for designing a new drug are immense and at an unacceptable level. According to some estimates it costs about $880 million and 14 years of research to develop a new drug before it is introduced in the market. Intervention of computers at some plausible steps is imperative to bring down the cost and time required in the drug discovery process.

You might also like