You are on page 1of 9
The application of systems biology to drug discovery Carolyn R Cho 1 , Mark Labow
The application of systems biology to drug discovery Carolyn R Cho 1 , Mark Labow

The application of systems biology to drug discovery

Carolyn R Cho 1 , Mark Labow 2 , Mischa Reinhard t 3 , Jan van Oostrum 4 and Manuel C Peitsch 4

Recent advances in the ‘omics’ technologies, scientific computing and mathematical modeling of biological processes have started to fundamentally impact the way we approach drug discovery. Recent years have witnessed the development of genome-scale functional screens, large collections of reagents, protein microarra ys, databases and algorithms for data and text mining. Taken together, they enable the unprecedented descriptions of complex biological systems, which are testable by mathematical modeling and simulation. While the methods and tools are advancing, it is their iterative and combinatorial application that defines the systems biology approach.

Addresses

1 Dep artment o f S ys tems B iology, G enome and P roteome Sciences, Novartis Institutes o f B ioMedical R esearch, Cambridge MA 02139, USA

2 Dep artment o f P latform Chemistry and Biology, G enome and P roteome Sciences, Novartis Institutes o f B io Medical Re search, Cambridge MA 02139, USA

3 Dep artment o f S cientific Com puting and Statistics, Genome and Proteome Sciences, Novartis Institutes o f B ioMedical R e search, Novartis AG, CH-40 02 Basel, Switzerland

4 Dep artment o f S ys tems B iology, G enome and P roteome Sciences, Novartis Institutes o f B ioMedical R esearch, Novartis AG, CH-4002 Basel, Switze rland

Correspon ding author: Peitsch, Manuel C (ma nuel.peitsch@novart is.com)

Current Opinion in Chemi cal Biology 2006, 10 :294–302

This review come s from a themed issue on Next-generation t herapeutics Edited by Clifton E Barry III and A lex Matter

Available o nline 5 th July 2006

1367-5931/$ – see front matter # 2006 Elsevier L td. A ll rights reserved.

Introduction

Drug discovery is a complex unde rtaking facing many challenge s [1 ], not the least of which is a high attrition rate as many prom ising candi dates prove inef fective or toxic in the clinic owing to a poor unde rstanding of the diseases, an d thus the biol ogical systems , they targe t. There fore, it is broadl y agreed that to increase the pro- ductiv ity of drug discovery one needs a far dee per under- standin g of the molecular mecha nisms o f diseases, taking into account the full biolog ical contex t o f the drug target and movi ng beyond individua l gen es an d proteins [ 2–5 ]. System s biol ogy, and especi ally the elucidation and

dynamic anal ysis of cellular signa ling pathw ays, provi des a new gram mar [ 2 ], o r framewor k, for drug discovery .

Systems biology is the ‘systemat ic’ interrogati on of the biological processes within the compl ex, physiologi cal milieu in which they function. Insi ght into the combi ned behavior of these many, diver se, inte racting compo nents is achi eved throug h the integrat ion of exp eriment al, mathema tical and comput ational sci ences in an iterative approach ( Figure 1 ). Thr ough this contex tual unde rstand- ing of the molecul ar mecha nisms of disea se, a systems approach has the potent ial to furth er facilit ate the iden- tification and valid ation of the thera peutic modulatio n o f regulato ry and metabol ic networks and hence help iden- tify tar gets and biom arkers, as well as ‘of f-target’ and side effects of drug candi dates [ 3–5 ].

Here, we focus on selected recent advance s i n the dis- ciplines of syste ms biology (Box 1 ) that are relevant to drug disc overy.

Experimental methods

Experimen tal ap proaches in systems biology are gen er- ally aimed at ident ifying the compo nen ts of a system and their inte ractions, and monitoring the effe ct of perturba- tions on these compo nents. Recent ad vances in proteo- mics, genomics and metabo lomics [ 6,7 ] and their integratio n [8 ] are radi cally tran sforming the drug dis- covery proces s. For inst ance, the ident ification of protein network compo nents and the characte rization of their post-transl ational modi fications has recently reach ed new level s o f scale and compl exity as exempl ified by the analysi s of the insulin recept or substrat e 1 serine/ threonine pho sphorylat ion sites [ 9 ] and the interac tome analysis of the human TNF a /NF k B network members [10 ], and the Erb B/EGF receptors [ 11,12 ]. This has been enabled not on ly by the rapid ly matu ring MS-ba sed proteomics methods [9–11 ] but also by prote in arrays [12 ], whi ch have much progr essed over the past few years [13,14,15 ,16 ]. Althoug h protein forwar d arrays have been successfull y applied in a numb er of settings [ 12–14,15 ], including to the profil ing recept or tyro sine kinas e activa- tion [ 14 ], reverse prote in arrays are argua bly the most broadly app licable technol ogy. They are based on the principle that compl ex protein mixtur es (such as a cell lysates) are spott ed in an array format and probed with selected antibodie s i n a mul tiplexed manner . Reverse arrays have bee n used to anal yze cell lines for potent ial biomarke rs [ 17 ], profile molecul ar pathways in cells excised from cancer tissue by laser capture mi cro-dis sec- tion [15 ,18 ,19 ], and detect auto- antibod ies in serum

Figure 1

The application of systems biology to drug discovery Cho et al. 295

of systems biology to drug discovery Cho et al. 295 Process d iagram of systems b

Process d iagram of systems b iology.

[ 16,20 ], whi ch opens new opportu nities for their applica- tion to screening in a clinical sett ing [ 20 ]. Further more , reverse arrays are part icularly well suited to moni tor dynami c network respons es ( Figure 2 ) t o compo und- induced pe rturbations. Thi s compo und-base d syst ems respons e profiling [5 ], or stru cture pathway activity rela- tionsh ip ( Figure 3 ) has the po tential to provi de early indicat ions about possi ble off-ta rget and toxic effects of drug candi dates. The availabilit y o f highly speci fic anti- bodies is, ho wever, a prere quisite for rever se arr ays. Where as a reasonabl e fraction of commer cially availa ble antibo dies ap pears to be suitabl e fol lowing validation, the genera lization of reverse arrays will de pend on the gen- eration of broad collect ions of high- quality anti bodies [ 15 ]. For instan ce, the Hum an Prote in Atlas Initiative ( http:// www.prot einatla s.org ), part of the Hum an Anti- body Initia tive of the Human Prote ome Organ ization (HUPO ), has started a syst ematic app roach to the

generati on and validat ion of anti bodies [ 21,22 ]. It is, however, notewort hy that molecular recognit ion is not limited to immunogl obulin domain s, and prote omics may soon benefi t from newly enginee red prote in-bindi ng domains such as desig ned ankyrin -repeat prote ins (DAR- Pins) [ 23 ]. While prote in arrays provide data for the average content of a sample (e.g., cell lysa tes), recent develop ments in flow cytomet ry-ba sed single- cell proteo- mics can further complem ent these technolog ies in that a limited number of signa ling events and surf ace markers can be meas ured simultan eously in the same cell and hence en able discri mination betwee n variou s cell popu- lations in rare samp les and biopsi es [ 24 ].

Beyond monitor ing the effe ct of perturbat ions on pre- defined network compo ne nts, systems biology relies on the systema tic analysi s of gen e function in signalin g pathways and cellular proces ses. This has recent ly been

296 Next-generation therapeutics

Box 1 Systems b iolo gy is interdisciplin ary.

E xperimental sciences: Direct systemic interrogation. Large scale genom ics, proteomics and metabolite measurements are used to mon itor g ene transl ation, p rotein expression, signaling events and me tabolite fluxes induced by the syst ematic perturbation o f a biolo g ical system by biological , genetic or chemical factors. Furthermo re, large-scale experimenta l m ethods are used to identify th e nodes o f signaling n etworks, through t he comprehen sive identif ication o f interaction p artne rs and p rotein modific ations (e.g., phospho rylations). The comparison o f s amples from a disease state with those o f m atched normal donor s a nd the creation o f a nima l mode ls of disease (through the knock -out of selected g enes and/o r th e introduction o f s pecific mutations) represents a w ell establ ished app roach t o study a system through t he analysis of naturally occur ring p erturbations. Data analysis and pathway i nformatics: Analy sis and under- standing of data in t h e context o f larger systems. Statistical and comp utational methods are used to analyze and i nterpret large experimenta l data sets w ith the aim t o identify molecular s pecies sign ificantly affected by the experimenta l c onditions ( perturbatio ns). Dat abases and software t ools a re used to store , manage and visua lize expression data i n the cont e xt of cellular network a nd pathw ay informatio n. This i nformation i s a ssemble d from literature by han d or using m ini ng techniques. Litera ture min ing: Discovery, extraction and synthesis of current kno w ledge. Entity recognition (identifying s ubstances), informat ion extrac tion (identify ing relationships between biological entities) and natural language processing (combination o f syntax and semantics analysi s t o e xtrac t relationships from complex sentences) are used to extract known facts such as protein–protein i nteracti ons, protein phospho rylation, regula tory relationships between mol ecular entities and genotype–ph enotype relation s hips from the scientific literature. Text mining enables the inferenc e of relationships b etween e xtracted entities a nd facts that a re not f orma lly enunci ated in a sentence. M a thematical modeling: Extrapo lation a nd prediction to test unders tanding. Mathematica l modeling a nd simulation is used to identif y a ssu mptions (iterating with literature mini ng) and gaps in unders t anding (iterating w ith additional analysis or data), and to gener ate new and experimentally verifiable hypotheses, thereb y closing t he iteration loop.

made possible by the dev elopment of cell-ba sed genome - scale ap proaches, where ov er 20 000 individua l genes can be inte rrogated in highly multi plexed experime nt. These technol ogies aim to quanti fy the effects of either the overex press ion of individua l prote ins using full length cDNA s [25 ,26 ] (Figure 4 ), or the inhi bition of gene expres sion by RNA i mol ecules [ 27–29 ] ( Figure 4 ). They are enable d b y the crea tion of genome -scale collect ions of reage nts [25,30,31 ] and their optimi zation throug h com- putat ional app roaches [32 ]. Thes e genome -scale screen s produce systems -level activi ty data for each gene, provi d- ing rich databa ses that can be mined to predi ct the physio logical and bioche mical fun ction of indi vidual pro- teins as well as the regul atory netw orks and pathw ays unde rpinning distinc t phenotypes . I n o n e early set of experi ments perform ed in mam malian cells, data from sever al cDN A overex press ion screen s were used to pre- dict the bioche mical activi ty of a gene of previ ously unkno wn function [ 25 ]. In a later exam ple, this approach was applied to ident ify g enes and pa thways capable of induci ng nuc lear TORC accumula tion [ 26 ]. The gen- ome-sca le RNAi-b ased screen was de velope d in

Drosophi la melanoga ster cell cultu res and used to exam- ine the role of 21 000 genes in cell growth [ 33 ]. This approach has sinc e been appli ed to a numb er of signal transduct ion and cell-ba sed processes in D. melanogas ter and mamma lian cell systems [ 34–36 ] as wel l as in vivo using the ne matode Caenor habditis elegans [ 34,37 ]. Furthermor e, similar ap proaches have recent ly been used to globa lly asses s the role of non -coding RN As on path- way function in mamma lian cells [ 38 ]. Althoug h high- through put functiona l syst ems level analysis provi des quantitativ e data abou t g ene activi ty and funct ional inte r- actions, it expect ed that new technol ogies and areas of investigatio n, such as microRN A biology [ 39,40 ], will emerge and further complem ent our unde rstanding of biological syst ems.

Data mining and pathway informatics

The evolut ion o f these genomic and proteomic met hods has necessit ated the develop ment of new algori thms to analyze the resulti ng data in the contex t o f dru g discovery [41 ]. In pa rticular, integrat ing data from differe nt experi- ments is a chal lenge that is being succes sfully addres sed. For e xample, a Bayesian infe rence of sub-net works from a set of 300 microa rray experime nts has be en used to uncover a numbe r o f pathways [ 42 ], and method s have been develo ped to overlay gen e expres sion data with genome -wide tra nscription factor location data obtai ned by ChIP-on-C hip exp eriment s [ 43 ], leading to the iden- tification of previou sly unkno wn regulato ry ne tworks using data obtained from rapamyci n treated S. cerevisiae [43 ]. Besides method s that infer pathways and regul atory networks , there is a growing num ber ( > 150) of databases [3 ] — of which KEG G [ 44 ] (http:/ /www.geno me.j p/keg/ kegg2.html ) i s probably the best known — that collect such info rmation from scienti fic publi cations using litera- ture mi ning and manua l curatio n. Suc h databa ses can be used to overl ay funct ional and pathway info rmation onto rank-order ed gene lists derived from differe ntial ex pres- sion experime nts [ 45 ]. This approa ch — called gene set enrichme nt anal ysis (GS EA) — rem oves the undue bias of selecting individual up-regulated genes by focusing on entire sets of genes [45 ]. GSEA led to the discovery o f several other disease-relevant pathways including in can- cer [46], Huntington’s disease [47] and myoblast differ- entiation [48], with potential implication for drug and biomarker discovery. By further combining signature- based predictions across s everal pathways, one can identify coordinated patterns of pathway deregulations. Suc h pat- terns were shown to distinguish specific cancers and tumor subtypes [49 ] and to reflect the biology and outcome of specific cancers. Furthermore, i n cell lines, these patterns pre dict their sensitivity to therape utic age nts [49 ] and may help repo sition or extend the applic ation o f ex isting drugs.

Literature mining

The scientifi c literatur e (w hich includ es patent s) is where the key knowle dge and facts relevant to systems biology

Figure 2

The application of systems biology to drug discovery Cho et al. 297

of systems biology to drug discovery Cho et al. 297 Sample applications of reverse protein arrays.

Sample applications of reverse protein arrays. (a) Monitoring phosphorylation e vents: Jurkat cells were treated with OKT3 and a -CD28 antibodies for t he indicated p eriod (x-axis), lysed and analyzed on reverse protein arrays using a n a -ERK (blue circle s) and a n a -phospho-ERK antibody (red bars). This demonstrates t hat transient a nd rapid phosphoryla tion e vents can b e m easur ed using reverse protein arrays. (b) Monitoring o ther signaling e vents: untreate d a nd LiCl-treated Jurkat cells were incubated for 1 6 h with the indicated c oncentrat ions of LiCl, which mimics Wnt p athway/ b -catenin signaling by inhibiting GSK3 b . Following lysis of the cells, the b -catenin levels were assessed using a specific antibody. T he inhibitio n o f GSK3 b -mediated phospho rylation o f b -caten in bloc ks its degradatio n, lead ing t o th e observed a ccumulation of b -catenin. (c,d) Monitoring th e downstream effect of cell signaling inhibitors. Starved A431 cells were stimulated with insulin and co-treat ed with increasing concentra tions of an inhibito r o f t he IGF1- receptor tyrosine kinase. After 30 min o f treatment, the c ells were lysed and the phosphoryla tion levels o f A kt and GSK3 b were monitored w ith antibo d ies specific for (c) Ser473-Pho spho-Akt and (d) S er9-phospho - GSK3 b . By plotting the perc ent inhibition versus inhibito r c oncentration, one can derive IC 50 -like data from such experiments. RFI is the r elative fluorescence intensity.

are stored and reported [ 50 ]. This resource is, howev er, growing an d diver sifying at a stag gering pace. As a con- sequen ce, comput ational tools design ed to efficiently extract entities and thei r relatio nships (bio logical facts) will play a pivot al rol e i n syst ems biology [ 51,52 ]. Indeed, model building starts with the identifica tion of the com- ponen ts of a system an d how they interact (Figure 1 ), facts that are then formal ized in a diagram from whi ch a mathe matical model will be derived [53 ]. The best approa ch to ident ify prote in and gen e entities in text is to use a careful ly cu rated list of synonym s [54 ] and recent ly de velope d met hods for synon ym extract ion [ 55 ] and termi nology disambig uation [ 56–58 ]. The con- secut ive ex traction of the inte raction s between these

entities relies on entity co-occurr ence analysi s [ 59–61 ] and na tural langua ge processi ng [62–67 ]. Thes e met hods have been succes sfully applied to the recons tructi on of networks [ 67,68 ] rel evant to drug discovery and to the analysis of biol ogical data [ 68 ,69 ] and bioinforma tics databases [70,71 ] in the contex t of literature -derived information and networks . Further more, combi ning text mining and statis tical app roaches enable s scienti sts to extract signi ficant informatio n about research trends , emerging fields from patent s [72 ], and infer (pote ntial) new pathw ay/target- disease relation ships from the scien- tific liter ature [ 67,73 ]. Rec ent advance s i n high- perfo r- mance GRID -base d text mining [ 74 ] o f full text scie ntific articles [75 ] open s new doors for the disco very of scie ntific

298

Next-generation therapeutics

Figure 3

298 Next-generation therapeutics Figure 3 Possible applicatio n o f reverse protein a rrays in systems

Possible applicatio n o f reverse protein a rrays in systems b iology: structure pathway act ivity relation ship (SPAR). (a) Selected cell lines are treated with appropriat e combinat ions of activating stimuli and treated with either (b) si/shR N A (c) or test compou nds. T reated cells are sampled in a time-depen dent manne r and lysed before being spotted on reverse protein arrays. The arrays are (d) incubated with pre-defined antibodies and (e) me asurements are taken. The systems response p rofi les (SRPs) are ded u ced from (f) the fluorescence inten sities and (g) store d in a database along with pathw ay informatio n. The treatmen t w ith siRNAs allows identific ation o f SRPs caused by well-targeted networ k p erturbatio ns, w h i ch can s erve as the reference set again st which SRPs caused by drug candidate s can be compared. (h) There by off-target effects c an be deduced.

facts and interpre tations so fundam ental to systems biol- ogy. Finally, inte grating text with biol ogical, chemica l and clinical da ta throug h text mining and en tity-asso- ciated rule s will en able syste ms biologi sts to quickly navig ate between scienti fic data domai ns that are cur- rently kept in disconn ected databa ses [ 76 ].

Mathematical modeling

The advances in experimental approaches and in data and literature mining have also accelerated progress in the development and application of modeling approaches [53,77]. The most widely applied modeling method is the deterministic biochemical reaction description. The formalism, analysis and application that has been reviewed extensively [78 –80 ] has matured t o the extent that an

annotation standard has begun to emerge [50 ,81]. Emer- ging graphical ontology standards [82,83] will greatly aid in harmonizing the academic and commercial software tools that are available. In drug discovery, this modeling method has been successfully applied in pharmacokinetics/phar- macodynamics and dose-response modeling [84,85], and wider application is anticipated ( http://www.fda.gov/oc/ initiatives/criticalpath/stanski/stanski.html).

One drawback of the deter ministic reac tion approa ch is its lack of scalabi lity. Genomi c and prote omic approa ches are aimed at identif ying signa ling networks of tens or more of molecules. The size, range of reac tion param eters over many orders of magni tude, and extent of unknowns make these models intract able to computat ional met hods.

Figure 4

The application of systems biology to drug discovery Cho et al. 299

of systems biology to drug discovery Cho et al. 299 Large-scale genetic screens. L arge curated

Large-scale genetic screens. L arge curated collectio ns of gene-sp ecific cDNAs and RNAi enable high ly m ultiplexe d and syst ematic gain a nd loss of functio n s cree ns. The diag ram outlines a hypothetical signal transduction p athway, initiated by a ligand-dependen t membrane b ound receptor, w hich a fter activation by a ligand leads to the activation o f a specific signal transduct ion pathway. T his pathway culminates in a downstream event s uch as transcription a l i nduction, which can be monit ored directly or, a s m ost o ften used in high-throughout genetic screens, via an enzym atic or fluorescent reporter gene. (a) Gain of functio n s cree ns are p erformed by selectively over-express ing a critica l pathw ay component (grey circles) and are expected to drive the e quilibrium of a p athway forward, thus activating the reporter. Conversely, (b) loss of function screens are based on the elimination o f a critical compon ent (X ) w ith an RNAi reagent th at will prevent the a ctivation of th e p athway by a stimulus a nd thus the reporter ge ne. T he phenotypic respo n se, such a s th e nuclear transport o f regulat ory m olecules, can b e q uantified using (c) a m icroscopy- based method. In this experiment, the effects of over 7000 individua l genes on the n uc lear import o f the TORC1 CREB co-activa tor w ere quantifie d. This allowed the identification o f a variety of genes and pathways capable o f inducing TORC1 transl ocation and CREB activation [21 ].

New met hods such as combin atorial reaction gene ration [ 86 ], and linear progr amming [8 7,88 ] addres s this need by provid ing automat ed meth ods for ha ndling the compl ex- ity of large chemica l reacti on ne tworks.

Other met hods are add ressing the more fund amental issue of the limi tation of the determin istic approxim ation, itself. Stocha stic repre sentati ons account for the effects of small pop ulations [ 89 ]. Rule-ba sed comput ational approa ches move compl etely away from a deter ministic descr iption [90–92 ]. Thes e met hods provid e a molecul e- centric descript ion and can there fore be a natur al bridge betwe en data-driven infe rence and pred ictive model s [ 93 ], as are funct ional inference met hods [ 94 ]. These meth ods are highly scalabl e and easy to simula te; howev er, it is an open que stion as to how well they wi ll be able to addres s questions invol ving complex non - linear dy namics.

Finally, there are many syst ems biology model s that do not resem ble reaction networks (for a range of examp les visit http:/ /www.c ellml.or g ). In drug develo pment , the most succes sful of these model s are cardiac electro phy- siology (EP ) models that have be en ap plied to safety assessm ent [ 95 ] and, most recent ly, to provide mecha n- istic rationa le sup porting the Januar y 2006 US FDA approval of ranol azine for chroni c angina [96 ] ( http:// www.fda.g ov/bbs /topics /news/2006/ NEW013 06.htm l).

Conclusions

Althoug h the met hods and tools are advancing withi n each disc ipline, it is thei r iterat ive, combinat orial app lica- tion that defines the syst ems biol ogy approa ch. We believe that the discovery and unde rstanding of complex disease mecha nisms and therapeu tic modal ities will increasingl y require this app roach. Thi s will ha ve a pro- found im pact on the systema tic crea tion of large

300 Next-generation therapeutics

collect ions of reage nts (such as antibod ies, RNA i and cDNA ), detect ion met hods and labor atory technology, comput er scien ce and organi zational design . Ind eed, a more widespr ead collabor ation between mathema tician s, comput er scienti sts, phy sicians and exp eriment al sci en- tists will probabl y i m prove drug discovery in the next decade. This will certai nly impact resea rch organiza tion s and the skills needed in resea rch teams, and as a con- sequen ce cal ls for a broader scien tific educat ion of drug discov ery sci entists , bridging the cla ssical disc iplines of biology, mathema tics, chemi stry and medici ne in an unprecede nted manner . I n these way s, systems biology prom ises to imp act drug discovery signi ficantly and improve the success rate of disc overing crucial medicin es.

Acknowledgements

We thank M ark S B ogusk i and Alan Buckler for their input and support during t he writing of t his review.

References and recommended reading

Papers o f partic ular interest, published w ith in the a nnual period o f review, have been h ighlighted as:

of outstanding interest

of special interest

1.

Schmid EF, Smith DA: Keynote r eview: is declining innovation i n

the pharmaceutica l industry a m yth? Drug Discov Today 2005,

10 :1031-1039.

This analysis, based on FDA data from 1945 t o 2004, shows that the increasing R & D costs are not due to a decl ine in innovation o f the pharmaceu tical industry.

2. Fishman MC, Porter JA: A new g r ammar for d rug d iscovery . Nature 2005, 437 :491-493.

3. Butcher E C , B e rg E L, Kunkel EJ: System b iology in drug discovery . Nat B iotechno l 2004, 22 :1253-1259.

4. Apic G , I gnjatovic T, Boyer S, Russell RB: Illuminating d rug discovery w ith biological pathways . FEBS Lett 2005,

579 :1872-1877.

5. van der Greef J , M cBurney RN: Rescuing drug discov ery: in vivo systems pathology and systems pharmacology . Nat Rev Drug Discov 2005, 4 :961-967.

6. Lindon JC, Holmes E, Nichols o n J K: Metabonomics: systems biology in pharm aceutical research and developme nt. Curr Opin Mol Ther 2004, 6 :265-272.

7. Morris M, Watkin s SM: Focuse d metabolomic p rofiling i n the drug devel opment process: advances from lipid profilin g. Curr Opin Chem B iol 2005, 9 :407-412.

8. Saghatelian A, Crav att B J: Global strategies to i ntegrate t h e proteome and metabolom e. Curr Opin C hem B iol 2005, 9 :62-68.

9. Luo M, Reyna S, Wang L, Yi Z, Carr oll C , Dong LQ, Langlais P, Weintraub ST, Mandarino LJ: Ident ification o f i nsulin receptor substrate 1 s e rine/threonine phosph orylation sites u sing mass spectrome try analysis: regulatory r ole o f s e rine 1 223 . Endocrino logy 2005, 146 :4410-4416.

10. Bouwmeeste r T , B auch A, Ruffner H, An grand PO, Bergamini G ,

Croughton K , C ruciat C, Eberhard V, Gagneur J , Ghidelli S et al.:

A physical and functional map of the human TNF- a /NF- k B

signal transduction p athway . Nat C el l B iol 2004, 6:97-10 5.

11. Schulze WX, Deng L , Mann M : Phosph otyrosine i nteractome

of the E rbB-receptor kinase f amily . Mol Syste ms Bio l 2005,

1 :42-54.

12. Jones RB, Gordus A, Krall J A, MacBeath G : A quantitativ e protein i nteraction network for th e E rbB r eceptors u sing protein m icroarrays . Nature 2006, 439:168-1 74.

13. H ultschig C, Kreutzberger J, Seitz H, Konthur Z, Bu¨ ssow K , Lehrach H: Recent advan ces of p r otein m icroarrays . Curr Opin Chem Biol 2006, 10:4-10.

14. N ielsen UB, Cardone MH, Sinskey AJ, MacBeath G , Sorger PK:

Profiling r eceptor tyrosine kinase activation b y u sing A b microarrays . Proc Natl Acad S ci USA 2003, 100:9330- 9335.

15. Gulmann C, Sheehan KM, Kay EW, Liotta LA, Petric oin EF:

Array-based proteomics: mapping of protein c ircuitries f or diagnostics, prognostics, and therapy g u idance i n cancer . J P athol 2006, 208:595-6 06. The authors review f orward-phase and reverse-phase protein a rrays in a number o f applications.

16. Balboni I, Chan SM, Kattah M, Tenenbaum JD, Butte AJ, U t z PJ:

Multiplexed p rotein array p latform s f or analysis o f autoimmune diseases . Annu Rev Immun ol 2006, 24 :391-418.

17. N ishizuka S, Charbone au L, Young L, Major S , R einhold WC, Waltham M , Kouros-Mehr H , Bussey KJ, Lee JK, Espina V et al.:

Proteomic p rofiling o f t he NC I-60 cancer cell lines using new high-density reverse-phase lysate microarrays . Proc Natl Acad Sci USA 2003, 100 :14229-14234.

18. Wulfkuhle JD, Aquino JA, Calvert VS, Fish man DA, Coukos G, Liotta LA, P etricoin EF: Signal pathway profiling o f ovarian cancer from hum an tissue specimen s using reverse-phase protein m icroarra ys . Proteomics 2003, 3:2085- 2 090.

19. Sheehan KM et al. : Use of reverse phase prote i n m icroarrays and r eferenc e s tandard developme nt for molec ular network analysis of metastatic ovarian carcin oma . Mol Cel l Proteomics 2005, 4 :346-355.

20. Janzi M, Oedling J, Pan-Hammarst ro¨ m Q , S undberg M,

Lundeberg J, Uhle´ n M , H ammarstro¨ m L , N ilsson P : Serum microarrays for l arge scale s creening of protein l evels . Mol Cell Proteomics 2005, 4 :1942-1947 . The authors a nalyzed over 2000 seru m samples using reverse arrays and compared the d ata with clinical routine me asurements. The resu lts suggest that reverse protein arrays can b e used to scree n clinical serum samples.

21. Amini B, Andersen E , Andersson AC, Angelidou P, Asplund A , Asplund C , B e r glund L , B e rgstro¨ m K , B rumer H, Cerj an D et al. : A human prote i n a tlas f or norma l and cancer tissues based on antibody prote omics . Mol Cell Proteomics 2005, 4 :1920-1932 .

22. N ilsson P , P aavilainen L, Larsson K , Odling J, Sundberg M , Andersson AC, Kampf C, Persso n A , A l - Khalili Szigyarto C, Ottosson J et al. : Towards a human proteome atlas:

high-throughput generation o f mono-s pecific antibodies for tissue profiling . Proteomics 2005, 5 :4327-4337.

23. B inz HK, Amstutz P, Plu¨ ckth un A: Engin eering nov el bindi ng proteins from nonimm unoglobin domains . Nat Biotec hnol 2005, 23:1257-1268 .

24. Irish JM , Kotecha N, Nolan GP: Ma pping norma l and cancer cell sign a ling network s : towards single-cell p roteomics . Nat R ev Cance r 2006, 6 :146-155.

25. I ourgenko V , Z hang W, Mickanin C, Daly I , Jiang C, Hexham J M, Orth AP, Mirag lia L, Meltzer J, G arza D et al. : Identification o f a family of cAMP response e lement-bindi ng protein co- activators by genome-sca l e f unctional analysi s in mammali a n cells . Proc Natl Acad Sci USA 2003, 100:12147-121 5 2.

26. B ittinger MA, McWhinnie E, Meltzer J, I ourgenko V , L atario B, Liu X, Chen CH, Song C , Garza D, Labow M: Activation o f cAMP response e lement-mediate d gene express i on by regulated nuclear transport o f TORC proteins . Curr B iol 2004, 14:2156-2161 .

27. Ito M, Kawa no K, Miyag ishi M , T aira K: Genome-wide appl ication of RNA i t o t he discovery o f p otential d rug t argets . FEBS Lett 2005, 579:5988- 5995.

28. Wheeler DB, Carpenter AE, Sabatini DM: Cell m icroarrays and RNA interference chip away at gene functi o n. Nat G enet 2005, 37(Suppl) :S25-S30.

29. Bailey SN, Ali SM, Carpenter AE, Higgins CO, Sabatini DM:

Microarrays of lentiviruse s f or gene functi o n s creens i n immortalized and primary cells . Nat Met hods 2006, 3 :117-122.

The application of systems biology to drug discovery Cho et al. 301

30. Paddison P J, Silva JM, C onklin DS, Schlabach M, Li M, Aruleba S, Balija V, O’Shaug hnessy A, Gnoj L, Scobie K et al. : A r esou rce f or large-scale RNA-interference -based screens i n mamm als. Nature 2004, 428 :427-431.

31. Moffat J, Gru eneberg DA, Yang X, Kim SY, Kloep fer AM, Hinkle G,

Piqani B, Eisenhaure TM, Luo B, Grenier JK et al. : A l entiviral RNAi library for human and mouse g enes applied to a n a rrayed viral high-content screen . Cel l 2006, 124:1283- 1 298. Describes a library of 22 000 RNAi reagents that can be used in loss of function screens in human and murine syst ems.

32.

Huesken D, Lange J, Mickanin C, Weiler J, Ass e lbergs F, Warner J, Meloon B, E ngel S, Rosenbe r g A , Coh en D et al. : Design o f a genome-wide siRNA libra ry using a n a rtificial neural network . Nat B iotechnol 2005, 23 :995-1001.

33.

Boutros M , K iger AA, Armknec h t S , Kerr K , H ild M, Koch B , Haas SA, Paro R, Perrimon N: Heidelbe rg fly array consortium:

genome-wide RNAi analysi s o f g rowth and viability in drosophila cells . Science 2004, 303:832-835.

34.

Carpenter AE, Sabatini DM: Systematic genome-wi de screen s of gene func tion. Nat R ev Genet 2004, 5 :11-22.

35.

Dasgupta R , P errimon N: Using RNAi to catch Drosophila genes in a web of interactions: insights i nto cancer research . Oncogene 2004, 23 :8359-8365.

36.

Berns K , H ijmans EM, Mullenders J, Brummelkamp TR, Velds A, Heimerikx M, Kerk hoven RM , M adiredjo M, Nijkamp W, Weig elt B et al. : A l arge-scale RNAi screen in human cells identifies new componen ts of the p53 pathway . Nature 2004, 428:431-437.

37.

Sieburth D, Ch’ng Q , D ybbs M, Tavazo ie M, Kennedy S , W ang D , Dupuy D, Rual JF, H ill DE, Vidal M et al. : Systematic analysi s o f genes required for syna p se structure and function . Nature 2005, 436 :510-517.

38.

Willingham AT, Orth AP, Batalov S, Peters EC, Wen BG, Aza-Blanc P , H ogenesch JB, S chultz PG: A strategy for p robing the function o f noncoding RNAs finds a repres s or of NFAT . Science 2005, 309 :1570-1573.

39.

Bentwich I: Predicti o n and validation o f m icroRNAs and t heir targets . FEBS Lett 2005, 579:5904-5910 .

40.

Bentwich I, Avniel A, Karov Y, Aharo n ov R, Gilad S, Barad O, Barzilai A, Einat P , E inav U, Meiri E et al. : Identificat ion o f hundreds of conserved and nonconser ved human microRNAs . Nat Genet 2005, 37 :766-770.

41.

Allison DB, Cui XQ, Page GP, Sabrip our M: Microarra y data analysis: from d isarray t o consoli d ation and consensus . Nat Genet 2006, 7 :55-65.

42.

Pe’er D , R egev A, Elid an G, Fried man N : Inferring s ubn etworks from pert u rbed ex pression profiles . Bioinf ormatics 2001,

17

:S215-S224.

43.

Bar-Joseph Z, Gerber GK, Lee T I, Rinaldi N J, Yoo JY, Robert F, Gordon DB, Fraenkel E , Jaakkola TS, Young RA, Gifford DK:

Computational discov ery o f gene modules and regul atory networks . Nat B iotechnol 2003, 21 :1337-1342 .

44.

Kanehisa M, Goto S , H attori M, Aoki-Kinoshita KF, Itoh M , Kawashim a S , K atayama T , A raki M, Hiraka wa M: From genomics to chemical genomics: new developments i n KEGG . Nucleic Acids Re s 2006, 34 :D354-D357.

45.

Mootha VK, Lindgren CM, Eriksso n KF, Subramanian A, Sihag S ,

Lehar J, Puigserve r P , C arlsson E , R idd e rstrale M, Lau rila E et al. :

PGC-1alph a-responsive genes involved i n o xidative phosphorylat ion are c oordin a tely downregulated in human diabetes . Nat G enet 2003, 34:267-2 73.

Describes gene set enrichmen t a nalysi s a nd how this method detect s modest but coordinated chang e s i n t he expression o f g roup s o f func- tionally related genes. The a uthors demonstra te how this meth od can be applied to t he analysis of expression d ata obtained from b iopsie s.

46. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL , Golub TR, Lander ES, Mesirov J P: Gen e set enrichment analysi s: a knowledge-based approach for i nterpreting genome-wide expression p rofiles . Proc Natl Acad Sci USA 2005, 102:15545 -15550.

47. Hodges A , S trand AD, Aragak i AK, Kuhn A , S engst a g T , H ughes G , Elliston LA, Hartog C , Goldst e in DR, Thu D et al. : Regional and cellular g ene expression changes in human Huntington’s disease b rain . Hum Mol Genet 2006, 15 :965-977.

48. Szustakowski JD , Lee JH, M arrese CA, Kosinski PA, Nimala NR, Kemp DM: Ide n tification o f novel pathway regulation during m yogenic d ifferentiation . Genomics 2006,

87:129-138.

49. Bild AH, Yao G , C hang JT, Wang QL, P otti A, Chasse D , J oshi MB,

Harpole D , L ancaster JM, Berchuck A et al. : Oncogenic pathway signatures in hum an cancers a s a guide to targeted therapies . Nature 2006, 439:353-3 57. Demonstrates how th e combination o f p atterns of pathway deregulations distinguishes between cancer types, reflects th eir biolo g y a nd outcom e and how they are predictive for treatment s ens itivity in cell c ultures.

50. Oda K, Matsuoka Y , Funahashi A, Kitano H : A comprehensive

pathway map of epidermal growth factor receptor signaling . Mol Syste ms Bio l 2005, 1 :8-24. This review p resents a high-q uality literature-b ased n etwork recon struc- tion o f EGF signaling.

51. Shatkay H , Edwards S , W ilbur WJ, Boguski M: Genes, themes and microarra ys: using inform ation r etrieval for l arge-scale gene analysis . Proc Int Con f Intell Syst Mol Biol 2000, 8:317-3 28.

52. Jensen LJ, S aric J, Bork P : Literature mining for t he biologist:

from i nformation r etrieval t o b iologi cal discovery . Nat Rev Genet 2006, 7 :119-129. Using a n example sentence, this review p resents a good overvie w of literature mining for b iologists describing the different techniques, what they can be used for and wh at their limitat ions are.

53. Rajasethupathy P , V ayttaden SJ, B halla US: Systems modeling:

a pathway to drug discovery . Curr Opin C he m B iol 2005, 9:400-4 06.

54. Hanisch D, Fundel K , M evisse n H T, Zimmer R , F luck J: Prominer:

rule-based protein and gene entity recognition . BMC Bioinformatics 2005, 6 :S14.

55. Cohen AM, Hersh WR, Dubay C , Spack man K : Using co- occurrence network structure t o extract synonymous gene and protein names from medline abstracts . BMC Bioinformatics 2005, 6 :103.

56. Chen L , Liu H, Friedman C : Gene name ambigui ty of eukaryot ic nomenclatures . Bioinformatics 2005, 21 :248-256.

57. Gaudan S , K irsch H , R ebholz-Schuhman n D: Resolving abbreviations to their senses in Medline . Bioinform atics 2005, 21:3658-3664 .

58. Schijvenaars BJ, Mons B, Weeber M, Schuem i e M J, van Mulligen EM, Wain HM, Kors JA: Thesauru s -based disambiguation o f gene symbols . BMC Bioinformatics 2005,

6:149.

59. Cooper J W, Kershen baum A: Discove r y o f p rotein–prot e in interactions usi ng a com bination o f ling uistic, statis tical and graphical i nformat ion. BMC Bio informatics 2005, 6 :143.

60. Ramani A K , Bunescu RC, Mooney RJ, M arcot t e EM:

Consolidating t he set of known human protein–protein interactions in preparation f or large-scale m a pping o f the human interactome . Genome Biol 2005, 6 :R40.

61. Alako BT, Veldhoven A, van Baal S, Jelier R, Verhoeven S , Rullmann T , P olman J, Jenst er G: Copub ma pper: m ining Medline based on search term co-publi cation . BM C Bioinformatics 2005, 6 :51.

62. Temkin JM, G ilder MR : Extraction o f p rotein interacti o n information from unstructur ed text using a context-free grammar . Bioinformatics 2003, 19 :2046-2053.

63. Daraselia N, Yuryev A, Egorov S, Novichkova S, Nikitin A, Mazo I : Extracting hum an prote i n i nteractio ns from Medline using a f ull-sente nce parser . Bio informatics 2004,

20:604-611.

64. Rzhetsky A, Iossifov I , Koik e T, Krau thammer M , K ra P, Morris M, Yu H, Duboue PA, Weng W, Wilbur WJ et al.: Geneways: a s ystem

302 Next-generation therapeutics

for e xtracting, analyzing, visualizin g, and integrating

79.

Eungdamrong N J, Iyengar R : Computation a l approa ches f or

molecular pathway data . J B iomed Inform 2004, 37:43-53 .

modeling r egulatory cellular networks . Trends Cell Biol 2004,

65.

Narayanasw amy M, Ra vikumar KE, Vijay - Shanker K: Beyond the clause: extraction o f phospho rylation i nformation from Medline abstracts . Bioinformatics 2005, 21 :i319-i327.

66.

Saric J, Jensen LJ, Ou zounova R, Rojas I, Bork P : Extr action o f regulatory g ene/protein networks from Medline . Bioinform atics 2005, 22 :645-650.

67.

Chen H , S harp BM : Content-rich b iological network constructed by mining Pubmed abstracts . BMC Bioinform atics 2004, 5 :147.

68.

Krauthamme r M , K aufmann CA, Gil liam TC, Rzhetsky A:

Molecular triangulati on: bridging linkage and molecular-

network i nformat ion f or identifyi ng candidate genes in Alzheimer’s disease . Proc Natl Aca d Sci USA 2004,

101 :15148-15153.

The authors i ntegrated literature-b ased molecular networks and genetic linkage maps to find c andidate disease genes .

69. Tiffin N, Kelso JF, P owell AR, Pan H , B ajic VB, Hide WA:

Integration o f text- and data-mining usi n g ontologies successfully selects d isease gene candidat es . Nucleic Acids Res 2005, 33:1544- 1552. The authors combined tissue -expression data w ith dis ease–tissue rela- tionships e xtrac t ed from the literature to predict candidate disease genes.

70. Hu Y, Hines LM, Weng H, Zuo D, Rivera M, Richardson A , LaBaer J: Analysis o f g enomic and proteomic data u sing advanced literat u re mining . J P roteome Res 2003,

2 :405-412.

71. Korbel JO, Doerks T, Jensen LJ, P erez -Iratxeta C, Kaczanowski S,

Hooper SD, Andrad e MA, Bork P : Systematic association o f genes to phenotypes by genome and literat u re mining . PLoS Biol 2005, 3 :e134. The authors combined comparative proka ryote genome analysis with literature mini ng and uncovered genes that might play a r ole i n infectious diseases.

72.

Grandjean N, Charpi o t B , P ena CA, Peitsch MC: Comp etitive intelligenc e and patent analysi s in d rug d iscovery. Mining the competitive know ledge bases and paten ts. Drug Discov Today:

Technol 2005, 3 :211-215.

73.

Natarajan J , B errar D, Dubitzky W, Hack CJ, Zhang Y, DeSesa C, Van B rockl y n J R, Bremer EG: Tex t m ining o f f ull-text journal articles com bined with gene express i on analysi s r eveals a relationship between sphingosine-1-ph o sphate and invasivity of a g lioblastoma cell line. BMC Bioinform atics 2006, in press.

74.

Natarajan J , Mulay N, DeSesa C, Hack CJ, Dubitzky W, Bremer EG: A g rid infrastructure f or text mining of full text articles and creat i on of a knowledge base of gene relations . Lecture Notes in Bioinform atics 2005, 3745 :101-108.

75.

Natarajan J , H aines C, Be rglund B, DeSesa C, Hack CJ, Dubitzky W, Bremer EG: Getitfull – a tool for downloadin g and pre-processi n g f ull-text journal articles . Lecture Notes in Computer Science 2006, 3869 :139-1 45.

76.

Romacker M , G randje a n N : 002C Parisot P , K reim O, Cronenberger D, Vachon T, Peitsch MC: The Ultralink: an expert system for conte x tual hyperlinki ng in know ledge management . I n Computer Applications in P harmaceutical Research and Development. Edited by Ekins S . W iley & Sons; 2006:729-7 53.

77.

Janes K A , Lauffenburger DA: A b iologi cal approach to computational mode ls of proteomic networks . Curr Opin C hem Biol 2006, 10 :73-80.

78.

Kholodenko BN: Cell-si gnalling d ynamics in time and space .

Nat Rev Mol Cell Biol 2006, 7 :165-176.

Detailed review o f p rinciples, arche typical dynamical behavior s (feed- back, bistable swi t ch and o scillators), and s om e classical exam p les.

14:661-669.

This article uses a MAPK example t o o rient the reader th rough a review o f methods, including a review of open access tools a vailab le at the time. For an updated list see http://ww w .sbml.org .

80. Arnold S, Si emann-Herzberg M, Schmid J, Reuss M : Mode l-

based inference o f gene expression d ynamics from s equence information . Adv Biochem Eng B iotechno l 2005, 100:89-17 9. Extended review for t he expert who wishes t o follow the model from building and vali dation t hrough d etailed analysis.

81. L e Novere N, Finney A, Huc ka M, Bhalla US, Camp agne F, Collado-Vides J , C ramp in E J, Halstead M, Klipp E , Mendes P et al. :

Minimum inform ation requested i n t h e annotation o f biochemical models ( MIRIAM) . Nat B iotec hnol 2005, 23:1509-1515 .

82. Karp PD, Mavrovouniotis ML: Repres enting, analyzing, and synthesizing b iochemical pathways . IEEE Expert 1994, 9 :11-21.

83. K itano H , Funahashi A, Matsuoka Y, Oda K: Using process diagrams for the graphi cal repres entation o f b iologi cal networks . Nat Biotechno l 2005, 23:961-9 66.

84. C hien JY, F riedrich S, Heathman MA, de Alwis DP, Sinha V :

Pharmacokinetic s/pharmacodynami cs and the stages of drug development: r ole o f modeling and simulat ion. AAP S J 2005, 7:E544-E5 5 9.

85. Andersen ME, Thomas RS, Gaido KW, Conolly RB: Dose- response modeling i n r eproductive toxico l ogy in the s ystems biology era . Reprod Toxicol 2005, 19 :327-337.

86. B linov M L, Faeder JR, Goldstein B, Hlavacek WS: Bionetgen:

software f or rule-based modeling o f sign a l transduction based on the i nteractions of molecular d omains . Bioinformatics 2004, 20 :3289-3291.

87. Lin X, Floudas CA, Wang Y, Broach JR: Theoret ical and computational studies of the g lucose s ignaling pathways in yeast u si ng global g ene expression data . Biotechno l Bio e ng 2003, 84 :864-886.

88. A raujo RP, Liotta LA: A control t heoretic paradigm for cell signaling network s : a simple complexity fo r a sensitive robustness . Curr Opin Chem B io l 2006, 10:81-87 .

89. M a’ayan A , B litze r RD, Iyengar R : Toward p redictive models of ma mmalian cells . Annu Rev B iophys Biomol Struct 2005,

34:319-349.

90. Fisher J, Piterman N, Hubb ard EJ, S tern MJ, H arel D:

Computational insights into Caenorh abditis elegans v ulval development . Proc Natl Acad Sci USA 2005, 102 :1951-1956.

91. E rrampalli DD, Priami C , Q uaglia P: A f ormal langua ge for computational systems biology . OMI CS 2004, 8 :370-380.

92. F ages F , Soliman S, Chabrier-Rivie r N : Modelling and querying interaction n etworks in the b iochemical abstra ct machine BIOCHAM . J B io Phys Chem 2004, 4 :64-73.

93. M endoza L , X enarios I : A method for t he g eneration o f standardized qualitative dynamical systems of regulator y networks . Theor B iol M ed Model 2006, 3 :13.

94. R ice JJ, Stolovi tzky G: Making th e most o f it: pathway reconstruction and integrativ e s imulation u sing the d ata at hand . Biosilico 2004, 2 :70-77.

95. Bottino D , P enland RC, Stamps A, Traebe rt M, Dumotier B , Georgiva A, Helmlinger G : Preclinical cardiac safety assessment of pharmaceutical compounds using an integrated systems- b ased computer mode l o f t he heart . Prog Biophys Mol Biol 2006, 90:414-4 43.

96. Fredj S , S ampson KJ, Liu H, Kass RS: Molecula r basis o f ranolazine b lock of LQT-3 mutant sodium channels: evidence for site of action . Br J Pharmacol 2006, 148:16-24.