You are on page 1of 20
D642-D648_ Nucleic Acid Research, 2006, Vol. 34, Daiabase issue ddoi:10.1093Inarlyhj097 The International Gene Trap Consortium Website: a portal to all publicly available gene trap cell lines in mouse Alex S. Nord", Patricia J. Chang’, Bruce R. Conklin®, Antony V. Cox, Courtney A. Harper’, Geoffrey G. Hicks*, Conrad C. Huang’, Susan J. Johns’, Michiko Kawamoto’, Songyan Liu’, Elaine C. Meng’, John H. Mortis’, Janet Rossant®, Patricia Ruiz®, William C. Skarnes®, Philippe Soriano’, William L. Stantord®, Doug Stryke", Harald von Melchner®, Wolfgang Wurst", Ken-ichi Yamamura", Stephen G. Young", Patricia C. Babbitt’ and Thomas E. Ferrin’ ‘University of Calfomia, San Francisco, 600 16th Street, San Francisco, CA 941 43-2240, USA, *Gladstone Institute of Cardiovascular Disease, University of California San Francisco Department of Medicine and Pharmacology, 1650 ‘Owens Street, San Francisco, CA 94158, USA, *Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK, ‘Manitoba Institute of Cell Biology, University of Manitoba, 675 McDermot Avenue, Winnipeg, Manitoba, Canada R3E OV9, “The Hospital for Sick Children, Toronto, Ontario, Canada MSG 1X8, °Center for Cardiovascular Research, Charité Universitatsmedizin and Department of Vertebrate Genomics, Max Planck Institute for Molecular Genetics, 14195 Bertin, Germany, ’Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, Seattle, WA 98109-1024, USA, “University of Toronto, 4 Taddle Creek Road, Toronto, Ontario, Canada MSS 3G9, "Department of Molecular Hematology , University of Frankfurt Medical School, 60590 Frankfurt am Main, Germany, ‘°GSF Research Center for Environment and Heatth, Institute for Developmental Genetics, Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany, “institute of Molecular Embryology and Genetics, Kumamoto University, 2-2-1 Honjo, Kumamoto 860-0811, Japan and ‘University of California, Los Angeles, 650 Charles E. Young Dr So., Los Angeles, CA 90095, USA. Rocoived August 16, 2005; Revised October 7, 2005; Aecopted October 16, 2005 ABSTRACT Gene trapping is a method of generating murine ‘embryonic stem (ES) cell lines containing insertional mutations in known and novel genes. A number of International groups have used this approach to cre- ate sizeable public cell line repositories available to the scientific community for the generation of mutant mouse strains. The major gene trapping groups worldwide have recently Joined together to central- ize access to all publicly available gene trap lines by developing a user-oriented Website for the International Gene Trap Consortium (IGTC). This col- laboration provides an impressive public informatics resource comprising ~45 000 well-characterized ES cell lines which currently represent ~40% of known mouse genes, all freely available for the creation ‘of knockout mice on a non-collaboralive basis. To “To whom conespondince shoul be addressed, Email fel vests standardize annotation and provide high confidence data for gene trap lines, a rigorous identification and annotation pipeline has been developed combining genomic localization and transcript alignment of gene trap sequence tags to identify trapped loci. This information is stored in a new bioinformatics database accessible through the IGTC Website inter- face. The IGTC Website (www.genetrap.org) allows users to browse and search the database for trapped genes, BLAST sequences against gene trap sequence tags, and view trapped genes within biological path- ways. In addition, IGTC datahave been integrated into major genome browsers and bioinformatics sites to Provide users with outside portals for viewing this data. The development of the IGTC Website marks a ‘major advance by providing the research community with the data and tools necessary to effectively use (© The Autor 20065, Published by Oxford University Pes. All igh reserved ‘The online yeson ofthis article has bee published under an open acest mad. Users ae entitled to as, reproduc, dein, o play the open acess version ofthis aril for no commercial purposes provided thal the ona stheship i propery ao lly stbaed he Joural aed Ontrd Univesity Press seated asthe erga plac of publication wih he corect tation etal given: ian aces subequentyrepeodacedordserated ot init entity but only in prt rasa envi work this mas be clay inet, Par commercial re-use, please contact jonas pemissons@oxfonjoural are Public gene trap resources for the large-scale charac- terization of mammalian gene function. INTRODUCTION The large and continuslly growing number of genome sequencing projects provides an opportunity to greatly advance our understanding of genetics and disease. One of the keys to realizing this goal is the development of genomic resources to elucidate functional characteristics of these genes, especially in mammalian genomes. The mouse isan especially useful mammalian model system, providing sn excellent sub- ject for studies of gene function because of its short genera~ tional span, ease of handling, and the close structural and functional similarity of its genome to that of humans (1,2) Furthermore, using this organism, scientists have access to & ‘wide range of procedures for genetic manipulation, including the use of embryonic stem (ES) cells to create mice with defined single-gene mutations using gene targeting and gene trapping techniques (3). Gene tapping is high-throughput method of creating mutagenized ES cells for use in generating knockout and other mutant mouse strains for research in functional genomics (4). Second generation gene trap vectors have recently enhanced the value of the method by offering the potential for creating conditional and other desired alleles using site- specific recombination ($7). Major scientific initiatives are ‘currently underway in North America and Europe to knock out every mouse gene in ES cells in order to characterize gene function and provide insight into systems associated with human disease (8.9). ‘A number of gene trap projects have already made notable progress toward this goal by generating resources of gene trap ‘mouse ES cell lines harboring well-characterized insertional ‘mutations (6,7,10-14), although until now the individual gene trap projects have been isolated, providing only details about their own cell lines. The International Gene Trap Consortium (IGTC) is a collaboration representing the major public gene trap resources worldwide, whose mission is to offer the sci- entific community access to all publicly available gene tap cell lines on # noa-collaborative basis for nominal hendling fees (15). The centralization of gene trap resources provides many advantages to the research community, allowing more effective utilization of the experimental opportunities offered by gene trap cell fines through standardized protocols for the identification and annotation of sequences from trapped loci, and the increased availablity of experimental protocols. As Table 1 IGTC members Nucleic Acids Research, 2006, Vol. 34, Database issue D643 reported here, the release of the IGTC Website (www. ‘genctrap.org) marks a major advance, generating a standard ized informatics pipeline and providing in one place both easy access to all publicly available gene trap cell lines and sophis- ticated tools for analysis of resource dats. Gene trap centers currently involved in this effort are listed in Table 1 IGTC: RESOURCE OVERVIEW ‘The IGTC Website centralizes access to all publicly available gene trap cell line data forthe frst time. This repository was created to address the needs of the international gene trap ‘community by providing researchers with the data and infor- matics tools necessary to find gene trap cell lines with muta- tions in genes and loci of interest. IGIC member projects produce gene trap cell lines and direetly submit gene trap sequence tags to the Genome Survey Sequences Database (dbGSS) division of GenBank at NCBI (16). Data from all publicly available cell lines are downloaded from dbGSS and subjected to the IGTC identification and annotation pipeline, which then automatically populates the MySQL database used to generate the annotation information presented on the Web- site. The IGTC Website bas been designed to provide easy user access tothe extensive array of assembled gene trap informat- ics data. Interfuce options include homology searches using BLAST, search and browse capabilities, and viewing trapped genes within biological pathways. The project also includes the integration of gene trap data st major genome browsers and other informatics data sites in order to offer a variety of outside portals to this data. Cell line requests from the IGTC: site are forwarded to the originating gene trap resource, where the cell line is removed from cryogenic storage and sent the user for experimental analysis, The IGTC site also provides useful documentation, on-line tutorials and scientific overviews on {gene trapping and the use of gene trap cell lines, GENE TRAP IDENTIFICATION AND ANNOTATION PIPELINE Gene trap mutations are characterized through a process of sequencing, identification and annotation, This process involves obtaining cDNA or genomic sequence upstream or downstream of the insertion site and identifying and annotat- ing the locus at whieh the insertion occurs. A full identification nd annotation protocol has been developed for the IGTC that integrates genomic and transcript-based identification approaches and adds information from other major informatics IGTC members| Call ines Website ‘Baygenonies (USA) 9848 ‘war hay genomic ada! Cente for Modeling Human Disease (Toronto, Cana) 4137 ‘coer genetp! ntryoni Siem Cell Database (Univerty of Mantoba, Cas) 3559 ‘wo Sells! [Etehangzable Gene Trap Clones (Kimamoto Universi, pen) ° erjpisowinder Geran Gene Trap Contortm (Germany) 13031 sroergenctrip.dl Sanger Insiute Gene Trap Resource (Cambrige, UK) 1354 ‘wo Sanger ulPosGenomicgenetap! Serum Lab Geze Trip Database (FHCRC, Seate, USA) 1627 ‘Ss: thre orclencefabsoranotap hl ‘Teleon institute of Genetics and Medicine-TIGEM (Naples, Taly) TOTAL ‘ore igem i/genetsppliy D644 Nucleic Acids Research, 2006, Vol. 34, Darabase issue Tene wap. a Tea TE rap Seauence o_o MapTag analysis | [Aupldoac analyst Region evap boveen genomic and rani enc rei sed fe pry ‘emai eer Che. 14s, 104910042-104907158 ——S [Taenaricaton) Lda “Rnnoranian ‘dara se 1 IGTC ieifcton and ammaatin pipelite ITC members sbi soe wap ell ie ata to JDGSS. The frst ep ofthe IGTC pps the ‘downlod ofall pobicly available gene tap celine data fon BOSS. Gene trapsequencesare then rocesed hugh he dl identi protcol sed tn genome lcaiaationanrascripalgnmes using MapTag and Alden, ‘espctively-Retimedhomologeus genoa regionsandwansenpiare ligne to genomic sequence to generate confined overlapping sequence eons which are ue ashe primary iene dala. Confirmed genomic coo fates queried agent major informatics datbuses fo cn smotton daa forthe genome locus etd, an the retumed data are entered nt te GTC estaba resources, such as genome browsers and specialized informat- is sites to annotate the identified loci (Figure 1). Use of both ‘genomic and transcript-based annotation methods were incor- Ported to increase confidence in identification and localiza tion on the genome of trapped loci, providing a significant improvement over identification strategies previously in use, ‘This pipeline was designed to be both robust and flexible, allowing the incorporation of multiple methods of identifica: tion and mapping and the future integration of new genomic data resources that may be developed. Gene trap cell fine sequences are submitted by each indi- vidutal gene trap resource to dbGSS along with relevant ancil- lary information about the cell fine, including its original source and the vector used in the trapping experiment. Within dbGSS, IGTC sequences are grouped using LinkOut (17) hutp:/Avww.dlib.org/dlibfmarch02/03inbrieF him! KWAN), which links gene trap dbGSS entries to the IGTC Website Requiring gene trap cell lines tobe entered into dbGSS ensures that all (GTC lines are in the public domain, and that all data fare consistent, transparent and freely accessible. Researchers ‘can access additional information about methodology and pro- tocols from the original trapping experiment via links to the individual gene trap project sites (Table 1), ‘The two complementary programs, MupTag (1S) and Autoldent (12), are used to locate the trapped gene on the mouse genome by analyzing the similarity between the unidentified gene trap sequence and genomic and transcript sequences, respectively. The minimum region of genomic overlap serves as the primary identification data, fom Which all subsequent annozation is derived. Finally, the iden- tified map coordinates are unnotated with gene features obtained from major genomic and informatics databases, Genomic localization: MapTag MapTag was developed as an uutomated method of identifying homologous genomic regions for gene trap sequences using the Ensembl database (18) and SSAHA algorithm (19). Map- Tag identifies matches in genomic sequence and assembles individual related stretches of genomic similarity using a bi splice model that takes into account exon boundaries, allowing the processing of mucleotide sequences of eDNA or genomic origin. The program filters alignment results, applying simple heuristics using the overall match length, percent identity and exon coverage, and filtering to remove pseudogenes, to detet- imine the best match. As gene tap sequence tags are typically short and imperfect, they can be difficult wo identify to a unique locus. The protocol used by MapTsg greatly improves the ability to differentiate between matches that show comparable levels of similarity by correctly selecting the result that exhibits a correspondence to the insertion site. The program returns the identified genomic region and an estimate of the ‘match confidence. ‘Transcript identification: Autoldent he pipeline also uses transcript identification to provide an independent and orthogonal identification method. Autoldent, tan automated protocol developed by BayGenomics, uses the BLAST (20) algorithm to identify the most similar sequences in the GenBank non-redundant nucleotide database at NCBI When the program identifies many high-scoring matches to very similar (Synonymous) genes, Autoldent adds steps to filter results and condense synonymous transcripts to obtain ‘4 single result. The progeam applies stringent criteria for acceptance of a high-quality gene identification but allows, ‘more relaxed criteria to identify multiple matching sequences ‘or to.a homologous sequence in another species, At the end of the process, Autoldentretums the best match transcript along ‘with alignment dats. Identification reconciliation and confirmation Genomic and transcript identification data are stored in the IGTC database. The pipeline then compares results fom the two identification protocols, using overlap between the {genomic coordinates from MapTag and the genomic localiza- tion of transcripts returned by Autoldent to confirm the iden- Lification. Gene trap sequences are assigned as localized when the genomic and transcript map coordinates overlap, or when only one protocol returns map coordinates. I the identified coordinates conflict or neither protocol returas map data, the zene trap sequence is classified as unlocalized and does not go, through the rest ofthe annotation pipeline. This reconciliation Step assures that each identified cell line is mapped to a genomic insertion locus with high confidence and in a manner that is fully documented, Annotati of gene trap cell fines Gene trap cell tine annotation is based on the confirmed ‘genomic coordinates as the primary identification data, Map coordinates are used to query the Ensembl and Entre2 (21) databyses to obtain gene features for the identified locus. ‘These accessions are used as primary keys to further query Ensembl, Entrez and the Movse Genome Informatics (MGI) resource (22) for secondary annotation data associated with the trapped genes, including major gene accession systems, Gene Ontology classitications (GO), protein domain, structure and function, PubMed and phenotype data, homology and thology data, and microarray probesets, In addition to sup- porting an extensive array of informaties data, the IGTC site also includes tissue-specific expression data from the Symat- las project (23) and biological pathway and GO hierarchy diagrams with trapped genes marked, produced using the GenMAPP program (24) ‘The IGTC database ‘The IGTC uées the open source MySQL database platform (www mysql.com), The database is populated in an automated process using results from the MapTag and Autoldent ‘dentification protocols and is structured to optimize informa tion access via Web queries. New gene trap cell line entries in dbGSS are downloaded weekly and ran through the IGT Pipeline. Identification results from MapTag are updated with each Ensembl build and Autoldent is programmed to regularly BLAST sequences in the database to update acces sion numbers and other changes in the information provided by GenBank, Annotation data. generation is synchronized With the Entrez, MGI and Ensembl datubises, and the IGTC database is updated by downloading information from these sites as necessary. The information in the [GTC database is available upon request in a tab-delimited or data- base compatible format. ACCESSING GENE TRAP DATA ‘The IGTC site provides a user-friendly approach to gene trap data, allowing researchers to access the gene trap database from a sequence, accession number or ID, expression or path= ‘way perspective using a variety of interfaces for searching and viewing gene trap data. The site is organized wound the cell tine annotation page, where the user can view all annotation data for a sclected cell line (Figure 2). Primary identitication and annotation data appear atthe top of the page witha fink to detailed identification results, This is followed by expandable lists of secondary annotation data, which provide information Nucleic Acids Research, 2006, Vol. 34, Database issue D645 useful in the selection and analysis of cell lines of potential interest, Below this is a section containing details about the cell line and gene trap vector, including primer sequences and 1adescription ofthe vector properties. To aid in the comparison of insertion sites of different cell lines that trap the same gene, the bottom ofthe cell line annotation page contains a diagram showing the gene imp sequence aligned to the transcripts returned by Autoldent, The Website provides a similar anno- tation page for all tapped genes, organized by gene 1D. Researchers can also access gene trap data via links from other mujor informatics resources, including the genome browsers and primary accession pages at NCBI (25), Ensembl (26) and UCSC (27) ‘The IGTC Website offers users diverse ways to find gene trap cell lines, ranging from searches using protein data microarray probesets or nucleotide sequences, to screening for traps placed in the context of biological pathways or in genes that demonstrate a particular expression profile. Figure 3 Fists the ways in which gene trap data can be accessed, cate~ gorized by access type, details about available data type and data access point. Users ean search the GTC database by aceession number, ID or keyword and chromosomal location, Resulis from database searches are displayed as a list of cell line IDs or gene symbols, which link to the individaal cell line or gene annotation page. These lists can be exported as tab-delimited files for use in spreadsheet programs or custom databases. Users can also browse the database by MGI Masker Symbol, Gene Name or chromosome location. BLAST analysis can be performed using nucleotide sequence for a gene or locus of interest to search against gene trap sequences or genomic sequence of trapped genes. Trapped genes can be viewed with a pathway perspective using bio- logical pathway diagrams and functions! GO groupings, which are colored by the number of cell lines available for teach gene. Users can also search for traps in genes with a designated expression profile by selecting a tissue of interest ‘and choosing the expression level of the gene relative to the ‘median tissue expression, Finally, users can browse gene wap cell lines displayed at ‘major external informatics sites, such as genome browsers and gene pages. IGTC cell lines have been mapped to genomic Sequence using the NCBI map viewer, UCSC genome browser and Ensembl genome browser. IGIC gene traps are main- tained at these sites either ay standard map tricks oF as uuser-configured tricks. (Users should follow site-specific directions to view gene trap data using these browsers.) In addition, gene trap cell lines are listed on some of the major gene pages and in mouse strain resource databases, and the IGTC maintains a full list of partnering sites. By integrating gene trap data into the larger context provided by these bioinformatics resources, the IGTC can reach ‘more potential users who are interested in genomic resources and functional genetics, SUMMARY ‘The establishment of the IGTC database and Website marks a major advance in making large-scale mouse knockout resources avuilable to the scientific community. Through the collaboration of gene trap projects worldwide, a standardized D646 Nucleic Acids Research, 2006, Vol. 34, Database issue | eit Line Bonotation, 7 ae ii ee RR oa | Bore Sienin waren, i Se nt Te elo TR Shea Feist Si acceore | re | Seem Reet Bo ba bmaasoe | | | | eet ena SitinearisrResNeNEREN ARRAS EERNRIRNNNE aa { —— Figure2, 1GTC Website celine annotation page: Thine providesanexunple ofthe mai dupe for gene tape lines. Usercan view primary Weniistion nd annotation dal, witha nk detailed sepors from he IGT pipeline. Anetensive amount of eed annaation dl so resent tive retearhers more information abot he repped acu ily, the page shows dels abou the celine and vetorandan nage sowing gene tapeel bine sequsocesaligoedotMe ‘wappe gene, From ts page, ses ae dete t the IGTC member site fr eal ine requ. Nucleic Acids Research, 2006, Vol. 34, Database issue D647 iwex]Access Type] Data Access Details, | Access Poin Se ee Bee See oust SOR MEENA] crc wen Pee ee Be] napctln [UR Cepecapne | HOT Hee ee ee BSS eee Figure 3 Gene trap data acces, Several methods for wsert 0 ft gene rp cel Hines of interest are iste ‘identification and annotation pipeline bas been developed to analyze gene trap data and offer the data access tools necessary for maximum resource utility. For the first ime, researchers investigating gene function in the mouse can query all publicly available gene trap ES cell lines ata single site. The IGTC Website contains approximately 45 000 cell, lines harboring mutations in nearly 40% of mouse genes (15), including many genes with gene trap cell lines representing. multiple mutant alleles. Researchers can search and browse the resource based on accession numbers or IDs, keywords, sequence data, tissue expression and biological pathways, and can also access the IGTC site from other major informatics, sites. Furthermore, they can easily request IGTC cell lines for functional characterization of gene function and disease mod- els. For example, BayGenomics cell lines are available from the Mutant Mouse Regional Resource Center (MMRRC) at the University of Califomia Davis (www.mmrre.org). ‘The MMRRC-UC Davis also offers to microinject ES cells wo derive knockout mice for investigators, Soon, the Soriano. and TIGEM collections will also be available from the MMRRC. The IGTC Website will grow and continue develop new tools and features as the diversity of trapped genes ‘and experimental options for using gene wup cell Tines expand, ACKNOWLEDGEMENTS. This work has: been supported by NIH grants UOL HL66600 and P41 RROIO$81, Canadian Institutes of Health grant GOP-36055 and by The Wellcome Trust Sanger Institut. ‘The authors thank Carol Bult at Mouse Genome Informatics, Deanna Church at the National Center for Biotechnology Information and Bob Kuhn at University of California at Santa Cruz for their helpful suggestions. Funding to pay the Open Access publication charges for this article was provided by the University of California, San Francisco, Conflict of imerest statement. None declared. REFERENCES 1, Weert RH, Chinwalk,AT, Cook, FowellG.A, Fait L-A, Fal RS Lint Tobe (202) nti soguenegan comparatively ‘ofthe mouse genome. Nore, 420. 20-362 Sung. H, Lec H-W and Song) (200) Portions genomes approach sing mit J. Biochem Mol. Biol, 37, 22-132. Noting? M. (300) Generation of moss matin 8 100! fr Facto StinfordW Cohn) B sndCordes P2001) Gene tap mutagenesis ast preset an beyond. Narre Res Genet, 2, 795-168 Branfs;C. and Dymeck $M. (2004 Talking about revelation: The Impact f site specie ecorbiatesn genetic atalses a mice Dev Cell 6,728 Schnuig:nF.De-Zols.. Van Sloon? HollatzM, Pes. Hansen) mie Seisenberger., GhyslinckN. Rie «al (2005) ‘Genomewe production of altiparpse alleles forthe fartional stash of he mouse genome. Proc Nal Acad. el USA, 102, aim, CCobslisG., Nicolaus. ovo ML Romito, Mara. BaburisiM. SardiefoM..Ds Giorgio FP. lovisaN. ZolloM. ol. 300) Tagging genes wit cassete-eschange Stes. Alice Aids Ret. 33, 8 ‘usin CP, Baty F, Bradey.A Busan, Cepecch M, Coins FS. Dove. WF, Duyk.G., DymeckS. pig Te al. 2008) ‘The inockout mouse projet. Nar Genet 36 931-924. 9, AuwersJ. Avner Baldock. BalubioA, Halling. BarbucdM., BemsA. Bradley. Browe,S..Carmelie.P eral (08) The European Aioensic forthe moe gnome magenesi program Nature Gora. Deletaunty KD, Al ~SbLE-G., LiXM, LCM, Palak nd Raley EE. (1991) Fencional genomics mice by tgsed sequence mutagenesis Naure Genet 6, 338-348, 11, WileeMV, Vani, Ote J Fachtbase EM Roi, Fuctbaier A. ‘Arvald HHL, Lelvach Meta, Von Mekciner el (2000) [Enublishment of gene-tap sequcncs tg brary to generate mutt snice frm embryonic em eels Native Gene 2, 13-14 D618 Nucleic Acids Research, 2006, Vol. 34, Database issue Suybe.D., KewamoioM: HuangC.C. loins S. KingleA Huper CA. MenglC. Les RE BaibitPC, Fein TE. eo. (2003) BayGenomics a source of iseronl mutation in masse embryonic sem cells Muclee Acids her, 3. 27%-281, Hansen, loss, Ward, W., Van Sloop Schnitgen Vou Mekinnet i. Fchibsoer EM, Vani, Amold ti nie. (2003) A largectcl,genedrven mutgeneris apreach fr the functional atayss ofthe mouse genome Prac. Ma Acad. Si USA 100, 9918-9822 ToC. Epp, ReidT. Lan Q, YuM, LAC. Onishi M. Hane ‘TeaoN Calo. era (2008) Me Cente orNodeng Hamn Disease Gene Trap resource. Nucleic Acids Re. 32, DS57-539 Shares W.C, Nord AS, Cox on Mclchner I, Wore... Hicks, Young..G, Conlin. Roi, Soriano e al. (2008) {public gee rap resource fr mouse functional geanmis, Name Gener, 96 383-94, BemonD.A, Karsch Machi, LipmnD J, OseJ. and ‘WheelerD L. 2008) GenBank cle Acide Res 33, 34-D38 Kwan YK. (2002) LinkOut Explore beyond PubMed sit Enter Dub ag Hubbard, AndtewsD, CaccamoM., CameronG., ChenY, (Clamp. Clarke, Coates. Cox Cunningham Fo (20), Ensembl 2005. Nucile dei Ket 33, DAeT-DaS3, 'NingZ.,Cor.A. and Mallika). (2001)SSAHA: fist search eto for large DNA databases. Genome Ret I, 1925-1729. Alschtl SF Gish. W., Mile, W. MyersE-W. apd Lipman DJ. (1990) Basie loa aligument search oo J. Mt. Biol. 21, 403-40. 2 a, 2 2. ‘Maplot:D, Ovelld, ProitK.D and Tatusova,T (2005) Exe (Gene genteel information at NCB, Nacleie Acide Res. 33 Det bse Fppig.T BultC), KetinJ A. Richardson Blake JA. AtagnostpoulosA.,BalvellM_ Bayz. Beal). Bello SM. ‘ra (2008) The Mosse Genome Date (MGD):Foingenestomice— 5 community reseure for mouse biology. Nucleic Acie Ret 3, beriais SoA, Wilshire, Ballo, Lapp, Ching.K.A, Block. Zhang), Soden. Hayskawant, Keciman Geral (2008) A gene alls ‘ol the mosses oman prsen-eneoding tansrpomes. Proc Not! ‘Acad. Se USA. D1, 6052-4067 Dahli KD, SslononisN, VianizanK, LastorS.C. and (Conn B R. G00) GeaMAPP, 2 new tool for viewing sad analyzing smirouay data on biological pathways. Nave Gene, 31 19-20. WocelerDL, Bare. RemsonD.A. Bryant Sit, CancseK, ChirchD M, DiGncriM, Edgar, Federhen Helmer W. a {@005) Database resnurces ofthe Nios Cente for Biotechnology Information, Mucete Acide Ker 38, D39-Di5. ‘Sather, Gibbine, Mei? Sry. Spooner, W’, HoH Rnd (Cox.&'V' 2004) Tre Encembl Went mechanixof a enome trowner Genome Ket, 1, 951-955, Kerlchik:D. Bacrsch. Diethans M, Farey.TS., Hiecchs A LuV-T, RoskinieM. Sdhvane Mt, Sugnel CW, TomasDJ. el (2003) The UCSC Genome Browser Database. Maca Aide Ret. gusts J. Biochem. 140, 798-798 (2006) ‘40: 10.1098,}4/mvj208 Negative Selection with the Diphtheria toxin A fragment Gene Improves Frequency of Cre- Exchange in ES Cells Kimi Araki™, Masatake Aral ‘Mediated Cassette and Ken-ichi Yamamura" "Department of Developmental Genetics, Institute of Molecular Embryology and Genetics, and "Institute of Resource Development and Analysis, Kumamoto University, Honjo 2-2-1, Kumamoto 860-0811 Received August 12, 2006; accepted October 11, 2006 ‘The Cre-lox system is an important tool for genetic manipulation in embryonic stem cells. We prey wusly reported that the cassette exchange strategy using the mutant lox66/71 and lox2272 combination showed high recombination efficiency and stability. However, the efficiency was strongly affected by the position of chromosomal target lox sites. To enrich successful cassette exchange events, even in clones showing lower recombination efficiency, we have improved exchange vector. The Diphtheria toxin A fragment gene was placed in the un-exchanged region for negative selection and the puromycin N-acetyl- transferase gene, instead of the neomycin phosphotransferase gene, was used for positive selection. By reducing random integration, the frequency of successful cassette exchange increased up to 2-4 fold. Furthermore, by adding the third lox site to induce intrarmolecular recombination, the recombination efficiency of cassette exchange itself was improved, and the frequency increased to maximum 5 fold, in which the percentage of exchanged clones reached to 50-70%. recombinase-mediated cassette exchanges. Key words: cassette exchange, Cre recombina: ‘This strategy should be useful for other ,, Diphtheria toxin A fragment (DT-A) gene, embryonic stem (ES) cells, site-directed recombination. Abbreviations: bsr, blasticidin S resistant; DT-A, diphtheria toxin A fragment; ES, embryonic stem; neo, ‘neomycin phosphotransferst lation; pac, puromycin NV NUSlacZ, laeZ gene fused with the nuclear localizing Iteansferase, Pek, phosphoglycerate kinase-1; RMCE, Cassette Exchange; tk, thymidine kinase; X-gal, S-bromo-4-chloro-3.indolyl -b-galactopyranoside, ‘The Cre-mediated site-specific targeting system is a powerful tool for genome engineering in mammals, espe- cially in mouse embryonic stem (ES) cells, because it allows precise and repeated knock-ins of any DNA to target lox sites introduced by gene targeting or gene trapping. (3) However, intermolecular recombination between wild-type loxP sites, ie., integrative recombination, is inefficient. due to re-excision through intramolecular recombination (4). In order to perform Cre-mediated insertion or replace- ment, two kinds of mutant lox sites have been developed, One is a pair of for sites with a 5 bp mutation in the left or right end of the lor sequence, such as lox66/71 (5, 6). Recombination between a chromosomally located lox71 site and’a lox66 site on a targeting plasmid results in site- specific integration of the plasmid producing a double mutant lox site at both ends and a wild-type loxP site. Since the binding affinity of the double for mutant site for Cre recombinase is reduced, the integrated plasmid is stably retained. The other mutant site is a heterospecific lox site that has mutation(s) in the centrel 8 bp spacer region (7-9). The recombination using heterospecific lox sites is termed Recombinase Mediated Cassette Exchange (RMCE) (10), in which the recombination does not oecur between two lox sites differing in the spacer region "To whom correspondence should be addressed. E-mail: arakimi@ spo.kumamoto-u.acjp Vol. 140, No. 6, 2008 whereas lox sites with identical spacer regions can be recombined efficiently. Until now, lor511 (11), lox2272 (a2) and lox5171 (13) have been successfully used in ES cells. Recently, we have shown that the combination of lox 66/71 and the heterospecific lox2272 site gave high recom- ination efficiency and stability even with Cre recombinase (14, and developed an exchangeable gene trap vector carrying lox71, loxP and lox2272 (3). Using the trap vector, wwe can initially carry out random insertional mutagenesis in ES cells, and then replace the reporter gene in the trap vector with any gene of interest to be expressed under the control of the trapped promoter through RMCE. In RMCE, a targeting plasmid carrying an integrated floxed cassette is co-electroporated with a Cre-expression vector into cells in their circular forms to reduce random integration. However, random integration of targeting plasmid also occurs at a considerable frequency, probably due to nicks in plasmid DNA strands. In our previous study, the percentage of random integrants was over 50% even in the clone that showed the highest RMCE frequency, indicating that random integration is more efficient than RMCE. The most effective method to elim: inate random integrants is negative selection using the thymidine kinase (tk) gene of the Herpes simplex virus U8). In this method, the tk gene is placed on a chromosomal target construct between two heterospecific lox sites, and recombined clones, where the tk gene should be removed by 798 (© 2006 The Japanese Biochemical Society 794 replacement with other DNA on the targeting vector, are selected with ganciclovir. Several groups reported that almost all clones obtained after electroporation had targeted replacement (12, 16, 17). This tk negative selec- tion is very useful because it is not necessary to have any positive marker on targeting vectors. On the other hand, the necessity of placing the tk gene in a chromosomal target construct is quite inconvenient for gene trapping, since itis preferable that trap vectors should consist of minimum elements to avoid unexpected effect(s) caused by intro duction of vector elements. Actually, male sterility caused by ectopic expression of the ¢k gene in testis has been, reported (18). Tn this study, we aimed to increase recombination fre- quency by improving exchange vectors, not. chromosomal target constructs, We used the Diphtheria toxin A fragment (DT-A) gene (19) for negative selection to reduce random integrants, changed the neomycin phosphotransferase (neo) gene into the puromycin N-acetyltransferase (pac) gene, and added a lox66 site to induce intra-molecular recombination within the exchange vector. With the improved exchange vector, the frequency of exchanged clones was increased to 2-5 folds. MATERIALS AND METHODS Plasmids—Plasmids 66-2272 and pCAGGS-Cre were deseribed previously (14, 20). The plasmid containing the MC1 promoter-DT-A gene was kindly provided by Dr. 8. Aizawa (19). However, The original MC1-DT-A cassette had no polyadenylation (pA) signal and the MCI promoter of the cassette carried only one copy of the enhancer soquonce. The MCI-DT-A-pA cassette used in this study was constructed by replacing the original promoter by the MCI promoter from MC1-neo-pA cassette (STRATAGENE, La Jolla, USA) and by adding the pA signal ofthe rabbit P-grobin gene. The pMC1-DTA plasmid was constructed by inserting the MC1-DT-A-pA cassette into p66-2272. The pPGK-DTA and pPac-DTA were cons- tructed by replacing the MCi-neo-pA cassette into the Pgk-neo-pA and PgkpacpA cassette, respectively. The pPacDTA-66 was constructed by inserting @ lox66 fragment into pPac-DTA. The sequences of all lor sites, in these plasmids were canfirmed by DNA sequencing Cell Culture and Electroporation—ES cell culture and establishment of call lines carrying a single copy of target lox sites was described previously (14). For RMCE, the cells (3-6 x 10° colls/0.8 ml in PBS) were electroporated at 400 V and 125 pF, and plated into two 10 cm plates. G418 selection was started after 24 h of electroporation at 200 g/ml for 7 days. For puromycin selection, cells were fed 2 ug/ml of puromycin containing ‘medium for 24 h x 2 times on day 1 and 4 after electro- poration. On day 8, colonies were stained with 5-bromo-4- chloro-S-indolyl -d-galactopyranoside (X-gal) or picked and expanded for DNA analysis. A series of 5 exchange vectors with a cell line was per- formed on the same day, and cach series of electroporation ‘was repeated at least four times on independent days. PCR Analysis—Genomic DNA (0.05-0.1 yg) was sub- jected to 82 eycles of amplification (each cycle consisted of 1 min at 94°C, 1.5 min at 58°C and 1.5 min at 72°O) using Amplifaq_ polymerase (Perkin-Elmer). Primer K. Araki et al sequences are as follows; AG2, 5'-CTGCTAACCATGTT- CATGCC:3'; LZUS3, 5/-GCCCATCGTAACCGTGCATS', bsr-2, 5'-GCAGAAATCGGAGGAAGAAG-3'; bsr-3, 5'-CAA- CTCCOTACACATACCAC-3'; FRT-S, 5'-GCTTCAAAAGC- GCTCTGAAG-3'; DTAI, 5TACCACGGGACTAAACCT- GG-3'; DTA2, §-CGCTTAACGCTTTCGCCTGT-%. ‘Statistical Analyses—The recombination efficiencies and relative number of blue or white colonies were evaluated by non-repeated measures ANOVA. Where a significant difference (p < 0.05) was identified, the differences were analyzed further with SNK tests for multiple comparisons. RESULTS. Experimental Design—The strategy to assess the RMCE frequencies used previously (14) is outlined in Fig. 1A. The chromosomal target is the CAG promoter-lox71-Blasticidin S resistant (ber) gene-pA-lox511-FRT-lox2272. Exchange vectors contain lox66-the promoter-less lacZ gene fused with the nuclear localizing signal (NLSIacZ)-a selection marker gene-lox2272. BS cell lines carrying the chromoso- ‘mal target are co-clectroporated with the exchange vector and Cre-expression vector in their circular forms, and then selected with appropriate drug according to the selection marker gene. The ere gene on the expression vector is transiently expressed, and Cre protein mediates site- specific recombination between the chromosomal lox71 and /ox66 on the targeting plasmid, and the chromosomal Jox2272 and lox2272 on the targeting plasmid, resulting in cassette exchange of the bsr gene with the NLSlacZ-selec- tion marker cassette, Since there is no negative selection ‘marker on the chromosomal target construct, both random integrants and site-specific recombinants become drug resistant, but only the colonies where RMCE had occurred are stained blue with X-gal, since the NLSLacZ gene is, inserted downstream of the CAG promoter. The percentage of blue colonies represents the frequency of RMCE. When only the exchange vector was electroporated, no blue colo- nies appeared (data not shown), indicating that gene trap- ping events hardly occur with electroporation using the exchange vectors in circular forms. We used four ES cell lines, 71-5F2-7, 71-5F2-10, 71-582-23 and 71-5F2-26 carrying a single copy of the chromosomal target construct. The cell lines 71-5F2-10 and 71-5F2-23 were used in our previous study (14), and showed lower and higher recombination frequency, respec- tively. We added two cell lines, 71-5F2-7 and 71-5F2-26, which showed lower recombination frequency, to examine enrichment effects of exchange vectors on several different “inactive” positions. Since the original 71-5F2-1, 71-5F2-10 and 71-5F2-26 clones were contaminated with wild-type ES cells at a considerable percentage (data not shown), all four lines were re-cloned through limiting dilution and Blasticidin S selection. Table 1 shows the RMCE fre- quencies and the number of blue colonies in each re-cloned line electroporated with pCAGGS-Cre and p66-2272, which contains only the MC1-neo-pA cassette as a positive selec- tion marker (Fig. 1B, top) and gave the best frequency in the previous study. Since the conditions of electroporation were optimized for Cre-mediated recombination, the freq- uency in 71-5F2-23 increased to 35.3% from the frequency obtained in the previous study, 26.3%. In 71-5F2-23, the number of blue colonies was 3-4 times higher than in the J. Biochen. Efficient Cre-Mediated Cassette Exchange in ES Cells A call ines caring tage sts ‘preaassce \oe ey eon ee vinnie hncdone B toxzz72 Simm . a (IEEEREY FPR J Cae Fig. 1. (A) Experimental strategy for comparison of frequency of RMCE. BS cell lines carrying a single copy of the CAG-lox71-bsr-pAlox611-FRT-1ax2272 fragment were estab- lished. The cell lines were co-electroporated with the Cre expres sion vector and the targeting plasmids carrying the promoter-less NLS-lacZ. gene. Through RMCE, the NLSlacZ gene is joined tothe CAG promoter, resulting in positive staining with X-gal. Since the targeting plasmids contain a selection marker gene, random inte- kgrants can also appear, However, colonies with random integration fare not stained because there isno promoter for theNLSlecZ gene. ‘The percentage ofblue colonies represents the frequency of RMCE. Positions of PCR primers used in Table 2 are shown as small arrows with the name of the primer. (B) Targeting vectors this study. (C) Predicted recombination intermedi- ate of pPac-DTA-66. Since pPac-DTA-66 carries two lox66 sites, intra-molecular recombination should occur first to divide into two ircalar moleetles. The molecule including lox66 and lox2272 becomes a substrate of RMCE. other three clones, indicating high recombination effi- cieney probably due to open chromosomal configuration around the target lox sites. The goal of this study is to improve targeting vectors to enrich blue colonies in all ‘Vol. 140, No. 6, 2006 795 ‘Table 1. Recombination frequency with p66-2272 targeting vector in clones used in this study. Number of Blue Recombination ree colonies + Si Broquency (%) # SD 121296 20.25.89 TLSF2-10 92.0346 1654267 152.28 311 £166 35.3789 71.5F2-26 67.0178 1164244 in each clone cell lines, including the “lower” recombination frequency clones, ‘Exchange Vectors—Exchange vectors used inthis study are shown in Fig. 1B. In pMC1-DTA, we added the MCI promoterDT-A gene:pA cassette to p66-2272 in the tun-integrated region by RMCE (outside of two lox sites) for negative selection to reduce random integrants. The choice of selection marker gene is also important for colony formation effcieney. In pPGK-DTA, the mouse phospho glycerate kinase-1 (Pgk) promoter-neo-pA cassette, which shows higher colony efficiency than MC1-neo, is used, and in pPac-DTA, the Peh promoter-pac-pA cassette, which is known to that puromycin sensitive cells are killed quite aiuickly and the drug selection completes within 24-48 h, is tised, In addition, we constructed a pPae-DTA-66 plasmid, in which an additional lox66 site was placed between the DT-A cassette and plasmid vector sequence. After elec- troporation with Cre expression plasmid, intra-molecular recombination between /ox66 sites should take place first, resulting in two circular molecules as shown Fig. 1C. BY mninimizing the size of the targeting DNA molecule, we expected to reduce random integration. Recombination Frequencies—Percentages of blue colo- nies, ie, RMCE frequencies, in the 4 lines are shown in Fig. 2A. The pPac-DTA-66 plasmid (vector no. 5) gave the highest RMCE frequency in all lines with statistical sig- nificant differences. Even inthe lower’ recombination-frequency clone 71-5F2-26, the frequency exceeded 50%, and in the "high" recombination frequency clone 71-5F2-23 line, it reached to 75%. In order to evaluate the enrichment effect, relative numbers of blue or white colonies against the number of blue colonies obtained with 165-2272 were ealeulated and are shown in Fig, 2B, When the DT-A cassette was added to p66-2272 (pMCI-DTA), the number of white colonies was significantly reduced to almost half (vector no. 2, indicating that negative selection of the DT-A was effective. However, since the number of blue colonies was also slightly reduced, the percentages of blue colonies did not change significantly, but were higher than in 66.2272. With the use of the PGK-neo-pA, both the number of blue and white colonies increased, but no significant changes of frequency compared to pMC1-DTA abserved (vector no. 3.With the use of PGK-pac-pA, the RMCE frequency inereased to 30-50% (vector no. 4), Inter estingly, with pPac-DTA-66 plasmid, the numbers of blue colonies (targeted integration) were significantly increased inall ines, whereas the nembers of white colonies (random integration) were unchanged (vector no. 8). This indicates that smaller DNA molecule ean access more easily to chro- mosomal target lox sites. Im order to analyze integration pattorns, 71-5F2-7 and 71-5F2-10 colonies were picked from cells electroporated 796 80 He 60 40 - a, ES fi 20 Frequency of RMCE (%) K. Araki et al Fig. 2. (A) Frequency of RMCE. ‘Twenty micrograms of each replacement plasmid and the Croexpressing vector were ‘coelectroporated, and after drug selection for 7 days, colonies were stained with X-gal, and the pereen- tage of positive colonies was scored asthe frequency of RMCE, Numbers under the graph indicate the target- ing veetor used in each electropors: ton. 1, p88-2272; 2, pMCI-DTA; 8, pPGK-DTA; 4, pPaccDTA, 5, pPac DTA‘66, The means + SD of at least four independent electropore tions are represented. All cell lines Vector 12345 CellLine 5F2-7 B 12345 SF2-10 1234 SF2-23 ~ Relative Colony Number bi F2-23 12345 5F2-7 Vector 4 2 Cell Line si ‘Table 2. Genomic DNA analysis of isolated subclones by PCR. Cell line Total no. of clones analyzed w X-Gal staining (%) Blue 84) White 10 (56) Blue 14.61) White 939) TAT 715F210 cy show significant differences at P< 0.01 by ANOVA analysia. Statistical significances among. vectors were further analyzed by the SNK test a, P< 0.01 compared to all other vectors; b, P < 0.01 compared to vector 1, 2 and 5 and P < 005 compared to vector 3; ¢, P< 001 compared to veetor 1, 3 and 5 and P<'0.08 compared to vector 2; d, P< 001 compared to the indicated vveetor; ¢,P < 0.05 compared to the indicated vector. (B) Relative Blue (Golid bars) or white (gray bars) colony numbers. The number of blue colonies obtained with 266-2272 arbitrarily eet at 1. S 12345 SF2-26 5 1234 5F2-26 PCR pimers Berbers AGUERT.AS 0 ND 5 5 0 ND 6 3 DIAVDTAD 0 2 ° ° AGRO @ 0 “ ° Salclonsatained alr colstoporation with pPac-DTA-O6 and fONGUS-Gre ware picked and expanded Tor genomic DNA preparation ‘and PCR analysis, Part ofeach clone was stained with X-gal, and the pperformod withthe indicated primers Size is represented. ND, not done. with pPac-DTA.66 and pCAGGS-Cre, genomic DNAs were prepared, and POR analysis was performed. As shown in Fig. 1A, the 5’-junction of the recombination can be ampli- fied using the primers AG2 and LZUS3, and all blue clones gave a band of the expected size (Table 2, data of electro- phoresis not shown). Then, the presence of the bsr gene, which should be removed through RMCE, was examined with the primers bsr-1 and bsr-2. All blue clones were negative for the bsr gene detecting PCR, however, only 11 clones out of 19 white clones retained the bar gene, unexpectedly. Since itis reported that Jox511 can be recom- bined with loxP or Jox71 (9), PCR analysis with AG2 and FRT-AS primers was performed to detect recombination between lox71 and lox511. The 8 clones that did not carry the bsr gene exhibited a 201-bp band. We cloned 1e number of blue or white clones is represented. PCR analyses were Fig. and among the each blue or white clones, the numberof clones showed a band of expected the product into the T-vector and confirmed that the sequence corresponded to the recombination product lox71 and lox511. The spacer sequence of the lox site in the recombined product was the wild-type sequence (data not shown). The DT-A gene was detected in only 2 clones by PCR, indicating successful negative selection. Since the amplified region was the inside of the ORF, the promoter or pA signal might be deleted in these 2 clones, DISCUSSION We have shown here that the combined usage of negative selection of the DT-A gene, positive selection of the Pac gene and the third lox site for intra-molecular recombina- tion efficiently enriched RMCE events. The enrichment J. Biochem. Efficient Cre-Mediated Cassette Exchange in ES Cells effect was more apparent. in the ‘ower’ recombination-frequency clones, and we could obtain a fre- quency of 50% and more, which is high enough for the practical use of RMCE. ‘We could enrich RMCE event through two strategies, one is the use of the DT-A negative selection marker gene and the other is the change of positive selection marker gene. The usefulness of the D7-A gene which reduce random integration has been already reported in gene targeting (21), however, enrichment effect in gene targeting frequency by changing selection marker has not been clearly observed. In comparing two plasmids having the same promoter, pPac-DTA (Pgk-Pac) and pPGK-DTA (Pgk-neo), the number of white colonies was reduced to almost half, probably due to the difference of colony formation efficiency of the pac and neo gene. On the other hand, in comparison between pMC1-DTA (MCI-neo) and pPac-DTA (Pgk-pac), there is no statistical difference in number of white colonies or blue colonies, except of biue colonies in 71-5F2-26, nevertheless, the RMCE frequency showed statistical difference in the all lines used in this, experiment, Why the use of the Pgk-pac cassete resulted in higher RMCE frequency? We speculate that the difference of time course in G418 and puromycin selection might be the cause of the enrichment effect. Puromycin kills almost all sensitive cells within 24hrs, whereas G418 takes 2-8 days. The quick elimination of non-transfected cells, might result in reduction of spontaneously drug-resistant colonies, In addition to reduction of random integration, we could increase recombination efficiency itself by adding the third lox site to induce intra-molecular recombination, which results in 33% of size reduction of targeting DNA molecule by removal of the plasmid vector sequence. Interestingly, the numbers of blue colonies with pPac-DTA-lox66 plasmid were 1.5-1.7 times higher than those with pPac-DTA plasmid, thus, the RMCE efficiencies increase approxi- mately in inverse proportion with the size of targeting DNA molecule. This indicates that.the probability of encounter of chromosomal lox site and targeting plasmid depends on physical mobility of DNA molecule. In the higher-recombination efficiency clone 71-5F2-23, the chro- mosomal configuration around lox sites is considered to be open, therefore, they always show high efficiency even when the molecular weight of targeting vector is relatively large. On the other hand, in the lower-recombination effi- ciency clone 71-5F2-26, the chromosomal configuration around Lox sites is considered to be close, therefore, reduc- tion'of the molecular weight of targeting DNA is quite effective to increase RMCE frequency by improving acces- sibility to chromosomal target lox sites. Our results predict that RMCE using large molecular weight plasmid, ex. bacterial artificial chromosome, may show low frequency. ‘The advantage of RMCE using our strategy is the minimum requirement of the chromosomal target struc- ture, ie., only two heterospecific lox sites, and wide and easy application to gene targeting or gene trapping vectors. ‘Thus, our enrichment strategy for RMCE will be a powerful tool in genetic manipulation in ES cells. We wish to thank Ms. K. Haruna, M. Muta, R Minato and Y. Tsuruta for technical assistance. This work was supported Vol. 140, No. 6, 2008 797 in part by a Grant-in-Aid on Priority Aroas from the Ministry of Education, Science, Culture and Sports of Japan, by a grant from the Osaka Foundation for Promotion of Clinical Immunology. 1. Nagy, A. (2000) Gre recombinase: the universal reagent for genome tailoring. Genesis 26, 99-109 2. Brands, GS. and Dymecki, SM. (2008) Talking about a revolution: The impact of site specific reeombinases on genetic analyses in mice, Deu. Cel 6, 7-28 8. Taniwaki, , Haruna, K, Nakamura, H., Sekimoto, 7, Oike, ¥., Imaizumi, T., Ssito, F,, Muta, M., Soejima, Y., Unoh, A’ Nakagata, N, Araki, M, Yamamura, K, andAraki, K.(2005} Characterization ofan exchangeable gene trap using pU-17 carrying a stop codon-8 geo cassette, Dev, Growth Differ. 47, 165-72 4, Fokushige, 8. and Sauer, B 1992) Genomic targeting with a postive-slection lx integration vectar allows highly reprodu- ible gene expression in mammalian cell. Proc. Natl Acad ‘Sei, USA 89, 1905-9 5. Albert, HL, Dale, EC. Lee, E, and Ow, D.W. (1996) Site specie integration of DNA into wild-type and mutant lox sites placed in the plant genome. Plant J. 7, 649-89 6. Araki, K, Araki, M, and Yamamura, K (1997) Targeted int gration of DNA tsing mutant lox sivesin embryonic stem cell Nucleic Acids Res. 26, 866-72 1. Hoess, RAL, Wiersbick, A, and Abremahi, K. (1986) The role af the loxP’ spacer region in PI aiteepecife recombination. Nacleie Acids Res. 14, 2287-200 8, Lee, Gand Saito 1 (1996) Roe of nucleotide sequences of ox spacer region in Gre-mediated recombination. Gene 216, 55-65 9. Langer. 8.., Chaoori,AP., Byrd M,,and Leinwand, L. (2002) A genetic screen identiies novel non-compatible loxP sites Nucleic Acids Res. 30, 3067-77 10. Bouhassira, EE, Westerman, K., and Lebouleh, P. (1997) ‘Transcriptional behavior of LOR cahancer elements intepra ted atthe same chromosomal lous by recombinase- mediated cassette exchange. Blood 80, 5832-44 11, Bethke, B. and Sauer, B. (1997) Segmental genomic ropa sont by Cre-modiated recombination: genotoxic stress activa tion ofthe pd premoter in single-copy transformants. Nucleic Acids Res. 25, 2825-34 12, Kolb, AF. (2001) Selectionmarkerree modification of the suring betacasein gene using lox2272 [onrrection of Jox2722) ste. Anal. Biochem, 280, 260-71 18 Osipovieh, AB, Singh, A, and Ruley, HB. (2005) Post entrapment genome engineering: first exon size does not fect the expression of fusion transerpla generated by gene tntrapment. Genome Res. 16, 428-35 14, Arald, K, Araki, M, and Yamamura, K (2002 Site-directed integration of tho ee gene mediated by Cre recombinaso using ‘combination of mutant lox sites, Nucleic Acide Res. 30, €103, 15, Sebler, J, Schubeler, D., Flering, S, Groudine, M. andBode, J, (1998) DNA cassetto exchange in ES cells mediated’ by Flp recombinase: an efficient strategy for repeated modification of tagged losiby marker free constructs Biochemistry 87, 6229-94 16, Soukharev, 8, Mille, J.L., and Sauer, B. (1999) Segmental genomic replacement in embryonic stem cells by double lox targeting. Nuclei Acids Res. 27, 21 17, Shmering,D., Danzer, CP, Mao, X, Boiseair, J, Haffner, M, Lemsistre, M., Schuler, V, Kaesin, B, Kora, &, Burki, Ky, Lederman, B, Kinzel, B., and Muller, M. (2005) Strong and ubiquitous ‘expression of transgenes targeted into. the beta-actin locus by Crelox cassette replacement. Genesis 42, 229-35 798 18, 19, alShawi, R, Burke, J, Wallace, H., Jones, C. Harrison, §.) Buxton, D., Maley, S., “Chandley, A. ‘andBishop, J(O. (1991) The herpes simpiex virus type 1 thymidine kinase is expressed in the testes of transgenic mice under the control ofa cryptic promoter. Mol. Cell. Biol 11, 4207-16 Yagi, T., Nada, S., Watanabe, N., Tamemoto, H,, Kohmura, N, Tawa, Y. andAizawa, 8. (1993) A novel negative selection for homologous recombinants using diphtheria toxin A fragment gene. Anal. Biochem. 214, 77-86 20, a K. Araki et al, Araki, K, Imaizumi, 7, Okuyama, K, Oike, ¥., and Yamamura, K. (1997) Bificiency of recombination by Cre transient expression in embryonic stem cells: comparison of various promoters. J. Biochem. 122, 977-82 Yanagawa, Y., Kobayashi, T, Ohnishi, M., Tamura, 8., ‘suzuki, T, "Sanbo, M, Yagi, T, “Tashiro, F., and Miyazaki, J."(1999) Enrichment and efficient screening of ES cells containing a targeted mutation: the use of DTA ene with the polyadenylation signal as a negalive selection maker. Transgenic Res. 8, 215-21 J. Biochem, Review Article Gene trap mutagenesis in mice: New perspectives and tools in cancer research Ken. ‘amamura’ and Kimi Araki Division of Developmental Genet, Institute of Molecular Embyslogy ard Genetic, Kumamoto Unversity, 22-1 Honj, Kumamoto 860.081, Japan (ceived July 30, 2007/Acepted August 7, 2007/0 rine publication September 17,2007) The complete human DNA sequence of the human genome was published in 2004 and we entered the postgenomic era. However, many studies showed that gene function is much more complex than we expected, nd that mutation of disease genes does not give ny due for molecular mechanisms for disease development. Since the first report on gene knockout mice in 1988, knockout mice have bbeen shown to be a powerful tool for functional genomics and for the dissection of developmental processes in human diseases. In accordance with thi successful application of knockout mice, three ‘major mouse knockout programs are now underway worldwide, to mutate all protein-encoding genes in mouse embryonic stem cells using a combination of gene trapping and gene targeting. We developed the exchangeable gene trap method suitable for large sale mutagenesis in mice. In this method we can produce nll ‘mutation and post insertional modification, enabling replacement of the marker gene with a gene of interest and conditional knockout. We herein discuss the effect of this gene-driven type approach for cancer research, especially or finding the genes that are related to cancer, but are paid little attention in hypothesivdriven cancer research. (Cancer Sci 2008; 99: 1-6) tis @ well known fact that a gene, for example the (fetoprotein gene, expressed during development is often reactivated in cancer cells. In addition, there aze some similarities between carcinogenesis and embryonic development. One important function of development involves the production and organization of all of the diverse types of eels in the body. This generation of cellular diversity 1s accomplished by cell proliferation, followed by cell differentiation. The processes that frganize the different cells into tissues and organs are called morphogenesis and growth. Although cancer Ussues seem to be disorganized, we thought that they require cell differentiation and harmonization of these differentiated cells. For example, they need new blood vessels to obtain nutrition and energy for themselves. From this point of view, we hypothesize that many of she genes involved in embryogenesis may have functions in sncer cells, not just as markers, and that we can find genes that have unexpected functions in cancer cells. Thus, we decided to use ihe gene trap approach to find such genes. Enhancer trap as a tool for gene identification A strategy to monitor transcriptionlly active regions ofthe genome ‘was first deseribed in bacteria.” It involved the introduction of reporter constructs that requite the acquisition of cis-acting DNA sequences in the genome to activate reporter gene expression. In this way, genes are identified based on expression information and subsequently cloned from DNA sequences fnking the site of insertion, This approach was then applied for higher organisms using modified vectors suitable for eukaryotic transcription units. Initially, the enhancer trap vector was developed based on ak: 10.11.1549-70062007 00611. the observation that cellular enhancers are capable of acting on heterologous promoters using cultured cell lines.” tn drosophila the enhancer trap approach was used successfully in a large scale screening for genes expressed at particular developmental stages or in particular lineages.” The lacZ. reporter was used 10 provide a sensitive way 10 detect expression in whole embryos. Similarly, enhancer trap vectors were found fo exhibit unique temporal and spatial patterns of lacZ. expression in transgenic mice" Thus, this method was initially used for finding unknown genes in the genome, but not for the production of @ ulation. However, an enhancer is sometimes located far from ‘coding region in the eukaryotic genome. Thus, iti dificult to search for the gene, because isolation of probes and sereening of the cDNA library are required to identity the gene, Exon trap and promoter trap as tools for mutagenesis “To overcome problems in the enhancer wap approach, Gossler eral! developed « new strategy in which mouse embryonic stem GES) cells and an exon trap vector were used to screen many iniegration events and rapidly clone the associated genes, Although in the published reports, the term "gene trap’ is used, the more accurate term ‘exon trap" is used in this manuscript. Exon trap vectors are designed to generate spliced fusion transcripts between the reporer and the endogenous gene present ut the site of integra tion.” The essential feature ofthe exon trap design is the placement of a splice acceptor infront of a promoteriess lacZ. gene. Integra- tions of the vector ino the intron of a gene in the cosret orients tion were predicted to create lacZ. fusion transcripts; and if de reading frames of the endogenous genes and lacZ. ave the sume, an active [rualactosidase fusion protein should be produced. An alternative method, the promoter wap, was also developed to isolate the transcriptions) control element, promoter. using cultured cells.” Because this vector contains only the coding sequences of the reporter gene, iti expected to require insertions into exons or the S” untranslated region to activate reporter gene expression. ‘These vectors can create lacZ fusion products with endogenous genes and, as a result, may interfere with the normal coding capacity of the endogenous gene and thereby create a mutation. Moreover, cloning a portion of the endogenous gene directly from the /acZ. fusion wanscript eliminates the time-consuming task of searching for exons in flanking genomic DNA. Eventually, the exon trap and promoter trap approaches are applied to disrupt genes expressed in ES cells." The use of ES cells brings several advantages. Fis, the exon or promoter trap technology in ES cells made it possible to carry outa large scale disruption of genes in mice." Although the ability t carry out large seale sereens has proven essential in unraveling the genetic ‘To'whom corespandence should be added ‘mat ynamura6gpo kumamato.acp CanearS6i 1 January 2008 1 yolS9 1 90.1 1 to Program underlying embryogenesis in organisms such as Drosophila melanogaster, these types of screens in mammals were made difficult by the large genomic size. Furthermore, the ‘cost and space required to house large numbers of animals’ and the relatively long breeding period have limited the undertaking. of large scale screens. These disadvantages can be overcome by the use of ES eells. Second, itis possible to presereen ES cells for desired insertion events and for desired expression patterns. ES cells can be induced to differentiate to many types of cells. Thus, in vitro preselection of gene-trapped ES clones is possible by examining the lacZ. expression during differentia. tion." Third, monitoring the lacZ. fusion gene activity in embryos should enable one to readily visualize the pattern of endogenous gene expression during development. ‘Thus, this method was used for a large scale screening for insertional ‘mutation in developmentally regulated genes in mice."=22 Development of new trap methods We first developed a poly A trap vector: As expected, the exon ‘or promoter trap can be used for those genes expressed in ES cells. Thus, these methods can not be used for genes that are not expressed in ES cells. The constructs used in our poly A trap experiments consist of promoter-less lacZ. and neomycin phosphotransferase (neo) expression unit without its own poly A dition signal sequence. We removed the poly A sedition signal sequence because the neo gene was expected to be expressed only when the trap vector could use the poly A addition signal of the endogenous gene. Although we isolated six ES clones and sequenced RACE (rapid amplification of cDNA end) products, there were no sequences that completely matched with the con” sensits sequence of the mammalian poly A addition signal. Other groups also constructed poly A tap vectors and were used to ‘capture a broader spectrum of genes including those not expressed in ES cells" However, it ued out that poly A tapping inevitably selects for the vector integration into the last introns of the tapped genes, resulting in the deletion of only a limited C-terminal portion of the protein encoded by the last exon of the tripped gene, Shigeoka etal demonstrated that this remark able skewing is caused by the degradation of a selectable-marker used for poly A tapping via an mRNNA-surveillance mechanism, nonsense-mediated mRNA decay (NMD). The NMD pathway is universally conserved among eukaryotes and is responsible for the degradation of mRNAs with potentially harmful nonsense mutations.” They also show that an internal ribosome entry site (IRES) sequence inserted downstream of the authentic termination codon (TC) of the selectable-marker mRNA prevents the molecule from undergoing NMD, and makes it possible to ‘wap transcriptionally silent genes without a bias in the vector- integration site. Thus, this novel anti-NMD technology, termed UPATrap, could be used as one of the powerful and straight- forward strategies for the unbiased inactivation of all mouse genes in the genome of ES cells, ‘On the other hand, hese was a growing demand for production ‘of « multipurpose allele. As a gene has many functions itis not enough to make one strain of null mutant mouse to examine gene function or to produce a mouse model for human disease ‘Actually, the gene trap vectors that have been used to generate the currently available resources induce only one type of mutation, such as the null mbtation: mouse mutants generated from these libraries can show only the earliest and non-redundant develop- mental function of the wrapped gene. Therefore, for most of the ‘molant strains, the significance of the tapped gene for human disease remains uncertain, because most human disorders result from late-onset gene dysfunction. In addition, between 20% and 30% of the genes targeted in ES cells are required for development ‘and cause embryonic lethal phenotypes when transferred to the germ line, precluding functional analysis in adults.” foxrves’ ToxP, toxaz?2 Fig, 1. Crelox? recombination system. (a) The integration reaction ie inefficient. with wild-type loxP sites due to the re-cxckion of the recombined product. (9) Integration reacton thraugh lefe element Fight element (LERE) mutant lox. (0) Integration reaction through ‘double mutant lox. (d) Nucleotide sequences of various lox. ‘To solve these problems, we have developed an exchan, gene trap vector based on u strategy for directional site-speciic recombination by the Cre-loxP system. The Cre-loxP recombi nation system of bacteriophage Pl is currently te most powerful tool for genetic manipulation both in vire™™"! and in vivo Cre recombinase catalyzes reciprocal site-specific recombination between two laxP sites Fig. 1). Consequently, Cre mediates both excisive and integrative recombination (Fig. la). In integrative recombination, a circular DNA carrying a foxP site is inserted into a foxP site on a chromosome. However, this integration reaction is quite inelicient, because the integrated DNA, which has foxP sites at both ends, is easily removed again though excisive recombination if the Cre recombinase is still present (Fig. 1a) ‘Therefore, a special selection system in which only targeted integrants can Survive ig indispensable for targeted integration into JoxP sites. Albert ef al. devised a new strategy. They IWentfed three sets of mutant fox sites that favor integrative recombination over the excisive reaction. The loxP site is com posed of an asymmetric & bp spacer flanked by 13 bp inverted repeals (Fig. 1d). They introduced nucleotide changes into the 13 bp left element (LE motant for site, lox71) or the 13 bp right element (RE mutant (ox site, fox66) (Fig. 1d). Recombination between the LE mutant for site and the RE mutant for site produces the wild-type loxP site on the right and an LE +RE ‘mutant site on he left tha is poorly recognized by Cre, resulting in stable integration (Fig, 1b). Using a pair of mutant ox sites, Jox71 and a 1ox66, we achieved site-directed DNA integration in -mouse ES cells. We showed thatthe frequency of site-specific integration via the mutant lox sites reached a maximum of 16%. In contrast, the will-type loxP sites yielded very low frequencies (0.5%) of site-specific integration events By applying a pair of mutant lox sites to gene trapping, we have developed the new gene trap vector, termed pU-Hachi.™ ‘The pU-Hachi consists ofthe SA-lox7LIRES-Szeo. signal (pA)-loxP-pA-pUC. We used an IRES. selection-reporter marker, because it can trap genes whose expression levels in ES cells are very low." Since the long IRES-Bgeo unit interposed between the Lax site and the fox site can be easily removed by Cre recombinase”, we can carry ‘ut plasmid rescue to recover the Sflanking genomic region at well as the 3flanking region. Thus, we can isolate and identify the flanking DNA sequences easily. AS the fox71 site serves as a target for Cre-mediated insertion, we can insert a cDNA sequence joined to TRES to express it under the control of @ do 10.1121, 134-2006.2007.06114 trapped promoter. The poly (A) signal downstream of the loxP plays a role in concentrating the targeted integration event through the poly (A) trap strategy, Since the selection masker {ene ina targeing vector does not have a poly (A) signal, only "upon targeted integration does the selection marker gene se to the poly (A) signal on the trap vector, thereby making the cells déeug, resis. ‘Although the pU8 vector works quite well, the truncated protein may be prodoced when the vecior is integrted downstream ff the endogenous gene. This truncated protein may exert ‘unexpected effets om phenotypes. In addition, recombination rmay occur between lov? and loxP, because both carry the same Spacer sequences, resulling in the deletion of gene of interest “To overcome these problems, we improved and created the nevt trap vector, pUIT. First, we try to produce s promoter trap vector, 0 tal we can make a nll allele inthe fist step. For dis Purpose, we inserted the three stop codons in splice acceptor, Which is located upstream of the ATG of the reporter gene, Brgeo, and did not use the IRES sequence. If this vector was inserted into the downstream of endogenous gene, translation ‘would stop at the stop codon, As the Brgeo gene does not have TRES sequence, it would not be wanslated independently. Thus, ES clones will not become drug-resistant and such clones will not be isolated. Second, we took double lax stategy to reduce the ‘possibility of recombination as mentioned above. Inthe double {ox strategy or recombinase mediated cassetle exchange™™ a differen type of mutant lox site was used. We used heterospe: Gifie lox sites that have mutations) in the & bp spacer region. “The principle is that recombination does not occur between wo lox sites differing in the spacer region, whereas lox sites having the identical spacer region recombine efficiently"? Several groups have used lox311 (Fig. 1¢), which contains a single base Substitution, and demonstrated successful gene replacement in Which a genomic DNA segment flanked by loxS11 and loxP Was replaced with another cassette Nanked by loxStL and loxP located on a transfected plasmid vector" Lee and Saito™ developed heterospecific mutant lox sites with two base Subst tutions, such as 1ox2272 (Fig. 1d), and showed that they never recombined with the wild-type loxP site, while loxSt1 ean recombine with loxP at low frequency using an in viro system. Successful selection marker-free replacement using 10x2272 and loxP was demonstrated in ES cells by Kolb (Fig. le). So, ‘we examined the best combination of mutant ix sites to give a high recombination efficiency and stability, We found that the frequency of sitespecifc integration was highest and reached a maximom of 35% when we used the combination of the LE/RE ‘ulant and! 1ox2272. Based on these findings, we have con. Stnicted the pUi7 suitable for production of roll trap allele in the first step, production of the replaced allele in the second Step, and the conditionally inactivated allele inthe third step by mating recombinase expressing transgenic mice™ (Fig, 2) This is termed as the exchangeable gene trap method, ‘The new vector, pU-I7, caries the inton-fox7 splicing acceptor (SA) BacorioxP-pA°lox2272-pSP73-lox511 (Fig. 2). The SA contains Ure slop codons in-frame with the ATG of B-geo tat can function as promoter trappings. We found that the trap vetor was highly Selective for integrations near the exon conlaining the stat codon (Fig, 3), leading to the null mutation of wrapped endog- ‘enous gene as expected. Thus, using this pU-17 tap vector, we ‘can intially carry out random mutagenesis, Amone 432 tr clones, the trapped DNAs cormprise 53% (229) known coding region, 13.4% (S8 clones) new of unknown genes, 12% (82 élones) known coding region but in reverse cnentation, 8.8% GB clones) no known coding region, and 12.7% (SS clones) Undetermined (TransGenie Ine. Kumamoto, Japan, unpublished data, 2006) (Fig. 4). IL of interest that about 12% of clones show the insertion of a tap vector in a reverse orientation, suggesting that non-coding RNA genes may exist in these area ad Arak Fiat Toy a, weg pap SIT Nell mataton| un oa ee pL trace rand Toa, og@iS” — Raplacerent Fig.2._ Exchangeable gene trap method, Production of the null trap allele in the fst step, Vie replaced aliele in the second, and the Conditionally inactivated alte i the thd step i shown, regions. Furthermore, we successfully integrated the cre gene into the mutant lox sites by Cre-mediated recombination. ‘The inserted cre gene was stably wansmitted through the germline. This cre knock-in system using the LE/RE-lax2272 combination, should be useful forthe production of various cre mice for gene targeting or gene trapping. Exchangeable gene trap a tool for cancer research Using the exchangeable gene rap method, we established more than 300 ES trap mouse lines and these lines are cryopreserved. “The database of these lines is publicly available atthe Database for the Exchangeable Gene ‘Trap Clones (EGTC) website (hapiegieypwiewhndex), Among these, we analyzed six Lines in detail, termed us Ayu3-112, Ayu8-021, Ayu8025, AyUR-LO4, ‘AyUS-108, and Ayu21-127, to examine whether these genes had any relation with cancer. fh these lines, the integration of tap ‘esto sled in the disupion ef Crebhp (eyelic AMP response element-binding protein) gene Skr (Sickle) gene!” Abd (i hydrolase domain containing 2) gene" e-crk gene” ImpB. Unportin ByKub! aryopherin Bl) gene? and Ler (leucine-rich repeat domain containing G protein-coupled recepor 4) respectively. These strains ‘are termed, as Crepiphiestes, Sek, poignant” and Lerd™i-Us, respectively. In the present review we wil focus on three strains: Skene, Andes, and Lenders, Thee genes were initially thought tobe unrelated 1 cancer and therefore, no-one had chosen these genes_as targets for cancer research. This is because most cancer studies have been carried out using a hypothesisdriven approach. In this approch, researchers reed information that suggests. relationship exists between the gee and cancer commence ‘work. On the contrary, the gene tap approach is @ Baconian Science, like natyal history. This is also called an “ignorance diver" approach’ The seventeen-century philosopher Francis Bacon suggested a sysiem for understanding the world that begun with the accumulation of facts, based on observation This kind of spproach does not include bald theories or sudden leaps of understanding. 1 does not attract the same level of recognition as in hypothesis-driven types of research. Thus, ‘many people are doubful about this type of approach in science. However, through the gene trap approich, we can find the genes that are paid ile attention in cancer research, but ate actually related to cancer. Next we will brie intoduce these thre lines and explain how they are related to cancer ‘We established the mutant mouse line, B6; CB-Sir"—~"™#e (Ske), dough gene-vap mutagenesis in embryonic stem cell! ance Sei anuaey 2008 1 vo. 99 no. 3 129 Japanese Carcer Ssocatin 305 279 354 Ol —<$e at ii: Teen 2 192 er 7m e029 TSS Teas We 192 Gq —===-z a 3 876 2338, 5470 we 314 _ 22 Feet bess Tee tayo tees Go ATS ab79 79 _yt05 333 aie eee eee meee 4245557024 Fg. 3. insertion sites in known genes using the UI? vector. Black bars indiate coding region nd blank boxer indicate Sor J untranlated 12 tegions. Arrows indieate the qucleatide numbers $858 9682 Cfhvetion ses. Nucleotide numbers of traslaon Intiation ste, Wanslation termination site, and I tanseription termination site are shown below THES 216 the genes s2 ss__s8 250 Tee) Ba) 12.7%) (3.4) 200 A 482 vap dones (Trans Geri ne) 3 is 3 100 cocsessuesspuasseussaserssersts] so - 5 | ’ fame reverse -nocodng™ mull ht unknown’ siren fetason agin ome Fig. Characterization of trapped ONA. In 12% of clones, the wrap vecor & inserted into the known coding region but in reverse ‘rientation In 8.8% of clones there ts no known coding region round the inortion ste, ‘The novel gene identitied, called Sickle tail (Sk) is composed of 19 exons and encodes a protein of 1352 amino acids with a proline-rich region and a coiled-coil domain. Expression of reporter gene (B-ge0) was detected in the notochord during embryogenesis and in iis derivative, the nucleus pulposus, i adult mice. Compression of some of the nuclei pulpost in the center of intervertebral dises appeared at embryonic day (E) 17.5, resulting in a kinky-ail phenotype in Sk#*" adult mice Chordoma is « malignant tumor, but is characterized by slow srowth, with local destruction ofthe bone and extension into the adjacent soft issue. Very rarely, distant metastases are encountered Chordomas are thought to arise from primitive notochordal remnants along the axial skeleton, The distibution of rumors matches the distribution of notochordal remnants. The Skt could be a marker for differential diagnosis of chordoma ard chondrosarcoma. Ths differential diagnosis may be important to decide the extent of surgical dissection, We also isolated a clone, Ayu8025, in which the tap vector was inserted into the fifth iniron of O/B hydrolase domain con- taining 2 (Abhd2) °° The aif hydrolase fold family consists of hydrolytic enzymes of widely differing phylogenetic origin ich of these family members has the sume protein fold termed a the cf hydrolase fold. The core of each enzyme is simi an of-sheet of five to eight B-sheets connected by «helices to form an c/o sandwich. They all have a nucleophile-histidine- acid catalytic triad, which 1s the best-conserved structural feature in the fold.'In mice, three cDNA encoding proteins Containing an a8 hydrolase fold were cloned from lung eDNA.2* These are now termed as abhydrolase domain containing (Aha) 1, 2 of 3. Reverse transcription-polymerase chain reaction (RT- PCR) analyses showed hat these three genes are widely, but dif- ferentially expressed in many tissues. The expression of Abd? and Abhd3 was highest in the liver and lowest in the spleen, Whereas the expression of Abhd2? was high in the testis and the spleen. All three Abhé proveins are shown to have a single Predicted amino-terminus transmembrane domain. Although proteins in this family generally act as enzymes such as acetyl cholinesterases, or lipases. the specific functions of the three ‘bh proteins are unknown, To oblain some clue for Abhd2 function, we first analyzed the expression pattem during embryonic evelopment using X-gal staining. Interestingly, Abhd2 was expressed in the vitelline vessels ot yolk sacs at embryonic day (E) 12.5 and expression of Abhd2 switched from endothelial cells to vascular smooth muscle cells (SMCs) during further embryonic development, In adult, Abhd? was expressed in many tissues including vascular and’ non-vascular SMCs such as intestinal SMCS, Although homozygous mutant mice were apparently normal, enhanced SMC migration in the explants SMCs culture and marked intimal hyperplasia after cull placement were observed in homozygous mice. Our resulis Show that Ablid2 is involved in SMC migration and neo-intimal thickening on vascular SMCs, When we started to analyze this gene in 2002, there was no report to suggest a relationship between this “gene and cancer, Using microarray unalysis, Johansson ef al:5"29 reported that the expression of Abd? was enhanced in 4 mouse brain tumor madel induced by intracerebral injection of a recombinant Moloney murine leukemia virus encoding the platelet-derived growth factor B-chain (MMLV/ PDGFB) into newborn mice. In addition, using microarray analysis again, Chen eal reported that ABHD? expression was increased in MCF-7 cells transfected with human Sp, a member of the Sp transcription factor family, suggesting that ABHD? is a down- seam target of SpS, and that ABHD? is associated with tumor- genesis. At the moment, the role of ABHD2 in tumorigenesis is not known yet ‘Another gene trap line is Lyrd®®*2"1™ 8 (L974) in which the trap vector was integrated into the 2nd intron of the Lert gene. Northem blot analysis and quantitative RT-PCR showed so 10.1914. 139:7006 207206114 (©2007 Japanese Cancer association that the Ler" mice had 10% mRNA expression of the Lert gene, suggesting that the Lerd® mouse is a hypomorphic ‘mutant, Leucine-rich repeat domain containing G-protein cou: pled receptor (GPCR) 4 (LGRS) family members we characterized by an extracellular domain with multiple leucine-rich repeats (RRs). The presence of a large exira-celiolar domain is a remarkable feature that separates the LGR family members from the other GPCRs, Studies of LGRs from different species suggest thst LGRs can be classified into three subtypes (A, B and C). Type A LGRs include the follicle-stimulating hoanene recepior (FSHR), the lutenizing hormone receptor (LHR), and the thyzoid-stimulating hormone receptor (TSHR), in which the ligands are glycoprotein hormones.“ Type B LGR comprises three members: LGR4, also known as GPR48, LORS, and LRG. ‘Type B LGR remains an orphan GPCR and its physiological functions have not yet been determined. Type C LORS, including LGR7 and LGRS, were recenily described as relaxin receptors. Following the identification of LGR7.and LGR& as relaxin receptors, the closely related relaxin3 and INSL3 have been shown to function as selective agonists for LGR7 and LGRS, respectively. The homozygous male was infertile showing morphological abnormalities in both the testes and the epidi= ddymides. Inthe testes, luminal swelling, loss of germinal epithe lium in the seminiferous tubules, and rete testis dilation were observed. Rete testis dilution was due 10 a water reabsorption failure caused by a decreased expression of ESRI and SLC9A3 in the efferent ducts. The epididymis contained short and dilated tubules and completely lacked its initial segment. Interestingly, we observed multilamination and distortion of the basement membranes with an accumulation of laminin in eaput epidydi- midis. These results indicate that Lerf has pivotal roles for the control of duct elongation through basement membrane remod- cling, and the regional differentiation of the caput epididymidis. On the other hand, Gao etal: reported that LGR4 expression is upregulated in p27-deficient HCTI16 cells using microarray analysis. Forced expression of LGR4 increased both in viro invasive activity and lung metastasis potency of HCTI16 cells. fy References 1 Casadaban Ml, Cohen SN, Lactose geet fused to exogenous promatr fone step using Brac tacterephage ins probe er onsapon evil sajnce. Prac Nal Aud Se USA 1979, 46. 2530-% 2 Feed M, Gas M, Davies B, DjsellG, La Masta G, Lani L. oan Of cellular DNA sejpences tht llon expression of sjacnt genes Pre ‘at Acad Set USA 1983: 80: 2117-21 3 O'Kane C), Geng WI. Detetion in st of getami egy elemens in Drospa Proc Null Acad Se USA 19ST, 84 S123 4 Allen ND. Cran OG, Haron SC, Henle S,Reik W, Suni MA Tangenes ‘pres or active cvosomal domains in owe developmen ature Tosh 333° 852-5 5 Kovhary K, ClapoltS, Brown A, Campbell R. Peters A, Rosa J A transgene Contig lacZ iaented ita the dystonia ceri expressed i eral be Nature 198% 338: 435-7, "5 Gowler A Joyner AL, Rossa J, Skares WC. Mouse enbeyonc sem cls su mperer contra fo etic developmentally reid genes. Sense Io, bs 463-3 7 Brenner DG. Lin-Chao S, Coben SN. Analysis of mammal cell ‘genet elation inst. by using revinsedeived sporble exons ariyng the Escherichia coi lacZ gene, Pro Nal Acal Se USA 1989, 56 8 von Melcher H, Raley HE. Metficsin of cela promotes by using & fetovies promoter wap. J Vin 1989: 63 3227-33, 9 von Meluner H, Reddy S, Ruley HE. Kolation of cesar promoters by Using a etvies promote ap, Proc Nel Arad $1 USA 1990; 87 3733-7 10 Macleod D. Lovel-Badge Re Jones S. Jackson LA promoter tsp tambon ste (ES) cel selects for inration of DNA into Cp isla Mice Acide Ree 1991 195 17-23, 11 Freeh G, Soriano P Pauses tsps in embryonic sem cls: a genetic fete idesify and mate developmental genes in mie. Genes Dev 199 S1si27, 12 Wurst W, Ressant J, Prieaus V eral A large-scale gene-p Sree for contrast, the depletion of endogenous LGR4 by RNA interfer- ence reduced the invasive potential of HeLa and Lewis lung ‘carcinoma cells not only in vitro but also in vive, Moreover, ‘GPR4S expression was significantly associated with lymph node metastasis and inversely corelated with p27 expression in human colon carcinomas. As a reduced expression level of the eyelin- dependent kinase inhibitor p27Kipl is associated with inceeased tumor malignancy and poor prognosis in individuals with various types of Cancer, LGR4 may play an important role in invasiveness and metastasis of carcinoma and might therefore represent a potential prognostic marker or therapeutic target ‘This is consistent with our hypothesis Urat Lgr4 is involved in the remodeling of BM, Conclusion In summary, we show here that exchangeable gene trap is a powerful tool for finding genes that are related fo cancer, but are paid lite attention in cancer research. By combining microarray analysis and gene trap mutagenesis, we will be able to analyze function of such genes in more systematic ways. In addition, wwe show that we ean replace the reporter gene with a gene of interest using the Cre-loxP system. We can apply this system to the vectors for homologous recombination, thus we can make the multipurpose allele at any gene locus. These methods ean be applied for detailed analysis of gene function in cancer research, ‘Acknowledgments, ‘Tis work was supported in part by a KAKENHI (Grantin-Aid for Scientific Reseach) in Privi’ Areas “Integrative Research Toward the Conquest of Cancer" from the Ministry of Education, Culture, Spons, Science and Technology of Japan, and grant from the Osaka Fourd tion of Promotion of Clinical Immunology and a grant fom Pancreas Research Foundation of Japan. We thank TransGenie Tncorperation for providing unpublished data‘on characeszaion of trapped DNA. snd the ‘members of our group Fr their support and continuous discussion. Ingeronal mutts in evelopmesally segue genes in mice. Genees 1005; 13% 889-95. 13 Chowdhury K, Bonaldo P, Toes M.Soykovs A, Ges Evidence fo the sochsic negation of gave tap yer ito the mouse germline Nae ‘eid Res 1991-25" 1531-6 14 Hicks GG, Shi EG, Li XM. Li Ci, Pawlak M, Roley HE, Functions ‘ereiss in mie by tagged sequence mutagenesis. Nal Gener 1997; 16: Seas 15 Zambowice BP, Frisch GA, Buxton EC, Lilebere SL, Peron C. Sands AAT. Divopion and sequence ideniteation of 2000 genes la mows fembeyoni tm ces Nature 199392 608-0 16 Forever LM, Nagy A, Sinn Mf vol An incon gene tap seen i fembyone tom cell entation of genes that espn © reno in ‘Proc Na Ace Set USA 1596, 98 TeTT-RD 17 Stanford WL, Caruana , Vals KA of a: Expres upping’ ideation ‘ol noel gees expresed in emsiopitc and endl lineage by gene ttpping i ES cells Blond 1998 93 2622-31 1W Tate P. Lee M, Tweedie, Shanas WC, Bickmore WA. Capturing nove route pene ening chromosomal ater nuk pens J Ce Sot 1958, TY 2575-85, 19 Kloppel M, Vile KA, Weana JL, A. highshroughpt imtion ane tp sppraach dines C4ST asa tre of BMP ipnling Mech De 2002: 11% ae 2 Kom fi Schooe M, Neuhaus H ora. Ean rp integrons in mote fnbryenic sem es give rite to saning patterns in chimaercemtryor ‘nth a high Fequsey and det enogenous genes Mech Dev 1992; 39 38-108 2 Soiinn R Schor M, Hnseling Ue a. The mouse Eataeeo tap acs 1 Gi-ty a cove! mmabn gene clued to. Drosophila and yea trancrpiona elo genes. Mech Dex 1992; 3% 111-2, 22 Niwa tf, Araki Kimura, Tanigct S, Wakasgi S. Yamamura K. An cient pene eap math sing poy Atop veto and crater of igeretap erent) Biochem (Tyo) 1998 41 383-9 2 Saimin MG Moser BI, Grass F Eliet poy A tap approsch allows the acer S | sama 3808 | wo 9 a3 S captuce of gees specially sctive in rete embryonic sm cll land in use embryes. Dev Dy 1998; 212: 326-33, 24 Whida Y, Leder PRET. poly ‘tap renovne vector for versie ‘ismpion an expeesion monioring of gene ia ining eal Nuc Acie er 19. 271033. 25 Shigeo, Kewaichi MC Ihida Y. Suppression of sonsense-mesaed (ARNA deciy permis unbated pene sopping in mouse tmbyen Sein fll Nacleic Aids Ret 2008: 83 20. 26 Heme MW, Kuloch AEA gerfct message: RNA saveilance and onense mediated dey, Col 19% 96: 307-1, 27 Araki K, Imam, Okuyama K. Oike ¥, Yemamica K. Efiieny of ‘combination by Cre anon expression in embryonic sem cell omparien of ves promot J Biche (Too) 1991 122 977-92 28 Tess G, Sean van der Homven Feta A elibl x2 expeesion eporer sete for mligepose ocho Gt alles, Genesis 28; 38-1518 29 Michell Pinson Kl, Klly OG er Functional nays of scrted and Uansmembrone pots rial to mow velopment a Geer 2001; 28 2ar9 30 Saute B, Hendersen N Site specific DNA recombination in mansian cll bythe re recombinase of tateaphaye PI. Pra Natl Acad St USA 1988, 85:5165-70. Snuet B, Henteson N. Targeted inion of cxogenous DNA int she ‘eikaryeic genome by the Ce cima: New Did 1980; 2 441-9, 32 LakwoM, Sauer B, Mosinger I Jr eral Trgted oncogene atvation by Stespecie recension in tanseni mie: Pow Na Acad Se USA 90 99: 6232-8 135 Orban FC, Oi D. Mah JD, ist ad site spesitc DNA recombination i tanspeie mice Pro Nu! Asa Si USA 19.8) 6-5 24 Arr K, Araki M, Miyct J, Visas PStespeine recombi of ¢ ‘eansgene in feulizedeugs by wansiet expression of Cre recombinate Proc Natl Acad Sc USA 1995, 98: 100-8 35 Kuhn R. Schwenk F. Agu M, Rijewsky K. Uncle gene trsting in nee Science 1995: 3685 1421-8 346 Alte H. Dae EC, Ler E, Ow DW Site-specific inten of DNA i ‘siltype abd mutant fox sts pace in the lt genome, Plant J 19D 7 oo 37 Arak K, Imaizumi T,Sekimoto T eral Eschangable gene ap wing the Created fox system. ell Mol Bil (None Grand 1999, 48 737-30, 3 Souler 5, Mike HL, Sauer B. Segmental genic repicene in tmbryone stm als by dole lox tagting, Nate vid Aes 199, 27 34 Bostassia EZ, Westerman K, LebouchP. Tacit behavior of LCR Evtoncer ete negated sane chor ls by esi Iedate eae ashange Blond 1997 9. 3392-44 40 Feng ¥O, Seiler J, Alam eral Ste-speite chromosomal integration in Imommalisn cell: highly eficent CRE reconbiause-madiatel esse change J Aol Bit 1990, 29% 79-85 41 Hows RI, Wier dic A, Aborsh K The lef he ou space region in PI siespeie recombination. Naclec Acs Ret 1986, 14 357-308. 42 Bethke B. Samer B. Segmontal genome replacement by Cre-medioted ‘ecoinaon:genconic se ativan ofthe pS) pear in une ‘copy eansormants Nulle Acs Res 1997, 25. 2028-34, 43 Lee G. Saito L Role of nucleotide cequences of loa spaces region in Cresesited recombunaton. Gene 198: 216 $5.65. 44 Kolo AF, Selectonmarkerfree modifcation of the mine belaeasein _Bmewing lou2 (ooreton of lon2722) ste Anal Biochem 2001, 290. 2a0-tt 45 Ari K, Ami M, Yomamur K, Sitietd integration ofthe ee gene trained by Cre recombine sing 2 combination of misant fo tee ‘Mace Acids Res 20 30: e103, 46 Tonivaki T, Horm K, Rakanuce Hi otal, Charcteriaion of an exchangeable ene ap using pU-1T caving a top can bela go cee Dev Growh Difer 008, 4: 163-72 47 Olle Ys tints A. Mamiya T er al. Toate CBP poten eas © casi RubinseimTybi syndrome’ phonaypes in mises impleatons for + ominancacptve mest. Ham Mol Gene 199. 8. 387-96 48 Ole ¥, Takaka N Hat Aor a Mice bomacygen fr vce frm of CRED-inding protein exhib defect in hemstopoiesis and ssl Sraiogenei. Blood 1998 98: 2771-9 49 Senta Karahi KLiZeral Ave mune gone, Sichle ined he Danforth shart tail ews, 6 equal for nomal development a the Imervetebrl it, Genrer 200% 172 45-36 50 Miyata K, Othe Y, Hoshi T eral Inctese ofr muscle cel migration and of ininal hyperplasia ia mice acing he siphafeta hyceate dona tani? gene Blcken Bunphys Res Commun 25; 329 296-0 {enc T, Arai K, Miva Kora Mutant mie lacking CI conse by the gene wp ison mutgenai Calle ot ezemal for embryone AevelopmentBiochon Biophys Rex Commun 1989: 266: 569-8 ‘Miura. Yoshinobo Kaizer al Impaired expesio of ing aeyeperin bt leds post implnation lethality, Bloom Biche Res Commun 2004 351: 132-8 Hoshi T, Takeo T, NabagataN,Thhoya M, Araki K. Yamamuca K. LORE Regulstes the Point Development and integrity of Mile Reproduct Tees Mice Biol Reprod 2007 6: 303-13 Solston Feny G The common ttad 2 soy of since, polite, tie. darn genome, Wathingtn, DC USA: The Joep ent Pres, 2003 55 Olle Oly Cheah E Cyglr M eral The aphbetn ydrlae fol Prin Eng 199; 8:19 {6 Edgar Al Pola JM. Cloning and sue dsibaton of te murine ipa feta hyeolae fold pron

You might also like