The document provides an overview of the functionalities and tools available in Blast2GO, including annotation generation, modulation, and visualization of Gene Ontology (GO) terms. It describes the basic workflow of using Blast2GO to functionally annotate sequences, including performing BLAST searches, mapping, GO annotation, and visualizing results. The document also outlines some practical exercises for annotating and visualizing GO terms for sequences using Blast2GO and Babelomics.
The document provides an overview of the functionalities and tools available in Blast2GO, including annotation generation, modulation, and visualization of Gene Ontology (GO) terms. It describes the basic workflow of using Blast2GO to functionally annotate sequences, including performing BLAST searches, mapping, GO annotation, and visualizing results. The document also outlines some practical exercises for annotating and visualizing GO terms for sequences using Blast2GO and Babelomics.
The document provides an overview of the functionalities and tools available in Blast2GO, including annotation generation, modulation, and visualization of Gene Ontology (GO) terms. It describes the basic workflow of using Blast2GO to functionally annotate sequences, including performing BLAST searches, mapping, GO annotation, and visualizing results. The document also outlines some practical exercises for annotating and visualizing GO terms for sequences using Blast2GO and Babelomics.
An introduction to the functionalities of Blast2GO and some practical exercises Objectives
Getting to know the Blast2GO Interface and its basic
functions
Annotation generation
Annotation modulation
Additional tools
GO graph visualisation
File formats
Exercises
!se Blast2GO to annotate and visuali"e GO functional
information
!se Babelomics to visuali"e lists of GO terms
Remember: Work-flow of analysis non#model Experiment MNAT1 CTNNBL1 ENOX2 GTPBP1 RALY TAGLN2 RAB3A PPP2R5A MAPRE1 ..... ... $ata#Anal%sis &ist of genes with unknown function Functional interpretation Functional 'rofiling (FatiGo) Fati*can+ + Functional Annotation protein folding) response to stimulus) "inc ion binding ,,, ,,, visualisation Function transfer *e-uence alignment Functional Annotation Annotation .ule GO GO uncharacterised se-uence wwwblast2!oor! &A!/01 B2G
"ava and "ava Web #tart
*!/s 2ava .untime Environment (3,4+
2ava 5eb *tart) a technolog% to sta% alwa%s up to date
Activate the 2ava 0onsole for debugging
0reate a desktop short cut
$efine the memor% B2G can use
$n%ut data &in fasta format' as df asdf 6m%7favourite7species7se-3 8 still unknown gtgatggaaaagaaaagttttgttatcgtcgacgcatatgggtttctttttcgcgcgtattatgcgctgcctggattaagcacctcatacaattttcctgtaggag gtgtatatggttttataaacatacttttgaaacatctctctttccacgatgcagattatttagttgtggtatttgattcggggtcgaaaaattttcgtcacactatgtatt ccgaatacaaaactaatcgccctaaagcaccagaggatctgtcactacaatgtgctccgctacgtgaggctgttgaagcgtttaatattgtaagtgaaga agtgcttaactacgaagcagacgacgtaatagctacactctgtacaaaatatgcatctagtaatgttggagtgagaatactgtcagcagataaggatttac tacaactcctaaatgataatgttcaagtttacgaccctataaaaagcagatacctcaccaatgaatacgttttagaaaaatttggtgtttcatcagataagttg catattgatacggttgcatcgagttataatgagaaaattattctcagctaagctgtacaccgtttattacacactcgaaaggccgttag 6m%7favourite7species7se-2 8 no clue ttgttagctaaaaaggaagactttcacacctttggtaatggtgttggctctgctggaacaggtggagttgtagtttctgcatccatgttgtctgcggatttttcaaa tcttagagaagagatagcagcggttagtacggctggtgcagattggttacacattgatgtgatggatgggtgcttcgtccccagtttgactatgggtcctgtg gtgatttccggcattaggaaatgtacaaatatgtttcttgatgtgcatttgatgattaatcgcccaggcgatcatctgaagagtgtggtagatgctggagctgat aagatagagcacattcgcaagatgatagaggaaagctcatcaaccgcgaaaatcgctgttgatggtggtgtttcaacggataatgcccgggctgttatcg aggcaggtgcgaatatactcgttgttggaacggcgctgtttgctgctgacgatatgagtaaagttgtaagaactttaaaatcattttaa 6m%7favourite7species7se-9 8 :ust se-uenced gtgggactgctcatccctgtaggcagggtggctattttttgtgtaaaggcagtctttcatagtcttgtaccgccatactatctatggataactacaaagcagttttt tgaggtgtggtttttctctcttcctatagtagcagttacatctttgtttacgggaggcgcgttagcccttcaggataccctcgtgggaagcgctaaagtatcagg gtaatggagtttttactcctgcaagatgtaatagagggtctggtaaaagctgtatcgtttgggctggtaatttcgctagttgggtgttacaacgggtatcactgtg agataggcgcaaggggtgtaggaacagcgacaacaaaaacttcggtagcagcttctatgctcataattttgttaaactatataattactgttttttacgcgta 6m%7favourite7species7se-; 8 we will see soon,,, atgtacgctgtatctctttcaaatttgcatgtctctttcaacaacaaggaggttttgaaaggtgttgacttggacatagcatggggggattccctggttatactgg gagaatctggtagtggaaagtctgtactaacaaaggttgtattgggtctaatagtgccccaagagggaagtgttactgtagatggcaccaatattcttgaga ataggcagggcatcaagaattttagtgttttgtttcaaaactgtgcgttatttgacagtcttacgatttgggaaaatgtagtattcaatttccgtaggaggcttcgtt tagataaggataatgccaaggctttggctttacggggattggagcttgtgggattggacgccagtgtaatgaacgtgtatcctgtggagctatcaggcggg atgaaaaagcgcgtagctttggcaagagctattataggtagtcccaaaattctaattttggatgagccaacttcgggattggatcctataatgtcttcagtggt ( first c)eck 0lick on the green arrow to check %ou can connect to $B, A GO graph should appear
*atabase confi!uration
Open port 99<4 (m%s-l+ for outgoing connections at %our
institute
0onfigure=check personal firewalls
Actual settings can be found at www.blast2go.org
Blast2GO (%%lication >able with all the se-uence information (3+ Blast (2+ ?apping (9+ Annotation Graph visualisation Application messages Blast results Application statistics An% operation will onl% affect to selected se-uences@@@@ Findin! t)e )omolo!ues &B+(#,' 5here to run Blast 0hoose $atabase /umber of blast hits e#Aalue cut#off Blast algorithm Blast mode 1*' length cut#off parameter descriptions save %our results apart -)ose ot)er database at .-B$ *et at blast2go,properties file B+(#, results RED Blast *istribution -)arts Evaluate the similarit% of %our se-uences within public $Bs #in!le #e/uence 0enu 0a%%in! GREEN Full Gene Ontolog% $B /0BI Flat Files B gene2accession (; <CD ;3; entries+ B gene7info (3 49E 43; entries+ 'I. # /on#.edundant .eference 'rotein $atabase B including '*$) !ni'rot) *wiss#'rot) >rE?B&) .ef*e-) Gen'ept % '$B Resources for ma%%in! Blast 1its Ids) gi#numbers Blast 1its Ids) gi#numbers GO#terms E0 simF ?apping .esources (nnotation 0enu Blast Annotation Aalidation and Annex
Other Annotation modes (nnotation >he 1sp#1it 0overage 0utoff allows to set a minimum percentage of the 1I> se-uence which should be expand b% the G!E.H se-uence, >his helps to avoid the problem of cis#annotation (nnotation Result BLUE (nnotation -)arts (nnotation -)arts 0ommonl%) level E is the most abundant specificit% level in the Gene Ontolog% .ecovers implicit Biological 'rocess and 0ellular 0omponent GO terms from ?olecular Function Annotations Ref: Myhre, Tvet, Mollesta!, L"gre!: #!!to$al Ge$e O$tology str%&t%re for '(rove! bolog&al reaso$$g Bo$for'at&s, 2))* (dditional (nnotation: (..12 (dditional (nnotation: $nter3ro#can .esults are stored at %our computer as I?& files, Hou can upload them later Once %ou have completed %our Inter'ro annotation) results can be transformed to GO terms and merged to Blast annotation .un Inter'ro *earches at EBI from Blast2GO $nter3ro#can Results 0olumn with Inter'ro*can results Batch#Inter'ro#?otif searches allows to merge GOs from domains to annotation (dditional (nnotation: GO#lim GO*lim is a reduction of the Gene Ontolog% to a more reduced vocabular%, 1elps to summari"e information $ifferent GO*lims available at Blast2GO After GO*lim transformation se-uences get HE&&O5 1n4yme annotation and 5e!! 0a%s GO Enzyme Codes KEGG maps B Ordered Jegg ?ap visuali"ation B 1ighlighting of involved *e-uences (dditional (nnotation: 0anual -uration Hou can modif% manuall% annotation of particular se-uences If %ou click in this box) curated se-uences get purple 16%ort Results *aves the complete B2G pro:ect (heav%+ Export annotation results in different formats *elect file t%pe 16%ort formats 0<;<3KA<2 gl%oxalase i GO<<<;;42 Flacto%lglutathione l%ase activit% 0<;<3K0<2 metallothionein#like protein GO<<;4KC2 Fmetal ion binding 0<;<3KG<2 protein phosphatase GO<<<K2KC 0protein serine=threonine phosphatase complex 0<;<39E3< response to water deprivationL regulation of transcriptionL multicellular organismal developmentL response to abscisic acid stimulusL nucleusL transcription factor activit%L 0<;<39A32 translationL ribosomeL plastidL structural constituent of ribosomeL 0<;<39032 galactose metabolic processL plastidL aldose 3#epimerase activit%L carboh%drate bindingL B% *e- Gene*pring Format 0<;<3K03< ;C<C)D;<D)4DCD)3<2<<)EE2;)34D 0<;<3KA32 34CDK)2C2);;2;K 0<;<3K032 ;K4D)32E<E)K299 Go*tat 0<;<3K03< GO<<<;C<C mitogen#activated protein kinase 9 0<;<3K03< E02,C,33,2; 0<;<3KA32 GO<<34CDK class iv chitinase 0<;<3KA32 GO<<<<2C2 ,annot #lso for '(ort+ 0ore e6%ort formats ,e-%e$&e $a'e ,e-%e$&e !es&. ,e-%e$&e le$gth.t !es&. .t #// E01al%e ,'larty ,&ore #lg$'e$t le$gth2ostves 0<;<3K03< mitogen#activated protein kinase 9 C3C gi8322KD;3<;8gb8AB?4C4DK,38mitogen#activated protein kinase M0itrus sinensisN AB?4C4DK 3,9EE#329 DD ;;E,2K 222 223 0<;<3KE3< ###/A### C<4 gi83EC9E49<C8emb80AO42;ED,38unnamed protein product MAitis viniferaN 0AO42;ED 2,4DE#<94 K9 3EE,22 33D DD 0<;<3KG3< protein 42< gi833;3E93E;8gb8ABIE2C;9,383< k$a putative secreted protein MArgas monolakensisN ABIE2C;9 C,;CE#<3E 49 K9,EC D< EC 0<;<3KA32 class iv chitinase C3E gi894<K;CC8gb8AA09EDK3,38chitinase 01I3 M0itrus sinensisN AA09EDK3 3,;EE#<43 CK 29D,2 3C3 39; 0<;<3K032 c%steine proteinase inhibitor 449 gi8K<DD4K28gb8AAFC22<2,38AF24EEE373c%steine protease inhibitor M?anihot esculentaN AAFC22<2 D,99E#<2E K9 334,C DD K9 0<;<3KE32 protein phosphatase 2c 449 gi8;42CC32K8gb8AA*K4C42,38protein phosphatase 20 M&%copersicon esculentumN AA*K4C42 2,C4E#<CC D3 2D3,2 3K< 34; 0<;<3KG32 alpha beta fold famil% protein ECK gi83;CK4EC4D8emb80A/K92E3,38h%pothetical protein MAitis viniferaN 6gi83EC99D;4;8emb80AO;;<<E,38 unnamed protein product MAitis viniferaN 0A/K92E3 3,4CE#<K; D; 93;,4D 3CD 34D 0<;<3KA<2 gl%oxalase i 4<< gi82239;2E8emb80AB<DCDD,38h%pothetical protein M0itrus x paradisiN 0AB<DCDD 2,34E#<4; K3 2;K,<E 33; D9 0<;<3K0<2 metallothionein#like protein 42E gi899<KDK<8db:8BAA93E43,38metallothionein#like protein M0itrus unshiuN BAA93E43 2,29E#<3; 3<< K2,<9 ;< ;< ,e-. Na'e ,e-. Des&r(to$ ,e-. Le$gth3.ts '$. e1al%e'ea$ ,'larty3GOs GOs E$4y'e /o!es 5$ter2ro,&a$ 0<;<3K032 c%steine proteinase inhibitor 449 2< 2E K<,<<F 9 FGO<<<;K4DL 0GO<<32E<EL FGO<<<K299 I'.<<<<3<L I'.<3K<C9L noI'. 0<;<3KE32 protein phosphatase 2c 449 2< CC KE,<<F 2 /GO<<3E<C3L FGO<<<9K2; I'.<<3D92L I'.<3;<;EL I'.<3E4EEL noI'. 0<;<3KG32 alpha beta fold famil% protein ECK 2< K; CD,<<F ; FGO<<34CKCL 0GO<<<EC9DL 0GO<<<DE<CL 'GO<<<4C2E noI'. 0<;<3KA<2 gl%oxalase i 4<< 2< 4; C;,<<F 2 'GO<<<EDCEL FGO<<<;;42 E0;,;,3,E I'.<<;94<L noI'. 0<;<3K0<2 metallothionein#like protein 42E 3K 3; C;,<<F 3 FGO<<;4KC2 I'.<<<9;C 0<;<3KE<2 haemol%sin#iii related famil%expressed 432 2< 92 C2,<<F 3 0GO<<34<2< noI'. 0<;<3KG<2 protein phosphataseexpressed 4;E 2< DC K3,<<F E 0GO<<<K2KCL /GO<<3E<C3L 'GO<<<4;C<L 0GO<<<DE94L 0GO<<<EC9D no I'* match 0<;<3K0<; phosphogl%cerate bisphosphogl%cerate mutase famil% protein CK< 2< 49 44,<<F 2 'GO<<<K3E2L FGO<<<9K2; I'.<<39;EL I'.<39<CKL noI'. 0<;<3KE<; pol%ubi-uitin C<C 2< 33E DD,<<F 2 'GO<<<4;4;L 0GO<<<E422 I'.<<<424L I'.<3DDE;L I'.<3DDEEL I'.<3DDE4L noI'. 0<;<3KG<; meiotic recombination 33 ECE 2< ;E KD,<<F 23 0GO<<3D<39L 'GO<<<C324L FGO<<<;E3DL FGO<<<EE<DL FGO<<<;KC3L 0GO<<<EC9DL FGO<<9<3;EL 'GO<<<49<2L 'GO<<;E;;DL FGO<<<K2KDL 'GO<<;23ECL FGO<<<94CCL 'GO<<<4K4DL 0GO<<9<<KDL 'GO<<<C34EL FGO<<<;E2CL 'GO<<3EDCDL 0GO<<<EEC4L FGO<<<E3DKL 0GO<<<E49;L 'GO<<<433K I'.<<9C<3L I'.<<;K;9L noI'. 0<;<3KA<4 late embr%ogenesis#abundant protein 4;K 2< ;9 4K,<<F 2 'GO<<<DC9CL 'GO<<<D;<D no I'* match Export *e-uence >able Export Best1it $ata #e/uence #election *e-uence *election >ools to obtain a selection based on annotation status #e/uence #election B% /ame=$escription b% function b% se-uence name Invert and delete Ot)er ,ools 'ermits to reduce %our pro:ect si"e ?anipulation of *e-uence $escription ?erging ,annot and ,dat pro:ects Get more out of %our memor% 0heck when connection problems
7isualisation B GO Graph Aisuali"ation as tool to explore and discover B Interactive and "oomable graphs B 0olored graphs highlighting areas of interest -ombined Gra%) Each term has a number of se-uences associated /odes can be coloured to indicate relevance Each term is displa%ed around its biological context /ode shape to differentiate between direct and indirect annotation Let's paint the DAG of just 1000 sequences Too many nodes!!! -ombined Gra%) Need way to find relevant information 6 -ombined Gra%) $ifferent GO branches .educes nodes b% number of annotate se-uences 0riterion for highlighting and filtering nodes /ode data to be displa%ed Filtered Gra%) $irect annotations Filtered transition nodes O Filtered /odes
?olecular functions of 3<<< se-uences
GO0,l' %$fltere! D#G 7ltere! a$! Th$$e! B Dy$a'& re!%&to$ B Re'ove t(0ter's 8 9) se-%e$&es B Re'ove $ter'e!ate ter's wth $o!e0s&ore 8 92
,tat& re!%&to$ to a s%bset of ter's
Filterin! and 3runin! #)ow node content Gra%) -)arts Gra%) -)arts B *e-uence $istribution=GO as ?ultilevel#'ie (Oscore or Ose- cutoff+ B *e-uence $istribution=GO as &evel#'ie (level selection+ &EAE& E ?inimum 3E se-uences GO*&I? pie charts hand% to summari"e functional content #avin! O%tions ,ave as (&t%re a$! as te:t -olourin! yourself t)e *(G >he b%$esc option in the Graph#0olouring allows %ou to colour the $AG nodes according to an additional value GO<<<ECD2 GO<<<ECD2 3,< GO<<<4;32 GO<<<4;32 <,2K GO<<<9C9E GO<<<9C9E <,K3 GO<<34C<E GO<<34C<E <,C4 GO<<<EK;< GO<<<EK;< <,33 GO<<<EE<4 GO<<<EE<4 <,D9 GO<<<4493 GO<<<4493 <,9K GO<<2<<9C GO<<2<<9C <,9; >he PspecialQ ,annot file 9 columns GO name GO I$ value *cale between < and 3 used to colour the graph) Aalues from 3 to D are different colors to represent groups=categories #$$otato$ fle ,e-%e$&e0/o%$t No!e0,&ore /ategores;Des&r(to$s E$r&h'e$t #$alyss /o$t$%o%s (ara'eter Gra%) -olorin! 7isualisation #ummary
$AGs are interesting for browsing functional
annotation but can be too large
5ith filtering and prunning options %ou can create
more navegable $AGs
'ies are good to compact information tr% out levels
GO*lim compacts to more e-uivalent terms than
filtering the GO
Hou can create %our own colouring schema
#everal 16am%les816ercises
Blast) map and annotate several few se-uences in Blast2GO b%
loading the 3< test se-uences (within the file menu+, Generate some singel#*e- GO graphs to review annotation, (right mouse click on se-uence table+ (http==www,blast2go,org R *tart R 3<2;?B+
Annotated 33<< 0itrus#!nigenes (nt+ with Blast2GO, Anal%se the
annotation results,
Import the gene annotations of a PpotatoQ dataset into Blast2GO and
tr% to export a hand% graph as '$F,
Aisuali"es some gene annotations (breast cancer genes+ with
Babelo'&s GO0Gra(h01ewer (using the I$#0onverter >ools+ (http==bioinfo,cipf,es=babelomicstutorial=gographvi"+ htt(:;;bo$fo.&(f.es;blast2go&o%rse ,,,, and donSt forget to check the database setting under PtoolsQ db#host mem2< and db#name b2g7ma%3< @@@