You are on page 1of 48

Hands on Blast2GO

An introduction to the
functionalities of Blast2GO
and some practical exercises
Objectives

Getting to know the Blast2GO Interface and its basic


functions

Annotation generation

Annotation modulation

Additional tools

GO graph visualisation

File formats

Exercises

!se Blast2GO to annotate and visuali"e GO functional


information

!se Babelomics to visuali"e lists of GO terms


Remember: Work-flow of analysis
non#model
Experiment
MNAT1
CTNNBL1
ENOX2
GTPBP1
RALY
TAGLN2
RAB3A
PPP2R5A
MAPRE1
.....
...
$ata#Anal%sis
&ist of genes with
unknown function
Functional
interpretation
Functional
'rofiling
(FatiGo) Fati*can+
+
Functional
Annotation
protein folding)
response to
stimulus)
"inc ion binding
,,,
,,,
visualisation
Function transfer
*e-uence
alignment
Functional
Annotation
Annotation .ule
GO
GO
uncharacterised
se-uence
wwwblast2!oor!
&A!/01 B2G

"ava and "ava Web #tart

*!/s 2ava .untime Environment (3,4+

2ava 5eb *tart) a technolog% to sta% alwa%s up to date

Activate the 2ava 0onsole for debugging

0reate a desktop short cut

$efine the memor% B2G can use


$n%ut data &in fasta format'
as
df
asdf
6m%7favourite7species7se-3 8 still unknown
gtgatggaaaagaaaagttttgttatcgtcgacgcatatgggtttctttttcgcgcgtattatgcgctgcctggattaagcacctcatacaattttcctgtaggag
gtgtatatggttttataaacatacttttgaaacatctctctttccacgatgcagattatttagttgtggtatttgattcggggtcgaaaaattttcgtcacactatgtatt
ccgaatacaaaactaatcgccctaaagcaccagaggatctgtcactacaatgtgctccgctacgtgaggctgttgaagcgtttaatattgtaagtgaaga
agtgcttaactacgaagcagacgacgtaatagctacactctgtacaaaatatgcatctagtaatgttggagtgagaatactgtcagcagataaggatttac
tacaactcctaaatgataatgttcaagtttacgaccctataaaaagcagatacctcaccaatgaatacgttttagaaaaatttggtgtttcatcagataagttg
catattgatacggttgcatcgagttataatgagaaaattattctcagctaagctgtacaccgtttattacacactcgaaaggccgttag
6m%7favourite7species7se-2 8 no clue
ttgttagctaaaaaggaagactttcacacctttggtaatggtgttggctctgctggaacaggtggagttgtagtttctgcatccatgttgtctgcggatttttcaaa
tcttagagaagagatagcagcggttagtacggctggtgcagattggttacacattgatgtgatggatgggtgcttcgtccccagtttgactatgggtcctgtg
gtgatttccggcattaggaaatgtacaaatatgtttcttgatgtgcatttgatgattaatcgcccaggcgatcatctgaagagtgtggtagatgctggagctgat
aagatagagcacattcgcaagatgatagaggaaagctcatcaaccgcgaaaatcgctgttgatggtggtgtttcaacggataatgcccgggctgttatcg
aggcaggtgcgaatatactcgttgttggaacggcgctgtttgctgctgacgatatgagtaaagttgtaagaactttaaaatcattttaa
6m%7favourite7species7se-9 8 :ust se-uenced
gtgggactgctcatccctgtaggcagggtggctattttttgtgtaaaggcagtctttcatagtcttgtaccgccatactatctatggataactacaaagcagttttt
tgaggtgtggtttttctctcttcctatagtagcagttacatctttgtttacgggaggcgcgttagcccttcaggataccctcgtgggaagcgctaaagtatcagg
gtaatggagtttttactcctgcaagatgtaatagagggtctggtaaaagctgtatcgtttgggctggtaatttcgctagttgggtgttacaacgggtatcactgtg
agataggcgcaaggggtgtaggaacagcgacaacaaaaacttcggtagcagcttctatgctcataattttgttaaactatataattactgttttttacgcgta
6m%7favourite7species7se-; 8 we will see soon,,,
atgtacgctgtatctctttcaaatttgcatgtctctttcaacaacaaggaggttttgaaaggtgttgacttggacatagcatggggggattccctggttatactgg
gagaatctggtagtggaaagtctgtactaacaaaggttgtattgggtctaatagtgccccaagagggaagtgttactgtagatggcaccaatattcttgaga
ataggcagggcatcaagaattttagtgttttgtttcaaaactgtgcgttatttgacagtcttacgatttgggaaaatgtagtattcaatttccgtaggaggcttcgtt
tagataaggataatgccaaggctttggctttacggggattggagcttgtgggattggacgccagtgtaatgaacgtgtatcctgtggagctatcaggcggg
atgaaaaagcgcgtagctttggcaagagctattataggtagtcccaaaattctaattttggatgagccaacttcgggattggatcctataatgtcttcagtggt
( first c)eck
0lick on the green arrow to check %ou
can connect to $B, A GO graph should appear

*atabase confi!uration

Open port 99<4 (m%s-l+ for outgoing connections at %our


institute

0onfigure=check personal firewalls

Actual settings can be found at www.blast2go.org


Blast2GO (%%lication
>able with all the
se-uence information
(3+ Blast
(2+ ?apping
(9+ Annotation
Graph visualisation
Application messages
Blast results
Application statistics
An% operation
will onl% affect
to selected
se-uences@@@@
Findin! t)e )omolo!ues &B+(#,'
5here to run Blast
0hoose $atabase
/umber of blast hits
e#Aalue cut#off
Blast algorithm
Blast mode
1*' length cut#off
parameter descriptions
save %our results apart
-)ose ot)er database at .-B$
*et at blast2go,properties file
B+(#, results
RED
Blast *istribution -)arts
Evaluate the similarit% of
%our se-uences within public $Bs
#in!le #e/uence 0enu
0a%%in!
GREEN
Full Gene Ontolog% $B
/0BI Flat Files
B gene2accession (; <CD ;3; entries+
B gene7info (3 49E 43; entries+
'I. # /on#.edundant .eference 'rotein $atabase
B including '*$) !ni'rot) *wiss#'rot) >rE?B&)
.ef*e-) Gen'ept % '$B
Resources for ma%%in!
Blast 1its
Ids) gi#numbers
Blast 1its
Ids) gi#numbers
GO#terms
E0
simF
?apping
.esources
(nnotation 0enu
Blast Annotation
Aalidation and Annex

Other Annotation modes
(nnotation
>he 1sp#1it 0overage 0utoff allows to set a minimum percentage
of the 1I> se-uence which should be expand b% the G!E.H se-uence,
>his helps to avoid the problem of cis#annotation
(nnotation Result
BLUE
(nnotation -)arts
(nnotation -)arts
0ommonl%) level E is the most abundant specificit% level in the Gene Ontolog%
.ecovers implicit Biological 'rocess and
0ellular 0omponent GO terms from
?olecular Function Annotations
Ref: Myhre, Tvet, Mollesta!, L"gre!:
#!!to$al Ge$e O$tology str%&t%re for '(rove! bolog&al reaso$$g
Bo$for'at&s, 2))*
(dditional (nnotation: (..12
(dditional (nnotation:
$nter3ro#can
.esults are stored at %our
computer as I?& files, Hou
can upload them later
Once %ou have completed %our Inter'ro
annotation) results can be transformed to
GO terms and merged to Blast annotation
.un Inter'ro *earches at EBI from
Blast2GO
$nter3ro#can Results
0olumn with Inter'ro*can results
Batch#Inter'ro#?otif searches allows to merge GOs from domains to annotation
(dditional (nnotation: GO#lim
GO*lim is a reduction of the Gene Ontolog%
to a more reduced vocabular%, 1elps to
summari"e information
$ifferent GO*lims available at Blast2GO
After GO*lim transformation se-uences get
HE&&O5
1n4yme annotation and 5e!! 0a%s
GO Enzyme Codes KEGG maps
B Ordered Jegg ?ap visuali"ation
B 1ighlighting of involved *e-uences
(dditional (nnotation:
0anual -uration
Hou can modif% manuall% annotation of particular
se-uences
If %ou click in this box) curated
se-uences get purple
16%ort Results
*aves the complete B2G pro:ect (heav%+
Export annotation results in different formats
*elect file t%pe
16%ort formats
0<;<3KA<2 gl%oxalase i GO<<<;;42 Flacto%lglutathione l%ase activit%
0<;<3K0<2 metallothionein#like protein GO<<;4KC2 Fmetal ion binding
0<;<3KG<2 protein phosphatase GO<<<K2KC 0protein serine=threonine phosphatase complex
0<;<39E3< response to water deprivationL regulation of transcriptionL multicellular organismal developmentL response to abscisic acid stimulusL nucleusL transcription factor activit%L
0<;<39A32 translationL ribosomeL plastidL structural constituent of ribosomeL
0<;<39032 galactose metabolic processL plastidL aldose 3#epimerase activit%L carboh%drate bindingL
B% *e-
Gene*pring Format
0<;<3K03< ;C<C)D;<D)4DCD)3<2<<)EE2;)34D
0<;<3KA32 34CDK)2C2);;2;K
0<;<3K032 ;K4D)32E<E)K299
Go*tat
0<;<3K03< GO<<<;C<C mitogen#activated protein kinase 9
0<;<3K03< E02,C,33,2;
0<;<3KA32 GO<<34CDK class iv chitinase
0<;<3KA32 GO<<<<2C2
,annot
#lso for '(ort+
0ore e6%ort formats
,e-%e$&e $a'e ,e-%e$&e !es&. ,e-%e$&e le$gth.t !es&. .t #// E01al%e ,'larty ,&ore #lg$'e$t le$gth2ostves
0<;<3K03< mitogen#activated protein kinase 9 C3C gi8322KD;3<;8gb8AB?4C4DK,38mitogen#activated protein kinase M0itrus sinensisN AB?4C4DK 3,9EE#329 DD ;;E,2K 222 223
0<;<3KE3< ###/A### C<4 gi83EC9E49<C8emb80AO42;ED,38unnamed protein product MAitis viniferaN 0AO42;ED 2,4DE#<94 K9 3EE,22 33D DD
0<;<3KG3< protein 42< gi833;3E93E;8gb8ABIE2C;9,383< k$a putative secreted protein MArgas monolakensisN ABIE2C;9 C,;CE#<3E 49 K9,EC D< EC
0<;<3KA32 class iv chitinase C3E gi894<K;CC8gb8AA09EDK3,38chitinase 01I3 M0itrus sinensisN AA09EDK3 3,;EE#<43 CK 29D,2 3C3 39;
0<;<3K032 c%steine proteinase inhibitor 449 gi8K<DD4K28gb8AAFC22<2,38AF24EEE373c%steine protease inhibitor M?anihot esculentaN AAFC22<2 D,99E#<2E K9 334,C DD K9
0<;<3KE32 protein phosphatase 2c 449 gi8;42CC32K8gb8AA*K4C42,38protein phosphatase 20 M&%copersicon esculentumN AA*K4C42 2,C4E#<CC D3 2D3,2 3K< 34;
0<;<3KG32 alpha beta fold famil% protein ECK gi83;CK4EC4D8emb80A/K92E3,38h%pothetical protein MAitis viniferaN 6gi83EC99D;4;8emb80AO;;<<E,38 unnamed protein product MAitis viniferaN 0A/K92E3 3,4CE#<K; D; 93;,4D 3CD 34D
0<;<3KA<2 gl%oxalase i 4<< gi82239;2E8emb80AB<DCDD,38h%pothetical protein M0itrus x paradisiN 0AB<DCDD 2,34E#<4; K3 2;K,<E 33; D9
0<;<3K0<2 metallothionein#like protein 42E gi899<KDK<8db:8BAA93E43,38metallothionein#like protein M0itrus unshiuN BAA93E43 2,29E#<3; 3<< K2,<9 ;< ;<
,e-. Na'e ,e-. Des&r(to$ ,e-. Le$gth3.ts '$. e1al%e'ea$ ,'larty3GOs GOs E$4y'e /o!es 5$ter2ro,&a$
0<;<3K032 c%steine proteinase inhibitor 449 2< 2E K<,<<F 9 FGO<<<;K4DL 0GO<<32E<EL FGO<<<K299 I'.<<<<3<L I'.<3K<C9L noI'.
0<;<3KE32 protein phosphatase 2c 449 2< CC KE,<<F 2 /GO<<3E<C3L FGO<<<9K2; I'.<<3D92L I'.<3;<;EL I'.<3E4EEL noI'.
0<;<3KG32 alpha beta fold famil% protein ECK 2< K; CD,<<F ; FGO<<34CKCL 0GO<<<EC9DL 0GO<<<DE<CL 'GO<<<4C2E noI'.
0<;<3KA<2 gl%oxalase i 4<< 2< 4; C;,<<F 2 'GO<<<EDCEL FGO<<<;;42 E0;,;,3,E I'.<<;94<L noI'.
0<;<3K0<2 metallothionein#like protein 42E 3K 3; C;,<<F 3 FGO<<;4KC2 I'.<<<9;C
0<;<3KE<2 haemol%sin#iii related famil%expressed 432 2< 92 C2,<<F 3 0GO<<34<2< noI'.
0<;<3KG<2 protein phosphataseexpressed 4;E 2< DC K3,<<F E 0GO<<<K2KCL /GO<<3E<C3L 'GO<<<4;C<L 0GO<<<DE94L 0GO<<<EC9D no I'* match
0<;<3K0<; phosphogl%cerate bisphosphogl%cerate mutase famil% protein CK< 2< 49 44,<<F 2 'GO<<<K3E2L FGO<<<9K2; I'.<<39;EL I'.<39<CKL noI'.
0<;<3KE<; pol%ubi-uitin C<C 2< 33E DD,<<F 2 'GO<<<4;4;L 0GO<<<E422 I'.<<<424L I'.<3DDE;L I'.<3DDEEL I'.<3DDE4L noI'.
0<;<3KG<; meiotic recombination 33 ECE 2< ;E KD,<<F 23 0GO<<3D<39L 'GO<<<C324L FGO<<<;E3DL FGO<<<EE<DL FGO<<<;KC3L 0GO<<<EC9DL FGO<<9<3;EL 'GO<<<49<2L 'GO<<;E;;DL FGO<<<K2KDL 'GO<<;23ECL FGO<<<94CCL 'GO<<<4K4DL 0GO<<9<<KDL 'GO<<<C34EL FGO<<<;E2CL 'GO<<3EDCDL 0GO<<<EEC4L FGO<<<E3DKL 0GO<<<E49;L 'GO<<<433K I'.<<9C<3L I'.<<;K;9L noI'.
0<;<3KA<4 late embr%ogenesis#abundant protein 4;K 2< ;9 4K,<<F 2 'GO<<<DC9CL 'GO<<<D;<D no I'* match
Export *e-uence >able
Export Best1it $ata
#e/uence #election
*e-uence *election >ools to
obtain a selection based on
annotation status
#e/uence #election
B% /ame=$escription
b% function
b% se-uence name
Invert and delete
Ot)er ,ools
'ermits to reduce %our pro:ect si"e
?anipulation of *e-uence $escription
?erging ,annot and ,dat pro:ects
Get more out of %our memor%
0heck when connection problems

7isualisation
B GO Graph Aisuali"ation as tool to explore and discover
B Interactive and "oomable graphs
B 0olored graphs highlighting areas of interest
-ombined Gra%)
Each term has a number of se-uences associated
/odes
can be coloured
to indicate relevance
Each term is displa%ed
around its biological context
/ode shape to differentiate between
direct and indirect annotation
Let's paint the DAG of just 1000 sequences
Too many nodes!!!
-ombined Gra%)
Need way to find relevant information
6
-ombined Gra%)
$ifferent GO branches
.educes nodes b% number
of annotate se-uences
0riterion for highlighting
and filtering nodes
/ode data to be displa%ed
Filtered Gra%)
$irect annotations
Filtered transition nodes
O Filtered /odes

?olecular functions of 3<<< se-uences


GO0,l'
%$fltere! D#G
7ltere! a$! Th$$e!
B Dy$a'& re!%&to$
B Re'ove t(0ter's 8 9) se-%e$&es
B Re'ove $ter'e!ate ter's wth
$o!e0s&ore 8 92

,tat& re!%&to$ to a
s%bset of ter's

Filterin! and 3runin!
#)ow node content
Gra%) -)arts
Gra%) -)arts
B *e-uence $istribution=GO as ?ultilevel#'ie (Oscore or Ose- cutoff+
B *e-uence $istribution=GO as &evel#'ie (level selection+
&EAE& E
?inimum 3E se-uences
GO*&I? pie charts
hand% to summari"e
functional content
#avin! O%tions
,ave as (&t%re a$! as te:t
-olourin! yourself t)e *(G
>he b%$esc option in the
Graph#0olouring allows %ou
to colour the $AG nodes
according to an additional
value
GO<<<ECD2 GO<<<ECD2 3,<
GO<<<4;32 GO<<<4;32 <,2K
GO<<<9C9E GO<<<9C9E <,K3
GO<<34C<E GO<<34C<E <,C4
GO<<<EK;< GO<<<EK;< <,33
GO<<<EE<4 GO<<<EE<4 <,D9
GO<<<4493 GO<<<4493 <,9K
GO<<2<<9C GO<<2<<9C <,9;
>he PspecialQ ,annot file
9 columns
GO name
GO I$
value
*cale between < and 3
used to colour the graph)
Aalues from 3 to D are
different colors to represent
groups=categories
#$$otato$ fle
,e-%e$&e0/o%$t No!e0,&ore
/ategores;Des&r(to$s
E$r&h'e$t #$alyss
/o$t$%o%s (ara'eter
Gra%) -olorin!
7isualisation #ummary

$AGs are interesting for browsing functional


annotation but can be too large

5ith filtering and prunning options %ou can create


more navegable $AGs

'ies are good to compact information tr% out levels

GO*lim compacts to more e-uivalent terms than


filtering the GO

Hou can create %our own colouring schema


#everal 16am%les816ercises

Blast) map and annotate several few se-uences in Blast2GO b%


loading the 3< test se-uences (within the file menu+,
Generate some singel#*e- GO graphs to review annotation, (right
mouse click on se-uence table+
(http==www,blast2go,org R *tart R 3<2;?B+

Annotated 33<< 0itrus#!nigenes (nt+ with Blast2GO, Anal%se the


annotation results,

Import the gene annotations of a PpotatoQ dataset into Blast2GO and


tr% to export a hand% graph as '$F,

Aisuali"es some gene annotations (breast cancer genes+ with


Babelo'&s GO0Gra(h01ewer (using the I$#0onverter >ools+
(http==bioinfo,cipf,es=babelomicstutorial=gographvi"+
htt(:;;bo$fo.&(f.es;blast2go&o%rse
,,,, and donSt forget to check the database setting under PtoolsQ
db#host mem2< and db#name b2g7ma%3< @@@

You might also like