Professional Documents
Culture Documents
Somchai Saengamnatdej
April 25, 2010
1. Retrieving a genome/chromosome
SRS (http://srs.ebi.ac.uk)
● In Quick Text Search Window, select 'Protein' in find box, and type in 'name of your protein' in
matching box. Them, click 'Search'.
● When the list shows up, go to the entry with the 'accession number' you want.
● Tick in the box at the start of the entry.
● In the 'Display Options' window, select 'UniprotView' in the 'view results using: box'
● Then, click on "Apply Display Options" button.
● When the window of a list appears, double click on the UniProtKB to open.
● When the full entry shows up, scroll through the entry. (General information, description &
origin of the protein, published/unpublished references, comments on the function of the gene,
database cross references, keyword, sequence features, & sequence.)
● Click on the hyper-linked text to go to the database entries.
● Go back to the query list page.
● Now, again tick into the box at the start of the entry.
● On the Result Options window, select 'FastA' in the Launch analysis tool box.
● Click 'Save'
● The new window shows up, select 'FastaSeqs' in save with box.
● In the window 'Output To', select 'Browser Window (HTML)'
Accessing Molecular Data & Web Tools. 2
● Click 'Save'.
3. Annotation of a gene.
PROSITE (http://www.expasy.ch/prosite/)
● Paste the protein sequence retrieved from a database in the box provided.
● Click on 'Scan'.
● In the results viewer, there is a list of Prosite hits, click on the individual hits to go to the
specific entries and read their descriptions.
● There is a high level of false positives because prosite motif patterns are generally small and
rarely cover complete domains.
● The more reliable methods (Pfam, SMART) use HMMs (by searching against a library of
HMMs describing hundreds of conserved domains.
● Pfam (http://pfam.sanger.ac.uk/)
● Select 'SEQUENCE SEARCH'
● Paste your protein sequence in the box.
● Click 'Go'
● A 'progress' window appears.
● Then, search results window shows up.
● There is a list of 'significant' & 'insignificant' matches and an interactive graphical output.
● Click on the link in the 'Family' column. to go to the entry.
● In the Pfam entry page, click on the tabs at the top (Domain organization & Species
distribution)
SMART (http://www.embl-heidelberg.de/)
● Paste the protein sequence into the box.
● Select all the search options available.
● Click on 'Sequence SMART' to run.
● Output
● Schematic output.
● Description of the programs that are used to produce the schematic output.
● Interaction network.
● Other output including BLAST results.
InterPro (http://www.ebi.ac.uk/interpro/)
● A database of protein families, domains and functional sites.
● Identifiable features found in known proteins can be applied to unknown protein sequences.
● The icons at the bottom of the page are about the databases involved.
● Enter the interProScan Sequence Search page by clicking on the 'InterProScan' (on the left
column).
● The submission form presents.
● Paste the sequence of your protein in the box.
● In the 'Results', select 'interactive'
● Check all in 'APPLICATIONS TO RUN'
Accessing Molecular Data & Web Tools. 3
BLOCKS (http://www.blocks.fhcrs.org/)
TIGRfam (http://www.tigr.org/TIGRFAMs/)
PRINTS (http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/)
ProDom (http://prodom.prabi.fr/prodom/current/html/home.php)
Transmembrane predictions
TMHMM (http://www.cbs.dtu.dk/services/TMHMM/)
● Open the TMHMM v2.0 server page from the URL above.
● Paste the protein sequence in the box.
● Select output format as 'Extensive, with graphics'
● Click on 'Submit'
● The results are in tabular output and graphics.
● How many transmembrane domains in the protein. Try 'TMPRED' at the URL below to
compare the result.
TMPRED (http://www.ch.embnet.org/software/TMPRED_form.html)
PHOBIUS (http://phobius.cgb.ki.se/)
SignalP (http://www.cbs.dtu.dk/services/TMHMM/)
● Go to the SignalP3.0 Server output page.
● Paste the protein sequence into the box.
● Select your search options and output format.
● Click on 'Submit' button.
● The prediction results are graphical, tabular, and SignalP-HMM outputs.
● Try 'PSORT' at the following URL to compare the results.
PSORT (http://psort.nibb.ac.jp/)
RNA annotation
Entrez (http://ww.ncbi.nlm.nih.gov/Entrez/)
Blast searches (http://www.ncbi.nlm.nih.gov/BLAST)
Fasta searches (http://www.ebi.ac.uk/fasta33/)
References
See my previous documents.