You are on page 1of 8

Name :Bakhtawar

Roll no 30
Subject :Bioinformatics
Submitted to :prof Ali Shah
CONTENTS:
 BLAST
 EXPLANATIO
 HOW TO SEARCH
BLAST:Basic Local Alignment
Search Tool

The Basic Local Alignment Search Tool (BLAST) finds regions of similarity between


sequences. The program compares nucleotide or protein sequences and calculates the
statistical significance of matches. BLAST can be used to infer functional and
evolutionary relationships between sequences as well as help identify members of gene
families.
The finds regions of similarity between sequences. The program compares nucleotide
or protein sequences and calculates the statistical significance of matches. BLAST can
be used to infer functional and evolutionary relationships between sequences as well as
help identify members of gene families.
There are several types of BLAST searches. NCBI's WebBLAST offers four main
search types:

 BLASTn (Nucleotide BLAST): compares one or more nucleotide


query sequences to a subject nucleotide sequence or a database of
nucleotide sequences. This is useful when trying to determine the
evolutionary relationships among different organisms
(see Comparing two or more sequences below).
 BLASTx (translated nucleotide sequence searched against protein
sequences): compares a nucleotide query sequence that is translated
in six reading frames (resulting in six protein sequences) against a
database of protein sequences. Because blastx translates the query
sequence in all six reading frames and provides combined
significance statistics for hits to different frames, it is particularly
useful when the reading frame of the query sequence is unknown or it
contains errors that may lead to frame shifts or other coding errors.
Thus blastx is often the first analysis performed with a newly
determined nucleotide sequence.
 tBLASTn (protein sequence searched against translated nucleotide
sequences): compares a protein query sequence against the six-
frame translations of a database of nucleotide sequences. Tblastn is
useful for finding homologous protein coding regions in unannotated
nucleotide sequences such as expressed sequence tags (ESTs) and
draft genome records (HTG), located in the BLAST databases est
and htgs, respectively. ESTs are short, single-read cDNA sequences.
They comprise the largest pool of sequence data for many organisms
and contain portions of transcripts from many uncharacterized genes.
Since ESTs have no annotated coding sequences, there are no
corresponding protein translations in the BLAST protein databases.
Hence a tblastn search is the only way to search for these potential
coding regions at the protein level. The HTG sequences, draft
sequences from various genome projects or large genomic clones,
are another large source of unannotated coding regions.
 BLASTp (Protein BLAST): compares one or more protein query
sequences to a subject protein sequence or a database of protein
sequences. This is useful when trying to identify a protein (see From
sequence to protein and gene below).

There are also standalone and API BLAST options as well as pre-populated specialized
searches available on the BLAST homepage linked above.

From sequence to protein and gene


Object: Starting with a sequence, identify the protein or gene and the source.
Example: From the following sequence (available at http://tinyurl.com/blastp-sequence, or copy
the sequence below), identify the most probable protein and organism:
MSKRKAPQET LNGGITDMLT ELANFEKNVS QAIHKYNAYR KAASVIAKYP HKIKSGAEAK
KLPGVGTKIA EKIDEFLATG KLRKLEKIRQ DDTSSSINFL TRVSGIGPSA ARKFVDEGIK
TLEDLRKNED KLNHHQRIGL KYFGDFEKRI PREEMLQMQD IVLNEVKKVD SEYIATVCGS
FRRGAESSGD MDVLLTHPSF TSESTKQPKL LHQVVEQLQK VHFITDTLSK GETKFMGVCQ
LPSKNDEKEY PHRRIDIRLI PKDQYYCGVL YFTGSDIFNK NMRAHALEKG FTINEYTIRP
LGVTGVAGEP LPVDSEKDIF DYIQWKYREP KDRSE
 

Querying a sequence
Protein and gene sequence comparisons are done with BLAST (Basic Local Alignment
Search Tool).
To access BLAST, go to Resources > Sequence Analysis > BLAST:
This is an unknown protein sequence that we are seeking to identify by comparing it to
known protein sequences, and so Protein BLAST should be selected from the BLAST
menu:

Key for default display:

 Max[imum] Score: the highest alignment score calculated from the


sum of the rewards for matched nucleotides and penalities for
mismatches and gaps.
 Total Score: the sum of alignment scores of all segments from the
same subject sequence.
 Query Cover[age]: the percent of the query length that is included in
the aligned segments.
 E[xpect] Value: the number of alignments expected by chance with
the calculated score or better. The expect value is the default sorting
metric; for significant alignments the E value should be very close to
zero.
 Ident[ity]: the highest percent identity for a set of aligned segments to
the same subject sequence.
 Acc[ession] Len[gth]: the number of nucleotides or amino acids in the
result sequence identified by the accession number
 Accession [number]: a unique identifier assigned to records in the
NCBI databases

Clicking on a protein name displays the pairwise sequence alignment and links to
additional information about the protein and its associated gene (if available).
Saving your results

To save your search queries and settings, click on the Save Search link, then log in to My NCBI
using the Sign in or Register link at the upper right. Once you do this, your search strategies
should appear in the Saved Search Strategies tab.

Comparing two or more sequences


Object: Starting with two or more sequences, compare them and find the differences.
Example: In the NCBI database Nucleotide, enter the following search:
human[organism] AND mitochondrion[title]
This will search for nucleic acid sequences from humans with the word "mitochondrion"
in the title. Mitochondrial DNA is often used in evolutionary comparisons because it is
inherited only through the maternal lineage and changes very slowly.
Limit the results to NCBI Reference Sequences by selecting the RefSeq limit
under Source databases in the left-hand Filter menu. These are high-quality
sequences that have been curated and annotated by NCBI staff.
There are three Reference Sequences for the mitochondrial genome in humans: one for
modern humans (Homo sapiens), one for Neanderthals (Homo sapiens
neanderthalensis), and one for Denisovans (Homo sp. Altai).
In the right-hand discovery menu under Analyze these sequences click Run BLAST.

REfrances :

https://guides.lib.berkeley.edu/ncbi/blast

https://blast.ncbi.nlm.nih.gov/Blast.cgi

You might also like