You are on page 1of 18

Bioinformatics using

BLAST Technology

Tarun R & Vijith V Daniel


Introduction
• Bioinformatics – application of computer technologies to
biological sciences.

• Leverages algorithm development, data management &


data mining for biological studies.

• Prime area – genomics – using computing power for


complex DNA analysis & prediction.

• Currently in boom – demand for multi-skilled personnel.


Why Bioinformatics
• Assigning function to proteins.

• Mining the Data.

• Comparing protein-protein interaction in different


protein families.

• Transforming gene data to protein structure and


correlating gene and protein function.
Why Bioinformatics
• Advances in computing – speedy, efficient analysis and
management of huge amounts of data.

• Computers are actually changed the data format in biological


research. Bioinformatics is much meant for the databases,
storing, comparison and retrieving. Now the researchers go
beyond just capturing, managing and presenting data such as
drawing inspiration from wide variety of quantitative field like
biotechnology and including statistics, physics, computer
science and engineering. Well known that the DNA is made up
of adenine, thymine, cytosine and guanine and proteins are
adenine, uracil, cytosine and guanine and proteins are from
the 20 amino acids and these macromolecules are defined
linear chains and sequence symbols. These sequences can be
then compared to find similarities related by form and function
of bioinformatics.
Research in Bioinformatics
• Advances in computing – speedy, efficient analysis and
management of huge amounts of data.

• Research in bioinformatics and computational biology can


encompass anything from abstraction of the properties of a
biological system into a mathematical or physical model, to
implementation of new algorithms for data analysis to the
development of databases and web tools to access them. The
internet has totally changes the way scientist for and
exchange information. Data that once had to be
communicated on paper mow digitized and distributed from
centralized databases. Journals are now published in public
domains to access all the scientists. And nearly all the
research groups are has a web site offering everything from
reprints to software downloads to data to automated data
processing services.
The various areas and it place of the research work in the bioinformatic
• The oldest among the technology that interacts with the
biological databases is Common Gateway Interface (CGI).
The other technologies are XML and PHP. A common
Gateway Interface (CGI) programs are small executables
that the server executes in response to request from a
browser. The CGI program processes the request in code
and returns HTML to the browser. An http request is a
short transaction between a client and a server. When a
client type a request into the browser and it turn, sends it
to a naming server. The naming server maintains a
database of names, each of which is associated with an
IP (an unique identification number to identify each
computer in a network) address. The naming server
returns to the browser where the search text is available
in any of the IP address over the network.
BLAST (Basic Local Alignment
Sequence Tool)
BLAST – a skeleton
• Single public database on genome data is available through the
WWW.

• BLAST can find sequence motifs, domains and other repeats within
a sequence.

• Novices & experts can perform searches, decipher output and


analyze the results.

• Can predict biochemical activities and functions.


The BLAST Processing using CGI:
• In the NCBI there are various types of BLAST for it relevant searching for the
researchers. For each and every type it has some modification and easily we separate
and compare it and get the result. The various types of BLAST are follows Nucleotide
BLAST
• Standard nucleotide-nucleotide BLAST [blastn]
• MEGABLAST
• Search for short nearly exact matches
• Protein BLAST
• Standard protein-protein BLAST [blastp]
• PSI- and PHI-BLAST
• Search for short nearly exact matches
• Translated BLAST Searches
• Nucleotide query - Protein db [blastx]
• Protein query - Translated db [tblastn]
• Nucleotide query - Translated db [tblastx]
• Specialized BLAST pages
• VecScreen - BLAST-based detection of vector contamination
• IgBLAST - Analysis of immunoglobulin sequences in GenBank
• OLD Finished and Unfinished Microbial Genomes
The BLAST Method
• Step 1.Choose the program to use and the database to search
– Blastp is best with amino acid searches.
– nr database offers comprehensive search.

• Step 2.Input the data


– Sequence submitted in the FASTA format.

• Step 3. Set the program option or choose default


– Perform ungapped alignment.
Ensures any similarities, even those that define a domain within the coding region will be
identified, if the extent of local similarity is high enough.

• Step 4.Set the outputting formatting option


– Graphical overview, sucha as set alignment view, descriptions etc.
• Step 5. Performs the search
– Finally BLAST results and links were generated behind the scenes on the NCBI server
written to what appears to you as normal web page.
• Practically, the BLAST server of URL available in the website of ncbi. (1).
• Here, the first part of URL gives the directions to the CGI program.
• The second part of URL is state information which tells the CGI program
what part of its functionality needed to process the request. Here, the
state information brings up an empty search form in which you can enter
the sequence.Once the submit button is clicked, a new page appears.
• The lists request ID and the time to process the researcher request. In
background the CGI has passed the request to actual BLAST program
which runs in server.
• When BLAST finishes the request by process, the results are listed and
labeled with request ID. There will be some options are there to display of
results of the request. The follows is the final result for a particular
sequence.
• Query sequence: (NF00300418, length=180, Search NREF, e-
value < 0.0001, filter=T)
• >NF00300418 ADP-ribosylation factor 1
[Schizosaccharomyces pombe]
• MGLSISKLFQSLFGKREMRILMVGLDAAGKTTILYKLKLGEIVTTIPTI
GFNVETVEYRN
• ISFTVWDVGGQDKIRPLWRHYFQNTQGIIFVVDSNDRERISEAHEE
LQRMLNEDELRDAL
• LLVFANKQDLPNAMNAAEITDKLGLHSLRHRQWYIQATCATSGD
GLYEGLEWLSTNLKNQ
• >NF00530692 ADP-ribosylation factor 1 [Mus musculus]
• Length = 181
• Score = 328 bits (841), Expect = 2e-89
• Identities = 159/180 (88%), Positives = 169/180 (93%)
• Query: 102 IIVVGIPHEEIVKIAEDEGVDIIIMGSHGKTNLKEIL-----LGSVTENVIKKSNKPVLV 156

• Sbjct: 224 IVIKGVQRPEDAERAVEHGADGIVVSNHGGRQLDGVPAPIDALPEVVAAV--KGDIEVLV 281


• Query: 4 MYKKILYPTDFSETAEIALKHVKAFKTLKAEEVILLHVIDEREIKKRDIFSLLLGVAGLN 63

• Sbjct: 1 AYKNILVAVDGSPESQLAVEKAVELARERNAELSLIHVVSD----------YVLSEPYNG 50
• Query: 64 KSVEEFENELKNKLTEEAKNKMENIKKELEDVGFKVKDIIVVGIPHEEIVKIAEDEGVDI 123

• Sbjct: 51 LADEPEEDMEQELNVEEAVKLLLELSANAGIP--DKTIVVNGEPAETILEIAKENGADL 108


• Query: 124 IIMGSHGKTNLKEILLGSVTENVIKKSNKPVLVVK 158

• Sbjct: 109 VVVGSHGRGGLRGMLLGSVSIAVVRKAPCPVLVIR 143


Conclusion
• Bioinformatics simplifies research by harnessing
computing technology.

• WWW is a great advantage for publishing.

• Sequence comparison – the greatest tool.

• BLAST – the pioneer sequence comparison


research program.
Thank you.

Questions.

You might also like