Professional Documents
Culture Documents
64-FASTA JMPR-11-363 (Fast Alignment) PDF
64-FASTA JMPR-11-363 (Fast Alignment) PDF
Review
Fast alignment (FASTA) is used to compare protein sequence to another protein sequence. FASTA
provides similarity between nucleotide and protein databases using the FASTA program. FASTA is
usually used as part of a package of programs and it construct local and global sequence alignments.
In this article, simple applications of FASTA were discussed.
INTRODUCTION
Fast alignment (FASTA) is a program that is utilized to global alignment purposes (Henikoff et al., 1992, Brenner
analyze the similarity between protein sequences as well et al., 1998). The present study focuses on the steps
as nucleic acids. It also searches protein and DNA required to run the programs, rather than on the
databases and finds similarities between them. This interpretation of the results of a FASTA search.
program is used to find the region of similarity, local as
well as global. FASTA is both fast and selective because
it initially considers only amino acid identities.This FASTA
program may help in understanding evolutionary
relationships between sequences as well as help identify Sequence similarity search using the FASTA
members of gene families (Harison, 2005). The FASTA program
program can search the NBRF (National Biomedical
Research Foundation) protein sequence library (2.5 The uses of FASTA in databases include protein,
million residues) in less than 20 min on an IBM-PC nucleotide, proteomes, genomes, whole genome
(International Business Machines Corporation Personal shotgun, ASD (Alternative splicing database) protein A,
Computer) microcomputer and unambiguously detect SD (Splicing database) nucleotide, LGIC (Ligand-Gated
proteins that shared a common ancestor billions of years Ion Channel) protein and LGIC nucleotide.
in the past (Mackey et al., 2002, Pearson, 1990).
Computers are commonly used for analyses of DNA and FASTA program
protein sequence data. A common application of
computers in molecular biology is to characterize newly FASTA-protein similarity search
determined sequences by searching for DNA and protein
sequence databases. FASTA is widely used for such type This tool provides sequence similarity searching for
of searches, because it is fast, sensitive, and readily protein databases using the FASTA program (Pearson et
available (Table 1). FASTA is used for local as well as al., 1998). The steps to follow are:
CONCLUSION
BIOFFORC (Biological file format conversion) tool
development for biological file format FASTA is commonly used for comparing sequences of
protein and DNA. It gives fast and reliable results. FASTA
Different sequence formats are used in bioinformatics. provides an estimate of the statistical significance of each
Specific sets of bioinformatics are used for processing. alignment found.
Sometimes, sequence format conversion is needed. In
the public domain there are many sequence conversion
tool present. For this purpose a file format converter has REFERENCES
been developed with a graphical user interface in PERL
Brenner SE, Chothia C, Hubbard TJ (1998). Assessing sequence
(Practical Extraction and Report Language). This file comparison methods with reliable structurally identified distant
format converter is called BIOFFORC (Chinnaiah et al., evolutionary relationships. Proc. Natl. Acad. Sci., 95: 6073-6078
2008). Chinnaiah S, Maruthamuthu R, Ekambaram R (2008). BIOFFORC: Tool
development for biological file format, Bioinform., 3(2): 98-99
Harison S (2005). FASTA, Fundamentals of Bioinformatics, I.K.
International publishing house pvt.Ltd. New Delhi. India, p. 76.
Using the FASTA program to search protein and DNA Henikoff S, Henikoff JG (1992). Amino acid substitution matrices from
sequence databases protein blocks. Proc. Natl. Acad. Sci. U.S.A., 89: 10915-10919.
Altschul SF, Gash W (1996). Local alignment statistics. Methods
Enzymol., 266: 460-480.
Computers are commonly used for analysis of DNA and Mackey AJ, Haystead TA, Pearson WR (2002). Getting more from less:
protein sequence data. Newly determined sequences are algorithms for rapid protein identification with multiple short peptide
characterized by searching DNA and protein sequence sequences. J. Molecul. Cellul. Proteom., 1(2): 139-147.
Pearson WR (1990). Rapid and Sensitive Sequence Comparison with
databases. The FASTA program is commonly used for FASTP and FASTA. Methods in Enzymol., 183: 63-98.
such searches due to its fastness and sensitivity. Steps Pearson WR, Lipman DJ (1998). Improved tools for biological sequence
to run the FASTA programs have been developed comparison. Proc. Natl., Acad Sci. U. S. A., 85(8): 2444-1448.
Akram et al. 6933
Sagliano A, Volpicella M, Gallerani R, Ceci1 LR (1998). FastA based William R (1994). Using the FASTA Program to Search Protein and
compilation of higher plant mitochondrial tRNA genes. Nucl. Acids DNA Sequence Databases, Meth. Mol. Biol., 24(9): 307-331.
Res., 26(1): 154-155