Professional Documents
Culture Documents
BLAST® Help
Introduction
BLAST® Help
In addition to providing BLAST sequence alignment services on the web, NCBI also makes
these sequence alignment utilities available for download through FTP. This allows BLAST
searches to be performed on local platforms against databases downloaded from NCBI or
created locally. These utilities run through DOS-like command windows and accept input
through text-based command line switches. There is no graphic user interface.
The following tutorial discusses the steps needed to install NCBI C++ based BLAST
programs (blast+) and a sample NCBI database on PCs running Windows 7 Operating
Systems. The installation of the deprecated C-based BLAST package (legacy blast) is
discussed briefly at the end.
Downloading
BLAST® Help
The blast+ software package is available as two self-extracting archives. One archive, ncbi-
blast-#.#.#+-win32.exe, is compatible with PCs running 32-bit Windows operating systems.
The other, ncbi-blast-#.#.#+-win64.exe, is for PCs running 64-bit Windows operating
systems. In both cases, "#.#.#" denotes the current version number of the package. Archives
with the same base name are equivalent.
Please note that the archive with the ".tar.gz" file extension does not have the installer
function. The discussion below focuses on archives with a ".exe" extension.
Steps
Steps to download the package are described below.
BLAST® Help
Examples
These steps for the "ncbi-blast-2.2.29-win64.exe" archive are given in Figure 1a and 1b,
where the first two steps are demonstrated by 1a and the last step is demonstrated by 1b.
Page 2
Installation
The blast+ archive downloaded above contains a built-in installer. Accepting the license
agreement after double-clicking, the installer will prompt for an installation directory. In this
test case "C:\users\tao\desktop\blast-2.2.29+" will be set as the installation directory.
Clicking the "Install" button, the installer will create this directory with a "doc" subdirectory
BLAST® Help
containing a comprehensive user manual in pdf format, an "uninstaller" for future removal
of the installation, and a "bin" subdirectory where the BLAST programs and accessory
utilities are kept. Table 1 sums up programs and utilities contained in the blast+ package.
Table 1
Programs and utilities contained in the blast+ package
Program Function
blastdb_aliastool Creates database alias (to tie volumes together for example)
BLAST® Help
blastx Searches a nucleotide query, dynamically translated in all six frames, against a protein database
blast_formatter Formats a blast result using its assigned request ID (RID) or its saved archive
deltablast Searches a protein query against a protein database, using a more sensitive algorithm
dustmasker Masks the low complexity regions in the input nucleotide sequences
legacy_blast.pl Converts a legacy blast search command line into blast+ counterpart and execute it
BLAST® Help
makeprofiledb Creates a conserved domain database from a list of input position specific scoring matrix (scoremats) generated by psiblast
psiblast Finds members of a protein family, identifies proteins distantly related to the query, or builds position specific scoring matrix
for the query
rpsblast Searches a protein against a conserved domain database to identify functional domains present in the query
rpstblastn Searches a nucleotide query, by dynamically translated it in all six-frames first, against a conserved domain database
tblastn Searches a protein query against a nucleotide database dynamically translated in all six frames
BLAST® Help
tblastx Searches a nucleotide query, dynamically translated in all six frames, against a nucleotide database similarly translated
Similar procedures in Figure 1 can be used to download the BLAST databases. Steps for
downloading preformatted BLAST databases from NCBI are:
BLAST® Help
A utility included in the blast+ package, update_blastdb.pl , can be used to streamline the
downloading of preformatted BLAST databases from NCBI. It requires the installation of
the Perl package and execution from the command prompt under the "C:\users\tao\desktop
\blast-2.2.29+\db\" directory. The base command is:
where "base_database_name" is the name of the target database, without the "##.tar.gz"
extension.
Configuration
BLAST® Help
For smooth execution of blast+, the PC must be configured to recognize BLAST programs
installed under the "C:\users\tao\desktop\blast-2.2.29+\bin\" directory. To do this, a user
environment variable named Path needs to be created with "C:\users\tao\desktop
\blast-2.2.29+\bin\" as its value. The blast+ installer automatically creates a BLASTDB
environment variable, which points to the "C:\users\tao\desktop\blast-2.2.29+\" directory.
For this setup with a separate "db" directory created for BLAST databases, the value of
BLASTDB environment variable need to be modified to point to the "C:\users\tao\desktop
\blast-2.2.29+\db\" directory where the refseq_rna.00 database files are kept.
Environment Variables
BLAST® Help
executed from a command prompt window (CMD). This window can be opened by clicking
on "Start ⇨ All Programs ⇨ Accessories ⇨ Command Prompt" or by clicking "Start
⇨ Run …," followed by typing "cmd" (minus quotes) in the input box and pressing enter.
These processes are shown in Figures 4a and 4b.
Example Execution
In the command prompt, the working directory can be changed to "C:\users\tao\desktop
\blast-2.2.29+" by typing "cd \" followed by "cd users\tao\desktop\blast-2.2.29+". If the
initial prompt is a drive other than "C:\", type "C:" instead of "cd \" to change set the drive to
“C:” first. Figure 5 contains example commands and their console output from a work
session that tests a blast-2.2.29+ installation.
BLAST® Help
A realistic test of this installation should be actual searches, which requires an input query.
The next blastdbcmd command line dumps out a sequence from the installed database for
use as such a query.
test_query.txt
The exact meaning of the command line is (from left to right) to:
a execute blastdbcmd
b use refseq_rna.00 as the target database
c get the database sequence with nm_000122 as its accession
d dump the sequence in FASTA format, and
e send the output to a file named test_query.txt
The sequence in this file is used subsequently as the query in a test blastn search in the
following command line:
BLAST® Help
Parameters not specified explicitly will assume default values. To further customize the
search, other search parameters with customized input values should be added. Typing
"program -help" followed by enter key stroke will print out the complete list of program
parameters and their accepted options to the console for quick reference. Further details are
in the included user manual. The final "dir" examines the directory content again to show
that new output files are indeed generated as marked by red arrows.
BLAST® Help
The last command ‘set | find “BLASTDB”’ demonstrates a way to examine the
environmental variables setting in the command prompt. It calls “set” to get all the
environment variables, and passes it to “find” to search for BLASTDB. A returned value
marked by the last set of arrows indicates that this variable is set.
ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/
e The programs under the bin subdirectory have names and functions different
from that provided by blast+.
f Configuring the legacy blast installation is similar to blast+. However, an
additional DATA environment variable with the path to the "data" subdirectory
as its value should be specified.
g If the refseq_rna database mentioned in section 3 is installed, the following
command lines can be used to test the installation:
BLAST® Help
fastacmd -d refseq_rna.00 –I
fastacmd –d refseq_rna.00 –s nm_000122 –o test_query.txt
blastall –p blastn –i test_query.txt –d refseq_rna.00 –o legacy_output.txt
Technical Assistance
Questions, feedback, and technical assistance requests should be sent to blast-help at:
blast-help@ncbi.nlm.nih.gov
Questions on other NCBI resources should be addressed to NCBI Service Desk at:
info@ncbi.nlm.nih.gov
BLAST® Help
BLAST® Help
BLAST® Help
Figure 1a.
Download a blast+ package from NCBI through a web browser: Log on to ftp://ftp.ncbi.nlm.nih.gov/blast/executables/
LATEST/ and select "Save link as ..." after right-clicking on "ncbi-blast-2.2.29+-win64.exe".
BLAST® Help
Figure 1b.
Download a blast+ package from NCBI through a web browser: Change the location in the subsequent prompt to your own
directory under "C:" before saving the archive to a desired location.
BLAST® Help
Figure 2.
Extract the downloaded refseq_rna.00.tar.gz archive using WinZip: Right click on the database archive, then select “WinZip”
and “Extract to here …”
BLAST® Help
BLAST® Help
Figure 3a.
Configure standalone blast+ using Windows' environment variables: In the initial System popup, click the “Advanced system
settings” link to open the “System Properties” popup. Click the “Environment Variables …” button to access existing
environmental variables or set up new ones (shown in 3b).
BLAST® Help
Figure 3b.
Configure standalone BLAST using Windows' environment variables: Clicking "Environment Variables …" button on 3a
BLAST® Help
opens this popup, which provides access to existing environment variables and allows the creation of new ones, using the
“Edit” and "New" buttons, respectively. The two user variables relevant to BLAST are BLASTDB and path (highlighted).
BLAST® Help
Figure 3c.
Configure standalone BLAST using Windows' environment variables: Clicking the "New" button in Figure 3c brings out this
popup, where the new variable's name and path can be specified. In this example, a user variable called “path” is being
specified with a value of "C:\users\tao\desktop\blast-2.2.29+\bin\".
Figure 4a.
Open a command prompt in Windows 7: Click the "Start" button followed by “All Programs” link to see list of available
programs. Open the Accessories fold by clicking to see the Command Prompt (highlighted). Click it to launch.
BLAST® Help
Figure 4b.
Open a command prompt in Windows 7: Alternatively, click “Start” button, then the "Run …" link in the right-hand column. In
the popup, type “cmd” in the input box to open the Command Prompt.
Figure 5.
The output of a work session testing the blast+ installation: The input commands are in red boxes. Output files produced by
blastdbcmd and blastn command executions are marked by red arrows. The last command is for checking BLASTDB
environmental variable setting, whose output is marked by the last set of arrows.
BLAST® Help