You are on page 1of 12

Standalone BLAST Setup for Windows PC

BLAST® Help

Tao Tao, Ph.D.


NCBI
tao@ncbi.nlm.nih.gov
Corresponding author.

BLAST is a Registered Trademark of the National Library of Medicine

Created: May 31, 2010.


Last Update: April 10, 2014.

Introduction
BLAST® Help

In addition to providing BLAST sequence alignment services on the web, NCBI also makes
these sequence alignment utilities available for download through FTP. This allows BLAST
searches to be performed on local platforms against databases downloaded from NCBI or
created locally. These utilities run through DOS-like command windows and accept input
through text-based command line switches. There is no graphic user interface.

The following tutorial discusses the steps needed to install NCBI C++ based BLAST
programs (blast+) and a sample NCBI database on PCs running Windows 7 Operating
Systems. The installation of the deprecated C-based BLAST package (legacy blast) is
discussed briefly at the end.

Downloading
BLAST® Help

The blast+ software package is available as two self-extracting archives. One archive, ncbi-
blast-#.#.#+-win32.exe, is compatible with PCs running 32-bit Windows operating systems.
The other, ncbi-blast-#.#.#+-win64.exe, is for PCs running 64-bit Windows operating
systems. In both cases, "#.#.#" denotes the current version number of the package. Archives
with the same base name are equivalent.

Please note that the archive with the ".tar.gz" file extension does not have the installer
function. The discussion below focuses on archives with a ".exe" extension.

Steps
Steps to download the package are described below.
BLAST® Help

• Point a browser to this FTP directory:


ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/
• Right click on a desired archive and select "Save link as…" from the popup menu
• In the prompt, switch to a desired directory (folder) and click the "Save" button to
save the archive to the selected location on the local disk

Examples
These steps for the "ncbi-blast-2.2.29-win64.exe" archive are given in Figure 1a and 1b,
where the first two steps are demonstrated by 1a and the last step is demonstrated by 1b.
Page 2

Installation
The blast+ archive downloaded above contains a built-in installer. Accepting the license
agreement after double-clicking, the installer will prompt for an installation directory. In this
test case "C:\users\tao\desktop\blast-2.2.29+" will be set as the installation directory.
Clicking the "Install" button, the installer will create this directory with a "doc" subdirectory
BLAST® Help

containing a comprehensive user manual in pdf format, an "uninstaller" for future removal
of the installation, and a "bin" subdirectory where the BLAST programs and accessory
utilities are kept. Table 1 sums up programs and utilities contained in the blast+ package.

Table 1
Programs and utilities contained in the blast+ package
Program Function

blastdbcheck Checks the integrity of a BLAST database

blastdbcmd Retrieves sequences or other information from a BLAST database

blastdb_aliastool Creates database alias (to tie volumes together for example)
BLAST® Help

Blastn Searches a nucleotide query against a nucleotide database

blastp Searches a protein query against a protein database

blastx Searches a nucleotide query, dynamically translated in all six frames, against a protein database

blast_formatter Formats a blast result using its assigned request ID (RID) or its saved archive

convert2blastmask Converts lowercase masking into makeblastdb readable data

deltablast Searches a protein query against a protein database, using a more sensitive algorithm

dustmasker Masks the low complexity regions in the input nucleotide sequences

legacy_blast.pl Converts a legacy blast search command line into blast+ counterpart and execute it
BLAST® Help

makeblastdb Formats input FASTA file(s) into a BLAST database

makembindex Indexes an existing nucleotide database for use with megablast

makeprofiledb Creates a conserved domain database from a list of input position specific scoring matrix (scoremats) generated by psiblast

psiblast Finds members of a protein family, identifies proteins distantly related to the query, or builds position specific scoring matrix
for the query

rpsblast Searches a protein against a conserved domain database to identify functional domains present in the query

rpstblastn Searches a nucleotide query, by dynamically translated it in all six-frames first, against a conserved domain database

segmasker Masks the low complexity regions in input protein sequences

tblastn Searches a protein query against a nucleotide database dynamically translated in all six frames
BLAST® Help

tblastx Searches a nucleotide query, dynamically translated in all six frames, against a nucleotide database similarly translated

update_blastdb.pl Downloads preformatted blast databases from NCBI

windowmasker Masks repeats found in input nucleotide sequences

Test BLAST database


In addition to BLAST programs and accessory utilities, target database are also a key
component of a standalone BLAST setup. The common set of pre-formatted NCBI BLAST
databases is available as compressed archives from NCBI FTP site. Databases can also be
prepared de novo from custom FASTA sequences locally using the makeblastdb utility. For

Standalone BLAST Setup for Windows PC


Page 3

more effective database management of database files, a "db" subdirectory should be


created. In this test case it will be created under the BLAST directory with "C:\users\tao
\desktop\blast-2.2.29+\db" as its path.

Similar procedures in Figure 1 can be used to download the BLAST databases. Steps for
downloading preformatted BLAST databases from NCBI are:
BLAST® Help

• Point a browser to ftp://ftp.ncbi.nlm.nih.gov/blast/db/


• Right-click on a desired file (refseq_rna.00.tar.gz in this example case)
• Select "Save link as …" from the popup menu
• When prompted, use the "Save in" to change the directory to "C:\users\tao\desktop
\blast-2.2.29+\db"
This downloaded database is blast-ready, after inflation and extraction with a decompression
utility, such as WinZip or 7zip. Note that these steps described above download and install
only the first volume of the refseq_rna database. For the complete set, download all the
refseq_rna.##.tar.gz files. The database alias file (refseq_rna.nal) in the first volume will tie
all volumes back into the complete database. Figure 2 below shows an example inflation/
BLAST® Help

extraction procedure using WinZip.

A utility included in the blast+ package, update_blastdb.pl , can be used to streamline the
downloading of preformatted BLAST databases from NCBI. It requires the installation of
the Perl package and execution from the command prompt under the "C:\users\tao\desktop
\blast-2.2.29+\db\" directory. The base command is:

perl update_blastdb.pl --passive base_database_name

where "base_database_name" is the name of the target database, without the "##.tar.gz"
extension.

Configuration
BLAST® Help

For smooth execution of blast+, the PC must be configured to recognize BLAST programs
installed under the "C:\users\tao\desktop\blast-2.2.29+\bin\" directory. To do this, a user
environment variable named Path needs to be created with "C:\users\tao\desktop
\blast-2.2.29+\bin\" as its value. The blast+ installer automatically creates a BLASTDB
environment variable, which points to the "C:\users\tao\desktop\blast-2.2.29+\" directory.
For this setup with a separate "db" directory created for BLAST databases, the value of
BLASTDB environment variable need to be modified to point to the "C:\users\tao\desktop
\blast-2.2.29+\db\" directory where the refseq_rna.00 database files are kept.

Environment Variables
BLAST® Help

Steps to create or modify environment variables are summarized below:


• Click “Start” button then the "Control Panel" link to open the Control Panel
• Click the "System" icon to open the system prompt
• Click the "Advanced system settings" link in the left column to open the “System
properties” prompt
• Click the "Environment Variables" button to see the available list
• Click the "New" button under the "User variable for ..." panel
• Type the environment variable name and enter the absolute path
• Click "OK" to close the prompts

Standalone BLAST Setup for Windows PC


Page 4

Example Screen Shots


Screen shots of these steps are shown in Figures 3a, 3b, and 3c.

Execution and validation


Standalone blast+ programs do NOT have a graphical user interface (GUI) and must be
BLAST® Help

executed from a command prompt window (CMD). This window can be opened by clicking
on "Start ⇨ All Programs ⇨ Accessories ⇨ Command Prompt" or by clicking "Start
⇨ Run …," followed by typing "cmd" (minus quotes) in the input box and pressing enter.
These processes are shown in Figures 4a and 4b.

Example Execution
In the command prompt, the working directory can be changed to "C:\users\tao\desktop
\blast-2.2.29+" by typing "cd \" followed by "cd users\tao\desktop\blast-2.2.29+". If the
initial prompt is a drive other than "C:\", type "C:" instead of "cd \" to change set the drive to
“C:” first. Figure 5 contains example commands and their console output from a work
session that tests a blast-2.2.29+ installation.
BLAST® Help

Explanation of the test commands


The first command changes the working directory from initial “C:\” drive to the
blast-2.2.29+ directory. The "dir" lists the files and subdirectories under this directory. The
error-free console outputs from "blastn -version" and "blastdbcmd -db refseq_rna.00 -info"
command lines validate the installation.

A realistic test of this installation should be actual searches, which requires an input query.
The next blastdbcmd command line dumps out a sequence from the installed database for
use as such a query.

blastdbcmd –db refseq_rna.00 –entry nm_000122 –outfmt "%f" –out


BLAST® Help

test_query.txt

The exact meaning of the command line is (from left to right) to:
a execute blastdbcmd
b use refseq_rna.00 as the target database
c get the database sequence with nm_000122 as its accession
d dump the sequence in FASTA format, and
e send the output to a file named test_query.txt
The sequence in this file is used subsequently as the query in a test blastn search in the
following command line:
BLAST® Help

blastn –query text_query.txt –db refseq_rna.00 –out output.txt

This command instructs the system to:


• execute blastn program to search a nucleotide query against a nucleotide database
• use the sequence(s) in test_query.txt as the query
• search against the database refseq_rna.00 database, and
• save the result in a file named output.txt

Standalone BLAST Setup for Windows PC


Page 5

Parameters not specified explicitly will assume default values. To further customize the
search, other search parameters with customized input values should be added. Typing
"program -help" followed by enter key stroke will print out the complete list of program
parameters and their accepted options to the console for quick reference. Further details are
in the included user manual. The final "dir" examines the directory content again to show
that new output files are indeed generated as marked by red arrows.
BLAST® Help

The last command ‘set | find “BLASTDB”’ demonstrates a way to examine the
environmental variables setting in the command prompt. It calls “set” to get all the
environment variables, and passes it to “find” to search for BLASTDB. A returned value
marked by the last set of arrows indicates that this variable is set.

Setup steps for legacy blast


The original standalone BLAST package based on NCBI C-toolkit (legacy blast) is
deprecated. The installation of legacy blast package for Windows differs from that for blast+
described above. The key differences are summarized below.
a The legacy blast packages are located under a different ftp directory:
BLAST® Help

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/

b The packages are named with this convention: blast-#.#.#-CHIP-win#.exe,


where #.#.# is the version, CHIP is the chipset, and win# is the operating
system (32 or 64 bits)
c The packages do not contain an installer function. It is recommended that the
downloaded package be placed in a folder named blast-#.#.# (#.#.# to indicate
version) first before extraction
d Double clicking the package will execute the self-exacting function to install
the package by re-creating bin, doc and data subdirectories:
BLAST® Help

bl2seq.exe blastall.exe blastclust.exe blastpgp.exe


copymat.exe fastacmd.exe formatrpsdb.exe implala.exe makemat.exe
megablast.exe rpsblast.exe seedtop.ext

e The programs under the bin subdirectory have names and functions different
from that provided by blast+.
f Configuring the legacy blast installation is similar to blast+. However, an
additional DATA environment variable with the path to the "data" subdirectory
as its value should be specified.
g If the refseq_rna database mentioned in section 3 is installed, the following
command lines can be used to test the installation:
BLAST® Help

fastacmd -d refseq_rna.00 –I
fastacmd –d refseq_rna.00 –s nm_000122 –o test_query.txt
blastall –p blastn –i test_query.txt –d refseq_rna.00 –o legacy_output.txt

Technical Assistance
Questions, feedback, and technical assistance requests should be sent to blast-help at:

blast-help@ncbi.nlm.nih.gov

Standalone BLAST Setup for Windows PC


Page 6

Questions on other NCBI resources should be addressed to NCBI Service Desk at:

info@ncbi.nlm.nih.gov
BLAST® Help
BLAST® Help
BLAST® Help

Figure 1a.
Download a blast+ package from NCBI through a web browser: Log on to ftp://ftp.ncbi.nlm.nih.gov/blast/executables/
LATEST/ and select "Save link as ..." after right-clicking on "ncbi-blast-2.2.29+-win64.exe".
BLAST® Help

Standalone BLAST Setup for Windows PC


Page 7
BLAST® Help
BLAST® Help
BLAST® Help

Figure 1b.
Download a blast+ package from NCBI through a web browser: Change the location in the subsequent prompt to your own
directory under "C:" before saving the archive to a desired location.
BLAST® Help

Standalone BLAST Setup for Windows PC


Page 8
BLAST® Help
BLAST® Help

Figure 2.
Extract the downloaded refseq_rna.00.tar.gz archive using WinZip: Right click on the database archive, then select “WinZip”
and “Extract to here …”
BLAST® Help
BLAST® Help

Standalone BLAST Setup for Windows PC


Page 9
BLAST® Help
BLAST® Help
BLAST® Help

Figure 3a.
Configure standalone blast+ using Windows' environment variables: In the initial System popup, click the “Advanced system
settings” link to open the “System Properties” popup. Click the “Environment Variables …” button to access existing
environmental variables or set up new ones (shown in 3b).
BLAST® Help

Standalone BLAST Setup for Windows PC


Page 10
BLAST® Help
BLAST® Help

Figure 3b.
Configure standalone BLAST using Windows' environment variables: Clicking "Environment Variables …" button on 3a
BLAST® Help

opens this popup, which provides access to existing environment variables and allows the creation of new ones, using the
“Edit” and "New" buttons, respectively. The two user variables relevant to BLAST are BLASTDB and path (highlighted).
BLAST® Help

Figure 3c.
Configure standalone BLAST using Windows' environment variables: Clicking the "New" button in Figure 3c brings out this
popup, where the new variable's name and path can be specified. In this example, a user variable called “path” is being
specified with a value of "C:\users\tao\desktop\blast-2.2.29+\bin\".

Standalone BLAST Setup for Windows PC


Page 11
BLAST® Help
BLAST® Help
BLAST® Help

Figure 4a.
Open a command prompt in Windows 7: Click the "Start" button followed by “All Programs” link to see list of available
programs. Open the Accessories fold by clicking to see the Command Prompt (highlighted). Click it to launch.
BLAST® Help

Figure 4b.
Open a command prompt in Windows 7: Alternatively, click “Start” button, then the "Run …" link in the right-hand column. In
the popup, type “cmd” in the input box to open the Command Prompt.

Standalone BLAST Setup for Windows PC


Page 12
BLAST® Help
BLAST® Help
BLAST® Help

Figure 5.
The output of a work session testing the blast+ installation: The input commands are in red boxes. Output files produced by
blastdbcmd and blastn command executions are marked by red arrows. The last command is for checking BLASTDB
environmental variable setting, whose output is marked by the last set of arrows.
BLAST® Help

Standalone BLAST Setup for Windows PC

You might also like