You are on page 1of 4

How to use the PDB

Loren Williams
Georgia Tech

1) What is Protein Data Bank (PDB)?


Biologists and biochemists use sequence databases, structure databases,
literature databases, etc. The database we will learn here is called the Protein
Database (PDB). The PDB has all known 3D structures of proteins, DNAs and
RNAs. To find the PDB on the web, type ‘PDB’ into google, and go to the first link
returned, which is:

http://www.rcsb.org/pdb/home/home.do

You need to download the protein structures (i.e., the PDB files) that you are
going to study, to your own computer. Each structure is in a pdb file with a name
that does not carry much information (for example 1H97.pdb). A PDB file is a
simple text file with the xyz coordinates of all the atoms in the protein (one
protein has lots and lots of atoms).

Example of two lines of a pdb file


ATOM 1 N ALA A 1 5.089 4.202 28.188 1.00 42.31 N
ATOM 2 CA ALA A 1 4.695 2.911 28.829 1.00 41.76 C

For atom 1, x, y, z = 5.089 4.202 28.188


For atom 2, x, y, z = 4.695 2.911 28.829

2) Find and get a structure (PDB file) from the PDB.

go to the PDB web page


http://www.rcsb.org/pdb/home/home.do

Search for the protein that you want to study, for example, hemoglobin.

Unfortunately this gives you way too many hits (544 on 9/17/11).
3) Advanced Searching of the PDB.
The PDB web page has very sophisticated search capabilities, which are not
much use if you don’t know what you are searching for. In general, you should
focus on the accurately determined structures. The structures in the PDB are
experimentally determined, and the experimental error is very high in some of
them. No matter what else you are searching for, use an accuracy filter, too.
Here we will restrict our search to the small subset of hemoglobin structures that
are very accurately determined.

(a) Click the Advanced search button.

(b) Under “Choose a Query Type”, choose ”Macromolecule Name”. Type


“Hemoglobin” for the molecule name.
(c) Then add a second search criteria, by clicking on the plus (+) button.

(d) Set the second search criteria to “X-ray Resolution”. This filter allows us to
screen out less accurate structures. Scroll down the “Choose a Query Type”
menu to see X-ray Resolution, under “Methods”
(e) Set the resolution limits to 0.5 and 1.2

Executing this search will give you 4 hits. Which are the 4 most accurate
hemoglobin structures in the pdb.

[If you want to relax the accuracy criteria, and return more hits, increase the max
resolution to from 1.2 to 1.4 or higher]

(f) One of the very accurate hits is PDB entry 1H97. Download that PDB file.
Which is a dimer (human hemoglobin is a tetramer).

To download the coordinates to your computer, click on the download to


computer arrow.

(g) Find 1H97.pdb on your computer, probably in your downloads folder. You can
open it in a text editor or a display program like PyMol or Jmol.

You might also like