Professional Documents
Culture Documents
Z coordinate values
Branch indicator
Residue type
Chain identifier
Occupancy
Residue number
ATOM Temperature factor
HETATM (B-factor)
TER
HELIX
SHEET
SSBOND
Example: 1gcn
• Hydrogen atom records follow the records of all other atoms of a particular residue.
• A hydrogen atom name starts with H. The next part of the name is based on the name of the
connected nonhydrogen atom.
• For example, in amino acid residues, H is followed by the remoteness indicator (if any) of the
connected atom, followed by the branch indicator (if any) of the connected atom;
• If more than one hydrogen is connected to the same atom, an additional digit is appended so
that each hydrogen atom will have a unique name.
Common Errors in PDB Format Files
• Spurious Long Bonds
• Missing TER cards - Either a TER card or a
change in the chain ID is needed to mark the
end of a chain
• Improper use of ATOM records instead of
HETATM records
• Misaligned Atom Names
• Incorrectly aligned atom names in PDB records
can cause problems
• Duplicate Atom Names
• failure to uniquely name all atoms within a given
residue
• Residues Out of Sequence
• the second residue in the file is erroneously
numbered
• Common Typos
• Sometimes the letter l is accidentally substituted
for the number 1
• Missing Coordinates and Biological Assemblies
• Due to the limitations of structure determination methods,
most entries do not include coordinates for every single
atom in the identified molecule.
• In some cases, the experimental method may not observe
certain atoms. For example, flexible regions and hydrogen
atoms are not observed in X-ray crystallographic
experiments, and therefore, are not included in the PDB
coordinate files.
• A few of the common situations you might encounter are
– Asymmetric and Biological Assemblies (PDB ID:1hho)
– Alpha-Carbon Coordinate Files (PDB ID:1f6g)
– Missing Loops and Tails (PDB ID:1az5)
– Fragments and Domains (PDB ID:2a7u)
Exercise : Understanding PDB
Files
• Go to www.rcsb.org
• or search PDB in Google
• Search and download 1gcn
• Search and download 3hhb
• Search and download 1vm3
• Do not double click to open the file but right click the file and
choose ‘Open with’ option.
• Choose program ‘WordPad’ to open.
Alternate version of the exercise
• Go to www.rcsb.org
• or search PDB in Google
• Search 1gcn
• Search 3hhb
• Search 1vm3
• Click ‘Display Files’
• Explore ‘PDB Format’
Need More info?
• Check the following links…
• Introduction to PDB Data
• http://pdb101.rcsb.org/learn/guide-to-
understanding-pdb-data/introduction
Ramachandran Plot
• A special way for plotting
protein torsion angles was also
introduced by Ramachandran
and co-authors, and was
subsequently named the
Ramachandran plot.
• The Ramachandran plot
provides an easy way to view
the distribution of torsion
angles in a protein structure.
• The two torsion anglesdescribe
the rotations of the
polypeptide backbone around
the bonds between N-Cα
(called Phi, φ) and Cα-C (called
Psi, ψ).
• Torsion angles are among the most important
local structural parameters that control protein
folding - essentially, if we would have a way to
predict the Ramachandran angles for a particular
protein, we would be able to predict its fold.
• The torsion angles phi and psi provide the
flexibility required for the polypeptide backbone
to adopt a certain fold, since the third possible
torsion angle within the protein backbone (called
omega, ω) is essentially flat and fixed to 180
degrees.
• The horizontal axis shows φ values, while the vertical shows ψ
values.
• Notice that the counting starts in the left hand corner from -180
and extend to +180 for both the vertical and horizontal axes.
• Each dot on the plot shows the angles for an amino acid.
• This allows clear distinction of the characteristic regions of α-
helices and β-sheets.
• The regions on the plot with the highest density of dots are the
so-called “allowed” regions, also called low-energy regions.
• Some values of φ and ψ are forbidden since the involved atoms
will come too close to each other, resulting in a steric clash.
• For a high-quality and high resolution experimental structure
these regions (generously allowed and disallowed) are usually
empty or almost empty - very few amino acid residues in
proteins have their torsion angles within these regions.
• But there are sometimes exclusions from this rule - such values can be
found and they most probably will result in some strain in the polypeptide
chain.
• In such cases additional interactions will be present to stabilize such
structures. They may have functional significance and may be conserved
within a protein family.
• Another exception from the principle is the torsion angle distribution for one
single residue, glycine.
• Glycine does not have a side chain, which allows high flexibility in the
polypeptide chain, making otherwise forbidden rotation angles accessible.
• That is why glycine is often found in loop regions, where the polypeptide
chain needs to make a sharp turn.
• This is also the reason for the high conservation of glycine residues in
protein families, since the presence of turns at certain positions is a
characteristic of a particular fold of a structure.
• Another residue with special properties is proline, which in contrast to
glycine fixes the torsion angles at a certain value, very close to that of an
extended β-strand.
• Proline is often found at the end of helices and functions as a “helix
disruptor”.
Structure Quality Assessment
• In cases when the protein X-ray structure was not properly
refined, and especially for bad or wrong homology models,
we may find torsion angles in disallowed regions of the
Ramachandran plot − this type of deviations usually
indicates problems with the structure.
• Based on this, the Ramachandran plot is usually used in
assessing the quality of experimental structures or
homology models.
• Torsion angles outside the low-energy regions, whenever
observed, should be carefully examined.
• They may indicate problems in the structure, but they may
also be true and may provide some interesting insights into
the function of the protein.
Red indicates low-energy regions and allowed regions; yellow
allowed regions, pale yellow the so-called generously-allowed
regions and white marks disallowed regions. A: Good, B: Bad.
Exercise – Ramachandran Plot
RAMPAGE can be accessed from
http://mordred.bioc.cam.ac.uk/~rapper/rampage.php
Procheck can be accessed through PDBSum http://www.ebi.ac.uk/thornton-
srv/databases/cgi-bin/pdbsum/GetPage.pl?pdbcode=index.html
Upload a PDB file by clicking the ‘Browse’ button.
Provide an email ID to receive the results in email.