Rachel Manual

RACHEL™ Manual
SYBYL®-X 2.1
Mid 2013
1699 South Hanley Rd. Phone: +1.314.647.1099

St. Louis, MO Fax: +1.314.647.9241
63144-2917 http://www.certara.com
LEGAL NOTICE
SYBYL and related Tripos modules © 1991-2013 Certara, L.P. All Rights Reserved.
Benchware and related Tripos modules © 2005-2013 Certara, L.P. All Rights Reserved.
Almond © 2003-2013 Molecular Discovery Ltd. All Rights Reserved.
AMPAC © 1997-2013 Semichem. All Rights Reserved.
AMM-2001 module in AMPAC version 8.16.5 © 2001 Regents of the University of Minnesota. All Rights Reserved.
Concord, Confort, CombiLibMaker, DiverseSolutions, ProtoPlex and StereoPlex © 1987-2001 University of Texas at
Austin. All Rights Reserved.
FlexX © 1993-2011 BioSolveIT. All Rights Reserved.
FUGUE, JOY, HOMSTRAD, ORCHESTRAR © 2012 Cambridge University Technical Services, Cambridge,
England. All Rights Reserved.
RACHEL © 2002-2012 Drug Design Methodologies.
Surflex, Surflex-Dock, and Surflex-Sim © 1998-2012 BioPharmics LLC. All Rights Reserved.
VolSurf and Almond © 2001-2012 Molecular Discovery Ltd. All Rights Reserved.
Portions copyright 1992-2012 FairCom Corporation. All Rights Reserved.
This material contains confidential and proprietary information of Certara, L.P. and third parties furnished under the
Tripos Software License Agreement. This material may be copied only as necessary for a Licensee’s internal use
consistent with the Agreement. The allowed use includes printing of hardcopy versions hereof as minimally necessary
for Licensee’s internal use. Neither Certara, L.P., nor any person acting on its behalf, makes any warranty or
representation, expressed or implied, with respect to the accuracy, completeness, or usefulness of the material
contained in this manual or in the corresponding electronic documentation, nor in the programs or data described
herein. Certara, L.P. assumes no responsibility nor liability with respect to the use of this manual, any materials
contained herein, or programs described herein, or for any damages resulting from the use of any of the above. Except
for printing of hardcopy versions as stated, no part of this manual may be reproduced in any form or by any means
without permission in writing from Tripos (DE), Inc., 1699 South Hanley Road, Suite 200, St. Louis, Missouri 63144-
2917, USA (314-647-1099).
Selected software programs for methodologies contained or documented herein are covered by one or more of the
following patents: AllChem: US 7,860,657; Comparative Molecular Field Analysis (CoMFA): US 5,025,388; US
5,307,287; US 5,751,605; AT E150883; BE 0592421; CH 0592421; DE 691 25 300 T2; FR 0592421; GB 0592421;
IT 0592421; NL 0592421; SE 0592421. HQSAR: US 6,208,942. Embedded NLM: US 6,675,103. Topomers: US
6,185,506; US 6,240,374; US 7,184,893; US 7,212,951. TopCoMFA: US 7,329,222. DBTop: US 7,330,793. OptiSim:
US 6,535,819. Surflex software programs for chemical analysis by morphological similarity: US 6,470,305 B1.
SYBYL, UNITY, CoMFA, CombiFlexX, Concord, DiverseSolutions, GALAHAD, LeapFrog, OptDesign, StereoPlex,
and Alchemy are registered trademarks of Certara, L.P.
AUSPYX, Benchware, CScore, DISCOtech, Distill, GASP, HQSAR, Legion, MOLCAD, Molecular Spreadsheet,
Muse, OptiDock, OptiSim, Pantheon, ProTable, ProtoPlex, Selector, SiteID, Topomer CoMFA, Topomer Search,
Tuplets, and Tripos Bookshelf are trademarks of Certara, L.P.
RACHEL is a trademark of Drug Design Methodologies.
Surflex, Surflex-Dock, and Surflex-Sim are trademarks of BioPharmics LLC.
“FairCom” and “c-tree Plus” are trademarks of FairCom Corporation and are registered in the United States and other
countries.
All other trademarks are the sole property of their respective owners.
RACHEL Table of Contents
1. Introduction to RACHEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1 What is New with RACHEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 License Requirements for RACHEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2. RACHEL Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 Create a RACHEL Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 RACHEL Scoring Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Run a RACHEL Combinatorial Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Using Chemical Templates and Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 Scaffold Replacement Using CHARLIE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.6 Bridge Generation Using CHARLIE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.7 Create a RACHEL Component Database . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3. RACHEL Graphical Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.1 RACHEL Main Dialog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.2 RACHEL Search Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.3 RACHEL Chemical Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.4 RACHEL Scoring Function Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5 RACHEL Component Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.6 RACHEL Search Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.7 RACHEL Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4. RACHEL Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.1 Extracting Building Blocks from Corporate Databases . . . . . . . . . . . . . . . . 102
4.2 Intelligent Component Selection System . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.3 Development of a Component Specification Language . . . . . . . . . . . . . . . 107
4.4 Novel Techniques to Estimate Ligand-Receptor Binding . . . . . . . . . . . . . . 111
SYBYL-X 2.1 RACHEL 3

This page intentionally blank.
1. Introduction to RACHEL
RACHEL (Real-time Automated Combinatorial Heuristic Enhancement of Lead

compounds) is an all-purpose chemical design application designed to combina-
torially derivatize a lead compound to improve ligand-receptor binding and
accelerate drug discovery.
CHARLIE (Combinatorial Heuristic ARrangement of LInker Elements) is

proficient in building bridges between structures and is useful when substruc-
tures that tightly bind different regions of the active site are present. CHARLIE
is designed to link these complementary fragments together to generate a
complete ligand.
RACHEL encompasses numerous features designed to work cohesively in this

endeavor. They include:
• RACHEL - combinatorial derivatization of user-specified sites to
enhance lead refinement.
• CHARLIE - splicing module designed to link several lead compound
fragments.
• Component database generation, registration, and management.
• Automated generation of scoring functions based on user supplied ligand
and receptor data.
• Advanced conformational search engine capable of sampling nearly
1x106 conformers per second.
• User defined Markush-like chemical descriptors to direct the generation
of structures.
• Heuristic component selection and diversity metrics to accelerate the
generation of solutions while ensuring maximum component diversity.
RACHEL and CHARLIE were developed by Chris M.W. Ho, M.D., Ph.D. of
Drug Design Methodologies, LLC.
1.1 What is New with RACHEL

SYBYL-X 2.1 runs RACHEL 3.10 from Drug Design Methodologies on Linux
and Windows platforms.
With this version of RACHEL the generated ligands do not penetrate the space
occupied by the protein. [SYBYL-X 1.2]

1. Introduction to RACHEL
License Requirements for RACHEL
1.2 License Requirements for RACHEL

RACHEL requires 3D ligand structures. These may be obtained by using
Concord.
SYBYL-X Suite Licensing
SYBYL-X introduced a simplified licensing scheme in which the “SYBYL”

license provides access to Concord.
RACHEL requires a separate “RACHEL” license.
Module-Based Licensing
SYBYL continues to run with a license file issued before the SYBYL-X release.
In that context:
• RACHEL requires a “RACHEL” license.
• Concord requires a "ConcordStandalone" license.
6 RACHEL SYBYL-X 2.1

2. RACHEL Tutorials
Prerequisite to all RACHEL and CHARLIE Tutorials
1. It is always a good idea to clear the screen and reset the display before starting.
! > Delete Everything
! Click to reset all rotations and translations.
2. Make a local copy of the RACHEL demo files and give yourself writing permis-
sions for all the copied files.
! Type cmd cp -r $TA_DEMO/rachel . (Include the space and the
period.)
! Type cmd chmod -R a+w rachel
Conventions:
• The rachel directory mentioned in the tutorials refer to the location
where the RACHEL demonstration files have been copied.
• The instructions in the tutorials assume that rachel is a sub-directory of
your current location, which you can set via Options > Set > Default
Directory.
• Differences in the rounding of floating point numbers on different
platforms will produce slightly different results. All numbers reported in
this tutorial were captured on Windows.
Suite of RACHEL and CHARLIE Tutorials

• Create a RACHEL Project on page 8
• RACHEL Scoring Functions on page 14
• Run a RACHEL Combinatorial Search on page 23
• Using Chemical Templates and Descriptors on page 28
• Scaffold Replacement Using CHARLIE on page 47
• Bridge Generation Using CHARLIE on page 61
• Create a RACHEL Component Database on page 69

2. RACHEL Tutorials
Create a RACHEL Project
2.1 Create a RACHEL Project

RACHEL is designed to optimize user-defined portions of a weak-binding lead
compound with the active site of its target receptor. RACHEL requires an
anchor bond to which components will be attached. Potential complementary
fragments are then selected from the RACHEL database, attached to the anchor
bond, and conformationally searched within the receptor cavity to optimize
steric and electrostatic forces.
In this exercise, you will perform virtual combinatorial chemistry to explore

chemical alternatives for an arginine residue on a known peptidic inhibitor of
alpha-thrombin (PDB code 1dwe). [Banner, D.W, and P. Hadvary J. Biol.
Chem. 1991, 266: 20085].
2.1.1 Define a New RACHEL Project

1. Make sure that you have all the necessary files. These are the same as those
used in the other RACHEL tutorials.
! If you have not yet copied the RACHEL demo files, see Prerequisite
to all RACHEL and CHARLIE Tutorials on page 7.
2. Start defining a new project.
! Applications > RACHEL
! At the top of the RACHEL dialog, press Create New Project.
The RACHEL - Setup New Project dialog appears (dialog description on

page 77).
3. Enter the name of the new project.

! Press the Project [...] button.
The project directory dialog box will then appear.
! Navigate to the rachel directory.
! Press New.
! Append the name of the new project, tutorial, to the directory

name in the adjacent field and press OK.
The project name is listed on the Project line in the RACHEL - Setup New
Project dialog.

2. RACHEL Tutorials
4. Select the ligand.
! Press the Ligand [...] button.
! Navigate to the rachel/CMPDS directory.
! Select key.mol2 and press OK.
5. Select the receptor, the trimmed active site of alpha-thrombin.
! Press the Receptor [...] button.
! Select lock.mol2 and press OK.
Upon completion, the RACHEL - Setup New Project dialog will resemble the
following:
2.1.2 Designate the RACHEL Anchor Bond

RACHEL allows you to optimize combinatorially up to five different sites on
the lead compound simultaneously. For each site, an anchor bond must be
specified. It is this anchor bond to which chemical components will be linked
and conformationally searched. The target atom is the direction of growth of the
derivative compounds.
6. Start the setup process for RACHEL.
! In the RACHEL - Setup New Project dialog, press Setup RACHEL.
The SYBYL window displays the ligand-receptor complex.

• In M1: the ligand colored by atom type and with labeled atom IDs.
• In M2, the active site of alpha-thrombin colored purple.

2. RACHEL Tutorials
7. Look at the ligand.
! Use to undisplay temporarily (Mol Vis off) the receptor structure

(lock) in M2.
8. Designate the anchor bond in the ligand. Note: The order in which the atoms are
selected is important.
The following dialog appears, prompting you to select the first atom in the
anchor bond.
An anchor bond defines an optimization site. It is to this anchor bond that

chemical components are linked and their conformations explored. The order in
which the anchor atoms are chosen dictates which region will be optimized and
which region will remain static.
In this tutorial, you will determine what other chemical components might bind
in place of the arginine sidechain of the tripeptide ligand.
! In the SYBYL window, rotate and scale the molecules until you can
clearly see atoms N19 and 20 of the ligand.
! Click ligand atom N19 or type 19 in the dialog prompting you for the
first anchor atom.

2. RACHEL Tutorials
A sphere of green dots acknowledges this selection, then the next selection after
you make it.
! Click ligand atom 20 or type 20 in the dialog prompting you for the
second anchor atom.
You have designated the bond from N19 to C20 as the anchor bond of the
optimization site. The amide bond and terminal methyl groups are colored green
to indicate that this region will be replaced by the combinatorial addition of
chemical components.
Usage Note: If you accidentally selected the wrong bond, press Cancel in the
next dialog (atom selection for the target area) and restart the anchor bond
selection.
2.1.3 Designate the RACHEL Target Atom

Once you have selected the anchor bond, you must designate where you would
like to direct the growth of the derivative components. In this case, that portion
of the ligand is fully surrounded by the receptor, forming a pocket. In your own
work this might not be so apparent, especially if the receptor does not fully
enclose the active site.
Thus, you must specify a target atom that will be used to:
• direct the growth of the derivative structure;
• focus the conformational search to increase search efficiency;
• designate the approximate length to which derivative growth will occur.
9. To make the selection of the target atom in the receptor easier, label the residues
by substructure name.
! Use to re-display (Mol Vis on) the receptor structure (lock) in

M2.
! Set the Atom Labels for M2 to Substructure.
10. Select the target atom in ASP189.

2. RACHEL Tutorials
! Click the terminal carboxylate carbon of the ASP189 sidechain (atom

ID 312) or type m2(312) in the dialog prompting you for the target
atom.
The selected receptor atom is labeled SITE_1, designating it as the target for
ligand growth.
Notes:
• You can choose a receptor or a ligand atom to serve as the target atom.
Often, a receptor atom is either too far away or located in the wrong
position. In this tutorial, you could have designated one of the terminal
atoms of the arginine sidechain (in green) in the ligand to serve as the
target.
• You can also specify up to five ligand site to be optimized simulta-
neously. However, the search engine is limited to 10 rotatable bonds
spread out over the total number of sites defined. Thus, if two sites do
not influence one another (the structures generated for one site do not
enter into contact with those generated for the other), it is better to
conduct two separate searches (each with one site defined) as the search
engine will be able to conduct a more detailed conformational search.

2. RACHEL Tutorials
2.1.4 Terminate the RACHEL Setup and Review the Project

11. The RACHEL setup is complete: only one site will be defined for this tutorial.
! Press End in the dialog prompting you to select a site 2 anchor atom.
12. Review the RACHEL project as defined so far.
The RACHEL dialog now includes information about the project.
For each defined site the following information is listed:

• Site: 1 = The site’s ID number.
• Anchor = 19-20: The ligand atoms defining the anchor bond. The first
atom will remain fixed, the second atom determines the region that will
be optimized.
• -3.53, 8.20, 2.41 = The coordinates of the target atom.
The project directory contains all the files necessary to perform the virtual
combinatorial chemistry experiment.
• key.mol2 and lock.mol2 = the ligand and receptor files were copied to
the project directory.
• Rachel_setup = the information that is displayed in the dialog plus the
active site definition
• Rachel_builddef = Stores RACHEL chemical descriptors
• Rachel_scoredef = Stores RACHEL scoring function
• Rachel_searchdef = Stores conformational search engine parameters
Note: Do not move the project directory once you have created it. If this is
desired, you must erase this directory and the files within it and regenerate the
project using the RACHEL setup process detailed above.

2. RACHEL Tutorials
RACHEL Scoring Functions
2.2 RACHEL Scoring Functions

RACHEL utilizes local scoring functions rather than global scoring functions.
This means that RACHEL can study a set of existing ligand-receptor complexes
(if available) and deduce a scoring function based upon the specific chemical
interactions that influence their binding using partial least squares analysis
(PLS). Thus, RACHEL exploits intellectual property to generate a more
powerful scoring function.
If you do not have ligand-receptor structures to derive either a scoring function

or a target function, RACHEL implements a generalized scoring function
derived from the VALIDATE training set developed by Richard Head et al.
[J. Am. Chem. Soc. 1996, 118, 3959-3969].
2.2.1 The Training Set for the Scoring Function

The directory rachel/TSET contains the 3D structures of 9 HIV-1 protease

active sites and co-crystallized inhibitors. These have a wide range of affinities
and will be used as the training set.
Note: RACHEL uses SYBYL’s atom definitions to calculate van der Waals
complementarity and strain. For that reason, we recommend that you use the
Tripos force field if you want to minimize the structures when you use your
own data to train the RACHEL scoring function.
14. List the contents of the TSET directory

! Type cmd ls rachel/TSET
1a30_key1.mol2
1a30_lock.mol2
1aaq_key1.mol2
1aaq_lock.mol2
1dmp_key1.mol2
1dmp_lock.mol2
1hsg_key1.mol2
1hsg_lock.mol2
1hvi_key1.mol2
1hvi_lock.mol2
1hvr_key1.mol2
1hvr_lock.mol2
4hvp_key1.mol2
4hvp_lock.mol2
4phv_key1.mol2

2. RACHEL Tutorials
4phv_lock.mol2
5hvp_key1.mol2
5hvp_lock.mol2
PLS_hiv.txt
The ligand (key) and receptor (lock) structures for each complex are stored in
separate files. Thus, 1aaq_key1.mol2 and 1aaq_lock.mol2 are a matched
pair.
The text file, PLS_hiv.txt contains the information used by RACHEL to identify
the molecules in the training set.
! Type cmd cat rachel/TSET/PLS_hiv.txt
1hvi_key1.mol2 1hvi_lock.mol2 10.50

1dmp_key1.mol2 1dmp_lock.mol2 9.55
1hvr_key1.mol2 1hvr_lock.mol2 9.51
1hsg_key1.mol2 1hsg_lock.mol2 9.42
4phv_key1.mol2 4phv_lock.mol2 9.15
1aaq_key1.mol2 1aaq_lock.mol2 7.93
5hvp_key1.mol2 5hvp_lock.mol2 7.25
4hvp_key1.mol2 4hvp_lock.mol2 6.11
1a30_key1.mol2 1a30_lock.mol2 4.30
PLS_hiv.txt contains one line per matched pair, consisting of the ligand and
receptor file names followed by the binding affinity of the ligand for the
receptor (units = -log Ki).
2.2.2 Generate a Scoring Function

You will generate a scoring function using a series of co-crystallized HIV-1
protease inhibitors extracted from the Protein Data Bank.
15. Access the scoring function.

! On the RACHEL dialog’s Scoring Function line, press Train.
The Choose Function dialog appears (dialog description on page 91).
! Press Generate Scoring function with multiple complexes.
16. Specify a directory containing the training set of ligand-receptor complexes.
! Navigate to your rachel/TSET directory and press OK.
! In the Specify Activity File dialog, select the PLS_hiv.txt file described
above and press OK to continue.

2. RACHEL Tutorials
RACHEL begins processing the ligand-receptor complexes as described in

Extracting Building Blocks from Corporate Databases on page 102. Descriptors
are extracted and processed using partial least squares analysis (PLS) to
generate a predictive model.
Progress is reported in a separate window.
Processing training set:
1) 1hvi_key1.mol2
1hvi_lock.mol2
2) 1dmp_key1.mol2
1dmp_lock.mol2
3) 1hvr_key1.mol2
1hvr_lock.mol2
...
If there are enough ligand-receptor complexes of varying binding affinities,

RACHEL will generate a scoring function using PLS analysis of the complexes.
Once the analysis has completed, you will be shown the statistical results in the
form of a table listing the number of PLS components vs. the predictive power
(q2) of the resulting model. You will then be prompted for the number of PLS-
components to include in the predictive model.
Comp Q2cum Q2 SSYcum SSY SSXcum SSX
----- ----- ----- ------ ----- ------ -----
1 0.805 0.805 0.878 0.878 0.462 0.462
2 0.785 -0.103 0.893 0.016 0.721 0.258
3 0.580 -0.954 0.901 0.007 0.841 0.120
4 0.351 -0.545 0.902 0.002 0.968 0.127
5 0.258 -0.144 0.909 0.006 0.994 0.026
6 0.289 0.042 0.958 0.049 0.997 0.003
7 0.957 0.939 0.999 0.041 1.000 0.003
8 1.000 1.000 1.000 0.001 1.000 0.000
Q2cum = Predictive power of model.

SSYcum = Explained variance of data in Y.
SSXcum = Explained variance of data in X.
(S)coring function or (T)arget function? (S):
At this point, you have a choice to select either a Scoring function or a Target
function. Judging from the data, the predictive power (Q2cum = 0.805) is
adequate using 1 principal component in the PLS derived model. If the data is
not sufficient to generate a scoring function with adequate predictive power
(Q2cum > 0.50), RACHEL will automatically default to a Target function.
! At the request for “(S)coring function or (T)arget function?” enter S.
! At the request for “How many components to use?” enter 1.

2. RACHEL Tutorials
The normalized regression coefficients for each scoring function descriptor are
listed.
NPO_NPO E_INTER STERIC STRAIN MW NUM_RBD
0.355 0.129 0.007 -0.458 0.246 -0.166
LOGP_EST NPO_FRAC
0.495 0.515
Press =ENTER= to continue
Study the regression coefficients listed above. They are applied to the various
chemical descriptors of ligand-receptor binding in order to calculate (estimate)
binding affinity.
• NPO_NPO = non-polar non-polar ligand receptor interactions.
• E_INTER = electrostatic interaction energy.
• STERIC = steric complementarity.
• STRAIN = steric strain.
• MW = molecular weight.
• NUM_RBD = number of rotatable bonds.
• LOGP_EST = estimate of Log P (ligand).
• NPO_FRAC = non-polar fraction of ligand atoms.
Artifacts in the analysis or in the crystal structure data itself can generate
scoring functions that may produce poor structures. In this case, notice that the
coefficient for E_INTER is positive. Using this scoring function, a ligand would
be penalized for complementary electrostatic interactions with the receptor.
This is because negative electrostatic interaction values mean favorable electro-
static interactions and tighter binding. In essence, ligands with more favorable
electrostatic interactions with the receptor will be scored lower.
(negative electrostatic energy value ) X (positive coefficient) =

negative score!
There could be several reasons for this.

• The incorrect sign might be a PLS artifact as the statistical technique
attempts to best fit the data. Second, and more importantly, a small
sample size may significantly affect the PLS model.
• The size of the training set is very small; thus, a few select compounds
may skew the descriptor coefficients even though the overall predictive
power is adequate.
• Most importantly, crystal structures were used to generate the scoring
function. The very act of crystallization is a very restrictive filter in its

2. RACHEL Tutorials
own right. In essence, electrostatic complementarity must be present for

ligand and receptor to interact. However, variations in the electrostatic
energy may not be significant enough to generate measurable differences
in signal.
The solution is to either flip the sign on the E_INTER (electrostatic interaction)
term by editing the scoring function or by implementing a target function.
The data presented is fairly typical when crystal data is generated for a
particular drug discovery project. In the graph generated for the training set,
notice the cluster of ligands that bind the receptor with high affinity (nM or
better). There are far fewer compounds that exhibit lower binding affinities (µM
or worse). This is often the case as time is rarely spent elucidating coordinates
for poorer binding structures. Thus, the available data may be skewed. This is
unfortunate because the poorer binding ligands may give better insight into
improving the activity of the lead compounds.
The homogeneity of the data often leads to difficulties in generating scoring

functions. The variability in the measured chemical descriptors is not significant
enough to establish trends in the data. When this is the case, RACHEL employs
target functions to determine more accurately the factors that improve ligand
binding.
! Press the Enter key in the text window to close it.
A molecular spreadsheet is created that contains one row for each compound in
the training set. The columns contain the following information:
• CMPD = names of the files containing the ligands in the training set (in
the rachel/TSET directory).
• PREDICTED = binding affinity value predicted by the model.
• OBSERVED = binding affinity value from the file PLS_hiv.txt.
• DIFFERENCE = OBSERVED - PREDICTED
Note: The numbers reported in this tutorial were captured on Windows.

2. RACHEL Tutorials
The graph plots the PREDICTED versus OBSERVED values for the 9 rows. It
shows a good correlation between the predicted and observed binding affinities.
At first glance, the numbers indicate that this model is a good scoring function.
However, there may be shortcomings when scoring functions that were derived
from crystal structures are applied to real-world data.
17. Close the spreadsheet. The table file Rachel_obs_vs_pred.tbl is in the

project directory.
! MDE: File > Close
2.2.3 Predict the Binding of an Unknown Ligand

Using the scoring function you have generated, you can also predict the binding
affinity of ligands whose receptor affinity is unknown. To demonstrate this, you
will utilize your scoring function to predict the binding affinities for various
compounds in the training set.
! On the RACHEL dialog’s Scoring Function line, press Predict.
RACHEL prompts you to select the ligand structure for which you wish to
predict the binding affinity. Often you will be iteratively modifying the
structure of a ligand and wish to determine whether changes are improving
receptor affinity or diminishing it. If you do not have the ligand in question in a
molecule area, simply press End in order to load it from a Mol2 file.
! Press End in the dialog prompting you for a ligand to load it from a
Mol2 file.
! Navigate to the rachel/PREDICT directory, select 1hiv_key1.mol2
and press OK.

2. RACHEL Tutorials
RACHEL prompts for you to select the receptor structure involved in the
binding. If you the receptor in question is in a molecule area, simply click it and
press OK.
In this example, you must again load it from a file.
! Press End in the dialog prompting you for a receptor to load it from a
Mol2 file.
! Navigate to the rachel/PREDICT directory, select 1hiv_lock.mol2
and press OK.
In the separate text window:

Processing ligand-receptor complex:
1 /home/nicole/rachel/tutorial//.PLS/key.mol2
/home/nicole/rachel/tutorial//.PLS/lock.mol2
0.8900 Nonpolar-nonpolar energy.

-0.9316 Electro inter.
0.9835 Steric interaction.
0.0826 Steric strain.
8.1039 Molecular wt.
26.0000 Number of rbonds.
6.6440 LogP estimate.
0.8760 Nonpolar fraction.
Score = 9.554105.
This information is also stored in the PREDICT file in the project directory.
RACHEL predicts this ligand receptor complex to have a binding affinity

-logKi = 9.54 (about 10-9 M), which is excellent.This compound has a measured
affinity of -logKi = 9.15. The scoring function model was able to predict the
binding of this compound quite accurately.
2.2.4 Generate a Target Function

To generate an accurate predictive model (q2 > 0.50), there must be a sufficient
number (> 10) of 3D structures that contain a wide range (> 3 log units) of
binding affinities. If too few structures are available, or if the range of binding
affinities is too narrow, the resulting scoring function has little predictive value.
Under those circumstances you can use RACHEL to generate a target function
instead.
The target function is similar to the scoring function in that the same chemical
descriptors are employed. However, instead of scoring the ligand-receptor inter-
action like a force field, the target function stores the ideal values for each

2. RACHEL Tutorials
chemical descriptor. Ligand generation is then guided by these target values to

ensure that the derivative compounds mirror the chemical qualities of the
highest affinity compounds while complementing the active site of interest.
The advantage of target functions is that they are not prone to any statistical
artifacts. They allow RACHEL to rapidly generate structures that mirror the
chemical characteristics of the most active ligands. They can also be elucidated
from as few as one ligand-receptor complex. In addition, they are easier to
tweak in order to direct the genesis of new classes of compounds. Using target
functions, structures will be generated that exploit receptor binding character-
istics similar to the training set compounds. Thus, it is recommended that you
use only the best compounds in the training set.
The disadvantage of a target function is that the complex chemical influences

that one descriptor may have on the others is lost because descriptor values are
simply averaged. In addition, extrapolation of the function to other receptor
systems is not possible because all descriptor values are elucidated from the
binding characteristics of an ideal set to a single receptor system.
You will generate a target function using only the ligand and receptor that you
used to set up this tutorial.
18. Access the scoring function.

! On the RACHEL dialog’s Scoring Function line, press Train.
The Choose Function dialog appears (dialog description on page 91).
! Press Generate Target function with a single complex.

! Press End in the dialog prompting you for a ligand to load it from a
Mol2 file.
! Navigate to your rachel/CMPDS directory.
20. Select the receptor.
! Press End in the dialog prompting you for a receptor.
! In the rachel/CMPDS directory select lock.mol2 and press OK.
21. A separate text window displays the following:
Too few compounds in training set.

Sample is not statistically valid.

2. RACHEL Tutorials
Will implement target function.

Press =Enter= to continue.
This is because you cannot perform a PLS analysis with just one ligand-receptor
complex. RACHEL automatically defaults to a target function when too few
complexes exist or if the predictive power (q2) of the resulting scoring function
is below 0.3.
A message in the console indicates that the target function has been saved in
rachel/tutorial/Rachel_scoredef.
22. Review the scoring function parameters
! On the RACHEL dialog’s Scoring Function line, press Edit.
The last successful training activity determines the values entered in the
RACHEL - Adjust Target Function dialog (dialog description on page 90):
Since you have just completed the generation of a target function, that is what is
presented here. The eight descriptors used by RACHEL to describe ligand-
receptor interactions are listed along with their target values and weighting
factors.
In your own work you may modify any of these parameters in order to stress
one chemical property over another. Initially, you should adjust the weighting
factors of the chemical descriptors in scoring a particular compound. For
example, to stress electrostatic interactions, simply set the corresponding scalar
to 2.0. To diminish the impact of a particular parameter, set the corresponding
scalar <1.0. The practical range of weighting factors is 0.0–5.0.
You will use this target function to perform combinatorial optimization of a test
compound in the next section of this tutorial (Run a RACHEL Combinatorial
Search on page 23).
! Press Cancel to close the RACHEL - Adjust Target Function dialog.

2. RACHEL Tutorials
Run a RACHEL Combinatorial Search
2.3 Run a RACHEL Combinatorial Search

2.3.1 Define or Reload the RACHEL Project
To run a combinatorial chemistry search with RACHEL you need:
• A RACHEL project directory, a ligand, and a receptor.
• A RACHEL database of components.
• A scoring function.
More often than not, you will be running searches on a project created during a
previous session.
1. If necessary, open the project created in Create a RACHEL Project on page 8.

! In the RACHEL dialog press Open Existing Project.
! Navigate to your rachel/tutorial project directory and press OK.
The information about the RACHEL project is loaded at the top of the RACHEL
dialog.
2. Included with RACHEL is a component database derived from the publicly

available 3D structural database from the National Cancer Institute. This
database contains a wealth of chemical diversity and provides an excellent basic
set of components for combinatorial derivatization.
! In the Database section of the RACHEL dialog press Select.
! Navigate to your rachel/DBASE directory, select nci3d and press

OK.
The RACHEL dialog will resemble the following:

2. RACHEL Tutorials
2.3.2 Run the RACHEL Search

3. Start the search.
! Press Run Search.
You must define the storage location for all successfully generated structures
(hits) within the project directory. This allows you to store multiple runs, each
utilizing different parameter setups, in an organized fashion.
! In the file browser that is presented press New.
! Append hits.mdb to the directory name in the adjacent field.
Note: Because all the hits are saved in individual .mol2 files, adding the
extension .mdb to the directory name makes it easier later to review the hits in
a Mol2 database or in a molecular spreadsheet.
! Press OK to start the search.

2. RACHEL Tutorials
2.3.3 View the Structures Generated During the Search

While RACHEL is searching, you can view the structures that have been
generated so far in order to ascertain whether they are chemically feasible and
desirable.
4. Once a few iterations of structure generation have completed, perform the

following operations:
! Press Status at the bottom of the RACHEL dialog.
! Select the hits.mdb directory and press OK.
The RACHEL - Status dialog appears (dialog description on page 97).

• In the top half of this window, you will find the summary of the entire
project - detailing all the files and parameters used in the current search.
• In the bottom half, the hits that have been generated are listed along with
their relative scores. Your display will vary from what is shown
depending upon the random seed chosen by RACHEL and on the
duration of the search.
The top scoring ligand is highlighted in the RACHEL - Status dialog. Its corre-
sponding structure and highest scoring conformation are displayed within the
receptor in the SYBYL window.
In this tutorial you are using a target function. The maximum score for the
target function is 10.0 (see Understand the RACHEL Score Values on page 26).
Thus, as compounds that are generated improve iteratively, their scores will
approach this value.
5. You can now view the other structures that RACHEL has produced so far.
! Press Viewer at the bottom of the RACHEL - Status dialog.
The Viewer dialog is displayed (dialog description on page 98).
The Viewer acts as a remote control, enabling you to quickly cycle through the
hits while permitting full interaction with the structures.
! Pressing the left and right arrows allows you to move up and down
the list of hits while the double arrows move one “page” at a time.
! Toggle Original Ligand on to display the original ligand for
reference.

2. RACHEL Tutorials
Study the hits that RACHEL has generated. You should see a variety of deriv-
ative structures to replace the original ligand region that was specified in the
setup. With the chemical diversity present in the NCI 3D database, many
different replacement derivatives are possible. Because you did not define any
chemical descriptors (the topic of the next section in this tutorial), any chemical
structure is allowed, and some of the hits may be chemically improbable.
6. While you are viewing the structures, RACHEL continues to search for new
hits. At any time, you may load new hits from the project run.
! Press Refresh at the bottom of the RACHEL - Status dialog to reload

the most current hits from the search.
! When you are finished viewing the hits, press [X] in the Viewer dialog
to close it.
2.3.4 Understand the RACHEL Score Values

For any compound generated by RACHEL, the higher the positive score, the
more likely the compound should bind to the receptor.
Scoring Function Values
The scoring function scores reflect the values (and units) used in the training set
of compounds used for the RACHEL analysis. If you are using the default
scoring function, the scores indicate the ligand binding affinity (-log K). If the
default scoring function is generating high values (>11), it may not be suitable
for the unique characteristics of your receptor-binding site. Try training a
scoring function using your own data, or switch to a target function.
Target Function Values
Target functions have a maximum score of 10.0. Any deviation in any of the
measured scoring function descriptors from the values derived from the test set
compounds subtracts from the maximum score. Thus, the resulting scores
simply relate the characteristics of the ligands to the ideal test set of
compounds. Although the author postulates that a higher score should indicate a
more desired compound, one cannot derive any direct measure of binding from
these numbers.
Negative Score Values
Early in the search, you may see structures with negative scores. The reason is
that RACHEL implements a distance penalty function in order to generate
compounds that fill the active site region. Otherwise, numerous ligands that

2. RACHEL Tutorials
barely fill the target region may be generated. RACHEL penalizes developing
ligands until they reach 66% of the distance between the anchor bond and the
target. After this distance has been reached scoring is performed as usual.
Compounds with negative scores are deemed pseudo hits. RACHEL does not
discard these pseudo hits because they provide important data about steric and
electrostatic complementarity (or disparity). RACHEL uses this information to
determine heuristically which chemical groups to utilize in succeeding genera-
tions. The number of pseudo hits is displayed in parenthesis after the number of
true hits in the RACHEL monitor window.
2.3.5 Terminate the Search Process

! To terminate the search, press STOP Search at the bottom of the
RACHEL - Status dialog.
! Press Yes to confirm that you want to terminate the search.
RACHEL will complete the current searching iteration.
! Close the RACHEL - Status dialog and return to the RACHEL dialog.
Troubleshooting an Error in the Search Process
On rare occasions, RACHEL may encounter difficulties in initiating a search.

Most likely, it is due to an improperly written or read file in the project
directory. Check disk space and read/write permissions in the project directory,
especially if files are being shared. Licensing issues will also cause premature
termination. RACHEL will normally describe the problem in the SUMMARY
file in the directory containing the hits. This file is displayed in the Summary
window in the top half of the RACHEL Status dialog.

2. RACHEL Tutorials
Using Chemical Templates and Descriptors
2.4 Using Chemical Templates and Descriptors

2.4.1 Background Information
The many hits that are generated by RACHEL may or may not be useful. Due to
the combinatorial nature of this process, a vast number of potential candidate
structures are possible, however, very few of them may turn out to be useful.
RACHEL will help you focus this combinatorial potential to generate only
those structures that complement the active site of interest, that are chemically
desirable and synthetically feasible, and that afford patent protection. RACHEL
accomplishes this task by using a patented system of chemical building
descriptors.
When a combinatorial chemistry experiment is set up, RACHEL automatically

generates a default chemical descriptor file (Rachel_builddef) in the project
directory. This file contains a minimum number of constraints to eliminate
obvious chemical errors. Additional constraints can (and probably should) be
added because any user input governing the chemical nature of the substituent
components will both focus and accelerate the drug discovery process. The
chemical descriptor file consists of numerous possible constraints and
associated values.
Chemical Templates
The template is RACHEL’s fundamental chemical descriptor. The template is a

Markush-like expression that allows you to place chemical constraints upon
specific regions of the derivative structure while allowing other regions to be
freely modified.
The template string contains combinations of defined (numbered) components

and wildcard (*) regions. Each defined component is further described using
chemical specifications to focus and limit the potential list of substituent
groups. The wildcard regions tell RACHEL to substitute components freely by
selecting optimal components that are complementary to the receptor.
Chemical Descriptors
Some of RACHEL’s chemical building descriptors may be applied only to

components, others to sites, a few to either components or sites.
• Component level descriptors restrict the selection of substituent candi-
dates for each defined component.
• Site level descriptors govern an entire optimization site, that is, the
combination of components that complement a user defined region.

2. RACHEL Tutorials
Descriptors Purpose Applicability
ATOMS Number of atoms Component

ATTACH Specific linkages Component
ATYPES Atom types Site or Component
BONDS Bonded atoms Site or Component
LINKS Component linkages Site
MW Molecular weight Site
PHARM Pharmacophore Site
RATOMS Number of ring atoms Component
RBONDS Rotatable bonds Site
For a complete description of RACHEL’s chemical descriptors and how to use

them see RACHEL Chemical Descriptors on page 80.
2.4.2 Reopen a RACHEL Project

1. If the RACHEL project used in previous sections of this tutorial is still open,
skip to Scientific Scenario below.
2. If necessary, open the project created in Create a RACHEL Project on page 8.
! In the RACHEL dialog press Open Existing Project.
! Navigate to your rachel/tutorial/ project directory and press OK.
The information about the RACHEL project is loaded at the top of the RACHEL
dialog.
Because you have also run a RACHEL combinatorial search for this project
(Run a RACHEL Combinatorial Search on page 23), the name of the database
used for the search (rachel/DBASE/nci3d) is also posted at the bottom of the
dialog.

2. RACHEL Tutorials
2.4.3 Scientific Scenario

The goal in the tutorial is to generate ligand derivatives that substitute for the
arginine residue in the tripeptide ligand. The region to be replaced is the
arginine. The assumption is that intervening substituent groups are acceptable as
long as they complement the receptor.
3. Examine the ligand-receptor system.
! In the RACHEL dialog press the Chemical Descriptors: Edit button.
The RACHEL - Modify Chemical Descriptors dialog appears (dialog description

on page 80).
! At the top of the dialog toggle both Original Ligand and Receptor
on.
4. Label the residues in the receptor.
! Use (press Refresh if necessary) to set the atom labels (Atm

Lbl) for M4 to Substructure.
5. Look at the complex to determine the various ligand-receptor interactions that

can be exploited in order to generate appropriate derivatives.
From a steric perspective, the arginine sidechain fits tightly into a very defined
pocket. This pocket is quite flat, forming a narrow cavity into which the
arginine guanido terminus is wedged. One can clearly see that enough room
exists for a cyclic system to substitute for the guanido terminus; however, it
must be planar.
From an electrostatic perspective, it is clear why arginine is an ideal substrate

for binding. Numerous hydrogen bonds are formed with the receptor. Interac-
tions occur with the sidechain of Asp 189, the carbonyl group of Gly 219, and

2. RACHEL Tutorials
possibly with the hydroxy group of Tyr 228. This binding pocket is highly polar
with an abundance of negatively charged functional groups. Thus, the ideal
component to complement this region should contain numerous hydrogen bond
donors.
The carboxyl terminus of the arginine residue also resides in a region where
growth can occur. In addition a hydrogen bond is made with the amide nitrogen
of Gly 193. Any derivative components that are placed in this region should
maintain this hydrogen bond as well.
Given this knowledge of the ligand-receptor interactions within the active site,
the first task is to formulate a RACHEL template that will describe the
chemistry of the ligand derivatives. The template consists of a user-determined
arrangement of defined components and wildcard designations. RACHEL
chemical descriptors are then assigned to the defined components. These
descriptors act as filters to enrich the database for the functionality desired at
the various positions in the template. The diagram below depicts the RACHEL
template that you will generate and the various chemical descriptors that you
will assign to the defined components.
Figure 1 Chemical Template Schematic
• Defined Component #1:

• Must contain ring structures - five to six atoms in size.
• Two or more hydrogen bond donor groups must be present.
• Component must hydrogen bond to Asp 189 (pharmacophore)
• Wildcard
• RACHEL will substitute freely at this position given the steric and
electrostatic environment.
• RACHEL will add components to this position with varying connec-
tivity as necessary.

2. RACHEL Tutorials

• Small component desired to mimic alpha carbon of residue.
• Number of atoms in this component < 6.
• C.3 (sp3 carbon) must attach to amide nitrogen of previous residue.
• Component must contain at least one hydrogen bond acceptor. (O.3
or O.2)
• Component must serve as hydrogen bond acceptor for N.am of Gly
193 (pharmacophore).
You will now implement these tools to constrain RACHEL’s process of gener-
ating derivative chemical structures to replace the arginine sidechain of the
ligand tripeptide. These descriptors, in conjunction with the knowledge of the
chemical interactions within the active site, govern the structure-based drug
design and refinement.
2.4.4 Create the RACHEL Template

6. The first task in preparing a RACHEL combinatorial chemistry search is the
creation a suitable template. The RACHEL template is a Markush-like
expression that contains a combination of wildcard and defined components to
direct the generation of new derivative structures (see Chemical Templates on
page 28). The Edit Templates section of the RACHEL - Modify Chemical
Descriptors dialog controls this process (dialog description on page 80).
! At the top of the dialog toggle both Original Ligand and Receptor
off.
In the SYBYL window you will see:
This is how RACHEL represents the template.

• The scaffold of the original ligand is shown in blue.
• The red bond is the anchor bond to which derivative components will be
attached.

2. RACHEL Tutorials
• The “SITE1” label denotes the target that was selected to direct the
growth of derivative fragments.
• The “W” atom represents a wildcard component. When a search is set
up, RACHEL places a wildcard at each anchor bond by default so that
growth is allowed to proceed unhindered should you decide to forego
any descriptors or constraints.
! In the RACHEL - Modify Chemical Descriptors dialog, make sure that
the Component Type is set to Defined.
! Press Attach Component.
! When the Select Atom dialog pops up, click the W wildcard component
to attach a defined component.
In the SYBYL window, defined component C1 is attached to the wildcard

component W.
7. Insert a component between the anchor bond and the wildcard.
! Press Insert Component.
! Click the ligand atom (blue) attached to the wildcard component.
! Click the wildcard W component.
In the SYBYL window, defined component C2 has been inserted between the
anchor and the wildcard component.

2. RACHEL Tutorials
8. To complete the template, you must attach one more defined component that
branches off the main chain at component C2.
! Press Attach Component.
! Select component C2 as the attachment point for the next defined

component.
The template is complete.
9. Compare the template to the original ligand.

! At the top of the RACHEL - Modify Chemical Descriptors dialog, toggle
the Original Ligand on and off.
! Make sure the Original Ligand is off before you go on to the next
step.
2.4.5 View the Site Descriptors

10. The first step in preparing a RACHEL combinatorial chemistry search is the
creation is to assign descriptors to constrain the selection of the defined compo-
nents used to generate the new structures. The Edit Chemical Descriptors
section of the dialog controls this process.
RACHEL assigns a series of default site level descriptors to limit the generation
of chemically inappropriate structures. These can be seen easily.
! In the RACHEL - Modify Chemical Descriptors dialog, activate Site
Descriptor.
In the SYBYL graphics window, the SITE1 region is highlighted as shown

below. Because only one site has been defined, it is chosen automatically. In
your own work, if you have defined more than one site you will be prompted to
click the SITE label of choice.

2. RACHEL Tutorials
The dialog lists the descriptors that have been assigned to the selected site.
These descriptors all act as constraints to eliminate chemically inappropriate
atom types or undesired linkages (see the glossary of chemical descriptors on
page 28).
Notice that the ID number of the currently displayed site (Site: 1) is shown in
the lower left corner of the dialog.
! In the RACHEL - Modify Chemical Descriptors dialog, click the Display

pull-down to see how you can filter the list of displayed descriptors.
Often you will invoke numerous descriptors for either a site or a component.
This menu enables you to filter and view descriptors of the same type. If you
select ATYPES, only the single ATYPES descriptor will be listed.
! Select ALL Descriptors to see the entire list or defined descriptors in

the dialog.

2. RACHEL Tutorials
2.4.6 Add Chemical Descriptors for Defined Components

You will now specify the chemical descriptors for the three components you
defined in the template.
Component C1
Chemical descriptors for component C1 will indicate that the derivative

compounds must include a 5- or 6-membered ring in that position and must
have two or more potential hydrogen-bond donors.
11. Select component C1.
! In the RACHEL - Modify Chemical Descriptors dialog, activate

Component Descriptor.
! Click component C1 in the SYBYL window or type C1 in the dialog
prompting you to select a component.
Component C1 is highlighted in green. At the bottom of the RACHEL - Modify

Chemical Descriptors dialog the Site and Cmpnt information boxes are
updated to reflect your selection. Since there are no descriptors currently
defined for this component, none are listed.
12. Add a component descriptor of type RATOMS to specify the minimum and
maximum number of ring atoms for this component.
! In the Edit Chemical Descriptors section of the RACHEL - Modify
Chemical Descriptors dialog, press Add Descriptor.
The Select Component Descriptor Type dialog appears (dialog description on

page 82).
! Set the Descriptor Types pull-down to RATOMS - # of ring atoms

and press OK.
The Number of Ring Atoms dialog appears (dialog description on page 88).
! Set the Low and High values to 5 and 6, respectively, then press
OK.
The RATOMS descriptor has been assigned to Component C1 and appears in

the list.
13. Use the BONDS descriptor to indicate that candidates for the specified
component must contain a potential hydrogen bond donor.
! Press Add Descriptor.

2. RACHEL Tutorials
! Select BONDS - bonded atoms and press OK.
The Bonded Atom Constraints dialog appears (dialog description on page 85).
! For the first pair of atom types shown, select O.3 on the left and H on
the right.
! Set the remaining pair of atom types as follows:
- O.co2 and H
- N.3 and H
- N.2 and H
- N.am and H
- N.pl3 and H
This descriptor will allow RACHEL to isolate components that contain potential
hydrogen bond donors.
! Set the Operator to >.
! Click the right arrow next to the top slider to set the Value to 1.
! Press OK.
The descriptors for component C1 are:
BONDS O.3 H O.CO2 H N.3 H N.2 H N.am H N.pl3 H > 1

RATOMS 5 - 6
Component C2
Chemical descriptors for component C2 will indicate that the derivative

compounds must include a short chain (1-5 atoms) in that position to mimic the
size of the original arginine. In addition, the unchanged part of the ligand must
always be bonded to the new scaffold through an sp3 carbon.
! In the RACHEL - Modify Chemical Descriptors dialog, press

Component Descriptor so you can select another component.

2. RACHEL Tutorials
Reflecting your selection, component C2 is highlighted in green in the SYBYL

window, and the Cmpnt information box is updated at the bottom of the dialog.
Since there are no descriptors currently defined for this component, none are
listed.
15. Add a component descriptor of type ATOMS to specify the minimum and
maximum number of atoms for this component.
! Select ATOMS - Number of atoms and press OK.
The Number of Atoms dialog appears (dialog description on page 83).
OK.
The ATOMS descriptor has been assigned to Component C2 and appears in the
list.
16. Add a component descriptor of type ATTACH to specify that the new ligand
scaffold must connect to the fixed part of the ligand through an sp3 carbon.
! Select ATTACH - Specific linkages and press OK.
The Component Attachment to Neighbor Constraints dialog appears (dialog

description on page 83).
! Select C.3 in the list of atom types.
! Press the Select button.
! Click the ligand atom (blue) attached to Component C2.
! Press OK to accept ANCHOR as the attachment descriptor for

Component #1.
The descriptors for component C2 are:
ATOMS 1 - 5
ATTACH C.3 -> ANCHOR
Component C3
Chemical descriptors for component C3 will indicate that all the database
components that RACHEL substitutes in this position must have at least one
oxygen that is a potential hydrogen bond acceptor

2. RACHEL Tutorials
! In the RACHEL - Modify Chemical Descriptors dialog, press

Component Descriptor so you can select another component.
Reflecting your selection, component C3 is highlighted in green in the SYBYL

window, and the Cmpnt information box is updated at the bottom of the dialog.
Since there are no descriptors currently defined for this component, none are
listed.
18. Add a component descriptor of type ATYPES to specify the minimum and
maximum number of atoms for this component.
! Select ATYPES - Atom type choices and press OK.
The Atom Type Constraints dialog appears (dialog description on page 84).
! In the list of atom types select O.3, O.2, and O.co2.
! Set the Operator to >.
! The Value is set to 0.
! Press OK.
The descriptor for component C3 is:
ATYPES O.3 O.2 O.CO2 > 0
2.4.7 Define Additional Site-level Descriptors

Site-level descriptors are chemical constraints imposed on all the components
that belong to the derivatization site.
19. Select the defined site.
! In the RACHEL - Modify Chemical Descriptors dialog, press Site

Descriptor.
Because there is only one site, it is highlighted in the SYBYL window. All the
descriptors that RACHEL defined automatically for this component are listed in
the dialog.

2. RACHEL Tutorials
20. Use the RBONDS descriptor to indicate that the scaffold of three components
for Site 1 must have between 4 and 6 rotatable bonds.
! Select RBONDS - Rotatable bonds and press OK.
The Number of Rotatable Bonds dialog appears (dialog description on page 89).
OK.
The RBONDS descriptor has been assigned to Site 1 and appears at the bottom
of the list.
21. Use the MW descriptor to indicate that the scaffold must have a molecular
weight between 50 and 300.
! Select MW - Molecular Wt and press OK.
The Molecular Weight dialog appears (dialog description on page 87).
OK.
The MW descriptor has been assigned to Site 1 and appears in the list.
22. Use the ATYPES descriptor to eliminate halogens. The NCI-3D database
contains a few compounds with sp1 carbons. These too will be eliminated.
! Select ATYPES - Atom types and press OK.
The Atom Type Constraints dialog appears (dialog description on page 84).
! In the list of atom types select C.1, F, Cl, and Br.
! Set the Operator to =.
! The Value is set to 0.
! Press OK.
The ATYPES descriptor has been assigned to Site 1 and appears in the list.

2. RACHEL Tutorials
23. Use the ATYPES descriptor to prevent RACHEL from adding too many
heteroatoms to the entire ligand scaffold for this site.
! Select ATYPES - Atom types and press OK.
! In the list of atom types select N.4, N.3, N.2, N.1, N.ar, N.am, N.pl3,
O.3, O.2, and O.co2.
! Set the Operator to <.
! Use the arrows next to the integer slider to set the Value to 6.
! Press OK.
Another ATYPES descriptor has been assigned to Site 1 and appears in the list.
2.4.8 Add Pharmacophore Descriptors to the Site

The pharmacophore (PHARM) descriptor is perhaps the most useful. As shown
in the chemical template schematic (see Figure 1 on page 31),the goal is to form
hydrogen bonds with the carboxylate group of ASP 189 and the amide group of
GLY 193.
24. Display the receptor.

the Receptor on.
Note how the star-labeled SITE1 is next to the carboxylic carbon of the ASP
189 sidechain, which was the target atom used to define the site when you
created the project (see Designate the RACHEL Target Atom on page 11).

2. RACHEL Tutorials
25. Use a PHARM descriptor to specify that any derivative of this site must place a
hydrogen-bond donor within 3.5 Å of the carboxylate carbon in Asp189.
! Select PHARM - Pharmacophore and press OK.
The Specify Pharmacophore Descriptor dialog appears (dialog description on

page 87).
! Press Pick Atom.
! Click the carboxylic acid carbon of the Asp189 sidechain (the atom
closest to SITE1).
The selected atom is highlighted by a sphere or colored dots, and its XYZ
coordinate are shown in the dialog.
! In the list of atom types select H.
! Use the slider and associated arrows to specify an Error margin of

3.50 Å.
! Press OK.
The PHARM descriptor is labeled in the SYBYL window.
26. Undisplay the receptor and display the original ligand.

the Receptor off and the Original Ligand on.
27. Add a PHARM descriptor for the hydrogen bond acceptor at the carbonyl end of
the arginine residue.
! Select PHARM - Pharmacophore and press OK.

2. RACHEL Tutorials
! Press Pick Atom.
! Click the carbonyl oxygen of the arginine residue in the original

ligand.
The selected atom is highlighted by a sphere of colored dots, and its XYZ
coordinate are shown in the dialog.
! In the list of atom types select O.3, O.2, and O.co2.
! Use the slider and associated arrows to specify an Error margin of

.50 Å.
! Press OK.
The second PHARM descriptor is labeled in the SYBYL window.
The bottom of the descriptors list is now:
28. All descriptors have been defined. Save them.
! Press Save to close the RACHEL - Modify Chemical Descriptors dialog.
The definitions for all the descriptors associated with this project are saved in
the file Rachel_builddef in the project directory.
2.4.9 Prepare the RACHEL Combinatorial Search

29. Modify the Search Parameters
Because you have specified a large number of chemical descriptors, the search
may be highly constrained. To insure a significant number of hits you will
reduce the diversity index. A high diversity index (0.85–0.99) will result in
fewer, but more diverse hits. The default value should be used in searches that
have few chemical constraints. Otherwise, the resulting hits may all appear very
similar.
! In the RACHEL dialog, press Search Parameters: Edit.

2. RACHEL Tutorials
The Adjust Search Parameters dialog appears (dialog description on page 78).
! Reduce the Diversity Index from 0.85 (default) to 0.50.
! Press OK.
The search parameters associated with this project are saved in the file
Rachel_searchdef in the project directory.
30. Modify the target function used for scoring.
The limitation of a target function is that it is not extrapolative because scoring

is done by simply comparing the calculated chemical properties to an ideal
value. Because you are targeting hydrogen bonds, the electrostatic interaction
value of derivative compounds may surpass that of the original ligand.
However, the strength of a target function is that it is far easier to adjust than a
true scoring function. In this case, you can simply lower the electrostatic inter-
action target value (which will favor hydrogen bond formation) and increase the
associated scalar. The end result is that the importance of electrostatic interac-
tions is increased.
! In the RACHEL dialog, press Scoring Function: Edit.
The Adjust Target Function dialog appears (dialog description on page 90)
! Change the Electrostatic interactions value from -0.948 to -2.000.
! Change the corresponding weighting factor from 1.000 to 3.000.
! Press OK.
The scoring function parameters associated with this project are saved in the file
Rachel_scoredef in the project directory.
2.4.10 Run the RACHEL Combinatorial Chemistry Search

31. Start the RACHEL search.
! At the bottom of the RACHEL dialog, press Run Search.
You must define the storage location for all successfully generated structures.
These will be stored in the .mol2 format.
! Append hit2.mdb to the directory name in the adjacent field.

2. RACHEL Tutorials
2.4.11 View the RACHEL Results

While RACHEL is searching, you can view the structures that have been
generated so far in order to ascertain whether they are chemically feasible and
desirable.
32. Let the calculation run for a few iterations, then view the results.
! At the bottom of the RACHEL dialog, press Status.
! Select the hit2.mdb directory and press OK.
! Scroll down to the end of the Summary section.
Building Filters - Database allowable components

------------------------------------------------
Original database: Tot entries 66185
Site 1 Global cmpnts: Tot entries 51396
Site 1 Component 3: Tot entries 41118
This indicates that you have successfully used the chemical descriptors to enrich
the component lists for each of the defined structures in the template. Notice
that component C3 contains a large number of substituent candidates. This
indicates that a substantial portion of the NCI-3D database contains ring struc-
tures.
33. You can view the structures that RACHEL has produced so far.
The Viewer dialog acts as a remote control, enabling you to quickly cycle
through the hits while permitting full interaction with the structures.
! Pressing the left and right arrows in the Viewer dialog allows you to
move up and down the list of hits while the double arrows move one
“page” at a time.
! Toggle the Original Ligand on to display the original ligand for
reference.

2. RACHEL Tutorials
You should see a great deal of variation in the sidechain components utilized.
However, all of the derivatives that have positive scores meet all the chemical
criteria.
You could also potentially see a large variation in scores. Depending upon how
long the search is allowed to run, structures with negative scores may be
present. A distance based penalty score is implemented until the structure
attains a distance of 0.67 times the target distance. The target distance is deter-
mined by the location of the target atom. In addition, structures are penalized if
they do not meet all of the pharmacophore descriptors.
You may see structures that contain improper chemistry. These are due to struc-
tural errors in the NCI-3D database itself or in its translation to 3D coordinates.
Appropriate chemical descriptors can be used to eliminate unwanted structures.
It is very important to know the composition of the compounds in your
database. As you become more familiar with these constraints, you will
probably develop and tailor your own default list of specifications to use.
34. When you are finished viewing the hits terminate the combinatorial chemistry
search.
! Press [X] in the Viewer dialog to close it.
! In the Status dialog press STOP Search then press Yes to confirm
this.
RACHEL will complete the current iteration.
! Close the RACHEL - Status and return to the RACHEL dialog.
2.4.12 Final Notes

As you can see, the building specifications are an extremely powerful means of
generating derivative structures to order. Given this brief introduction to their
utility, feel free to experiment with various combinations of specifications to
chemically explore the active site.
This concludes this RACHEL tutorial.
! Press Exit in the RACHEL dialog.

2. RACHEL Tutorials
Scaffold Replacement Using CHARLIE
2.5 Scaffold Replacement Using CHARLIE

Many applications in drug design call for the generation of splicing fragments
to span a region of the active site. CHARLIE is designed to aid in this endeavor.
CHARLIE’s two main applications are:

1. Scaffold replacement: two or more portions of a single (usually flexible)
ligand need to be joined using a more rigid or novel chemical scaffold.
2. Bridge generation: two or more separate ligand fragments need to be
spliced.
The scientific context of this tutorial is scaffold replacement. In this scenario a

lead compound contains two distinct regions that tightly complement the
receptor, but are separated by a weakly binding or highly flexible linker.
Replacing the linker region with a more rigid or more complementary fragment
would be highly desirable.
The receptor and ligand used in this tutorial are alpha-thrombin and a tripeptide
inhibitor (PDB code 1dwe). For the sake of this exercise, you will assume that
the arginine guanido group and the phenyl ring of the phenylalanine are critical
pharmacophoric elements for recognition and binding. Thus, you will use
CHARLIE to generate novel chemical scaffolds that span these two groups.
A Matter of Time: This tutorial requires about 15 minutes of personal time.
2.5.1 Define a New Project

1. It is always a good idea to clear the screen and reset the display before starting.
! > Delete Everything
! Click to reset all rotations and translations.

2. RACHEL Tutorials
3. Start RACHEL.
4. Define a new project.
! In the RACHEL dialog, press Create New Project.

page 77).
! Navigate to your rachel directory.
! Press New near the bottom of the dialog.
! Append the name of the new project, charlie, to the directory name
in the adjacent field.
! Press OK.
Project dialog.


2. RACHEL Tutorials
Upon completion, the RACHEL - Setup New Project dialog will resemble the
following:
2.5.2 Set up CHARLIE for Scaffold Replacement

7. Start the setup process for CHARLIE.
! In the RACHEL - Setup New Project dialog, press Setup CHARLIE.

• In M2, the active site of alpha-thrombin colored in purple.
! Use to undisplay the receptor structure (lock) in M2 and make

sure that Atm Lbl is set to Id for M1.
The ligand is a tripeptide inhibitor of Thrombin. For the sake of this exercise,
you will assume that the arginine guanido group and the phenyl ring of the
phenylalanine are critical pharmacophoric elements for recognition and binding.
Thus, you will use CHARLIE to generate novel chemical bridges that span
these two groups.
9. Designate the anchor bond. This indicates where CHARLIE will begin building
the scaffold.
Note: The order in which the atoms are selected is important. Always select the
atoms sequentially from the anchor bond towards the target bond (as illustrated
by the arrows in the Select Atom dialog).
! In the SYBYL window, rotate and scale the molecules until you can
clearly see atoms N26 and 25 of the ligand.

2. RACHEL Tutorials
! Click on ligand atom N26.
A sphere of green dots acknowledges this selection.
! Click on ligand atom 25.
You have designated the bond from N26 to C25 as the anchor bond of the
scaffold. The bond and the rest of the molecule are colored green. Note: If you
accidentally selected the wrong bond, press Cancel in the next dialog (atom
selection for the target area) and restart the anchor bond selection.
10. Designate the splice target bond. This indicates where CHARLIE must
terminate the scaffold and link it with the remainder of the ligand.
! Click on the first atom of the target splice bond: 5.
A sphere of yellow dots acknowledges this selection.
! Click on the second target splice bond atom: 6.
As in the figure above, the portions of the ligand that you wish to retain are
colored by atom type once again. The region that will be replaced by the new
scaffold is colored green.
11. The RACHEL setup is complete: only one site will be defined for this tutorial.
! Press End in the dialog prompting you to select an anchor atom in
the ligand.
A message in the console acknowledges that the project directory has been set
up successfully.
Note: Do not move the project directory once you have created it. If this is
desired, you must erase this directory and the files within it and regenerate the
project using the RACHEL setup process detailed above.

2. RACHEL Tutorials
12. Review the project as defined so far.
For each defined site the following information is listed in the RACHEL dialog:
• Site: 1 = The site’s ID number.
• Anchor = 26-25: The ligand atoms defining the anchor bond. The first
atom will remain fixed, the second atom determines the region that will
be optimized.
• –> 5-6 = The ligand atoms defining the target bond. CHARLIE will
attempt to engineer a derivative linker to join the anchor bond to the
target bond.
The project directory (rachel/charlie within your current working directory)

contains all the files necessary to perform the virtual combinatorial chemistry
experiment.
• key.mol2 and lock.mol2 = the ligand and receptor files were copied to
the project directory.
• Rachel_setup = all the information that is displayed in the dialog
• Rachel_builddef = a minimum number of constraints to eliminate
obvious chemical errors.
• Rachel_scoredef = scoring function parameters
• Rachel_searchdef = conformational search parameters
CHARLIE can build linkers to replace five different ligand scaffolds simulta-
neously. The only requirement is that each anchor and target bond pair must be
unique. No two linkers may share the same anchor bond nor terminate on the
same target bond. In this exercise, you will replace only one region.
2.5.3 Adjust the Conformational Search Parameters

13. Although the default values are adequate, running CHARLIE may require
tweaking some of the search parameters. To be successful, CHARLIE must
overlap the terminal bond of the growing derivative chain with the target bond,
a specific set of coordinates in 3D space. You can envision CHARLIE trying to
hit the target bond with a molecular lasso. A conformational search with a much
finer degree increment may be necessary, especially if CHARLIE has difficulty
generating acceptable hits.
! On the RACHEL dialog’s Search Parameters line press Edit.
14. You can improve the chances of finding hits by increasing the number of
conformations per structure to one million:
! Increase Max # conformations per structure to 1000000 (1 million)

2. RACHEL Tutorials
15. A high diversity index (0.85 –> 0.99) will result in fewer, but more diverse hits.
A higher value should be used in searches that have few chemical constraints.
Otherwise, a multitude of hits may result that all appear very similar. A lower
diversity index (0.25 – 0.5) will allow a greater number of hits; however, they
will be more chemically similar. When running CHARLIE, it is a good idea to
begin with a low diversity index value. This value can be raised in succeeding
runs if necessary.
! Decrease the Diversity index to 0.50
16. The maximum splice tolerance is the RMS error allowed when CHARLIE joins
two ligand fragments by generating a linker structure. The tolerance is measured
at the linker bond - target bond overlap. A certain amount of error is necessary
to compensate for the fixed bond lengths and angles that are employed by the
search engine. The smaller the tolerance, the better the fit between the linker
and the static portions of the ligand, and the better the structure. However,
CHARLIE may have more difficulty producing a true hit.
As a general rule, a longer bridge will require a larger splice atom tolerance. If
you get a plethora of hits, you should decrease the splice atom tolerance.
Conversely, if you obtain no hits after 50 or more iterations, consider increasing
this value. In practice, you will alter this parameter value depending upon the
number and quality of hits you obtain.
! Increase Maximum splice atom tolerance to 0.75
17. Save the search parameters.
! Press OK to save the modified parameters.
2.5.4 Edit the Chemical Template

18. Because you are trying to build a more rigid scaffold to replace the chain in the
lead compound, you should implement a ring atom specification. You will edit
the Chemical Descriptors to specify this.
! On the RACHEL dialog’s Chemical Descriptors line press Edit.

on page 80).
Notice that the template consists of a wildcard (red) component plus a linker
(green). The linker tells CHARLIE that this is the terminus of the scaffold, and
that the terminal bond must be overlapped with the target bond to link with the
remainder of the ligand.

2. RACHEL Tutorials
19. For a better perspective, display the original ligand to show how the linker
region will join the desired portions of the original ligand.
! Toggle on and off the Display Original Ligand check box.
Keep in mind that the template is simply a schematic. Do not be alarmed that
the anchor bond appears altered in comparison to the original ligand. CHARLIE
will maintain all bond angles and lengths while adding the derivative compo-
nents.
• Defined Component #1
• Must join with anchor bond using an sp3 carbon (C.3). This ensures
appropriate chemistry.
• Wildcard
• Must contain between 8 and 10 ring atoms.
• Must join with target bond using an sp3 carbon (C.3).
20. You will now define the template and associated component descriptors. The
above schematic illustrates the strategy. The main group is component #2. This
component must contain a bicyclic ring. Thus, you will specify an 8-10 ring
atom descriptor, which should allow for bicyclic 5 and 6 membered rings in
various combinations. On either side of component #2, the wildcards will allow
CHARLIE to substitute components, as necessary. Components #1 and #3
simply ensure that sp3 carbons (C.3) atoms join with the guanido and phenyl

2. RACHEL Tutorials
groups. This prevents the generation of inappropriate chemical bonds between

the bridge and the ligand groups of interest.
! Toggle off the Display Original Ligand check box to allow access to
the template.
21. Insert the first component between the nitrogen of the anchor bond and the
wildcard component.

! When the Select Atom dialog pops up, click the blue atom connected
to the wildcard to designate the insertion point for this component.
! Click on the W wildcard component.
In the SYBYL window, defined component C1 is inserted between the

guanidino group and the wildcard component W.
22. Insert the second component (C2) between the wildcard and the linker terminus.
! Click on W then on LINKER.
23. Insert the third component (C3) between component C2 and the linker terminus.
! Click on C2 then on LINKER.
24. Insert a wildcard between components C2 and C3.
! Set the Component Type to Wildcard.
! Click on component C2 then on C3.

2. RACHEL Tutorials
The template appears as in the figure below.
Do not be concerned that the template seems large. This is a schematic repre-
sentation. The actual bridging components will rotate and stagger to fit the
space. Also keep in mind that CHARLIE will fill the wildcard components only
if necessary. They were inserted solely to give CHARLIE more freedom in
choosing components.
! Toggle on the Display Original Ligand check box to see how many
rotatable bonds were present in the original molecule.
! Toggle it off to bring back the display of the template before you go
on.
2.5.5 Assign the Chemical Descriptors

25. Define a descriptor of type ATTACH for component C1 to specify that the new
scaffold must connect to the anchor through an sp3 carbon.
! In the Edit Chemical Descriptors section of the dialog, press

Component Descriptor.
! Click on Component C1.

page 82).
! Set the Descriptor Types pull-down to ATTACH - Specific linkage

and press OK.
The Component Attachment to Neighbor Constraints dialog appears (dialog

! Select C.3 in the list of atom types.
! Press Select and click on the neighboring anchor bond atom (blue)
to select it as the target for the attachment descriptor.

2. RACHEL Tutorials
! Press OK to accept ANCHOR as the attachment descriptor for

Component #1.
This definition is echoed in the RACHEL - Modify Chemical Descriptors dialog:

ATTACH C.3 -> ANCHOR
26. Define a descriptor of type ATTACH for component C3 to specify that the new
scaffold must connect to the linker through an sp3 carbon
! Repeat the instructions in Step 25 above to assign an ATTACH

descriptor to Component C3. This time select the LINKER atom as
the neighbor.
This definition is echoed in the RACHEL - Modify Chemical Descriptors dialog:

ATTACH C.3 -> LINKER
27. Define a component descriptor of type RATOMS for component C2 to specify

the minimum and maximum number of ring atoms for this component. An 8-10
ring atom descriptor should allow for bicyclic 5 and 6 membered rings in
various combinations.
! In the Edit Chemical Descriptors section of the dialog, press
Component Descriptors.
! Click on Component C2 then press Add Descriptor.

and press OK.
OK.

the list.
28. Review the list of descriptors in the RACHEL - Modify Chemical Descriptors
dialog.
! On the Select line, press All.
! Scroll to the bottom of the descriptor list where you will see:
Site 1 Cmpnt 1: ATTACH C.3 -> ANCHOR
Site 1 Cmpnt 2: RATOMS 8-10
Site 1 Cmpnt 3: ATTACH C.3 -> LINKER

2. RACHEL Tutorials
29. Write the modified descriptors to the project directory.
! Press Save at the bottom of the dialog.
The descriptors are saved in the file rachel/charlie/Rachel_builddef.
2.5.6 Prepare the Target Function

30. Because you are using the same ligand and receptor in all RACHEL tutorials,
you may find it simpler to import the scoring function from another project.
You may also create a new target function very quickly.
You may either import the parameters from another successful run or edit them
interactively.
To Import an Existing Target Function: If you have already run the RACHEL
tutorial (see RACHEL Tutorials on page 7), you already have a target function
stored in that project’s directory (rachel/tutorial/Rachel_scoredef) and
you may simply import it.
! On the RACHEL dialog’s Scoring Function line press Import.
! Navigate to your rachel/tutorial project and press OK.
A message dialog displays the source and target directories for this operation.
! Press Yes to confirm that you want to overwrite the current scoring
function.
31. To Create a new Target Function:
! Follow the instructions in Generate a Target Function on page 20.
32. Modify the target function for building scaffolds. Ideally, the derivative
scaffolds should link the anchor and target regions with the most direct bridge,
producing steric and electrostatic complementarity. You can direct CHARLIE
to accomplish this by altering key target function parameters.
! On the RACHEL dialog’s Scoring Function line press Edit.
The Adjust Target Function dialog appears (dialog description on page 90)
To emphasize the generation of direct, succinct chemical bridges, you will

decrease the target values for the molecular weight and the # of rotatable bonds
and increase their scalar multipliers to accentuate their importance. You will
also reduce the emphasis of electrostatics back to baseline.

2. RACHEL Tutorials
! Decrease the Molecular Wt target value to 2.0 and increase its

weighting factor to 2.0
! Decrease the # Rotatable Bonds target value to 4.0 and increase
its weighting factor to 3.0
! Reset the Electrostatic interactions weighting factor to 1.0
! Press OK.
The scoring function parameters associated with this project are saved in the file
Rachel_scoredef in the project directory.
2.5.7 Run CHARLIE to Search for Scaffolds

33. Before you can run this CHARLIE search, you must select a database of
components. You will use the NCI 3D database distributed with the software.
! Press Select in the Database section of the RACHEL dialog.
! Navigate to the rachel/DBASE directory, select nci3d and press OK.
34. Start the CHARLIE search and specify where the results will be stored.
! Press Run Search.
(hits) within the project directory. These will be stored in the .mol2 format.
! Append scaffold.mdb to the directory name in the adjacent field.
2.5.8 Review the Scaffolds Produced by CHARLIE

35. View the results.
! In the RACHEL dialog, press Status.
! Select the scaffold.mdb directory and press OK.

2. RACHEL Tutorials
There are several things to note in the dialog:

• At the end of the Summary section, you can see that the descriptors
filtered out numerous database structures, leaving a subset of database
components for the defined components #1, #2, and #3.
• The top scoring ligand is highlighted in the Results section of the
dialog. Its corresponding structure and highest scoring conformation are
displayed within the receptor in the SYBYL window.
• At the bottom of the Results list, you may notice structures with
negative scores. These are pseudo hits as described in Negative Score
Values on page 26. You will probably see that the terminal component in
many of these structures actually crashes through the receptor wall. This
is normal. The terminal component is only used to maintain the integrity
of the linker bond. This component is removed and the scaffold is
spliced directly to the second ligand fragment when satisfactory overlap
has been achieved. As better scaffolds are generated the negative score
values will approach 0 (press Refresh below the list).
Note: This search may take a while to generate the first successful scaffold.
Approximately 30 - 40 generations may be necessary. If no successful scaffolds
(scores > 0.00) have been produced in 50 iterations, try terminating the search
and re-starting. Each time you start RACHEL (CHARLIE), a new random seed
value is generated. This gives RACHEL or CHARLIE a different set of starting
components, selected from the database, to begin derivatization or scaffold
building.
36. You can now view the other structures that CHARLIE has produced so far.
! Press the left and right arrows to move up and down the list of hits.
Use the double arrows to move one “page” at a time.
! Toggle Original Ligand on to it (in green) for reference.

2. RACHEL Tutorials
37. While you are viewing the structures, CHARLIE continues to search for new

38. When you are finished viewing a few successful scaffolds, terminate the search.
this.
39. This concludes this tutorial. You may either:

! Proceed to the other CHARLIE tutorial: Bridge Generation Using
CHARLIE on page 61
! or Exit RACHEL and clear the SYBYL screen.

2. RACHEL Tutorials
Bridge Generation Using CHARLIE
2.6 Bridge Generation Using CHARLIE

Many applications in drug design call for the generation of splicing fragments
to span a region of the active site. CHARLIE is designed to aid in this endeavor.
CHARLIE’s two main applications are:

1. Scaffold replacement: two or more portions of a single (usually flexible)
ligand need to be joined using a more rigid or novel chemical scaffold.
2. Bridge generation: two or more separate ligand fragments need to be
spliced.
The scientific context for this tutorial is bridge generation. In this scenario,
several fragments from two or more well characterized lead compounds bind to
separate regions of the active site. The task is to generate appropriate linker
structures to join the separate fragments into a single compound. In doing so,
you must also consider the receptor cavity and optimize both steric and electro-
static complementarity.
The ligand used in this tutorial consists of two separate fragments taken from
the tripeptide inhibitor of Thrombin (PDB code 1dwe). You will use this as the
test ligand to demonstrate the formation of chemical bridges between fragments.
A Matter of Time: This tutorial requires about 10 minutes of personal time.
2.6.1 Define a New Project

1. Make sure that you have all the necessary file, then start RACHEL.
2. Start RACHEL.
3. Define a new project.
! In the RACHEL dialog, press Create New Project.

2. RACHEL Tutorials

page 77).
! Navigate to your rachel directory.
! Press New near the bottom of the dialog.
! Append the name of the new project, charlie2, to the directory

name in the adjacent field.
! Press OK.
Project dialog.

! Select key_bridge.mol2 and press OK.
2.6.2 Set up CHARLIE for Bridge Generation

6. Start the setup process for CHARLIE.
! In the RACHEL - Setup New Project dialog, press Setup CHARLIE.

• In M2, the active site of alpha-thrombin colored in purple.
! Use to undisplay the receptor structure (lock) in M2.

2. RACHEL Tutorials
The ligand used in this tutorial consists of two separate fragments taken from
the tripeptide inhibitor of Thrombin. You will use this as the test ligand to
demonstrate the formation of chemical bridges between fragments. You will
bridge from the anchor bond [C10 –> C2] to the target bond C[1 –> C3].
8. Select the two atoms forming the anchor bond for the bridge. The order of atom
selection is important.
! Click on atom 10 (it is then highlighted with a green sphere).
! Click on atom 2 (its color changes to green).
9. Select the target bond for the bridge.
! Click on atom 1 (it is highlighted with a yellow sphere).
! Click on atom 3 (its color changes to green).
! Click End when prompted for Site 2 anchor bond. This will signify
that you are finished with the setup.
A line in the RACHEL dialog reports:

Site: 1 Anchor: 10-2 -> 1-3
2.6.3 Modify the Conformational Search Parameters

10. Although the default search parameters are adequate, some tweaking may
improve the chances of success. For details see: Adjust the Conformational
Search Parameters on page 51.
You may either import the parameters from another successful run or edit them
interactively.
To Import: If you have already run the other CHARLIE tutorial (see Bridge
Generation Using CHARLIE on page 61), you may simply import the search
parameters used in that run:
! On the RACHEL dialog’s Search Parameters line press Import.
! Select your rachel/charlie project directory and press OK.
! Press Yes to confirm.
! Proceed to Define the Chemical Template and Assign Descriptors

below.

2. RACHEL Tutorials
11. To Edit:
! On the RACHEL dialog’s Search Parameters line press Edit.
! Modify the RACHEL - Adjust Search Parameters dialog as follows:

- Diversity Index: 0.50
- Max # conformations per structure: 1000000 (1 million)
- Maximum Splice Atom Tolerance: 0.75
- Press OK to save the modified parameters.
2.6.4 Define the Chemical Template and Assign Descriptors

12. You will implement the template and descriptors as in the figure below to
bridge the gap between ligand fragments. You will specify a ring structure to
impart some rigidity to the bridging fragment. You will also include wildcards
on either side of the ring to allow CHARLIE to substitute components as
necessary to join the two segments.

• Must contain between 6 and 10 ring atoms.
• Wildcard
! On the RACHEL dialog’s Chemical Descriptors line press Edit.

on page 80).

2. RACHEL Tutorials

! Click on W then on LINKER.
In the SYBYL window, defined component C1 is inserted between the wildcard

and the linker.
13. Insert a wildcard between component C1 and the linker.
! Set the Component Type to Wildcard.
! Click on component C1 then on LINKER.
The template appears as in the figure below.
14. Define a descriptor of type RATOMS for component C1 and specify the
minimum and maximum number of ring atoms for this component.
! In the Edit Chemical Descriptors section of the RACHEL - Modify

Chemical Descriptors dialog, press Component Descriptor.

page 82).

2. RACHEL Tutorials
! Click on Component C1.

and press OK.
The Number of Ring Atoms dialog appears (dialog description on page 88).
OK.

the list.
! Press Save to write the modified descriptors to the project directory.
2.6.5 Import the Target Function from Another CHARLIE Project

15. Because you are using the same ligand, receptor and general conditions as in the
other CHARLIE tutorial (see Bridge Generation Using CHARLIE on page 61),
you will import the scoring function from that project.
! On the RACHEL dialog’s Scoring Function line press Import.
! Navigate to the rachel/charlie project and press OK.
A message dialog displays the source and target directories for this operation.
! Press Yes to confirm that you want to overwrite the current scoring
function.
For more information about the specific parameters, see Prepare the Target
Function on page 57.
2.6.6 Run CHARLIE to Search for Bridges

16. Before you can run this CHARLIE search, you must select a database of
components. You will use the NCI 3D database distributed with the software.
! Press Select in the Database section of the RACHEL dialog.
! Navigate to the rachel/DBASE directory, select nci3d and press OK.
17. Start the CHARLIE search and specify where the results will be stored.
! Press Run Search.

2. RACHEL Tutorials
(hits) within the project directory. These will be stored in the .mol2 format.
! Append bridge.mdb to the directory name in the adjacent field.
2.6.7 Review the Bridge Fragments Produced by CHARLIE

18. View the results.
! At the bottom of the RACHEL dialog, press Status.
! Select the bridge.mdb directory and press OK.
There are several things to note in the dialog:

• At the end of the Summary section, you can see that the descriptors
filtered out numerous database structures, leaving a subset of database
components for defined component #1.
• The top scoring ligand is highlighted in the Results section of the
dialog. Its corresponding structure and highest scoring conformation are
displayed within the receptor in the SYBYL window.
• At the bottom of the Results list, you may notice structures with
negative scores. These are pseudo hits as described in Negative Score
Values on page 26. You will probably see that the terminal component in
many of these structures actually crashes through the receptor wall. This
is normal. The terminal component is only used to maintain the integrity
of the linker bond. This component is removed and the bridge is spliced
directly to the second ligand fragment when satisfactory overlap has
been achieved. As better bridges are generated the negative score values
will approach 0 (press Refresh below the list).

2. RACHEL Tutorials
Note: This search may take a while to generate the first successful bridge.
Approximately 30 - 40 generations may be necessary. If no successful bridges
(scores > 0.00) have been produced in 50 iterations, try terminating the search
and re-starting. Each time you start RACHEL (CHARLIE), a new random seed
value is generated. This gives RACHEL or CHARLIE a different set of starting
components, selected from the database, to begin derivatization or bridge
building.
19. You can now view the other structures that CHARLIE has produced so far.
! Press the left and right arrows to move up and down the list of hits.
Use the double arrows to move one “page” at a time.
! Toggle Original Ligand on to it (in green) for reference.
20. While you are viewing the structures, CHARLIE continues to search for new
21. When you are finished viewing a few successful bridges, terminate the search.
this.
22. This concludes this tutorial.

! Exit RACHEL and clear the SYBYL screen.

2. RACHEL Tutorials
Create a RACHEL Component Database
2.7 Create a RACHEL Component Database

In order to optimize a lead compound, RACHEL utilizes components (chemical
building blocks) derived from the 3D structures in a corporate database. By
breaking the corporate database down into its essential elements, RACHEL:
• maximizes the diversity of generated derivative structures;
• utilizes proprietary chemical components and synthetic know-how.
Typically, you will generate the RACHEL database from a corporate database
or from a publicly or commercially available compound database.
The RACHEL demo files include a set of 25 ligands1 extracted from the Protein
Data Bank. The ligands are stored in a Multi-Mol2 file, rachel/DBASE/
Demo_dbase_multi.mol2.
1. Make sure that you have all the necessary files.
2. Start RACHEL
The RACHEL dialog appears (dialog description on page 74).
3. Create a RACHEL database in which to store the components that will be

extracted.
! At the bottom of the RACHEL dialog, press Select to create a new

RACHEL database.
The RACHEL Component Database dialog appears (dialog description on

page 93).
! Navigate to your rachel/DBASE directory.
! Press New to activate the adjacent field.
! Append the name of the new component database, test_dbase, to

the directory name in the field.
1. The compound selection is based on research by R. D. Head, M. L. Smythe, T. I. Oprea, C. L.

Waller, S. M. Green, G. R. Marshall, J. Am. Chem. Soc., (1996) 118, 3959-3969

2. RACHEL Tutorials
! Press OK.
! Press Yes in the small dialog that pops up to confirm the creation of
a new RACHEL database.
4. Add structural components to the RACHEL database.
! At the bottom of the RACHEL dialog, press Add Structures to create

a new database.
The RACHEL Add Structures to Component Database dialog appears (dialog

! Navigate to your rachel/DBASE directory.
! Select the file Demo_dbase_multi.mol2 and press OK.
RACHEL extracts the components from the Multi-Mol2 file and adds them to
the new test_dbase database. A small window monitors the process by
showing the number of structures in the Multi-Mol2 file processed and the
number of components extracted. RACHEL extracts only unique components.
Thus, the number of extracted components will rise rapidly. However, as
common components are stored, the number of novel components added will
gradually diminish. Individual components with more than 256 atoms or bonds
are rejected.
Because there are a few structures in this demo multi-mol2 input file, the
extraction process may appear instantaneous. A structural database of 500,000
compounds may actually require 30–60 minutes of real time to process
(depending upon processor, disk, and network conditions) resulting in 25–
50,000 unique components depending upon the inherent chemistry.

2. RACHEL Tutorials
RACHEL has a limit of 100,000 unique components that can be registered and
stored in any single component database.
When the extraction has completed, you will see a status window describing the
number of unique components that were stored.
Extraction of databases components completed.
Database contains 43 unique components.
Press =Enter= to finish.
! Press the Enter key in the text window to clear it.
RACHEL has added the following files to your rachel/DBASE directory:

• test_dbase = The component database. For each component, the
number of atoms, number of bonds, number of links to other compo-
nents, coordinates, and bonding information are encrypted, compressed,
and stored in this database.
• test_dbase.reg = Registration file ensuring that each component is
unique.
• test_dbase.status = The log of the database generation process. This
in formation is identical to the content of the summary window at the
end of the process.
• test_dbase.xref = Cross reference file detailing the source of each
component.
5. Review the summary of the extraction of database components.

! Press the View button in the Database section of the RACHEL dialog.
The RACHEL - Database Viewer appears (dialog description on page 95).
The information printed earlier in the RACHEL summary window is printed in

the RACHEL textport.
6. View a few database components.

! Press Search to read the database components.
! Press OK is the Success! dialog.
! Use the Next and Prev button to display a few of the database
components.
7. This concludes this tutorial.
! Press Quit to close the RACHEL - Database Viewer.
! Exit RACHEL and clear the SYBYL screen.

This page intentionally blank.
3. RACHEL Graphical Interface
• RACHEL Main Dialog on page 74
• RACHEL - Setup New Project on page 77
• RACHEL Search Parameters on page 78
• Edit RACHEL Search Parameters on page 78
• Import RACHEL Search Parameters on page 79
• RACHEL Chemical Descriptors on page 80
• Add Descriptors on page 82
• ATOMS Descriptor on page 83
• ATTACH Descriptor on page 83
• ATYPES Descriptor on page 84
• BONDS Descriptor on page 85
• LINKS Descriptor on page 86
• MW Descriptor on page 87
• PHARM Descriptor on page 87
• RATOMS Descriptor on page 88
• RBONDS Descriptor on page 89
• RACHEL Scoring Function Parameters on page 90
• RACHEL Component Database on page 93
• Open or Create a RACHEL Database on page 93
• Add Structures to a RACHEL Database on page 94
• View a RACHEL Database on page 95
• RACHEL Search Status on page 97
• View Structures on page 98
• RACHEL Utilities on page 99
• Generate Pseudo Receptor on page 99
• Extended Radius Cavity on page 100
• Ligand Viewer on page 100

RACHEL Main Dialog
3.1 RACHEL Main Dialog

To define and run a RACHEL project.
Applications > RACHEL
Project Definition
Open Existing Access a file browser to select the directory containing

Project an existing project.
Create New Access the RACHEL - Setup New Project dialog where
Project you can select the ligand and receptor files, and desig-
nate the location (anchor bond) and direction of growth
(target atom) for compound derivatives.
Project Displays the full path to the currently open project.
Ligand Displays the name of the ligand associated with the
project.
Receptor Displays the name of the receptor associated with the
project.

RACHEL Main Dialog
Site Information
Site ID number of the defined site. Up to five sites may be

defined on the lead compound.
Anchor ID numbers of the two atoms defining the anchor bond
for the site. It is to this anchor bond that chemical com-
ponents are linked and their conformations explored.
The order in which the anchor atoms are chosen dic-
tates which region will be optimized and which region
will remain static.
Coordinates X, Y, Z coordinates of the target atom, that is, an atom
used to indicate the direction of component growth.
Parameters
Search Parame- Edit—Access the RACHEL - Adjust Search Parameters

ters dialog.
Import—Access a file browser in which you can select
another project. The search parameters associated with
the selected project will be used for the current project.
Chemical Edit—Access the RACHEL - Modify Chemical Descrip-
Descriptors tors dialog.
another project. The building descriptors associated
with the selected project will be used for the current
project.

RACHEL Main Dialog
Scoring Func- Edit—Access the RACHEL - Adjust Scoring (Target)

tion Function dialog.
Train—Access the Choose Function dialog. Note:
Structures used to train the RACHEL scoring function
should be minimized with the Tripos force field
because RACHEL uses the SYBYL atom definitions to
calculate fields such as van der Waals complementarity
and strain.
another project. The scoring function associated with
the selected project will be used for the current project.
Predict—use the current scoring function (established
with the training set or imported from another project)
to predict the binding affinity of ligands. You will
prompted for the location of the ligand(s) and of the
receptor.
Database
Database This information box echoes the name of the selected

RACHEL database.
Select Access the RACHEL Component Database dialog
(page 93) to select an existing or create a new
RACHEL database.
Add Structures Access the RACHEL Add Structure to Component Data-
base dialog (page 94) to extract components from a set
of molecules and store them in a RACHEL database.
View Print to the console the information that was printed to
the RACHEL textport during database extraction and
access the RACHEL - Database Viewer.
Action Buttons
Run Search Start the RACHEL combinatorial search. You will need
to specify a directory to store the hits (.mol2 files) and
associated information files.
Note: Because all the hits are saved in individual .mol2
files, adding the extension .mdb to the directory name
makes it easier later to review the hits in a Mol2 data-
base or in a molecular spreadsheet.
Status Select the directory containing the compounds gener-
ated by the search and access the RACHEL - Status dia-
log.

RACHEL Main Dialog
Utilities Access the RACHEL - Utilities dialog to create a

pseudo-receptor cage by using the coordinated of
aligned ligands.
Exit Exit the RACHEL program.
3.1.1 RACHEL - Setup New Project

To select the ligand and receptor files, and designate the location (anchor bond)
and direction of growth (target atom) for compound derivatives.
Access: Press Create New Project in the RACHEL dialog.
Project Use the project directory browser to:

• select an existing project that you can then modify.
• define a directory for a new project (first press New
at the bottom of the directory browser).
Ligand Use the file browser to select the Mol2 file containing
the ligand.
Receptor Use the file browser to select the Mol2 file containing
the receptor.
Setup RACHEL Define one or more sites for virtual combinatorial
chemistry. Each site is described by an anchor bond,
the location of substitution in the lead compound, and
by a target atom defining the direction of growth for
compound derivatives.
Setup CHARLIE Define two or more sites that must be linked with a dif-
ferent chemical scaffold.
Note: The ligand and receptor molecules must share the same coordinate space.

RACHEL Search Parameters
3.2 RACHEL Search Parameters

The search parameters tweak the performance of the conformational search
engine.
3.2.1 Edit RACHEL Search Parameters

Access: In the RACHEL dialog press Search Parameters: Edit.
Maximum During each iteration of structure generation RACHEL

number of hits generates N derivatives de novo and N derivatives using
retained intelligent component selection (see page 104). The 2N
structures are then scored, and the top N structures are
retained for continued refinement in subsequent itera-
tions. RACHEL can retain a maximum of 200 hits.
Stop after ? RACHEL terminates if it fails to generate a better-scor-
non-productive ing novel ligand derivative after the specified number
iterations of iterations. The larger the number, the longer
RACHEL will try new combinations of components to
produce a novel, better scoring derivative.
Diversity Index The higher this number, the greater the diversity of the
generated ligands. However, fewer hits may be
obtained as RACHEL removes similar classes of struc-
tures. Conversely, a lower diversity index will produce
a larger number of retained structures. However, they
may be more similar in chemical structure or motif.
Atom vdW Scaling down all van der Waals radii compensates for
Scaling Factor fixed bond angles and generates additional conformers.
During Search 1-4 contacts and hydrogen bond interactions are also
affected. The smaller the scaling factor, the greater the
potential number of conformers obtained. However,
chances are also greater that higher energy structures
may result.

RACHEL Search Parameters
Maximum # The maximum number of conformers that RACHEL

Conformations will allow per structure. Using “Adaptive Radial Sam-
per Structure pling” (the technique used in the RECEPTOR search
engine from Washington University1), RACHEL is able
to utilize the allotted conformer space in the most effi-
cient manner. The search time is directly correlated to
this parameter.
Maximum # Con- The maximum number of conformers to retain for each
formers derivative structure.
Retained per
Structure
Maximum Splice The RMS error (Å) allowed when CHARLIE joins two
Atom Tolerance ligand fragments by generating a linker structure. The
tolerance is measured at the linker bond - target bond
overlap. A certain amount of error is necessary to com-
pensate for the use of fixed bond angles by the search
engine. The smaller the tolerance, the better the fit
between the linker and the static portions of the ligand,
and the better the structure. However, as a result,
CHARLIE may have more difficulty producing a true
hit.
1. D.D. Beusen, E.F.B. Shands, S.F. Karasek, G.R. Marshall, and R.A. Dammkoehler,
“Systematic Search in Conformational Analysis,” THEOCHEM: J. Mol. Struct. 370, 157-
171 (1996).
The default values for the parameters accessible through this dialog are stored
in the text file $RACHEL_HOME/Rachel_searchdef. If you modify the
search parameters before a RACHEL run. their new values are stored in the
Rachel_searchdef within the project directory.
3.2.2 Import RACHEL Search Parameters

With experience, you may become comfortable with a set of parameters that
produces desired search results. You may also wish to use the same values for
another project.
Access: In the RACHEL dialog press Search Parameters Import then use a
file browser to select the project directory from which you want to import the
search parameters.

RACHEL Chemical Descriptors
3.3 RACHEL Chemical Descriptors

To constrain the chemical nature of substituent components.
Access: In the RACHEL dialog press Chemical Descriptors: Edit.
Project The name of the open RACHEL project.

Display Toggle the display in the SYBYL window of the Origi-
nal Ligand and Receptor.
Edit Templates
Component Select the type of component to be acted upon by the

Type buttons immediately below:
• Defined—A user defined component, labeled with
the letter C.
• Wildcard—A component automatically placed by
RACHEL at each anchor bond so that growth can
proceed even without any descriptors or constraints.
Wildcard components are labeled with the letter W.
Attach Compo- Press the button then click in the SYBYL window on
nent the attachment point (atom or component) for the new
component.

Insert Compo- Press the button then click in the SYBYL window on
nent the two insertion points (atoms or components).
Delete Compo- Press the button then click in the SYBYL window on
nent the component to be deleted.
Edit Chemical Descriptors
Select Select the center of interest for chemical descriptors:

• Site Descriptor—If there is more than one site
you must select one in the SYBYL window.
• Component Descriptor—You must select one
component in the SYBYL window.
• All—All defined sites and components are selected.
Display Lists descriptors depending on the selection directly
above.
• All Descriptors
• For sites: RBONDS, MW, ATYPES, BONDS,
LINKS, PHARM.
• For components: ATOMS, RATOMS, ATYPES,
BONDS, ATTACH.
Add Descriptors Access the Select Component Descriptor Type dialog.
Edit Descriptor Edit the descriptor selected in the list below.
Delete Descrip- Delete the descriptor selected in the list below.
tor
Site and Component Count
Site The ID number of the currently selected site or # if no

site is selected.
Cmpnt The ID number of the currently selected component or
# if no component is selected.
Save You must press this button to terminate the definition

of chemical descriptors and return to the RACHEL dia-
log. The values for all the chemical descriptors are
stored in the text file Rachel_builddef in the project
directory. This makes it easy to import then into
another project.

3.3.1 Add Descriptors

To select the type of additional chemical descriptor to define for a site or
component.
Access:
• In the RACHEL dialog press Chemical Descriptors: Edit.
• Then, in the RACHEL - Modify Chemical Descriptors dialog, select Site
Descriptor or Component Descriptor or All.
• Then, press Add Descriptors.
Site level descriptors entail parameters that govern an entire optimization site,
that is the combination of components that complement a user defined region.
Component level descriptors are used to restrict the selection of substituent
candidates for each defined component.
Descriptors Applicability
ATOMS - Number of Atoms Component

ATTACH - Specific Linkages Component
ATYPES - Atom Types Site or Component
BONDS - Bonded Atoms Site or Component
LINKS - Component Linkages Site
MW - Molecular Weight Site
PHARM - Pharmacophore Site
RATOMS - # of Ring Atoms Component
RBONDS- Rotatable Bonds Site
The descriptors you are likely to use most often are LINKS, ATYPES, and
BONDS. These will mainly be used to filter out unwanted chemical construc-
tions or components from use in derivative structures.

3.3.2 ATOMS Descriptor

A component descriptor to specify the number of atoms in each component.
Access:
• In the Select Component Descriptor Type dialog, set the Descriptor Types
to ATOMS and press OK.
• Or in the RACHEL - Modify Chemical Descriptors dialog, select a
descriptor of type ATOMS in the list and press Edit Descriptor.
Low, High The minimum and maximum numbers of atoms in com-

ponents to be retrieved from the database.
3.3.3 ATTACH Descriptor

A component descriptor to indicate that a specific component must attach to
another designated component through a specific atom type.
Access:
to ATTACH and press OK.
descriptor of type ATTACH in the list and press Edit Descriptor.
Atom Type List Select the desired atom type for the attachment atom of
the component.

Select Press the button then click the atom in the ligand that is
the anchor point for this component.
3.3.4 ATYPES Descriptor

A component or site descriptor to specify the occupancy requirements of
selected atom types for a specific component or for an entire site derivative.
Access:
to ATYPES and press OK.
descriptor of type ATYPES in the list and press Edit Descriptor.
Atom Types Select one or more SYBYL atom types in the list.
Operator =, <, or >.
Value Click the appropriate arrow or drag the appropriate
slider to specify either integer or fractional values.
Examples:
N.4 N.3 N.2 N.1 N.ar N.am N.pl3 > 1
Two or more atoms in the specified component or site must be nitrogens.
C.3 C.2 C.1 C.ar H > 0.99

The desired component or site must be entirely hydrocarbon.
F Cl Br I = 0
The desired component or site must be void of halogens.

3.3.5 BONDS Descriptor

A component or site descriptor to specify the occupancy requirements of
bonded atoms within a component. This descriptor has no effect on rotatable
bonds between components.
Access:
to BONDS and press OK.
descriptor of type BONDS in the list and press Edit Descriptor.
Atom Types Select one SYBYL atom type in side-by-side lists. Up

to six pairs of atom types may be selected in the dialog.
Value and Sliders Click the appropriate arrow or drag the appropriate
Examples:
C.ar C.ar = 6
If used as a component-level descriptor, the specified component must
contain six aromatic bonds.
If used as a site-level descriptor, a component with six aromatic bonds must
be present in the site.

O.3 H N.3 H N.am H > 0

If used as a component-level descriptor, this insures that candidates for the
specified component must contain a potential hydrogen bond donor.
3.3.6 LINKS Descriptor

A site descriptor to specify the types of atoms involved in rotatable bonds.
Access:
to LINKS and press OK.
• In the RACHEL - Modify Chemical Descriptors dialog, select a descriptor
of type LINKS in the list and press Edit Descriptor.
Atom Types Select one SYBYL atom type in each list.

Value and sliders Click the appropriate arrow or drag the appropriate
At the bottom of the standard list of atom types is another one where all the
names are in parentheses. Use these to specify the type of any atom associated
with a ring.
Examples:
C.3 (C.3) > 0
At least one rotatable bond must be present between an sp3 carbon and an
sp3 carbon located in a ring structure in the site derivative.
C.3 C.3 > 0.5

More than 50% of the site derivative rotatable bonds must consist of bonded
sp3 carbon atoms.

O.3 O.3 = 0
No bonded oxygens may be present between site components.
3.3.7 MW Descriptor
A site descriptor to specify the range of molecular weight.
Access:
to PHARM and press OK.
of type MW in the list and press Edit Descriptor.
Low, High The extreme values of molecular weight for each com-
ponent to be added to the selected site.
3.3.8 PHARM Descriptor

A site descriptor to specify that derivatives of this site must place atom(s) of the
specified type(s) within a certain distance of designated coordinates.
Access:
to PHARM and press OK.
of type PHARM in the list and press Edit Descriptor.

Pick Atom Press this button then click the atom that RACHEL will
attempt to replace with another scaffold. Usual candi-
dates are a hydrogen bond donor or acceptor in the
original ligand, or a receptor atom, or a water atom in
the receptor active site. The selected atom will be
bridged to the growing chain.
XYZ As an alternative to selecting an existing atom, enter a
position’s 3D coordinates in the fields.
Mol2 Press this button to select an atom in other molecule.
Desired Atom Select at least one SYBYL atom type in the list.
Types
Error Use the arrows or the slider to define the radius (in Å)
of the sphere centered on the designated coordinates.
Example:
1.223, -2.546, 0.443 O.3 O.2 O.co2 0.50
RACHEL will attempt to select components that will place a hydrogen bond
acceptor (O.3, O.2, or O.co2) within 0.50 Å of the 3D coordinate position
{1.223, -2.546, 0.443}.
3.3.9 RATOMS Descriptor

A component descriptor to specify the number of ring atoms.
Access:
to RATOMS and press OK.
descriptor of type RATOMS in the list and press Edit Descriptor.

Low, High The minimum and maximum numbers of ring atoms in

components to be retrieved from the database.
3.3.10 RBONDS Descriptor

A site descriptor to specify the number of rotatable bonds.
Access:
to RBONDS and press OK.
of type RBONDS in the list and press Edit Descriptor.
Low, High The minimum and maximum numbers of rotatable

bonds for each component to be added to the selected
site.
Note: RACHEL allows a maximum of ten rotatable
bonds summed over all defined sites.

RACHEL Scoring Function Parameters
3.4 RACHEL Scoring Function Parameters

3.4.1 Edit Scoring Function Parameters
To direct the scoring function by adjusting stressing some chemical properties
other the others.
Access: In the RACHEL dialog press Scoring Function: Edit.
Base offset Sets the appropriate scale and range given the training
value set data. Available only for Scoring Functions.
Nonpolar Inter- Estimates the non-polar interaction energy between
actions ligand and receptor. Higher values reflect the appropri-
ate association between hydrophobic regions of the
receptor with hydrophobic portions of the ligand.
Higher values are conducive to increased ligand recep-
tor affinity.
Electrostatic Estimates the electrostatic interaction energy between
Interactions ligand and receptor. Lower values reflect more comple-
mentary associations between oppositely charged
ligand and receptor atoms. Thus, lower values are con-
ducive to increased ligand receptor binding.
Steric Interac- Estimates the steric complementarity between ligand
tions and receptor. Higher values indicate a tighter associa-
tion between ligand and receptor.
Steric Strain Estimates the steric strain between ligand and receptor.
In contrast to the above term, lower values indicate a
tighter association and fewer inappropriate steric con-
tacts between ligand and receptor.

Molecular Wt Measures the molecular weight of the ligand. The

molecular weight of the ligand is divided by 100 to
generate this value. In general, molecular weights near
500 are considered optimal for drug candidates.
Number of Measures the number of rotatable bonds in the ligand
Rotatable Bonds structure. In general, the smaller the number of rotat-
able bonds, the lower the entropic costs to ligand recep-
tor association.
LogP Estimate Estimates the partition coefficient of the ligand. It is
based on the XLOGP algorithm as published by Wang
et. al. (J. Chem. Inf. Comput. Sci. 1997, 37, 615-621)
Nonpolar Frac- Calculates the fraction of non-polar atoms in the ligand.
tion
Weighting fac- To stress or diminish the impact of a particular parame-
tors ter, increase or decrease the corresponding scalar,
respectively. The practical range of weighting factors is
0.0–5.0.
The default values for the parameters accessible through this dialog are stored
in the text file $RACHEL_HOME/Rachel_scoredef.
3.4.2 Train
To select the type of RACHEL training function.
Access: In the RACHEL dialog press Scoring Function: Train.
Generate Target A target function is formed by simply averaging the

Function with descriptor values of the highest affinity complexes in
Single Complex the training set. For theoretical background see Auto-
mated Elucidation of the Target Function on page 114.
Generate Scor- Use this option if there is a sufficient number (> 10) of
ing Function 3D structures that contain a wide range (> 3 log units)
with Multiple of binding affinities. For theoretical background see
Complexes Automated Elucidation of the Scoring Function on page
111.

3.4.3 Import
With experience, you may become comfortable with a set of parameters that
produces desired search results. You may also wish to use the same values for
another project.
Access: In the RACHEL dialog press Scoring Function: Import then use a
file browser to select the project directory from which you want to import the
scoring function.
3.4.4 Predict
To use the current scoring function (established with the training set or
imported from another project) to predict the binding affinity of ligands. You
will prompted for the location of the ligand(s) and of the receptor.
Access: In the RACHEL dialog press Scoring Function: Predict.

RACHEL Component Database
3.5 RACHEL Component Database

A RACHEL database consists of components, the fundamental building blocks
that will be used to generate new derivative compounds. Read about Rachel
component databases on page 101.
3.5.1 Open or Create a RACHEL Database

To select an existing or to create a new RACHEL database.
Access: In the RACHEL dialog press Select.
Note: You must have write permission to the directory and database file that
you will create or add components to.
Sub-Directories Use the standard file browsing techniques to select the

directory that contains or in which to create a RACHEL
database.
Files If you want to open an existing RACHEL database,
select it from the list.
Other Directo- Lists the current directory and your home directory.
ries
New Press this button to create a new RACHEL database.
You can then edit the path and file name in the adjacent
field.

3.5.2 Add Structures to a RACHEL Database

To extract components from a set of molecules and store them in a RACHEL
database.
Access: Press Add Structures in the RACHEL dialog.
Note: You must have write permission to the directory and database file that
you will create or add components to.
Sub-Directories Use the standard file browsing techniques to select the

directory that contains the molecules of interest.
Files Select the file that contains the molecules from which
the structural components will be extracted. The file
must be in Multi-Mol2 format. After selection the file
name appears in the information box at the bottom of
the dialog.
Other Directo- Lists the current directory and your home directory.
ries

3.5.3 View a RACHEL Database

Tools to explore the database.
Access: Press View in the RACHEL dialog.
Database and Component Definitions
Database Full path of the currently open database. Use the file
browser to select another database.
Num Atoms Specify the minimum and maximum number of atoms
in the components to be extracted by Search.
Ring Atoms Specify the minimum and maximum number of ring
atoms in the components to be extracted by Search.
Mol Wt Specify the range of molecular weight for the compo-
nents to be extracted by Search.
Num Links Specify the minimum and maximum number of links in
the components to be extracted by Search.
Descriptors
Add Descriptors Access the Select Component Descriptor Type dialog.

Edit Descriptor Edit the descriptor selected in the list below.
Delete Descrip- Delete the descriptor selected in the list below.
tor

Explore the Database
Search Perform the component search. You must do this in

order to navigate the database.
Prev, slider, Tools to navigate within the database.
Next
ID# The ID number of the component currently displayed.

RACHEL Search Status
3.6 RACHEL Search Status

To monitor the progress of a RACHEL combinatorial search and review the hits
already generated.
Access: Press Status in the RACHEL dialog.
Hit Directory The full path to the directory containing the derivative
compounds.
Note: Because all the hits are saved in individual .mol2
files, adding the extension .mdb to the directory name
makes it easier later to review the hits in a Mol2 data-
base or in a molecular spreadsheet.
Summary Name and location of the project files and the full
description of the search parameters, chemical descrip-
tors, and scoring function.
Cavity Generate an extended radius active site cavity sur-
rounding the generated ligand space.
Export Structure Select a structure in the results list to save it to a .mol2
file in a SAVE subdirectory within the location defined
by the Hit Directory.
Results For each successful hit (compound derivative), the
name of the .mol2 file and its RACHEL score.
Viewer Access the Viewer dialog.

RACHEL Search Status
Refresh Reload all the structures found so far. Note that this
operation clears all molecule areas and associated back-
ground images (such as MOLCAD surfaces).
Stop Search Stops the search at the end of the current iteration. You
will be prompted to confirm this action.
Close Close this dialog and return to the RACHEL dialog.
3.6.1 View Structures

Cycle through the hits produced by a RACHEL search.
Access: Press Viewer in the RACHEL - Status dialog.
Original Ligand Toggle the display of the original ligand, superposed

(in green) on the compound currently displayed.
<< Jump to the previous screenful of hits.
<- Display the previous hit.
-> Display the next hit.
>> Jump to the next screenful of hits (about 7).
[X] Close the dialog.

RACHEL Utilities
3.7 RACHEL Utilities

Draw a cage of atoms around a set of aligned ligands to represent a pseudo
receptor when the structure of the actual receptor is unknown.
This technique is based on the Active Analog Approach developed by G. R.
Marshall, C.D. Barry, H.E. Bosshard, R.A. Dammkoehler, D.A. Dunn,
Computer-Assisted Drug Design, ACS Symposium Series 112, Olson, E. C. and
Christoffersen, R. E., Eds., Amer. Chem. Soc., Washington, D. C., 205-226
(1979).
Access: Press Utilities in the RACHEL dialog.
Generate Access the Pseudo-receptor Generation dialog to create

Pseudo-Recep- a Mol2 file containing the pseudo-receptor surrounding
tor the volume of a given ligand cluster.
Generate Generate an extended radius active site cavity for a
extended radius ligand-receptor complex (read more on page 100).
cavity
View saved Access the RACHEL - Ligand Viewer to display the
molecular struc- receptor cavity and the ligands saved manually during
tures the search (in the SAVE directory within the project).
3.7.1 Generate Pseudo Receptor

Create a pseudo-receptor as a cloud of points around aligned ligands.
Ligands Read in a multi-mol2 file and compute Gasteiger

(MultiMol2) charges for each molecule.

RACHEL Utilities
Pseudo Recep- Name of the output file that will contain the pseudo
tor receptor. The default file extension is .mol2.
Modify vdW Select one or more atoms in the Atom Expression dialog
then enter the amount of “additional vdW clearance” at
the keyboard.
3.7.2 Extended Radius Cavity

Create a wire surface of the receptor cavity. The surface is drawn by adding
1.5 Å to the van der Waals radii of the receptor atoms near the ligand.
This functionality requires:

• a ligand: already in a molecule area or read in from a . mol2 file
• a receptor: already in a molecule area or read in from a . mol2 file
• the name of the output file that will contain the cavity. Two files are
created: .cnt and .dsp.
Atoms of the ligand involved in hydrogen bonds with the receptor penetrate the
mesh surface.
3.7.3 Ligand Viewer
Directory With the browser open the SAVE directory within the
project’s directory.
Receptor File containing the receptor used for the project. This is
file is retrieved automatically.
Generate Display a mesh surface of the receptor cavity. The sur-
Extended face is drawn by adding 1.5 Å to the van der Waals
Radius Cavity radii of the receptor atoms near the ligand.
Slider Use Prev and Next to scroll through the saved ligands.

4. RACHEL Theory
RACHEL (Real-time Automated Combinatorial Heuristic Enhancement of Lead

compounds) was designed to optimize weak binding lead compounds in an
automated, combinatorial fashion. RACHEL can be classified as a “builder-
type” drug refinement program.
• Extracting Building Blocks from Corporate Databases on page 102
• Intelligent Component Selection System on page 104
• Development of a Component Specification Language on page 107
• User-directed Structure Generation
• Filtration of Components Using Constraints
• Template Driven Structure Generation
• Novel Techniques to Estimate Ligand-Receptor Binding on page 111
• Automated Elucidation of the Scoring Function
• Automated Elucidation of the Target Function

4. RACHEL Theory
Extracting Building Blocks from Corporate Databases
4.1 Extracting Building Blocks from Corporate

Databases
RACHEL, as other builder-type programs, work as follows:
• A database of chemical fragments is used to derivatize a lead compound
by replacing weak binding regions with components that will improve
receptor complementarity.
• These compounds are then scored by calculating their affinity for the
receptor.
• Those compounds that bind tightly with the receptor are then saved
while those that bind poorly are discarded.
• The new population of compounds is then processed to form the next
generation of derivatives.
• Over time, a lead compound is iteratively refined into a set of tight
binding structures.
RACHEL’ unique advantage is that it pulls building block components directly

from the user’s corporate database, utilizing the scientific institution’s intel-
lectual property in the design of new drugs. The figure below demonstrates this
process.
Figure 2 Extraction of Components from Corporate Structural Database
The corporate structural database (on the left) may contain hundreds of
thousands of compounds. All structures are composed of non-rotatable chemical
groups separated by rotatable bonds as defined by the laws of chemistry. These
non-rotatable groups represent the components or fundamental building blocks
that will be used to generate new derivative compounds (on the right).
RACHEL first isolates these components by identifying the rotatable bonds in

the structure (red arrows in the figure below). Each individual component is
then isolated, identified with a unique label that describes its distinct chemical

4. RACHEL Theory
Extracting Building Blocks from Corporate Databases
architecture, and stored in the component database along with a description of

its chemical composition. A unique component label is used to register each
fragment and prevent the storage of redundant chemical groups.
Figure 3 Separation and Isolation of Components at Rotatable Bonds
There are major advantages to extracting components in the manner described

above.
• The storage of unique components allows the compression of a massive
corporate database into a much smaller and manageable form. Typically,
a corporate database containing hundreds of thousands of structures may
be comprised of only 5,000 individual components because a few
components, such as methyl, hydroxyl, and amine groups, are utilized
over and over again.
• Unique chemical constructions, for which only proprietary synthetic
methods are known, are stored and available for use in future ligand
design. This allows the research scientist to take advantage of patented
corporate chemistry and preserve the competitive edge gained from prior
research.

4. RACHEL Theory
Intelligent Component Selection System
4.2 Intelligent Component Selection System

The goal of builder-type programs is to generate derivatives that are comple-
mentary to the active site. Both steric (size and shape) as well as electrostatic
forces must be considered. The difficulty in accomplishing this lies in the sheer
number of potential component combinations. While random selection of
fragments for assembly ensures an adequate sampling of components, it may
lead to the selection of improper fragments, generating poor derivatives.
The RACHEL software has a far greater problem. While other builder-type
applications contain databases with 100 components or less, RACHEL can
extract upwards of 40-50,000 components, depending upon the size and
diversity of the corporate database. Thus, the number of potential fragment
combinations is nearly immeasurable. Clearly, a method is needed to rapidly
focus on the appropriate combinations that are likely to satisfy binding require-
ments.
The greatest benefit of RACHEL’s component extraction method is that a

massive fragment property index of the entire corporate database is created.
Along with the atomic coordinates of each component, a wealth of chemical
information characterizing each building block is stored. Data such as the size
of the component, atom composition, connectivity, ring structure, and electro-
static charge are included. As such, a means of rapidly cross-referencing
chemical components on demand is available.
Figure 4 Generation of Fragment Property Index
The figure above demonstrates how this fragment property index is generated.
The image on the left depicts a representative component database. Using the
stored chemical attributes, the database is sorted and mapped into a multi-
dimensional array, where each axis represents a different descriptor. In this
example, only size, polarity, and valence (number of connections) are shown for
simplicity. Each axis provides a gradient along which components can be distin-
guished. As a result, components that are similar with respect to the various
descriptors are grouped together.
This fragment property index offers a powerful means to improve the gener-
ation of complementary ligands. Over time, builder-type programs evolve
compounds with improved binding. A moderate affinity structure has

4. RACHEL Theory
reasonable steric and electrostatic complementarity with the active site.

However, components can still be added, deleted, or substituted to augment
receptor interaction.
Although simplistic, random selection of substituent fragments is absolutely

necessary to ensure adequate sampling of the database and to generate truly
novel solutions. RACHEL implements random sampling in the initial stages of
lead compound optimization. Early derivatives that are generated are weak
binding at best. Thus, random component sampling increases the chances of
finding the appropriate components to improve receptor interaction.
However, random sampling often diminishes the complementarity of reasonable

binding compounds. This is the result of replacing satisfactory components with
poor ones. For example, if a methyl group or a highly charged fragment were to
replace a large, hydrophobic ring on the ligand, it would ruin interaction with
the receptor at that component.
Instead, RACHEL incorporates a heuristic active site mapping algorithm as

shown in the figure below to determine the optimal chemical characteristics to
complement a given region of the active site.
Figure 5 RACHEL Active Site Mapping
This technique maps chemical characteristics of the receptor, such as positive

charge, negative charge, and active site volume as a function of distance along
the active site axis. Using this active site map, RACHEL can determine the
chemical characteristics most likely to complement the receptor at a given
component location. RACHEL then determines a list of candidate fragments
and substitutes them in a combinatorial fashion.

4. RACHEL Theory
Figure 6 RACHEL Intelligent Component Selection System
In the example above, RACHEL determines that the naphthalene group (blue)
and carboxylic acid group (red) of a ligand derivative should be replaced with
other components to improve binding. The naphthalene group is large and very
non-polar since it consists strictly of hydrocarbons. Conversely, the carboxylic
acid group is quite small, but highly polar. Using the active site map as
described above, RACHEL determines that these characteristics are indeed ideal
for complementing the receptor at each respective component. Using the
fragment property index, RACHEL can cross-reference other database compo-
nents that exhibit similar characteristics, as shown in the red and blue boxes on
the right. These components are then combinatorially used to generate a new
family of derivatives for testing. Each derivative retains the optimal receptor
binding characteristics. However, enough variability is generated to potentially
improve receptor complementarity.

4. RACHEL Theory
Development of a Component Specification Language
4.3 Development of a Component Specification

Language
4.3.1 User-directed Structure Generation
The ability to instantly cross-reference components by chemical composition
also permits user-directed structure generation, a powerful and unique feature of
RACHEL. This technology permits the true application of virtual combinatorial
chemistry. The inspiration for this technology stems from earlier work
published by the author RACHEL [“DBMAKER: A set of programs to generate
three-dimensional databases based upon user-specified criteria” Ho, C.M.W.,
and Marshall, G.R. J. Comp-Aided Mol. Design 1995, 9, 65-86].
The figure below demonstrates this with an example. The lead compound
scaffold in the center contains an amide bond with various sidechains extending
from it.
Figure 7 RACHEL User-directed Structure Generation
Biochemical characterization of this lead compound determined that three

chemical groups make up the pharmacophore.
• The first group, shown in blue, must contain a large ring system.
Crystallographic analysis revealed that single and bicyclic rings are
capable of binding, as long as they are planar. Thus, the rings must be
aromatic. Any atom types may be accepted.
• The second group, shown in green, has different requirements. Here too
a cyclic component is desirable. However, the binding pocket in this

4. RACHEL Theory
region is smaller and more spherical. Thus, only single rings are
acceptable although they need not be aromatic. In addition, this region is
very hydrophobic; thus, only hydrocarbon components are acceptable.
• The third group, shown in red, is quite different from the other two. This
region of the active site is highly charged and requires a small polar
group to interact with. Thus, no ring structures are acceptable.
Furthermore, heteroatoms (nitrogen, oxygen) are required.
Table 1 Chemical Requirements Derivative Groups in Figure 7
Blue Derivatives Green Derivatives Red Derivatives

(+) Ring structures - aro- (+) Ring structure - single (-) Ring structures
matic
Molecular weight < 200 Molecular weight < 100 Molecular weight < 50
# Atoms < 25 # Atoms < 20 # Atoms < 8
(+) Any atom type (+) C, H = only (+) N, O = required
Using the individual databases, shown in Figure 7 as the blue, green, and red
boxes, RACHEL combinatorially generates all possible derivatives within the
constraints of the active site. In so doing, an immense number of diverse
chemical structures may be constructed and tested in a defined and controlled
manner.
4.3.2 Filtration of Components Using Constraints

Another feature unique to RACHEL’s component specification language is the
removal of structures that are undesirable because they are unstable or not
synthetically feasible. The constraints shown below may be used in a variety of
ways to limit the generation of these unacceptable structures.
Figure 8 RACHEL Component Constraints
In this hypothetical chemical structure, the geometric shapes represent different

components while the lines connecting them correspond to rotatable bonds.
• The ATOM and RATOM constraints govern how many atoms a
particular component can possess.

4. RACHEL Theory
• The LINK constraint limits the atom types that can be utilized in
rotatable bonds.
• The PHARM constraint signifies that a specific atom type must be
present at a precise location in the active site.
• The #CMPNTS restriction places upper and lower bounds on the total
number of components a structure can possess.
• The ATYPE constraint stipulates how many atoms of a specific type can
be present in both individual components as well as the entire structure.
• The BOND constraint places limits on the types of bonded atoms that
can be present within a component.
As one can see, this again gives the user a tremendous amount of control over
the structures generated by RACHEL.
4.3.3 Template Driven Structure Generation

An additional feature that is unique to RACHEL is an automated method to
generate diversity using templates. These templates are an integral part of the
component specification language. As is normally the case, a computational
chemist is not sure exactly what derivative components might complement the
receptor. However, specific chemical groups may be desired at general locations
in the active site. These may be pharmacophoric elements, or proprietary
chemical structures for which synthetic methods are available. This is illustrated
in Figure 14C below.
Figure 9 Template Driven Structure Generation
In the upper left is a portion of a characterized lead compound: a carbonyl

group (red) and a phenyl ring (green) are required to satisfy the pharmacophore
for receptor binding. Given this information, RACHEL allows you to define a
chemical template (shown in the upper right) to generate appropriate structures.
The lead compound fragment and the two pharmacophoric groups are separated
by wildcard designations, which denote where chemical variability can occur.
RACHEL will then generate chemically diverse structures using the template as
shown in Figure 9. The static portions of the template are left untouched and
they are incorporated into every generated derivative. However, the wildcard

4. RACHEL Theory
regions allow RACHEL to insert various components in a random manner to

link these pharmacophoric elements together. Constraints can be placed on
these variable regions using the component specifications described above. The
use of these templates enables RACHEL to fully explore the chemical diversity
within the corporate database and maintain the fundamental groups necessary to
achieve receptor binding.

4. RACHEL Theory
Novel Techniques to Estimate Ligand-Receptor Binding
4.4 Novel Techniques to Estimate Ligand-Receptor

Binding
4.4.1 Automated Elucidation of the Scoring Function
The calculation of receptor binding affinity for each newly generated derivative
ligand remains the most challenging aspect of drug design. Not only is this task
very difficult, it also is critical for the success of the program. The difficulty is
that the accurate determination of ligand receptor binding involves complex,
intensive, and time-consuming quantum chemical calculations. On the other
hand, a typical ligand refinement program can generate and sample new struc-
tures at the rate of several hundred to thousands a minute. Thus, in order to
achieve such high throughput, there must be some compromise in the accuracy
of the binding calculation. That compromise is in the utilization of scoring
functions.
In essence, a scoring function is an equation that estimates the binding affinity

of a ligand to the receptor using descriptors that can be rapidly computed for the
ligand receptor interaction.
Given a particular ligand and receptor,

• the actual binding affinity must be measured using a biological assay;
• the three-dimensional structure of the ligand bound within the receptor
must be determined using X-ray or NMR techniques.
The determinants of binding include steric interaction energy, electrostatic

interaction energy, and hydrophobicity. Given the three-dimensional structure
of a particular compound bound within the active site, it is possible to calculate
values for these descriptors. For review, see Head, R.D. et. al, J. Am. Chem.
Soc. 1996, 118: 3959-3969.
For example,
• The steric interaction energy is calculated as the number of receptor
atoms that are within a specific distance (i.e. 5 Å) of any ligand atom.
The higher the value, the more interactions between ligand and receptor
atoms.
• The electrostatic interaction energy is computed using Coulomb’s law.
• The hydrophobicity is represented by LogP, which is a measure of the
compound’s solubility in oil versus water. The higher the value, the
more greasy and oily the compound.

4. RACHEL Theory
In short, these descriptors are simple and very easy to calculate. This allows for
the rapid determination of characteristics that relate to ligand binding strength.
It is important to note that this example is very simplistic. In reality, some
scoring functions contain over twenty terms.
Figure 10 Derivation of Scoring Function
The figure above presents four complexes whose binding affinity has been
measured and whose descriptors have been calculated. Statistical tools, such as
partial least squares regression, are then employed to relate the numerical trends
in the descriptors with the corresponding binding affinities. In the resulting
equation, estimated affinity is a function of the calculated descriptors (steric,
electrostatic, and logP). Coefficients (A, B, C) relate the calculated descriptors
to the actual affinities and are determined by the statistical analysis. In the
example, as steric interaction energy increases, so does the biological binding
activity. Thus, the coefficient A is positive. On the other hand, a negative
electrostatic interaction energy is conducive to tighter binding since opposite
charges attract. Therefore, the corresponding coefficient B is negative. LogP
follows a similar trend as steric interaction energy; thus, coefficient C is
positive.
Once a scoring function has been derived, it can be employed to estimate

binding affinities very rapidly. Given a newly designed ligand or structural
derivative that has been docked within the active site, the descriptors of binding
are calculated then multiplied by the derived coefficients of the scoring
function. The resulting terms are summed to determine the estimated binding
affinity of the ligand in question.
RACHEL uses the following descriptors to generate its scoring function:

• Steric complementarity
• Steric strain
• Electrostatic interaction energy
• Nonpolar - nonpolar interaction energy

4. RACHEL Theory
• Molecular weight
• Number of rotatable bonds
• LogP estimation
• Nonpolar atom fraction
Currently, several hundred high quality ligand-receptor complexes in the public

domain can be employed for scoring function development. Pharmaceutical
firms have access to far more proprietary structures. However, even with all
these structures and with the powerful statistical tools to analyze them, scoring
functions still remain mediocre at best.
According to the laws of thermodynamics, ∆G = ∆H - T∆S.

• ∆G is the Gibbs free energy of binding, that is the energy that is released
when ligand and receptor bind. This is the actual thermodynamic
property that we are trying to estimate with the scoring function.
• ∆H is the enthalpy, that is internal energy. It is grossly approximated by
the calculated descriptors. Efforts to improve the accuracy of these
approximations often increase calculation time drastically.
• T∆S is an entropy term and is indicative of the relative gain or loss of
disorder when ligand and receptor bind. Perhaps the biggest influence on
entropy is the behavior of the water molecules in the active site that are
displaced when binding occurs. This term is often disregarded because it
is time-consuming to calculate it with any degree of accuracy.
Consequently, ∆G is at best very crudely estimated by scoring functions. For

review, see A. Murcko et al. J. Med. Chem. 1995, 38, 4953-4967.
Generalized Scoring Functions
In general, proprietary, generalized scoring function have been derived using a

wide variety of structures. This approach has significant shortcomings.
1. Receptor systems vary considerably in their chemical makeup. In some
systems, electrostatic interactions dominate the ligand binding force. In
other systems, hydrophobic interactions overshadow the other forces
involved. Thus, a master scoring function to estimate binding affinity for all
ligand receptor systems in an all-purpose tool that does not excel at
anything. Using such a variety of ligand-receptor systems in the training set
adds considerable noise to the data, which diminishes its predictive power.
2. The wealth of structure-activity data gathered by a research lab while devel-
oping a drug candidate is not exploited to its full potential by a generalized
scoring function.

4. RACHEL Theory
Focused Scoring Functions
RACHEL offers the unique ability to utilize the corporate structure-activity

data, thereby retaining the competitive edge gained through research and devel-
opment. By incorporating the necessary statistical and analytical tools,
RACHEL generates focused scoring functions to estimate the binding affinity of
new compounds.
By limiting the training set to structures binding within the same receptor, a
focused scoring function is biased towards the interactions that govern ligand
association with the target active site. If hydrophobic contacts predominate, the
hydrophobic descriptors will be emphasized. Conversely, if electrostatic forces
are important to binding, those descriptors will be accentuated. Even something
as simple as the size of the active site can have a tremendous impact on the
allowable ligands. This is a descriptor that cannot be adequately represented in
generalized scoring functions. Given their built-in adaptability, focused scoring
functions have a greater predictive power when estimating ligand-receptor
binding.
4.4.2 Automated Elucidation of the Target Function

Even with structure-activity data pertaining to a target receptor, difficulties in
generating accurate scoring functions may arise. First, there must be an
adequate number of compounds to make the analysis statistically valid.
Figure 11 Problems in Deriving Scoring Functions
Imagine the green and red dots to be structure-activity data points for individual
ligand-receptor complexes. The lines passing through them represent potential
scoring functions attempting to describe their distribution.
• In the graph on the left, the dataset contains a large number of
complexes whose activities cover a wide range of values. This wide
distribution allows for an easy determination of a best-fit line. The
scoring function generated from this set thoroughly represents the data.
• The dataset in the middle graph contains too few compounds to generate
an accurate fit of the data. Notice the ambiguity that exists in deter-
mining the best-fit line. Any scoring function derived from this dataset
has little predictive value.

4. RACHEL Theory
• In the graph on the right there is no lack of data. However, money and
time constraints may have limited studies of poorly binding compounds,
resulting in a cluster of high-affinity data points. This graph shows that
is difficult to elucidate an accurate scoring function when the structure
activity data is not broad enough.
In situations where the dataset is either too small or too clustered, RACHEL
offers another means of generating a focused scoring system from proprietary
structure activity data. When RACHEL determines that the derived scoring
function offers little predictive value, it switches to a target function. A target
function is formed by simply averaging the descriptor values of the highest
affinity complexes in the training set. These “ideal” descriptor values are then
used as a guide to determine if newly generated derivative structures will be
kept or discarded. This is illustrated in the figure below.
Figure 12 Use of Target Function to Screen Compounds
In this three-dimensional graph, the axes represent the three descriptors of

ligand receptor binding (steric, electrostatic, LogP). The blue cube is a plot of
the “ideal” descriptor values that have been averaged from the optimal binding
ligands in the user’s structure-activity data. This blue cube represents the target
values against which all derivative compounds will be compared. The descriptor
values for each derivative structure are then plotted. The structures whose
descriptor values are closest to the target are retained (green). All other struc-
tures are rejected (red).
The primary advantage of using a target function is its ease of implementation.

No longer is a large training set of compounds required. Even a single
compound can be used as a model for optimal ligand receptor binding. By
simply extracting the descriptor values of the best compounds, it is possible to
avoid many of the pitfalls in scoring function development that result from data
artifacts. In addition, the characteristics of the ligand-receptor association that
foster improved binding are allowed to drive the development of future struc-
tures.

4. RACHEL Theory
The disadvantage in using target functions is the lack of extrapolation. Because

the system uses the properties of previously characterized ligands, it is unable to
predict whether a new derivative compound can potentially bind better to the
receptor. Being unable to quantitate the binding relative to the other structures
in the training set, RACHEL simply builds structures that mimic the character-
istics of the best compounds.
Fortunately, this is often the exact task at hand for pharmaceutical chemists. By
the time a drug development project has reached maturity, the ligands that have
been developed are often optimal binding compounds. Therefore, a target
function is usually sufficient as it allows the drug designer to construct alternate
chemical architecture that retains optimal binding characteristics.

RACHEL Index
A LINKS 86
MW 87
ATOMS descriptor 83 PHARM 87
ATTACH descriptor 83 RATOMS 88
RBONDS 89
ATYPES descriptor 84
tutorial 28
component database
B adding structures 94
building blocks 102
BONDS creating 93
descriptor 85 opening 93
selection system 104
C tutorial 69
viewing 95
CHARLIE creating a project 8
tutorials graphical interface 73
bridge generation 61 introduction 5
scaffold replacement 47 project setup 77
pseudo receptor 99
receptor cavity 100
D scoring functions
Database automated elucidation 111
RACHEL components 93 editing parameters 90
Descriptors 80 focused 114
generalized 113
importing parameters 92
L predicting 92
target function 114
License requirements
training function 91
RACHEL 6
tutorial 14
LINKS descriptor 86 search parameters
editing 78
importing 79
M search status 97
MW descriptor 87 theory 101
tutorials 7
utilities 99
P
RATOMS descriptor 88
PHARM descriptor 87
RBONDS descriptor 80, 89
Project
Receptor
setup in RACHEL 77
Pseudo receptor 99
Pseudo receptor 99 view cavity 100
R S
RACHEL Scoring methods
chemical descriptors 80 RACHEL 14, 111
adding 82
ATOMS 83
ATTACH 83 T
ATYPES 84 Tutorials
BONDS 85 RACHEL 7

Bridge generation using CHARLIE 61
Create a component database 69
Create a project 8
Running a combinatorial search 23
Scaffold replacement using CHARLIE 47
Scoring Functions 14
Using chemical templates and descriptors
28

Rachel Manual

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Rachel Manual

Uploaded by

Copyright:

Available Formats

RACHEL™ Manual

1699 South Hanley Rd. Phone: +1.314.647.1099

3. RACHEL Graphical Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4. RACHEL Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

SYBYL-X 2.1 RACHEL 3

RACHEL (Real-time Automated Combinatorial Heuristic Enhancement of Lead

CHARLIE (Combinatorial Heuristic ARrangement of LInker Elements) is

RACHEL encompasses numerous features designed to work cohesively in this

1.1 What is New with RACHEL

SYBYL-X 2.1 RACHEL 5

1.2 License Requirements for RACHEL

SYBYL-X Suite Licensing

SYBYL-X introduced a simplified licensing scheme in which the “SYBYL”

RACHEL requires a separate “RACHEL” license.

6 RACHEL SYBYL-X 2.1

Prerequisite to all RACHEL and CHARLIE Tutorials

! > Delete Everything

! Click to reset all rotations and translations.

Suite of RACHEL and CHARLIE Tutorials

SYBYL-X 2.1 RACHEL 7

2.1 Create a RACHEL Project

In this exercise, you will perform virtual combinatorial chemistry to explore

2.1.1 Define a New RACHEL Project

2. Start defining a new project.

! Applications > RACHEL

! At the top of the RACHEL dialog, press Create New Project.

The RACHEL - Setup New Project dialog appears (dialog description on

3. Enter the name of the new project.

The project directory dialog box will then appear.

! Navigate to the rachel directory.

! Append the name of the new project, tutorial, to the directory

8 RACHEL SYBYL-X 2.1

4. Select the ligand.

! Press the Ligand [...] button.

! Navigate to the rachel/CMPDS directory.

! Select key.mol2 and press OK.

5. Select the receptor, the trimmed active site of alpha-thrombin.

! Press the Receptor [...] button.

! Navigate to the rachel/CMPDS directory.

! Select lock.mol2 and press OK.

2.1.2 Designate the RACHEL Anchor Bond

6. Start the setup process for RACHEL.

! In the RACHEL - Setup New Project dialog, press Setup RACHEL.

The SYBYL window displays the ligand-receptor complex.

SYBYL-X 2.1 RACHEL 9

7. Look at the ligand.

! Use to undisplay temporarily (Mol Vis off) the receptor structure

An anchor bond defines an optimization site. It is to this anchor bond that

10 RACHEL SYBYL-X 2.1

2.1.3 Designate the RACHEL Target Atom

! Use to re-display (Mol Vis on) the receptor structure (lock) in

10. Select the target atom in ASP189.

SYBYL-X 2.1 RACHEL 11

! Click the terminal carboxylate carbon of the ASP189 sidechain (atom

12 RACHEL SYBYL-X 2.1

2.1.4 Terminate the RACHEL Setup and Review the Project

12. Review the RACHEL project as defined so far.

The RACHEL dialog now includes information about the project.

For each defined site the following information is listed:

SYBYL-X 2.1 RACHEL 13

2.2 RACHEL Scoring Functions

If you do not have ligand-receptor structures to derive either a scoring function

2.2.1 The Training Set for the Scoring Function