You are on page 1of 14

INFORMS Journal on Computing informs ®

Vol. 21, No. 3, Summer 2009, pp. 349–362


issn 1091-9856  eissn 1526-5528  09  2103  0349 doi 10.1287/ijoc.1090.0340
© 2009 INFORMS

Grid-Enabled Optimization with GAMS


Michael R. Bussieck
GAMS Software GmbH, 50933 Cologne, Germany, mbussieck@gams.com

Michael C. Ferris
Computer Sciences Department, University of Wisconsin–Madison, Wisconsin 53706,
ferris@cs.wisc.edu
Alexander Meeraus
GAMS Development Corporation, Washington, DC 20007, ameeraus@gams.com

W e describe a framework for modeling optimization problems for solution on a grid computer. The frame-
work is easy to adapt to multiple grid engines and can seamlessly integrate evolving mechanisms from
particular computing platforms. It facilitates the widely used master-worker model of computing and is shown
to be flexible and powerful enough for a large variety of optimization applications. In particular, we summarize
a number of new features of the GAMS modeling system that provide a lightweight, portable, and powerful
framework for optimization on a grid. We provide downloadable examples of its use for embarrasingly parallel
financial applications, decomposition of complementarity problems, and for solving very difficult mixed-integer
programs to optimality. Computational results are provided for a number of different grid engines, including
multicore machines, a pool of machines controlled by the Condor resource manager, and the grid engine from
Sun Microsystems.
Key words: algebraic modeling language; grid computing; decomposition
History: Accepted by Karen Aardal, Area Editor for Design and Analysis of Algorithms; received September
2007; revised September 2008; accepted February 2009. Published online in Articles in Advance June 12, 2009.

1. Introduction treats a confederation of loosely coupled heteroge-


There are probably two main sources for parallelism neous computing resources as a single object and
within optimization algorithms. First and foremost, attempts to execute algorithms on such a platform.
there is the opportunity to use parallel computa- Here, there have been some notable successes in the
tions to aid in the search for global solutions, typi- area of optimization, including the solution of large-
cally in a nonconvex (or discrete) setting. Important scale travelling salesman problems (Applegate et al.
techniques of this kind either involve multiple trial 1998), the processing of difficult quadratic assign-
points or search processes, including pattern searches, ment problems (Anstreicher et al. 2002, Wright 2001),
evolutionary algorithms (Alba 2005, Goldberg 1989), and the resolution of optimality of some hard mixed-
heuristics (Linderoth et al. 2001), or multistart integer programs (Ferris et al. 2001). The attraction
methods, or computations to efficiently explore a of such an environment is that it can provide an
complete enumeration of a large set of trial points, enormous amount of computing resources—many of
including branch-and-bound (Gendron and Crainic which are simply commodity computing devices with
1994) or branch-and-cut methods (Ralphs et al. 2003). the ability to run commercial quality codes—to a
Second, optimization algorithms have used building larger community of users. As such, grid computing
blocks, most prominently decomposition and parallel is sometimes called computing for the masses, or poor
linear algebra techniques, to exploit the computa- man’s parallelism. This is the platform that we intend
tional powers of high-performance machines. We to exploit in this paper.
refer to Grama and Kumar (1995, 1999) for lists of ref- As a particular example, the results contained here
erences, whereas the texts of Butnariu et al. (2001) and use the Condor system (Epema et al. 1996, Litzkow
Censor and Zenios (1997) provide a fuller perspective. et al. 1988), a resource management tool that can
Because commodity computational components provide a user with enormous computing resources.
have become increasingly cheap and accessible, Condor has been extensively deployed on collections
there has been an increasing interest in the last of idle workstations, dedicated clusters and multi-
decade in grid computing (Foster and Kesselman processor architectures running a variety of differ-
1999, Livny and Raman 1999), a somewhat poorly ent operating systems. An important feature of this
defined but well-known notion. Grid computing system (from a user’s perspective) is that whenever
349
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
350 INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS

individual machines are updated, the power of the difficult mixed-integer programming problems to
overall grid increases seamlessly. Furthermore, the optimality.
system is available for download.1 The use of a modeling language is an essential
However, we believe that grid computational factor in the management and solution of realistic,
resources are not enough to make parallel optimiza- application-driven, large-scale problems and allows
tion mainstream. Setting aside the issue of data col- us to take advantage of the numerous options for
lection, it is imperative that we provide simple and solvers and model types, therefore enhancing the
easy-to-use tools that allow distributed algorithms to applicability of these tools. The representation of
be developed without knowledge of the underlying large-scale optimization problems in a modeling lan-
grid engine. Although it is clear that efficiency may guage naturally reveals decomposable structures that
depend on what resources are available, we believe together with application knowledge lead to effective
it is imperative to provide high-level methodology to parallelization. Modeling languages with integrated
allow a modeler to partition the solution into suffi- procedural components allow the implementation of
ciently compute-intensive subproblems, each of which such techniques at the application level. Further dis-
generates information to guide the overall solution cussion of the procedural and declarative components
process. Stochastic programming is perhaps a key in modeling languages is described in Bussieck and
example, where in most of the known solution tech- Meeraus (2003). Although we will use GAMS, the sys-
niques, large numbers of scenario subproblems need tem we are intimately familiar with, most of what
to be generated and solved (Linderoth et al. 2006). will be said could as well be applied to other algebra-
The programming paradigm envisioned in this based modeling systems such as AIMMS, AMPL,
paper is the master-worker model in which a master MOSEL, MPL, OPL, and others.
program generates a large number of independent The GAMS grid facility allows multiple optimiza-
subproblems that can be solved in parallel by the tion problems to be instantiated or generated from
workers. Once the subproblems finish, the master pro- a given set of models. Each of these problems is
gram performs additional computations and creates a solved concurrently using grid computing resources.
new set of subproblems. This paradigm is supported A number of commercial grid computing resources are
by the MW (Goux et al. 2000) API within Condor. now available on an as-you-go basis, and optimization
The mixed-integer programming (MIP) solver FAT- software is beginning to appear. For example, GAMS
COP (Chen and Ferris 2000, Chen et al. 2001) uses the and its grid facility are now available on Sun Micro-
MW API, as do several other applications. We aim to system’s Network.com.
take the abstraction one level higher and allow Con- The rest of this paper is organized as follows. In
dor (or any other grid computing system) to be used §2, we outline the grid computing tools that are
directly from within a modeling language. Our linkage available through GAMS. The following sections then
of grid computing to modeling systems is an attempt outline specific examples of the use of these tools,
to allow grid technology to be exercised by (non-grid including applications in finance (§3), decomposition
expert) businesses and application modelers. Previous of global economic models (§4), and difficult mixed-
work in this vein can be found for example in Dolan integer programs (§5). Computational results detail-
et al. (2008) and Ferris and Munson (2000). ing specific application of these tools to the solution
A modeling language (Bisschop and Meeraus 1982, of various optimization problems on a number of grid
Fourer et al. 1990) provides a natural, convenient computing engines are provided, coupled with the
way to represent mathematical programs. These lan- use of advanced features of the modeling system for
guages typically have efficient procedures to handle generation, collection, and analysis of results.
vast amounts of data and can quickly instanti-
ate a large number of models. For this reason,
modeling languages are heavily used in practi- 2. The GAMS Grid Facility
cal applications. This paper outlines some basic The computational grid and modeling languages form
grid computing tools within the general algebraic a synergistic combination. Linking them together
modeling system (GAMS) that facilitate the parallel gives us expressive power and allows us to easily
and asynchronous solution of models. Three exam- generate simple parallel programs. Successful appli-
ples, from simple to complex, will be used to illus- cations of our mechanism should possess two key
trate the use of grid computing facilities: tracing of properties: they should generate a large number of
efficiency frontiers and scenario evaluations, imple- independent tasks, and each individual task should
menting parallel decomposition methods, and devel- take a long time to complete. Applications with the
oping asynchronous algorithms to solve extremely above two properties cannot be reasonably performed
serially. Furthermore, the model generation time and
1
Livny, M. The Condor Project: High throughput computing. http:// scheduling overhead are ameliorated by the resources
www.cs.wisc.edu/condor. spent solving each individual task.
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS 351

We will first review what happens during the syn- instantiates the model using the current state of
chronous solution step and then introduce the asyn- the GAMS database, and passes it to some solution
chronous or parallel solution steps. When GAMS engine. Once a solution is obtained, the results are
encounters a solve statement during execution, it pro- merged back into the GAMS database. To carry out
ceeds in three basic steps: multiple replications, a loop construct can be used.
Step 1 (Generation). The symbolic equations of the Note the lack of a dummy index in the construct
model are used to instantiate the model using the loop(scenario,   ), which iterates over all elements
current state of the GAMS database. This instance in the set scenario. The only changes required are to
instantiate different data and to save appropriate val-
contains all information and services needed by a
ues from each solution.
solution method to attempt a solution. This represen-
tation is independent of the solution subsystem and loop(scenario,
computing platform. demand = sdemand(scenario);
Step 2 (Solution). The model instance is handed cost = scost(scenario);
over to a solution subsystem, and GAMS will wait solve mymodel min obj using minlp;
until the solver subsystem terminates. report(scenario) = obj.l );
Step 3 (Update). The detailed solution and statistics However, the solve statement in this loop is
are used to update the GAMS database. We refer to blocking. Essentially, the solver takes the (scalar-level)
this process as loading the solution. model that was generated, solves it, and returns the
In most cases, the time taken to generate the model solution back to the modeling systems without releas-
and update the database with the solution will be ing the process handle. We can make the solve state-
much smaller than the actual time spent in a specific ment non-blocking by setting the model attribute
solution subsystem. Often the model generation takes solvelink to a value of 3 or using the compile time
just few seconds, whereas the time to obtain an opti- symbolic constant %solvelink.AsyncGrid%. This value
mal solution may take a few minutes to several hours. tells GAMS to generate and submit the model for solu-
tion, and continue without waiting for the completion
If the solutions of a collection of models are unaf-
of the solution step. This is now a submission loop.
fected by their order of computation, we can solve
them in parallel and update the database in random mymodel.solvelink=%solvelink.AsyncGrid%;
order. All we need is a facility to generate models, sub- loop(scenario,
mit them for solution, and continue. At a convenient demand = sdemand(scenario);
point in our program, we will then look for the com- cost = scost(scenario);
pleted solution and update the database accordingly. solve mymodel min obj using minlp;
We will term this first phase the submission loop and h(scenario) = mymodel.handle );
the subsequent phase the collection loop. We obviously cannot save any solution values until
Submission Loop. In this phase we will generate the instance has actually solved. Instead, we need to
and submit models for solution that can be solved save some information that will later allow us to iden-
independently. tify an instance and check when the solve is complete.
Collection Loop. The solutions of the previously The model attribute handle provides this unique iden-
submitted models are collected as soon as a solution is tifier. We store those handle values—in this case, in the
available. It may be necessary to wait for some solutions parameter h—to be used later to collect the solutions.
to complete by putting the GAMS program to sleep. The following collection loop retrieves the solutions.
We now illustrate the use of the basic grid facility. loop(scenario$handlecollect(h(scenario)),
Assume that we have a simple transportation model report(scenario) = obj.l );
that is parameterized by given supply and demand The function handlecollect will interrogate the
information. The following statements instantiate solution process and will load the solution and related
those parameters and then solve the model, saving the information if the solution process has been completed
objective value afterwards into a report parameter. for the given handle. If the solution is not ready, the
demand = 42; cost = 14; function will return a value of zero and will continue
solve mymodel min obj using minlp; with the next element in the set scenario. For readers
not familiar with GAMS syntax, the $ should read like
report = obj.l;
a such that condition or a compact if statement. The
Note that a model in the GAMS language is just a above collection loop has one major flaw. If a solution
collection of symbolic relationships—the “equations.” was not ready, it will not be retrieved. We need to call
A solve statement simply takes those relationships, this loop several times until all solutions have been
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
352 INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS

retrieved or until a time limit is reached. We will use signal to the model which solution we want to load
a repeat-until construct and the handle parameter h to and then we use the procedure execute_loadhandle to
control the loop to look only for those solutions that merge the solution into the current database. We also
have not been loaded yet, as shown below. use the function handledelete to remove the instance
repeat from our system.
loop(scenario$handlecollect(h(scenario)), A few words are in order regarding the implemen-
report(scenario) = obj.l; tation of the solver interface. The solution is returned
h(scenario)=0 ); in a GAMS data exchange (GDX) container. This con-
display$sleep(card(h)*0.1) ‘sleep a bit’; tainer is a high-performance, platform-independent
until card(h)=0 or timeelapsed > 100; data exchange with APIs for most programming lan-
guages. Data contained in GDX meet all syntactic and
We use the handle parameter to control the col- semantic rules of GAMS information and are used
lection loop. Once we have extracted a solution, we to communicate between GAMS systems components
will set the handle parameter to zero. Before run- and any other external system such as databases,
ning the collection loop again, we may want to wait spreadsheets, and many other systems. GDX ensures
a while to give the system time to complete more data quality and provides one very important service:
solution steps. This is done with the conditional dis- it manages the mapping of different namespaces. The
play statement, which just executes a sleep command GAMS data model is a subset of a relational data
for 0.1 seconds times the number of solutions not yet
model that accesses data by descriptors and not loca-
retrieved. The final wrinkle is to terminate after 100
tion. Programming the mappings from data structures
seconds of elapsed time, even if we did not get all the
suitable for algorithms or programming languages to
solutions. This is important, because if one of the solu-
relational data spaces is error-prone and expensive;
tion steps fail, our program would never terminate.
GDX automates this step.
The parameter h will now contain the handles of the
The actual submission of the model instance to the
failed solves for later analysis. Note that even if the
operating system for further processing is done via
time limit is reached, the submitted jobs are not inter-
scripts. Whenever a solve statement is encountered
rupted. The grid engine will continue to process these
while executing a GAMS program, control is passed
models until a solution is found (or the solver decides
to a script that is responsible for running the solver on
to terminate). At a later stage, a new GAMS job can
the problem instance and passing back the solution to
be run using the existing handle information and can
collect all outstanding results. This is a key feature GAMS. In the grid environment, we simply use the
of our grid mechanism: the submission and collection file system to give each instance its own environment
loops are only interconnected by the handle parame- and its own directory. The script then schedules the
ters. As a final note, we have made no assumptions solver execution. The solvers generate the solution file
about what kind of solvers and what kind of comput- and a flag to signal completion. The collection loop
ing environment we will operate. The above example will understand the completion signal and will com-
is completely platform- and solver-independent, and mence retrieval. This submission script centralizes all
it runs on a Windows laptop or on a massive grid sys- information required to tailor the system to a spe-
tem without any changes to the GAMS source code. cific grid engine and is easily customizable. Note that
There are three handle functions that allow more this implementation conforms to the master-worker
control of the program. These are illustrated in the paradigm, with the GAMS program being the master
following modified version of the collection loop. and the grid resources providing the workers. GAMS
generates tasks and is responsible for the synchroniza-
repeat
tion of the results, whereas the grid processes indi-
loop(scenario
if(handlestatus(h(scenario))=%handlestatus.Ready%, vidual tasks and simply reports back results. A key
mymodel.handle = h(scenario); feature of this design is its easy modification for dif-
execute_loadhandle mymodel; ferent grid systems. All that is required is a tailoring
display$handledelete(h(scenario)) of the script to the current grid resource, multicore
‘could not remove handle’; machine, Condor, or an Amazon cloud.
h(scenario)=0 )); Grid systems like Condor are highly fault toler-
display$sleep(card(h)*0.1) ‘sleep a bit’; ant. In particular, Condor guarantees the eventual
until card(h)=0 or timeelapsed > 100; completion of all tasks that have been given to it.
The function handlestatus returns the current state Thus, if a machine becomes unavailable or unre-
of the solution process for a specific handle. If the sponsive (because of hardware failures, for example),
return value is %handlestatus.Ready%, we have a solu- the task is reallocated to another worker. However,
tion and can proceed to load all or parts of the solu- other grid engines may not be as effective at imple-
tion. This is carried out in two steps. First, we have to menting fault tolerance, so we provide an additional
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS 353

modeler-controllable method to deal with the situa- which gives independent access to all solution process
tion of a failed optimization step. When GAMS sub- related information, including the solution values
mits an optimization task to the grid, a resubmission if the process was successful. Instead of using the
script is also generated automatically. If a user deter- handlecollect function to retrieve all solution values,
mines that an optimization task did not perform as the procedure execute_loadhandle has been added to
expected, the task can be resubmitted using the fol- collect all or parts of a solution.
lowing syntax: Process Management. We may not be able to find
rc = handlesubmit(handle(scenario)); the desired solution with just one GAMS process
or submission. This could be because the solution
The return code indicates if the resubmission was process may have failures, may take days or weeks
successful or not. Although this process is essentially to complete, or has to interact with other exter-
redundant, its additional robustness has proven useful nal systems. This requires fail-safe design and time-
when extremely large grid engines have been used. distributed processes. The existing save and restart
In the following sections we will illustrate the use of facilities, which allow GAMS processes to be inter-
the grid facility in research and production environ- rupted and restarted, respectively, at a later time,
ments with three sets of examples. The first set shows possibly on a different computing platform, provide
how to exploit the ability to do parallel solution when effective process management. The new GAMS pro-
tracing efficiency frontiers or evaluate independent cess parameter griddir allows us to conveniently
scenarios. The second set implements parallel decom- name a collection of model instance-related informa-
position schemes for complementarity problems such tion, usually the name of a node in a file system.
as those arising in economic analyses. The last set of This facilitates the sharing of grid-related information
examples shows how one can extend the simple grid between different processes.
facility to implement sophisticated methods to solve System Management. The actual implementation
very difficult mixed-integer models. of the grid submission and management procedure
Before proceeding with these examples, we would is concentrated in one single script file, the gmsgrid
like to give a brief summary of features introduced in script. Each GAMS installation comes with a default
this section. It is important to note that no new lan- gmsgrid script that implements the grid facility for
guage constructs were required to program and man- a given platform using standard operating system
age the asynchronous execution of the model solution functionality. These scripts can easily be modified to
step; only minor extensions to existing structures have adapt to different system environments or applica-
been added. tion needs. A GAMS user does not have to have any
Instance Identification. The identity of a specific knowledge about the existence or content of those
model instance, the model handle, is simply encoded scripts.
into a GAMS scalar parameter value, which can be
managed like any other data item.
Solution Strategy. The existing model attribute 3. Processing Independent Scenarios
<model>.solvelink, which specifies the implementa- The most immediate use of parallel solution is for
tion of the solution process of a model instance, has the generation of independent scenarios arising in
two additional values to specify what type of asyn- many practical applications. Monte Carlo simulations,
chronous processing should be used. A new attribute, scenario analyses, and the tracing of efficiency fron-
the <model>.handle, is used to communicate instances tiers are just a few examples. The modifications to
of model instantiation and the collection of solution the existing sequential GAMS code are minor and
values. require no understanding of any platform-specific fea-
Solution Management. Four new functions were tures. No additional constraints are imposed on the
needed to manage the asynchronous model instances. application. We illustrate those features by using the
The function handlecollect checks to see if the solu- model QMEANVAR from the GAMS model library.
tion of a model instance has been completed. If a This model is used to restructure an investment
solution to the instance exists, it will be collected and portfolio using the traditional Markowitz-style return
merged into the existing GAMS database; solution- variance trade-offs (Markowitz 1952) under addi-
specific model attributes are reset. handlestatus tional trading restrictions, which make the model a
returns the current status of the model instance mixed-integer quadratic programming model, which
without taking any further action. handledelete will in GAMS is classified as a mixed-integer quadrati-
attempt to remove the model instance from the cally constrained program (MIQCP). Practical mod-
system. handlesubmit resubmits a model instance els of this kind can be very large, the scope for
for solution. The outcome of the attempt to solve using advanced starting points is limited, and the
a model instance is stored in a GDX container, model instantiation time is very small compared to
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
354 INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS

the solution time. The potential savings in elapsed complete grid ready model is also available via the
time are then closely related to the number of process- GAMS model library under the name QMEANVAG.
ing nodes available, which makes implementation on There may remain one more question: Are the
multi-CPU systems attractive. solutions from the serial model the same as from the
Before one starts to convert an application from parallel one? This kind of question arises in many sit-
serial to parallel operation, it is important to verify uations when doing maintenance or enhancements of
that the parallel features are working as advertised. existing applications. It is simple to capture a snap-
This can easily be accomplished by taking the existing shot of the current state of some or all the data items,
application by setting the <model>.solvelink option model inputs, and results in a GDX container. The eas-
to the value %solvelink.AsyncSimulate%. This will iest way is to add the GAMS parameter GDX to the
instruct GAMS to execute all the solve statements in job submission for both versions. The two GDX con-
serial mode as before but use the asynchronous solu- tainers can then be compared with a special difference
tion strategy. Once we have verified that our model utility called gdxdiff, which produces a new GDX
and the required solvers can operate in asynchronous container that contains only the differences. Those dif-
mode on the different target platforms, we are ready ferences can then be further processed or visualized
to parallelize the code. in the GAMS IDE. It should be noted that there could
As we have shown in the previous section, we need be large differences in the equations and variables
to separate the serial solution loop into a submis- because the order in which solutions are retrieved is
sion and collection loop. The original serial loop is as different, but all other items should be the same. In a
follows. shell environment this would look like
Loop(p, gams qmeanvar gdx=qmeanvar
ret.fx = rmin + (rmax-rmin)/(card(p)+1)*ord(p); gams qmeanvag gdx=qmeanvag
Solve minvar min var using miqcp; gdxdiff qmeanvar qmeanvag qmeandiff RelEps=1e-12
xres(i,p) = x.l(i);
report(p,i,‘inc’) = xi.l(i); It is typical for strategic modeling application to
report(p,i,‘dec’) = xd.l(i) ); require the generation of large numbers of scenarios
and their solutions need to be retained for further
We only need to define a place where we store analysis. Furthermore, we may just want to submit
the solution handle and set the solution strategy for a number of instances and disconnect from the sys-
this model to %solvelink.AsyncGrid%, which instructs tem, analyze the results, and prepare a new set of
GAMS to generate an instance of the model and scenarios. This working style can be supported by
submit this instance to the grid system for solution. splitting our GAMS code into several parts. The first
Instead of saving the solution values, we save only one will contain the complete model, data definition,
the handle when we retrieve the solution. and possibly the submission loop. Once all problems
parameter h(p) solution handle; have been submitted, the program will terminate. To
minvar.solvelink=%solvelink.AsyncGrid%; be able to inquire about the status of each instance
Loop(p, and collect solutions, we need to save the GAMS
ret.fx = rmin + (rmax-rmin)/(card(p)+1)*ord(p); environment for later restart and provide a known
Solve minvar min var using miqcp; place where we can find the solution information. The
h(p) = minvar.handle; ); GAMS parameters save and griddir (or shorthand
versions, s and gdir) need to be added when submit-
In the following collection loop, we set a real-time
ting the job. When using a Windows command shell
limit and mark the missed points.
the syntax is as follows.
Repeat
> gams ... save=<sfile> gdir=<gdir>
loop(p$handlecollect(h(p)),
xres(i,p) = x.l(i); The GAMS environment and the information in the
report(p,i,‘inc’) = xi.l(i); grid directory are platform independent. For example,
report(p,i,‘dec’) = xd.l(i); you can start submitting the jobs on a Windows plat-
h(p) = 0 ) ; form and continue your analysis on a Sun Solaris sys-
display$sleep(card(h)*0.1) ‘sleep some time’; tem running on SPARC hardware. After some time
until card(h) = 0 or timeelapsed > maxtime; we may want to check the status of the work. This
xres(i,p)$h(p) = na; task can be carried out by the following code.
The results of the model are now ready to be pro- parameter status(p,*);
cessed further by GAMS or passed on to some other acronym Waiting, Ready;
system for further analysis and visualization. No other status(p,‘status’) = Waiting;
parts of the GAMS program had to be changed. The loop(p$(%handlestatus.Ready%=handlestatus(h(p))),
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS 355

minvar.handle = h(p); In this section we give a prototype example of such


execute_loadhandle minvar; a decomposition, namely, an asynchronous Jacobi
status(p,‘solvestat’) = minvar.solvestat; method for a complementarity system.
status(p,‘modelstat’) = minvar.modelstat; The particular example we have in mind arises
status(p,‘seconds’) = minvar.resusd; from a large class of models that are widely used in
status(p,‘status’) = Ready ); applied general equilibrium analysis. These problems
display status; use realistic (possibly large-scale) data generated from
a suite of modeling tools (Codsi and Pearson 1988) for
To run the above program, we will have to restart
general and partial equilibrium models called GEM-
from the previously saved environment and provide
PACK. A key feature of these models is the avail-
the location of the grid information. A job submission
ability of a configurable data set that facilitates large
then may look like
global models to be built for studying many differ-
> gams ... restart=<sfile> gdir=<gdir> ent trade and policy issues. Horridge and Ruther-
The output of this run may then look like ford have created a number of utilities that faciliate
translating the data files from GEMPACK format to
---- 173 PARAMETER status GAMS format (http://www.monash.edu.au/policy/
gp-gams.htm).
solvestat modelstat seconds status These models provide a large number of inter-
esting, large-scale complementarity problems whose
p1 1.000 1.000 0.328 Ready solutions are used for issues such as tariff reform,
p2 1.000 1.000 0.171 Ready tolling, and economic policy evaluation. Many prob-
p3 Waiting lems are formulated as square systems of nonlinear
p4 Waiting equations, but in other cases, changes in activities or
p5 1.000 1.000 0.046 Ready regimes lead more naturally to formulations as com-
Once we are satisfied that a sufficient number of plementarity systems. Furthermore, these problems
instances completed successfully, we are ready to exe- often have a natural decomposition structure read-
cute the second part of our program using the same ily available because of underlying spatial or market
restart and gdir values. properties in the system being modeled. As such, this
The model solutions retained are stored in GDX problem class provides a rich source of problems of
containers and can be operated on like any other GDX various sizes and difficulties for our grid engine.
container. In large-scale applications, it may not be
feasible to merge all solution values. We only need to 4.1. Asynchronous Jacobi Method
extract certain ones. We outline how to implement various iterative
schemes for the solution of mixed complementarity
problems (MCPs). For simplicity, rather than detail-
4. Decomposition Approaches ing all the equations and inequalities that occur in the
Although it is clear that many model solution pro- general MCP setting that would obscure the point we
cedures can be enhanced via the use of embarassingly are trying to emphasize here, we provide a simple
parallel techniques implemented on a grid computer, example of grid solution of systems of linear equa-
there are an enormous number of problems that take tions formulated as an MCP:
too long to solve in a serial environment and do not
have such easy parallelization. The difficulties include parameter A(i,j), b(i);
the sheer size of the problems, a requirement of global * specific instantiation of data
or stochastic optimization guarantees, or a combina- variables x(i); equations e(i);
torial explosion in the search processes related to dis-
crete choices. e(i).. sum(j, A(i,j)*x(j)) =e= b(i);
In such cases, a modeler can resort to a problem de-
composition approach. The GAMS language is rich model lin /e.x/; solve lin using mcp;
enough to allow algorithms such as Benders’ decom- In the cited applications, each element of i might
position, column generation, Jacobi and Gauss-Seidel represent a region of the world, and each element of j
iterative schemes for systems of equations, and even a particular amount of good or a related price. The
branch-and-bound algorithms to be coded directly. complementarity condition is represented here using
Many of these approaches generate subproblems that e.x. When the problem size becomes too large, we can
can be solved concurrently, and the grid extensions of use a partitioning scheme where the model domain is
GAMS enable these subproblems to be solved on the split into a collection of nonoverlapping subdomains.
grid in such a manner. In our motivating GEMTAP example, the splitting is
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
356 INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS

based on trade boundaries, or loosely coupled mar- loop(iters$(res(iters) > tol),


kets, for example. A simple code to do this is outlined
loop(k, ! submitting loop
below, where we use the two dimensional sets active
x.fx(i)$fixed(k,i) = x.l(i);
and fixed to denote those variables that are part of solve lin using mcp; h(k) = lin.handle;
partition k and those that should be treated as fixed x.lo(i)$fixed(k,i) = -inf;
in partition k, respectively. x.up(i)$fixed(k,i) = inf );

sets k problem partition blocks / block_1*block_%b% / repeat ! collection loop


active(k,i) active vars in partition k loop(k$handlecollect(h(k)),
fixed(k,i) fixed vars in partition k; display$handledelete(h(k)) ‘could not remove handle’;
alias(kp,k); h(k) = 0 ); ! mark problem as solved
until card(h) = 0;
active(k,i) = ceil(ord(i)*card(k)/card(i)) = ord(k);
fixed(k,i) = not active(k,i); res(iters+1) = smax(i, abs(b(i) - sum(j, A(i,j)*x.l(j)))) );

A popular method for solving such systems is the Note here a couple of points. The GAMS grid
Gauss-Seidel method, whereby the problem is split option is used to spawn all the block subproblems in
into several blocks. Each block problem is solved the first loop. The second loop retrieves all the solu-
sequentially with the variables that are not in the tions, using the built-in merge procedure of GAMS to
block being fixed at their current values. An excel- overwrite the appropriate variable values. Although
lent reference for this method and the ones we outline such codes are easy to write with the extensions
below is Bertsekas and Tsitsiklis (1989). described in this paper, it is typically the case that
the Jacobi process is slow to converge (more so than
parameter res(iters) sum of residual
tol convergence tolerance / 1e-3 / the previous Gauss-Seidel scheme) because it does not
iter iteration counter; use the most up-to-date information as soon as it is
available.
lin.holdfixed = 1; ! treat fixed vars as constants
For this reason, an asynchronous scheme is often
x.l(i) = 0; res(iters) = 0; res(‘iter0’) = smax(i, abs(b(i))); preferred, and the GAMS grid option can be used to
facilitate such a process very easily.
loop(iters$(res(iters) > tol),
parameter curres intermediate residual values;
loop(k,
x.fx(i)$fixed(k,i) = x.l(i); lin.solvelink = %solvelink.AsyncGrid%;
solve lin using mcp;
x.lo(i)$fixed(k,i) = -inf; x.l(i) = 0; res(iters) = 0; res(‘iter0’) = smax(i, abs(b(i)));
x.up(i)$fixed(k,i) = inf ); iter = 0;
res(iters+1) = smax(i, abs(b(i) - sum(j, A(i,j)*x.l(j)))) );
loop(k, ! initial submission loop
Note several points here. The first is that because x.fx(i)$fixed(k,i) = x.l(i);
some variables are fixed, the holdfixed option of solve lin using mcp;
GAMS generates a model simply in the smaller block h(k) = lin.handle;
x.lo(i)$fixed(k,i) = -inf;
dimension space. Second, when the solution is read x.up(i)$fixed(k,i) = inf );
back into GAMS, a merge is performed, and hence
only the values of the variables that have been repeat ! retrieve and submit
loop(k$handlecollect(h(k)),
updated by the solver are changed. display$handledelete(h(k)) ‘could not remove handle’;
This process, while reducing the size of the prob- h(k) = 0;
lems being solved, may take large numbers of itera- iter = iter + 1;
curres = smax(i, abs(b(i) - sum(j, A(i,j)*x.l(j))));
tions to converge and is carried out in a serial fashion. res(iters)$(ord(iters) = iter + 1) = curres;
For parallelization or grid solution, an even simpler if(curres > tol,
technique is commonly used and is typically referred loop(kp$(h(kp)=0 and
smax(active(kp,i), abs(b(i) - sum(j, A(i,j)*x.l(j)))) > tol),
to as a Jacobi scheme. In this setting, each block prob- x.fx(i)$fixed(kp,i) = x.l(i);
lem is solved concurrently with the variables that are solve lin using mcp; ! submit new problem
not in the block being fixed at their current values. h(kp) = lin.handle;
x.lo(i)$fixed(kp,i) = -inf;
After all block subproblems are completed, the vari- x.up(i)$fixed(kp,i) = inf ) ) );
ables are updated simultaneously with the block solu- until card(h) = 0 or iter ge card(iters);
tions before the next iteration of the process is started.
Note that all the block subproblems are spawned
parameter h(k) handles; initially. After GAMS detects that a subproblem is
lin.solvelink = %solvelink.AsyncGrid%; solved, the results are retrieved, and the residual of
the system of equations is calculated. If this is not
x.l(i) = 0; res(iters) = 0; res(‘iter0’) = smax(i, abs(b(i))); small enough, then each block subproblem that is not
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS 357

currently running, but for which the residual in this 5.1. Basic Decomposition of the Feasible Space
block is large, is spawned. Note that if a current block There are various ways of decomposing the feasible
already has a small residual, a new subproblem is not region of a MIP with discrete variables LI ≤ xI ≤ UI .
spawned but may be started at a later time when a For example, the Hamming distance to a reference point,
different set of variables is updated. This process typ- enforced by an additional constraint, for MIPs with
ically converges much faster than the Jacobi scheme LI = 0 and UI = 1 provides an a priori decompo-
and in a parallel or grid environment outperforms the sition into I subproblems. The bisection of bound
(serial) Gauss-Seidel scheme shown previously. ranges of some discrete variables xI with I  ⊂ I trivially
The complete example from this section is avail- 
produces a decomposition of 2I  subproblems. Ferris
able in the GAMS model library and was success- et al. (2001) applied a few rounds of manual strong
fully run on all three of our grid engines, namely,
branching with a breadth-first-search node selection
as background processes on a multiprocessor desk-
strategy to generate 256 subproblems of the Seymour
top, using the Sun grid, or using the Condor resource
problem. These 256 problems were solved in parallel
manager. Furthermore, precisely the same scheme has
with CPLEX 6.6.
been used to solve multiple applied general equilib-
The advantages of a priori decomposition schemes
rium examples mentioned at the start of this section
and as reported by Rutherford and colleagues.2 are clear. The ratio between work and communica-
tion at the worker exceeds the one from simple par-
allel B&B algorithms. Off-the-shelf industrial strength
5. Solving Intractable Mixed-Integer MIP solvers can be used at the workers. Subproblems
arising from bound tightening allow for additional
Problems
round of MIP preprocessing that in general results
MIP has been a proving ground for parallel com-
putation for the last 15 years (Gendron and Crainic in a tighter relaxation and faster solution times. The
1994). Although most modern commercial MIP solvers problem with a priori decompositions is that the com-
support symmetric multiprocessing, or SMP, based putational effort required solving the subproblems
on shared memory, the early 1990s also saw imple- differs significantly. In some early experiments with
mentations of the branch-and-bound (B&B) algorithm the bisection method,3 up to 95% of the subprob-
on distributed memory systems, including academic lems were quite simple to solve, whereas the remain-
codes (e.g., Eckstein 1994), which lay the groundwork ing problems were almost as difficult as the original
for the PICO solver (Eckstein et al. 2001) as well as one. It is not obvious to determine the level of diffi-
commercial codes like parallel OSL on IBM SP2. The culty of a subproblem prior to solving it. One way of
tree search in the B&B algorithm is a clear invitation ranking subproblems is to look at the value of their
for massive parallel processing. The master process LP relaxation (possibly improved by extensive MIP
farms out the relatively expensive operation of solving preprocessing and cut generation). The design of the
the linear programs (LPs) at the nodes to the workers. branch-and-bound algorithm suggests that on aver-
Even though communication standards like PVM or age, the closer the value of the LP relaxation of a
MPI have developed over time, no commercial MIP subproblem is to the value of the root relaxation, the
solver supports a distributed memory environment longer it takes to solve the subproblem. This measure
today. One reason for the failure of this simple par- has its limitations: for example, a subproblem could
allelization scheme is the large volume of data com- have an LP relaxation value close to one of the root
municated between master and worker compared to relaxation, but the feasible region is extremely small;
the relative short solution times of the LPs at the
so exploring the subproblem will quickly terminate.
nodes.
Nevertheless, computational experiments show that
In this section we will describe a different method
subproblem generation according to this measure pro-
for solving MIPs on a grid using an a priori decom-
duces decompositions with subproblems of a similar
position scheme of the solution space. We use the
but reduced level of difficulty compared to the orig-
GAMS grid facility for managing the subproblems on
the grid. Moreover, we discuss some extensions to inal problem. Moreover, we can use a MIP solver to
the solver links that satisfy the need for communica- produce such decompositions without deeper knowl-
tion between the running jobs, making this a complex edge of the problem itself and the importance of the
example of grid-enabled optimization. discrete decision variables.

2 3
Rutherford, T. MPSGE (Section on Decompositon). http://www. OR/MS Today advertisement, February 2006. http://www.gams.
mpsge.org/mainpage/mpsge.htm. com/presentations/orms_condor.pdf.
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
358 INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS

5.2. Decompositions by Branch and Bound 0.6


Emphasis feasibility
At any time during the B&B algorithm4 the set of
Emphasis moving best bound
open nodes J = 1
  
k corresponds to a decom- 0.5
position of the relevant unexplored feasible region
of a mixed-integer program. The minimum5 of the
incumbent and the optimum objective value of all 0.4
subproblems defined by the open nodes J determines
the optimal solution of the original problem: zopt =
j 0.3
min zincb
z1opt
z2opt
  
zkopt , where zopt is the opti-
mum objective value of the subproblem correspond-
ing to the jth open node. 0.2
MIP solvers like COIN-OR’s CBC, CPLEX, and
Xpress provide strategies for variable branching (e.g.,
0.1
strong branching) and node selection (e.g., best
bound) that focus on improving the smallest (for min-
imization problems) LP relaxation value of all open 0
nodes. This value is also known as the best bound or
best estimate. Decompositions based on open nodes Figure 1 Relative Distance of LP at Node from Root LP
from B&B trees developed by such strategies tend
to produce subproblems with similar LP relaxation GAMS/CPLEX option dumptree, providing the tight-
values that are significantly reduced from the initial ened bounds of discrete variables for each of the
root relaxation value. Following the suggested rela- subproblems corresponding to the open nodes as
tion between the level of difficulty and the devia- soon as their number reached the specified value
tion of subproblem LP relaxation values from the n. If we want to further process the subproblems
root, the scheme will produce subproblems of equal on the GAMS language level, the variables and
but reduced level of difficulty. For example, Figure 1 their tightened bound values need to be provided
shows for model DICE6 (Bosch 2003, Gardner 2001) in the namespace of the original GAMS model and
the relative difference of the LP values at 16 open not in the internal namespace of CPLEX (usually
nodes from the root node LP value with different set- x1
x2
  ). When GAMS generates a model to be
tings for CPLEX parameter mipemphasis. passed to a solver, a dictionary is created that allows
Using the approach described above, we can use the mapping from the solver variable and constraint
any MIP solver for automatic generation of a pri- space to the GAMS namespace, and vice versa. The
ori decompositions that provides tree development GAMS/CPLEX option dumptree stores the tightened
strategies that focus on moving the best bound and bounds of the subproblems for offline processing
that give access to the open nodes during the B&B in separate GDX containers using the dictionary to
algorithm. Depending on the number of available determine the original GAMS namespace.
machines in the grid, we can stop the B&B algorithm After an initial solve of the original problem using
as soon as the number of open nodes reaches a spec- GAMS/CPLEX with option dumptree n, at most7 n
ified number n resulting in n subproblems. subproblems by the individual GDX bound contain-
ers are available for submission to the grid. The sub-
5.3. Submission to the Grid mission and collection is similar to the first example.
The following work has been implemented using Here, the data for different scenarios and subprob-
GAMS/CPLEX (Bixby et al. 1997), but as outlined in lems do not come from some external data source but
previous sections, a very similar implementation is get loaded from the GDX containers.
possible with COIN-OR’s CBC or Xpress.
The GAMS/CPLEX interface supports the branch- 5.4. Need for Communication Between Jobs
and-cut-and-heuristic facility (BCH) (Bussieck and The n submitted jobs run completely independent
Meeraus 2007) allowing GAMS to supply user cuts without communication before completion. Unlike
and incumbents to the running branch-and-cut algo- the first example, we are not necessarily interested
rithm inside CPLEX using CPLEX callback functions. in optimal solutions of all n subproblems. We just
A minor extension to this facility resulted in the need the best solution of all n subproblems. If we can

7
4 Before the subproblem gets dumped to a GDX container, CPLEX
We assume the reader is familiar with the B&B algorithm; other-
will solve the LP relaxation of the subproblem. In case the
wise, refer to textbooks like Nemhauser and Wolsey (1988).
node is integer feasible or infeasible, which would give a trivial
5
Throughout the section we assume a minimization problem. subproblem, the dumping of the GDX bound container is skipped,
6
See http://www.gams.com/modlib/libhtml/dice.htm. resulting potentially in less than n subproblems.
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS 359

determine that a subproblem will not provide the best use of special link libraries, which are incompati-
solution, we can terminate the job before it finishes ble with some third-party software (CPLEX in our
regularly. Assume the minimum of the incumbents of case). Hence, a vacated GAMS/CPLEX job needs to
all (running) jobs is zincb = mini=1

n ziincb . Obviously, be started from scratch that can result in a significant
we can terminate all subproblems for which the best amount of wasted CPU time.
bound is not smaller than zincb .
The BCH facility can be configured so CPLEX calls 5.5. MIP Strategy and Repartitioning
out to a user GAMS program whenever a new incum- The branch-and-bound algorithm works toward clos-
bent is found. In our case, the GAMS program com- ing the gap between the incumbent and the best
municates the new incumbent ziincb of subproblem i bound. Different variable and node selections (among
to the master job, which is the process of collecting other strategies like cut generation, use of pri-
results from the subproblems. If this new incumbent mal heuristics, etc.) emphasize the improvement of
is better than the current best incumbent zincb of all the incumbent or improvement of the best bound.
subproblems, zincb is updated and communicated to Although a sequential MIP solver needs to balance
all other subproblems. The value zincb is used as a its computational emphasis for moving the incum-
cutoff value instructing the solver to stop processing bent and the best bound, this is much easier in a
nodes whose LP relaxation value is larger than zincb . parallel setting. In addition to solving the n subprob-
If the best bound of a subproblem is larger than zincb , lems for which the MIP solver focuses on moving
processing of all open nodes will be stopped and the the best bound, we have one additional job solving
job will terminate immediately. the original problem with heavy emphasis on find-
Because CPLEX allows a running optimization pro- ing good incumbents. GAMS/CPLEX can be easily
cess to read a new option file, the updated cutoff instructed to place emphasis on improving the incum-
value is communicated to a worker via an option file. bent or the best bound by using the CPLEX meta-
The need to process this file could be triggered by option mipemphasis 1 (feasibility) or mipemphasis 3
the master sending a signal to the worker. This puts (best bound).
more work in the master that could limit scalability. In some harder cases, the partitioning strategy may
Instead, we implemented a simple polling scheme become inefficient on the grid machine in that most
that is performed by the worker that solely relies of the partitioned jobs have completed their execu-
on basic filesystem operations. This type of file-based tion, but a very small number of jobs continue to have
communication is straightforward to implement on a non-zero gap. In this case, it is important to fur-
grid systems with a shared file system. Although the ther repartition those jobs using the dumptree option
Condor system supports communication via a shared and to add the newly formed jobs to the list of out-
file system, requiring this resource will limit the num- standing tasks. These new tasks either augment the
ber of available machines significantly (because Con- pool of work that needs to be processed or replace the
dor runs on heterogeneous networks across the world job that was running. In some cases, the repartition-
that do not share a file system). However, Condor can ing can significantly reduce the overall computational
mimic the operations of a shared file system using a time. When should a repartition be carried out? One
utility called Condor Chirp. This utility ensures cer- trigger for this could simply be time (where a fixed
tain named files on the master file system are fetched time limit is set for each job), and another could be the
to appropriate workers, and vice versa. These files number of remaining jobs compared to the size of the
then act as signals to communicate information to and available grid. The first is easy to implement, and an
from the workers. example of the use of this dynamic repartitioning is
In the introduction, we discussed some specifics of given in Ferris et al. (2009). In the latter case, we could
the Condor system. Not all machines in the Condor prioritize the jobs for repartitioning based on the cur-
pool at the University of Wisconsin are dedicated rent value of the gap. We have not demonstrated this
machines; some of them are desktop workstations. in an application, but an implementation in GAMS
The Condor system ensures that the interactive user would require exactly the same features that we use
of the workstation does not suffer performance loss to write out an incumbent solution.
from background processes by vacating Condor jobs
as soon as a user starts an interactive session. Con- 5.6. Numerical Results
dor supports checkpointing of jobs, meaning that All the techniques described in the previous sections
a job frequently creates checkpoint files that allow have been implemented for the DICE model in a new
restarting from the last checkpoint and hence min- model called DICEGRID available from the GAMS
imizing wasted CPU time in case of a vacated job. model library. Although the DICEGRID model is of
Unfortunately, Condor’s checkpointing requires the educational and reference value, this approach has
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
360 INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS

also been used on a set of very difficult MIP prob- work indentifying particular structures and problem-
lems. A source of such MIP problems that are publi- specific enhancements is imperative.
cally available is MIPLIB (Bixby et al. 1992, 1998). The
computational experiments were carried out in 2006 5.7. Portability of Models
on the Condor pool at the University of Wisconsin, In August 2007, GAMS Development Corporation
on four problems from the latest version of MIPLIB and Sun Microsystems teamed up to provide GAMS
2003 (Achterberg et al. 2006) that were unsolved at the users access to the commercial grid computing facility
time: A1C1S1, ROLL30000, SWATH, and TIMTAB2. at Sun’s Network.com. We repeated the experiments
The following table contains relevant figures. with the simplest of the four problems, ROLL3000, in
this environment without the need for changing a sin-
A1C1S1 ROLL30000 SWATH TIMTAB2
gle line of GAMS code. We decomposed the MIPs into
256 subproblems (rather than 1,000 as for the Con-
Optimal solution 11
5034 12,890 467407∗ 1,096,557
Number of subproblems 1,089 986 1,001 3,320
dor pool). ROLL3000 finished in less than an hour of
Maximum CPU time 1599 12 — 1532 wall clock time consuming less than eight hours CPU
of individual job (h) time. The unchanged model for ROLL3000 was also
Wall clock time (h) 185 297 4240 1715
Total CPU time (h) 3
7003 509 8
1351 2
7447
solved on a four-core Sun SPARC Solaris workstation.
Fraction of wasted 67 06 428 131 The scheduling of the 256 jobs was left entirely to the
CPU time (%) operating system. ROLL3000 was solved in about two
CPLEX B&B nodes 19,21,736 400,034 22,458,649 17,092,215

hours of wall clock time using about six CPU hours
Not solved to optimality.
on the four processors. The large difference in compu-
tational effort for ROLL3000 between the experiments
Whereas the problems A1C1S1 and ROLL30000 on the Condor pool and the Sun grid and worksta-
were solved with no modification of our described tion can be mainly attributed to two facts. In 2006, on
approach, the TIMTAB2 problem required some addi- the Condor pool we used CPLEX 9, whereas the more
tional help. For TIMTAB2 we generated 3,320 sub- recent experiments on the Sun grid and the worksta-
problems and solved these jobs over a period of three tion used CPLEX 10. Second, the individual machines
days using (at times) over 500 processing units. These in the Sun grid (and the same holds for the four-
units included both Linux and Windows machines, core workstation8 ) are uniformly equipped with high-
some of which had a shared file system and some powered CPUs and sufficient memory.9 The quality
of which did not. Not only was this problem solved of the machines in the Condor pool is diverse and
to optimality, but a new solution (of value 1,096,557)
on average significantly worse compared to the other
was generated. The solution required not only the
two computing environments.
large computational resources from Condor but also
a collection of problem-specific cuts generated by
colleagues in Berlin (Liebchen 2006) for these types 6. Conclusions
of problems. It is important to notice that problem- In this paper we have shown a number of ways to
specific expertise, coupled with large amounts of com-
harness the power of a computational grid for the
puting resources, facilitated this solution.
solution of optimization problems that are formu-
The standard scheme failed to solve the SWATH
lated in a modeling language. The paper describes
problem. After almost a year of cumulative CPU
GAMS grid, a lightweight, portable, and powerful set
time, there were still 538 of the 1,001 problems unfin-
of extensions of GAMS specifically used for managing
ished. Even after repartitioning and adding user-
optimization solution strategies on a grid.
defined cuts, the problem remained unsolved after
The paper includes a number of expository exam-
over 17 years of CPU time of which 2/3 was
wasted, exploring over 1.2 billion nodes in about ples decribing the use of these features to imple-
four months of wall clock time. However, by under- ment both parallel algorithms and distributed solution
standing the problem structure (SWATH is a 20-node approaches within a number of important application
generalized traveling salesman problem with supern- areas. These examples (QMEANVAG, JACOBI, DICE-
odes involving additional constraints; see the GAMS GRID, SWATH, and an implementation of Dantzig-
model library for the original and improved formu- Wolfe (1960) decompostions DANWOLFE) are all
lation) and generating four rounds of subtour elim- available in the GAMS model library for download
ination constraints (resulting in 22 additional cuts),
the problem was solved to optimality within sec- 8
Sun Fire X2200 M2 Server with two 2218 processors (dual-core)
onds. Such experience cautions the applicability of with 16 GB of RAM.
pure brute-force methods consuming large amounts 9
The Sun grid consists of Sun Fire dual-processor Opteron-based
of computational resources and ensures that further solvers with 4 GB of RAM per CPU.
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS 361

(http://www.gams.com) and in the Online Supple- Censor, Y., S. A. Zenios. 1997. Parallel Optimization: Theory, Algo-
ment to this paper. We believe the simplicity and rithms and Applications. Oxford University Press, New York.
Chen, Q., M. C. Ferris. 2000. FATCOP: A fault tolerant Condor-
generality of the framework lends itself well to a
PVM mixed integer programming solver. SIAM J. Optimization
large number of business and research problems. The 11(4) 1019–1036.
design facilitates the use of all solvers and model types Chen, Q., M. C. Ferris, J. T. Linderoth. 2001. FATCOP 2.0: Advanced
currently available within GAMS on a grid engine. features in an opportunistic mixed integer programming
The framework uses a master-worker computa- solver. Ann. Oper. Res. 103(1–4) 17–32.
Codsi, G., K. R. Pearson. 1988. GEMPACK: General-purpose soft-
tional model that we believe is sufficiently scalable
ware for applied general equilibrium and other economic
and flexible for many optimization algorithms and modellers. Comput. Econom. 1(3) 189–207.
applications. It is already available in current releases Dantzig, G. B., P. Wolfe. 1960. Decomposition principle for linear
of GAMS and demonstrably useful for hard optimiza- programs. Oper. Res. 8(1) 101–111.
tion problems. As particular instantiations of these fea- Dolan, E. D., R. Fourer, J.-P. Goux, T. S. Munson. 2008. Kestrel:
An interface from optimization modeling systems to the NEOS
tures, we have described the use of GAMS grid and
server. INFORMS J. Comput. 20(4) 525–538.
the CPLEX solver in conjunction with grid facilities Eckstein, J. 1994. Parallel branch-and-bound algorithms for general
provided by the Condor resource manager or the Sun mixed integer programming on the CM-5. SIAM J. Optimization
grid engine to solve MIP problems that have evaded 4(4) 794–814.
solution by other means. Eckstein, J., C. A. Phillips, W. E. Hart. 2001. PICO: An object-
oriented framework for parallel branch and bound. Inher-
ently Parallel Algorithms in Feasibility and Optimization and
Acknowledgments Their Applications. Studies in Computational Mathematics, Vol. 8.
This work is supported in part by Air Force Office of Sci- Elsevier Science, Amsterdam, 219–265.
entific Research Grant FA9550-07-1-0389 and National Sci- Epema, D. H. J., M. Livny, R. van Dantzig, X. Evers, J. Pruyne.
ence Foundation Grants DMI-0521953, DMS-0427689 and 1996. A worldwide flock of Condors: Load sharing among
IIS-0511905. workstation clusters. Future Generation Comput. Systems 12(1)
53–65.
Ferris, M. C., C. T. Maravelias, A. Sundaramoorthy. 2009. Simulta-
neous batching and scheduling using dynamic decomposition
References on a grid. INFORMS J. Comput. 21(3) 398–410.
Ferris, M. C., T. S. Munson. 2000. Modeling languages and Con-
Achterberg, T., T. Koch, A. Martin. 2006. MIPLIB 2003. Oper. Res.
dor: Metacomputing for optimization. Math. Programming 88(3)
Lett. 34(4) 361–372.
Alba, E. 2005. Parallel Metaheuristics: A New Class of Algorithms. 487–506.
John Wiley & Sons, Hoboken, NJ. Ferris, M. C., G. Pataki, S. Schmieta. 2001. Solving the Seymour
Anstreicher, K. M., N. W. Brixius, J.-P. Goux, J. Linderoth. 2002. problem. Optima 66(October) 1–6.
Solving large quadratic assignment problems on computa- Foster, I., C. Kesselman, eds. 1999. The Grid: Blueprint for a Future
tional grids. Math. Programming 91(3) 563–588. Computing Infrastructure. Morgan Kaufmann, San Francisco.
Applegate, D., R. Bixby, V. Chvátal, W. Cook. 1998. On the solu- Fourer, R., D. M. Gay, B. W. Kernighan. 1990. A modeling lan-
tion of traveling salesman problems. Documenta Mathematica J. guage for mathematical programming. Management Sci. 36(5)
Extra Vol. 3(17) 645–656. 519–554.
Bertsekas, D. P., J. N. Tsitsiklis. 1989. Parallel and Distributed Compu- Gardner, M. 2001. The Colossal Book of Mathematics. W. W. Norton
tation: Numerical Methods. Prentice-Hall, Englewood Cliffs, NJ. & Company, New York.
Bisschop, J. J., A. Meeraus. 1982. On the development of a general Gendron, B., T. G. Crainic. 1994. Parallel branch-and-bound algo-
algebraic modeling system in a strategic planning environ- rithms: Survey and synthesis. Oper. Res. 42(6) 1042–1066.
ment. J.-L. Goffin, J.-M. Rousseau, eds. Applications: Mathemat- Goldberg, D. E. 1989. Genetic Algorithms in Search, Optimization, and
ical Programming Studies, Vol. 20. North Holland, Amsterdam, Machine Learning. Addison-Wesley Professional, Upper Saddle
1–29. River, NJ.
Bixby, R. E., E. A. Boyd, R. R. Indovina. 1992. MIPLIB: A test set Goux, J.-P., S. Kulkarni, J. Linderoth, M. Yoder. 2000. An enabling
of mixed integer programming problems. SIAM News 25 16. framework for master-worker applications on the computa-
Bixby, R. E., S. Ceria, C. M. McZeal, M. W. P Savelsbergh. 1998. tional grid. Proc. Ninth IEEE Sympos. High Performance Dis-
An updated mixed integer programming library: MIPLIB 3.0.
tributed Comput. HPDC9, Pittsburgh, IEEE Computing Society,
Optima 58(June) 12–15.
Washington, DC, 43–50.
Bixby, R. E., W. Cook, A. Cox, E. K. Lee. 1997. Computational
Grama, A., V. Kumar. 1995. Parallel search algorithms for discrete
experience with parallel mixed integer programming in a dis-
optimization problems. ORSA J. Comput. 7(4) 365–385.
tributed environment. Ann. Oper. Res. 90 19–43.
Bosch, R. A. 2003. Mindsharpener. Optima 70(June) 8–9. Grama, A., V. Kumar. 1999. State of the art in parallel search tech-
Bussieck, M. R., A. Meeraus. 2003. General algebraic modeling niques for discrete optimization problems. IEEE Trans. Knowl-
system (GAMS). J. Kallrath, ed. Modeling Languages in Math- edge Data Engrg. 11(1) 28–35.
ematical Optimization. Kluwer Academic Publishers, Norwell, Liebchen, C. 2006. Personal communication. (April).
MA, 137–157. Linderoth, J. T., E. K. Lee, M. W. P. Savelsbergh. 2001. A parallel,
Bussieck, M. R., A. Meeraus. 2007. Algebraic modeling for IP and linear programming-based heuristic for large-scale set parti-
MIP (GAMS). Ann. Oper. Res. 149(1) 49–56. tioning problems. INFORMS J. Comput. 13(3) 191–209.
Butnariu, D., Y. Censor, S. Reich. 2001. Inherently Parallel Algorithms Linderoth, J. T., A. Shapiro, S. J. Wright. 2006. The empirical behav-
in Feasibility and Optimization and Their Applications. Studies in ior of sampling methods for stochastic programming. Ann.
Computational Mathematics, Vol. 8. Elsevier Science, Amsterdam. Oper. Res. 142(1) 215–241.
Bussieck, Ferris, and Meeraus: Grid-Enabled Optimization with GAMS
362 INFORMS Journal on Computing 21(3), pp. 349–362, © 2009 INFORMS

Litzkow, M. J., M. Livny, M. W. Mutka. 1988. Condor—A hunter of Nemhauser, G. L., L. A. Wolsey. 1988. Integer and Combinatorial Opti-
idle workstations. Proc. 8th Internat. Conf. Distributed Comput. mization. John Wiley & Sons, New York.
Systems ICDCS, San Jose, CA, IEEE Press, Los Alamitos, CA, Ralphs, T. K., L. Ladányi, M. J. Saltzman. 2003. Parallel branch, cut,
104–111.
and price for large-scale discrete optimization. Math. Program-
Livny, M., R. Raman. 1999. High-throughput resource management.
The Grid: Blueprint for a New Computing Infrastructure. Morgan ming 98(1–3) 253–280.
Kaufmann, San Francisco, 311–337. Wright, S. J. 2001. Solving optimization problems on computational
Markowitz, H. M. 1952. Portfolio selection. J. Finance 7(1) 77–91. grids. Optima 65(May) 8–13.

You might also like