Professional Documents
Culture Documents
1993
Dolk, Daniel R.
Elsevier Science Publishers B.V.
files in a DBMS as application-independent, inte- accordance with the classical dichotomy of pro-
grable resources (via joins, for example), we can gramming languages (functional languages ex-
think of models as sharable resources which can cepted). Definitional integration corresponds to
be combined in ways unanticipated by their origi- schema integration while procedural integration
nal developers. Although 'models as data' is old corresponds to process synchronization. Each of
hat by now, examining model integration from a these is surveyed in detail. Taking an implemen-
database analogy perspective nevertheless yields tation perspective, we discuss the system require-
the following insights: ments of an integrated modeling environment
(1) Database theory (at least relational theory) (IME) which supports these technical concepts
is not sufficient to build an equivalent theory of Of model integration, and note the applicability
models which includes model integration. Al- of object-oriented concepts. We summarize by
though data management philosophy has pro- considering the theoretical perspective and sug-
vided a convenient metaphor for model manage- gest that research is needed to develop model
ment, database theory has been as frequently integration from a loose federation of database-
confounding as enlightening. Part of the reason related and programming language-related con-
for this is that the focus of model management is cepts into a robust theory of models which unifies
on the schema rather than on the relation. model definition and manipulation.
(2) All roads lead to object-oriented environ-
ments for implementation of integrated modeling
environments. This is not surprising since models 2. Organizational dimension: Strategic modeling
are complex data structures requiring complex
manipulations. However, object-oriented is not a The utility of modeling has all too often been
substitute for a theory which encompasses model circumspect in organizations because of the tech-
manipulation as well as representation. nological barriers models present and the subse-
We will discuss model integration from four quent reluctance of management to use them.
different dimensions: Organizational, defini- This is an ongoing battle which is continually
tional, procedural, and implementation. Initially, being fought by the operations research commu-
we take an organizational perspective to argue nity and to which model management can make
that effective strategic planning requires the inte- genuine contributions by improving the accessibil-
gration of models developed for specific func- ity and comprehensibility of models.
tional and operational applications. From a tech- Models are also underutilized because the or-
nical perspective, we view model integration in ganizational perspective from which they are de-
two dimensions: Definitional and procedural in veloped is often too limited, that is, at the func-
~NAGEMENT /~ MODEL
tional, operational level rather than at the control (3) a transportation model which determines the
and strategic levels. Models developed at the minimal cost of distributing the product to
operational level tend to stand in isolation and customers;
frequently don't provide the necessary informa- (4) a pricing model which calculates a price for a
tion for organization-wide planning. For example, product given demand volume, and produc-
a production model developed without considera- tion and distribution expenses; and
tion of the organization's accounting model may (5) a financial model which determines the rev-
maximize total output when it makes more sense enues and net income from sales of the prod-
to minimize unit cost. As one moves up the uct given demand volume, manufacturing and
organizational pyramid, modeling requirements distribution expenses, and product price.
must evolve towards strategic objectives which, in The relatively narrow view of a model devel-
turn, will place greater emphasis upon model oped at the operational level is often unable to
aggregation and integration (fig. 1). provide, or contribute to, the broader sensitivity
Consider a firm which has developed the fol- analysis demanded by upper management. Sup-
lowing independent models (this example has pose management asks the question "what effect
been adapted from [3]): will replacing two machines in the production
(1) an econometric marketing model which fore- process have on net income?", or alternatively,
casts demand in terms of sales volume for a "what will happen to revenues if demand for our
product for the next fiscal year; product softens as a result of decreased spending
(2) a discrete event simulation manufacturing by the Department of Defense?" No single model
model which estimates the required expense can provide the desired information. A response
to produce enough of the product to meet a to these 'what if' queries requires linking and
specified demand; running several or all of the models. Figure 2
initial price
~price
price [price'
[ -- -- ~ price prlce :=
| Conve[ge?| . } (price' +
price)/2
m expense I YES~ PrN~eI
-- "~price' l
volume i [d expense |
&FIN ~-
]
shows how these models might be interconnected modeling environments and languages must
conceptually by having the outputs of one serve somehow be linked. This represents a formidable
as the inputs to another. Price computation is programming challenge which may very well be
shown as an iterative process which will require prohibitively complex and expensive to imple-
multiple passes through the first four models to ment. The following sections discuss the founda-
achieve convergence. tions of an integrated modeling environment
Another ramification of building models from (IME) which can overcome this technological bar-
strictly a functional or operational perspective is rier to model integration.
that each model is likely to be developed as a
standalone tool in a separate software environ-
ment using different languages. For example, the 3. Definitional dimension: Schema integration
marketing model may be developed in SAS, the
production model in Simscript, the transportation There are two dimensions which must be con-
and pricing models in GAMS, and the financial sidered in the process of model integration: defi-
model in a spreadsheet. These separate software nitional (model representation) and procedural
environments with their unique languages further (model manipulation). Definitional integration
isolate models from one another and restrict their involves the logical linking of similar model rep-
integrability and potential utility. In order to sat- resentations whereas procedural integration con-
isfy the sensitivity analyses above, four different cerns the linking of processes to form operators
&MKT &MFG
i
P/a/
j%
u/a/ v/a/
r
PROD/pe/
",,/ PROD/pe/
&MFG MANUFACTURING
Fig. 3. Structured model genus graphs and associated schemas for the marketing and manufacturing models [16].
D.R. Dolk, J.E. Kottemann /Model integration 55
which subsequently manipulate these integrated tion as well. As a result, the dichotomy between
representations. These two dimensions will be definition and procedure is less of a problem with
considered in turn in the next two sections. this approach. Graph grammars provide a graph-
A necessary prerequisite for model integration based paradigm for model representation which
is that models be cast in some lingua franca so is especially effective for node-arc problems such
that they may eventually be joined. A large part as network analysis. This approach is also in
of model management research has been devoted concert with the trend toward graphical user in-
to the development of formal model representa- terfaces which now earmark contemporary oper-
tion schemes that facilitate this. Structured mod- ating system environments.
eling [15], logic modeling [26], and graph gram- These three approaches to model definition
mars [22] are three such formalisms. Structured are as much complementary as they are compet-
modeling shares a common ancestry with the data ing formalisms (see [7]). We will initially use
modeling approaches underlying database man- structured modeling to demonstrate definitional
agement, particularly the entity-relationship integration but will discuss contributions to model
model [8). Structured modeling goes well beyond integration from logic modeling and graph gram-
entity-relationship, however, and has particular mars as well. We assume the reader has some
relevance to applications in operations research familiarity with structured modeling; consult
and management science. Logic modeling origi- [15,17] for more details.
nates largely from artificial intelligence concepts To illustrate integration at the definitional
and relies (usually) on first order logic-not only level, we provide simple examples of the market-
for representation of models, but for manipula- ing and manufacturing models discussed in Sec-
&MKT MFG
z/f/ ,c v/K/
T
PROD/pe/
&MFG MANUFACTURING
Fig. 4. Structured model genus graph and schema for integrated marketing and manufacturing model.
56 D.R. Dolk, J.E. Kottemann / Model integration
tion 2, represent them as structured models, and The result of the graphical 'join' is reflected in
then integrate them as described in [16]. Figure 3 the model schema by first concatenating the two
shows structured model genus graphs and schemas, dropping the superfluous PROD para-
schemas for the two models. graph in &MFG and then deciding which of the
The genus graphs probably provide the most V paragraphs to retain. In this scenario, V is an
intuitive medium for understanding how to inte- output of the marketing model which serves as an
grate these two models, so we will describe the input to the &MFG model (see fig. 2), so V as
major steps involved in the integration process calculated in &MKT is what we need to retain.
from this perspective: Therefore we drop the in paragraph in the &MFG
(1) identify places where the graphs can be module from the integrated schema which now
'joined'; constitutes a valid structured model schema (fig.
(2) 'join', the graphs; 4).
(3) modify the resultant graph to maintain struc- Structured modeling was developed partly to
tural consistency; provide more information about a model than
(4) regenerate the associated relational schemata previous 'black box' representations that show
for ((4) storing the model's elemental detail only model inputs and outputs. Structured model-
(data). ing does not deal with inputs or outputs explicitly
The first step is perhaps the most important since these can be viewed as application-depen-
one in model integration. Identifying commonali- dent designations. Nevertheless it is often useful
ties between models where they can eventually be for variable correspondence determination to
'joined' is the problem of variable correspondence know which variables are inputs a n d / o r outputs.
as we describe in the section about procedural Output variables in structured modeling will usu-
integration. This involves discerning which com- ally be designated as either variable attribute (va)
ponents of models are really the same or which or function ( f ) elements. Model inputs usually
components can be made the same by simple will be fixed attribute (a) elements although it's
transformations. In our-simple example, PROD conceivable that primitive entity (pe) elements
(product) and V (sales volume) appear in both could also be inputs.
models although V is a calculated quantity in the The last step in the integration process is
&MKT model and a simple attribute in the determining the resultant relational schemata
&MFG model. from the new schema. Structured model genus
Variable correspondence can be considerably graphs form functional dependency graphs which
complex. Assume, for example, that the volume can be translated into relational schemata in third
in the &MKT model is expressed in units whereas normal form. The relational schemata for the
the volume in the &MFG model is in 100's of original marketing and manufacturing models and
units. The integration process must be able to the integrated model are as follows
recognize this and make the necessary conver-
(1) &MKt: PROD(prod_id, p, v);
sions to the underlying elemental detail to ensure
(2) &MFG: PROD(prod_id, u, v, e);
that the integrated model is dimensionally consis-
(3) &MKT_MFG: PROD(prod_id, p, u, v, e).
tent. Structured modeling has no features to sup-
port this critical aspect of integration. Typing Notice that, in this case, the schemata corre-
schemes to support model integration are dis- sponding to the integrated model could be formed
cussed below. as a view joining prod _id of the original relations
Assuming that PROD and V are resolved to (assuming they were named differently, of course).
be the same entities, this suggests a 'join' which As we suggested above, model integration in-
substitutes the &MKT genus graph for the troduces the need for typing schemes and inheri-
PROD ~ V link in the &MFG genus graph (fig. tance schemes to facilitate variable correspon-
4). Once two models have been 'joined' graphi- dence. In general, one may need rather extensive
cally, its then necessary to check whether the knowledge about a variable's type in order to
integrated model satisfies the structural proper- resolve two variables and subsequently integrate
ties of a structured model. This requires that we their associated models. This has led to some
look at the new model schema. interesting research in typing schemes to support
D.R. Dolk, J.E. Kottemann / Model integration 57
model integration. Bradley and Clemence [4,5] not the relational join but rather schema integra-
have developed a concept hierarchy typing calcu- tion. In other words, definitional integration in-
lus which assigns units, dimensions, and concepts volves 'joining' at the conceptual model level
to model variables. If two variables are similar in rather than at the relational level. This in turn
these three attributes, then they can be used to requires development of typing and inheritance
'join' models, perhaps through some intermediate schemes, much like the work currently being done
transformations. in object-oriented databases [32].
In the domain of logic modeling a similar
effort is underway. Quiddity is an approach to
typing which is broader in scope than the concept 4. Procedural dimension: Process integration
hierarchy but with the same objectives for model
integration [1]. By defining the quiddity of vari- Definitional integration is only one side of the
ables, their similarity and mergeability can be integration coin. Even if we can find robust ways
determined and implemented, if possible. to integrate the logical description of models, the
It is tempting to view model integration as a question still remains of how we manipulate this
direct corollary of the relational join. This is a newly created object called an integrated model.
naive approach, however. The appropriate As the previous section indicated, we are not
database analogy for definitional integration is dealing with so tidy a world as relational theory
PROCESS INTEGRATED_MODEL
# Declare models.
MODEL &MKT, &MFG, &DIST, &PRICE, &FIN
with its properties of completeness and transitive S O L V E command. For the time being, we will
closure. What then are the counterparts of rela- also assume that the & M K T and & M F G models
tional algebra and calculus which apply to model are more complex than shown in Section 3, al-
manipulation? This section attempts to lay a though the variable correspondence will remain
groundwork for this issue. the same.
One of the tenets of model definition is that There are several aspects of this M M L process
representation and manipulation are separable that bear mentioning:
functions. For example, a model representation (1) models are the basic objects being manipu-
should be as independent as possible from any lated; 2
solver(s) which eventually may act upon it. This is (2) variable correspondence is handled explicitly,
vital to the ultimate comprehensibility of models. for example the conversion of V O L U M E in
In the optimization world, model representation & M K T from units to 100's of units in &MFG.
has traditionally been tightly bound to the data This process can be done automatically with
structures required by solution algorithm soft- an appropriate typing scheme [5];
ware. This has resulted in models remaining rela- (3) S O L V E executes a process which solves a
tively inaccessible to the decision makers for model, for example, the statement " S O L V E
whom they were intended to benefit. & M K T U S I N G SAS" when executed, would
The separation of the modeling world into invoke the SAS program;
representations and solvers has significant ramifi- (4) S O L V E implies an underlying transformation
cations for model integration. If we connect two which converts the elemental detail tables
or more models at the logical level, we need to (the model's data) to the appropriate data
determine the corresponding action at the solver structures for the specified solver. For exam-
level. For example, consider a situation where we ple, solving & P R I C E requires that elemental
have separate transportation models for eastern detail tables be converted to G A M S format
and western regions and we want to integrate before the G A M S program is executed;
them into a national transportation model. In this (5) the order of processes is important. & M K T
case, which is one primarily of model aggregation must be executed before & M F G since & M F G
or homogeneous integration, the solver will be requires as input the V O L U M E output from
exactly the same for the national model as for &MKT;
each of the regional models. Only the logical (6) processes may be run in parallel. The
schemas have to be integrated. " S O L V E C O N C U R R E N T L Y & P R I C E ...
Our example from Section 2, on the other & T R A N S P " command is meant to indicate
hand, requires a more complex approach. In this that these two models could be solved in
case, the notion of using the same solver for the parallel, perhaps in two separate windows.
integrated model can be rejected out of hand The M M L in fig. 5 is a simplified version of a
because the models are fundamentally of such model integration control language (MICE) pro-
different types. What makes more sense in multi- posed by Kottemann and Dolk [24]. Besides vari-
paradigmatic, or heterogeneous integration, is to able correspondence and sequentiality, the M I C L
concatenate each of the model's solvers in roughly supports model (process) synchronization as well.
the same way as their schemas in order to derive Synchronization occurs when two concurrent pro-
an integrated solver. This solver is, in fact, noth- cesses must exchange variables during their re-
ing more than a process which controls the indi- spective executions. For example if the pricing
vidual processes corresponding to each model's model were geographically sensitive and the
solver. Figure 5 indicates how this might be ac- transportation model were price sensitivel it may
complished in a hypothetical language which we be necessary for the & P R I C E and & T R A N S P
can think of as a model manipulation language
(MML). Our assumption here is that we have 2 There are a number of manipulations which can be per-
structured model schemas for each of the five formed on models besides solve, e.g., evaluate, retrieve,
modify, create, etc. For the purposes of this discussion,
models (&MKT, &MFG, &TRANSP, & P R I C E , however, we will restrict the scope to the solve operation
and & F I N respectively), and a corresponding li- only. Extension of the concepts to other operations is
brary of solvers which can be invoked via a straightforward.
D.R. Dolk, J.E. Kottemann / Model integration 59
appealing prospect. In this respect, database (variables with the same semantics) and
analogies seem apropos, with model definition as homonyms (variables with the same names but
the counterpart of data definition and model different semantics) is possible if there is a suffi-
integration the equivalent of data manipulation. ciently powerful variable typing scheme such as
As discussed before, however, direct database concept hierarchy or quiddity in effect. The act of
analogies can be misleading. We review briefly 'joining' models can be done as well, both at the
the requirements which model integration imply graphical and schema levels. In the case of struc-
for an IME and suggest possible blueprints to tured modeling, an I M E could also inform the
building such a system. modeler of schema errors and inconsistencies ex-
One way of looking at IME requirements is to isting in the 'joined' schema. Once a semantically
determine the extent to which model integration and syntactically correct integrated schema was
can be automated. For example, if we want to developed, the IME could then generate the new
integrate two or more model schemas, what sup- relational schemata and offer the modeler the
port can an IME provide? A necessary condition choice of creating these new relations explicitly or
is a common model definition formalism and building views from existing relations.
associated language such as structured modeling From a process integration perspective, an
and SML [19] in which schemas over a wide class I M E supporting 'automatic integration would fa-
of models can be created and linked. An associ- cilitate the conversion of a process diagram such
ated requirement is that transformation facilities as fig. 2 into the integrated solver of fig. 5. The
be available to convert other model representa- degree to which this can be done fully automati-
tions to this common definition formalism. This cally depends on the complexity of the model
would provide interfaces to other external model- integration, particularly with respect to the diffi-
ing systems such as A M P L [14], for example, and culty of the variable correspondence involved and
would allow modelers familiar with these systems the degree of process synchronization required.
to work comfortably within the IME. Muhanna and Pick [29] have implemented such a
Even with a common internal model represen- system under a simplified set of assumptions
tation, it is unreasonable to expect a system to do which minimizes these difficulties. Their SYMM
the entire process of model schema integration system supports a graphical interface for repre-
without some human intervention. Some steps senting the model integration (similar to fig. 2),
can be reasonably achieved automatically, how- but there is no associated model manipulation or
ever. For example, identification of synonyms control language. In the general case, as we've
Table 1
IME requirements for supporting model integration and selected references
IME requirement Relevant research
Uniform internal model definition scheme capable of Geoffrion [15,17],Jones [22],Lee and Krishnan [26]
representing many classes of models.
Conversion of external model definition schemes into Maturana [28], Bhargavs and Kimbrough[2],
internal scheme Chari and Krishnan [7]
Robust typing and inheritance at both the variable Bradley and Clemence [4], Bhargava et al. [1],
and model level. Liang [25]
Model manipulation language based on message passing Muhanna and Pick [29], Kottemann and Dolk [24]
to support solver integration.
Model solution libraries with transformation routines Eck et al. [13], Maturana [28], Ramirez et al. [30]
for conversionof internal data structures to solver data
structures.
Graphical user interfaces and views for supporting Jones [23), Muhanna and Pick [29], Ma et al. [27],
model definition and integration. Greenberg and Murphy[20]
DBMS tools for model management. Dolk [11], Desai [10]
D.R. Dolk, J.E. Kottemann / Model integration 61
shown in Section 4, some form of model manipu- land [9] have implemented a debt management
lation language will be necessary, and useful, for system using the frame-based KEE TM environ-
specifying this form of model integration. ment and Desai [10] has proposed implementing
At the tool level, IMEs clearly require graphi- structured modeling representations in an
cal user interfaces for specifying and integrating object-oriented DBMS. Not surprisingly, the need
models, whether at the schema or the process for typing/inheritance schemes, message passing,
level. Jones [23], for example, has developed a and process coordination discussed in the previ-
graph-based modeling system wherein model rep- ous sections leads us directly into the object-ori-
resentation and integration are done entirely at a ented camp. We briefly describe our own pro-
graphical level. Again, however, the user is re- posal for an object-oriented 1ME which we call
sponsible for ensuring proper variable correspon- Communicating Structured Models (CSM).
dence across models. The DBMS is also a vital CSM is based upon structured modeling as the
tool needed by an IME for handling the complex internal definition-medium, concept hierarchies
data manipulation that earmarks large scale mod- as the variable and model typing scheme, and
eling and model integration. Finally, we should CSP as the process integration formalism. The
note that improvements in operating systems over basic idea is to develop an MML similar in struc-
the years may very well provide some of the ture to the hypothetical example in fig. 5, which
model integration features we've been discussing. will exist as a shell around SML. CSML (Com-
For example the Mach version of Unix, which is municating SML) will allow users to integrate
the host operating system for the NeXT com- models either schematically in the homogeneous
puter, provides advanced process communication case through graphical interfaces (similar to
capabilities which cQuld be adapted as the basis CASE tools) or procedurally in the multi-para-
for a message passing MML. digmatic case through an appropriate language
Table 1 summarizes some of the major re- syntax. CSML must support at least the following
quirements for an IME which result from model features:
integration. It's interesting to note that most of (a) the basic structured programming constructs
the research has been directed towards model of sequence, selection, and iteration;
definition with only tentative forays into the (b) demons;
model manipulation area. One anticipated bene- (c) embedded SML statements for model defini-
fit from thinking in a mode[ integration context is tion;
that the scope of model management research (d) parallel execution of processes;
will broaden to include this dimension. (e) transformation operators to solver data struc-
Another reason that model manipulation has tures;
been largely ignored can be traced to the 'object- (f) embedded SQL statements for data manipu-
oriented' phenomenon. Numerous authors have lation.
recognized the applicability of this design In short, CSML would require many of the char-
methodology to model management and the asso- acteristics of a discrete event simulation program-
ciated benefits of models as objects, solvers bound ming language such as Simscript TM but with hooks
to these models, and inheritance hierarchies. This to model schemas, solvers, and relational data as
has probably done as much harm as good in the well.
advancement of model management research. Implementation of CSML would itself consti-
Object-oriented approaches are too often used as tute a significant software integration effort, un-
an implementation panacea for sweeping difficult doubtedly requiring some existing object-oriented
conceptual and theoretical problems under the environment as a foundation. This only reinforces
rug. Model integration is one such problem. the link between IMEs and object-oriented con-
Having said this, it will perhaps appear contra- cepts. However, we reemphasize our view that
dictory to now claim that object-oriented environ- object-oriented is primarily an implementation
ments are promising implementation vehicles for choice for building modeling environments rather
IMEs. Nevertheless, the object-oriented paradigm than a substitute for model theory. Although we
has been shown as feasible and beneficial for have taken pains to describe the challenges of
building 1MEs. For example, Dempster and Ire- model integration independent of any particular
62 D.R. Dolk, J.E. Kottemann / Model integration
ence on System Sciences II1, IEEE Computer Society [24] J.E. Kottemann, and D.R. Dolk, Process-oriented model
(1990) 474-483. integration, Proceedings of the Twenty-First Hawaii In-
[14] R. Fourer, P.M. Gay, and B.W. Kernighan, A modeling ternational Conference on System Sciences Ill, (IEEE
language for mathematical programming, Management Computer Society Press, 1988).
Science 36, 5 (May 1990) 519-554. [25] T-P. Liang, Analogical reasoning and case-based learn-
[15] A.M. Geoffrion, An introduction to structured modeling, ing in model management systems, Forthcoming in Deci-
Management Science 33, 5 (May 1987) 547 588. sion Support Systems.
[16] A.M. Geoffrion, Reusing structured models via model [26] R.M. Lee, and R. Krishnan, Logic as an integrated
integration. Proceedings of the Twenty-Second Annual modeling framework, Computer Science in Economics
Hawaii International, Conference on System sciences, and Management 2 (1989).
IEEE Computer Society (1989) 601 6ll. [27] P. Ma, F.tt. Murphy, and E.A. Stohr, Design of a graph-
[17] A.M. Geoffrion, The formal aspects of structured model- ics interface for linear programming, Communications of
ing, Operations Research 37, 1 (January-Februa~' 1989) the ACM 32, 8 (1989) 996-11112.
30-51. [28] S. Maturana, Integration of a mathematical programming
[18] A.M. Geoffrion, Integrated modeling systems, Computer solver into a modeling environment, Anderson Graduate
Science in Economics and Management 2 (1989) 3-15. School of Management, UCLA, Los Angeles, CA (Oc-
[19] A.M. Geoffrion, SML: A model definition language for tober 1988).
structured modeling, Western Management Science In- [29] W.A. Muhanna, and R.A. Picks, Composite models in
stitute, UCLA, Los Angeles, CA (November 19891. SYMM. Proceedings of the Twenty-First Hawaii Interna-
[20] H.J. Greenberg, and F.H. Murphy, Views of mathemati- tional Conference on system sciences II1 (IEEE Com-
cal programming models and their instances, University puter Society Press, 1988) 418-427.
of Colorado at Denver, Denver, CO (May 1991). [311] R.G. Ramirez, C. Ching, and R.D. St. Louis, lndepen-
[21] C.A.R. Ltoare, Communicating sequential Processes denc and mappings in model-based decision support sys-
(Prentice-Hall, Englewood Cliffs, N J, 1985). tems, Forthcoming in Decision Support Systems.
[22] C.V. Jones, An introduction to graph-based modeling [3t] E. Shapiro, ed, Concurrent Prolog Collected Papers,
systems, Part 1: Overview, ORSA Journal of Computing Volumes 1 and 2 (The MIT Press, 19871.
2,2(19911) 136 151." [32] M. Stonebraker, and G. Kemnitz, The POSTGRES next
[23] C.V. Jones, An integrated modeling environment based generation database management system, Communica-
on attributed graphs and graph-grammars, Forthcoming tions of the ACM 34. 111(October 1991) 78-92.
in Decision Support Systems.