<html>
<head>
<title>
DL94: Digital Library as a Foundation for Decision Support Systems 
</title>
</head>

<body>

<!--#include virtual="/DL94/header.ihtml" -->

<h1>Digital Library as a Foundation for Decision Support Systems </h1> 
<p>
<p>
Sulin Ba[1],  Aimo Hinkkanen[2], and Andrew B. Whinston[1]<p>
<i>
[1] Department of Management Science and Information Systems, Graduate School of
Business, CBA 5.202, The University of Texas at Austin, Austin, TX 78712, USA,
{sulin, abw}@bongo.cc.utexas.edu<p>
[2] Department of Mathematics, University of Illinois at
Urbana-Champaign, Urbana, IL  61801,  USA, aimo@symcom.math.uiuc.edu</i><p>
<p>
<p>
<p>
<b><p>
Abstract</b><p>
Organizations often face complicated decision making problems.  As the
corporate knowledge is becoming more and more dispersed, there is a need to
analyze organization wide issues that incorporate a wide range of knowledge
representations and data types, and this can be supported by computers.<p>
The growth of distributed computing and the emergence of research on digital
libraries provide new insights to decision support systems (DSS) research.  In
this paper, we look at digital library from a business point of view:  what
services provided by a digital library would be particularly useful to
industries?  The digital library in this context is a repository of executable
documents that has different parts scattered around on different platforms
across the network.  We propose to utilize this digital library to build an
enterprise wide problem solving system that is based on executable documents
that contain knowledge represented in a mathematical form, given that a
considerable amount of company information is mathematical.  The system is
aimed to answer "what if" and "what to do" questions and to provide
explanations for the proposed approach.<p>
<b><p>
Keywords</b>: Decision support systems, digital library, document composition,
organizational decision support. <b><p>
<p>
<p>
<p>
1.  Introduction</b><p>
During the past 25 years, great progress has been made in research and
commercial applications of decision support systems (DSS).  Conceived
originally as the application of computing technology to support decision
making, DSS research focused on the implementation of tools from operations
research.  By allowing end users to state business problems in a higher level
language and use software to translate requests, build suitable models, access
required databases, integrate and execute models, and finally provide answers
to the user, decisions can be made in a more effective manner.<p>
While there are still many challenges and opportunities within this paradigm,
significant changes have taken place in the environment surrounding DSS so that
radically new approaches are required.  These changes are based on end-user
demand and technological changes.  Instead of focusing on highly structured
aspects of company operations that can be modeled using operations research
tools, there is a need to analyze organization wide issues that incorporate a
wide range of knowledge representations and data types.  Companies face complex
decision problems which must be resolved collectively by several individuals
and that involve multiple phases, including the specification of the problem,
discovery and support for alternatives, eventual solution and implementation.
There is, in effect, a complex process that underlies decision making that can
be supported by computers.<p>
The growth of distributed computing and the emergence of research on the
"digital library" provide new insights to DSS research.  The main purpose of
this project is to build an enterprise wide problem solving system that is
based on executable documents that contain knowledge represented in a
mathematical form.  The system is aimed to answer "what if" and "what to do"
questions and to provide explanations for the proposed approach.  The enormous
amount of information scattered around, no matter in what form, will be stored
in a digital library that serves as a repository of executable documents.  When
users pose a query, the system extracts all, and only, the relevant information
from the library, executes them, and returns an answer which is based on a
composite model assembled from different pieces of knowledge.<p>
Since mid-80's, electronic documents have become more powerful and more widely
available each year.  The term electronic document has come to encompass a wide
variety of knowledge forms including text (reference volumes, books, journals,
newspapers, etc.), illustrations, tables, mathematical equations, scientific
data, scanned images, video, voice, hypertext links, and animation [2].  In the
mean time, digital networks and the number of users are growing exponentially.
The massive information sources available on the network (i.e., the Internet)
have formed the basic ingredient of a digital library.  With the National
Science Foundation's initiative [13], research on digital library may yield
revolutionary results as to how knowledge is stored and disseminated.<p>
Organizational knowledge is very complex in terms of the level of details and
the level of analysis.  It is of heterogeneous types that could be dynamic or
static, qualitative or quantitative.  For example, while the knowledge base for
a company's production and inventory distribution system is largely
quantitative, some functional relationships in the same company could be
represented in a qualitative form: "If you increase partnership, productivity
will increase."  There could also be some statements which are expressed in the
logic form such as "Don't cut off service to an elderly customer before x
months."  Organizational knowledge is increasingly becoming more and more
distributed and heterogeneous, which makes the digital library concept valid in
an organizational setting.  In our earlier work [8], we have developed some
idea of using compositional modeling approach to model organizations that
contain heterogeneous knowledge (functional relationships, company specific
knowledge, internal empirical data stored in  databases, etc.).  The system is
aimed to predict or explain the performance of organizations, answering "what
if" and "what to do" questions in particular.  Taking the same idea, we can
develop an organization wide problem solving system that answers user queries
by assembling pieces of knowledge, which we call model fragments, that are
stored in our digital library.<p>
The digital library in our context can be thought of as a repository of
executable documents that has different parts scattered on different platforms
across the network.  The problem solving process can be tied to identifying
parts of documents as mathematical formulas (or groups of formulas) that can be
executed.  Each part is a model fragment that can be composed with others to
form an executable compositional model.  When users retrieve information from
the library as to a particular query, the system should be able to isolate the
relevant model fragments, execute them, and return a sufficient answer to the
user.<p>
However, the term digital library is far more complex than simply a set of
existing documents that are interconnected and digitized.  The capture of data
of all forms, the categorization and organization of electronic information in
a variety of formats, the browsing, searching, filtering, abstracting, and
summarizing techniques, the combination of large volumes of data, and the
utilization of distributed databases, are all in the realm of digital library
research.  Our research is based on all these but goes beyond this scope. We
propose to use digital library to do enterprise modeling, that is, to look at
digital library from a business point of view:  what services provided by a
digital library would be particularly useful to industries?  How could they
enhance competitiveness, time-to-market [6]?  What would happen to the
company's profit if it invests in a new product, etc.?  The main reason for
choosing mathematical documents as our first step in developing such a system
is that a considerable amount of company information is mathematical.  For
example, spreadsheets in accounting department, empirical operational data.
Logical type data and qualitative relationships are all in the realm of
mathematics.<p>
We propose to use Mathematica to represent electronic documents.  Mathematica
is a completely integrated software package capable of numeric, symbolic and a
wide range of graphical computation.  It offers a flexible structure for a
great deal of symbolic manipulations and numerical calculations.  In addition,
many of built-in functions in Mathematica can be used as building blocks to
create users' own customized programs, routines, or applications.  The notebook
interface of Mathematica allows users to mix unlimited amounts of text,
graphics, equations, and even sound into an organized, live, presentation
quality document.  This document can be saved, edited, and read by any other
computer having a notebook version of Mathematica.  In our approach, documents,
as well as formulas inside documents, are represented in the Mathematica
language.<p>
Some work has been done in developing mathematical software to integrate
numerical computation, mathematical typesetting, computer algebra  and
"technical electronic mail" (mail that contains formatted mathematical
expressions) [1].  The CaminoReal system developed in Xerox PARC can handle
direct manipulation of mathematical expressions, whether as part of technical
documents or as inputs and outputs of computational engines.  There are two
unique features in CaminoReal: first, the tight coupling of its computation
facilities with a sophisticated document system which opens interesting
opportunities for computed and interactive documents; second, its access to
computational "algebra servers" on the local network.  The framework we are
proposing is a step further in the sense that the compositional method can
greatly enhance the system's flexibility in answering user queries and
manipulating documents in a distributed fashion. <p>
In this paper, we intend to point out how digital library could be developed
and utilized as a foundation for enterprise wide decision support systems.  In
section 2, we discuss some major issues involved in organizing the documents
and maintaining consistencies in the library.  The algorithm design issues,
such as data representation and time scales, are discussed in section 3.
Section 4 focuses on the qualitative optimization aspect which is important for
a decision support system operating in a heterogeneous environment.  Section 5
concludes the paper.<b><p>
<p>
<p>
2.  Documents in Digital Library</b><p>
In our framework, we mainly focus on mathematical documents, that is, documents
that contain mathematical formulas and equations, and represent them using
Mathematica.  These documents will be interconnected in a way that relevant
ones for a particular query can be pulled out to form a composite model that is
sufficient to answer the query.  Some specific issues in organizing the
documents are discussed in the following subsections.<b><p>
<p>
2.1.  Heterogeneity of Documents</b><p>
An important issue that needs to be addressed is how to deal with the
heterogeneity of documents.  In most cases, documents have different formats,
which makes the interchange of documents a difficult task.  To make it worse,
the language that represents knowledge could be very different from document to
document.  Although we are only dealing with mathematical documents,
relationships inside these documents could be very diverse.  They could be of
logic form, qualitative form, or quantitative form.  For example, the logic
inference rules in an existing expert system may contain a series of Horn
clauses, whereas a spreadsheet is represented by cells containing numbers
and/or formulas.  Besides tables, formulas, and equations, there is always
discussion text, such as the origin of data, the analytical procedure, or the
method of data collection.  With the development of multimedia representation,
documents could include voice and animation as well.  These all increase the
heterogeneity of the documents in our digital library.  The question is how to
separate these different pieces in a way that they can later on be assembled to
answer queries in a suitable context.<p>
<b><p>
2.2. Composition of Documents</b><p>
One important question to be answered in this framework is, given a query, how
to decide which fragments to use and how to put them together.  In the modeling
process, it is crucial to focus on the relevant aspects of the problem of
interest, that is, to include all the relevant objects and constraints, exclude
irrelevant ones and ignore unnecessary details.<p>
The compositional modeling method we are proposing (for a discussion and an
example, see [5]) contains three levels.  First, some model fragments may be
combined as components before any query takes place at all, since these
fragments may often appear together with a meaning.  Therefore, they can be
combined as reusable building blocks and stored in the library independently of
query execution.  That is, those model fragments will simply be grouped
together and the combination takes place at the same time as the execution of
the whole model regarding a particular query.<p>
At the second level, after a query is issued, we need to find the appropriate
model fragments and/or components that are sufficient and consistent to model
the situation of interest.  A challenging issue is how can documents from
heterogeneous sources be found?  Some distributed index structures might be
needed to complete this task.  For each query, oftentimes, one model will
suffice.  However, multiple models may appear to be suitable or possible.  In
this case, some heuristics are needed to decide which model to take.  For
example, we could choose one that has the smallest number of
fragments/components.<p>
The execution of the model is the third level in combining a set of documents.
This is a rather complicated process since new types of combination may be
needed.  For example, the output of one fragment may be the input of another.
Some questions arise here:  should the execution be done sequentially or in
parallel?  how is the convergence of the execution ensured if it is done in
parallel?  These are challenging problems that have to be worked out.<p>
<b><p>
2.3. Maintaining Model Consistency</b><p>
Some research on digital library has been concerned with the integration of
documents in different formats that are created using different
hardware/software, which is also one of our concerns.  However, since we are
linking and executing documents, we have to be concerned with the integration
of the contents as well.  With all the model fragments in the library, we need
to find a way to isolate all the coherent and adequate composite models for a
particular query.  Coherent means that all the assumptions and user-posed
constraints are satisfied, whereas adequate means that the composite model is
able to answer this query, taking into account all the assumptions that need to
be included in the model.<p>
Since there is an enormous amount of information in the library that is
scattered across the network and faces constant update, it is inevitable that
inconsistencies exist, which will result in contradictory models.  Therefore,
it is crucial for the system to maintain consistency in each composite model.
The underlying rationale is that each model fragment has its own governing
assumptions, that is, the context, or the set of conditions, in which it holds.
[This is built on the idea of assumption-based truth maintenance systems,
developed by de Kleer in artificial intelligence area (see [4]).  However, we
will not discuss the technical details in this paper.]  When choosing model
fragments to be combined, these assumptions or constraints have to be
satisfied, i.e., the chosen model fragments have to be valid in that context.
We cannot combine model fragments which are inconsistent with each other.  For
example, when we do forecasting for sales, we have to take season into account,
which means that for a summer forecast, we need to use the data and
relationships that both hold in summer, while another set of data and
relationships is needed for a winter forecast.  The idea is that the model
fragments to be combined have to be carefully chosen so that they are
consistent with each other and satisfy the modeling assumptions and
constraints.<p>
While the system needs to maintain consistencies when choosing model fragments,
it also needs to give explanations for its answers, that is, for each answer it
returns, the problem solving system must identify responsibility for its
conclusions by providing rational explanations of how it reaches the
conclusions.  For example, it is not adequate for the system to simply tell an
engineer that his new design for an airplane does not work.  Instead, if the
system points out that no material will stand the projected stresses imposed by
the design, the engineer will have a way of going back and modifying the
design.  In other words, an enterprise wide decision support system must have
the capability of tracing what assumptions or constraints lead to the
conclusion.<p>
<b><p>
2.4.  "Cataloging" of Documents</b><p>
Traditional libraries use catalogs to organize their documents.  Each document
is assigned a classification number according to which documents are organized
and located.  These classification systems have an internal structure which
reveals the relationships between different categories and/or documents.  For
example, in the Library of Congress classification system, each category has a
classification number assigned to it.  Categories are arranged in a
hierarchical order by attaching more digits to a category than its super
category.  With the massive amount of documents in our digital library, we also
need some sort of "catalog" to describe the relationships between documents.  A
graph/tree structure may be needed to show the interrelationships and
dependencies among different pieces of documents (model fragments) which also
help decide how the documents should be combined.<p>
<b><p>
2.5.  Incorporating SGML</b><p>
One main strategy emerging for making documents computable across applications
and platforms is tagging languages.  So far, the most widely used tagging
language is SGML.  It is a document encoding mechanism designed to enable the
"markup" of information content of documents.  A basic design goal of the
Standard Generalized Markup Language (SGML) was to ensure that documents
encoded according to its provisions should be transportable from one
hardware/software environment to another without loss of information.  The
structure of documents therefore can be understood or interpreted by other
software applications that have SGML data interpretation capability.  SGML
provides a method for describing the relationship between the structure and the
content of a document.  It also enables documents to be stored and exchanged
independently of formatting, software applications, and computing platforms.<p>
The Electronic Publishing Special Interest Group has begun to refine its SGML
application for markup of complex mathematics and tables [14], which suggests
the incorporation of SGML in our problem solving system.  The notion of
document type definition (DTD) introduced by SGML enables documents to be
formally defined by their constituent parts and their structures.  For example,
a document designer might write a DTD that enables the analytical discussion of
a mathematical paper to be marked up as such.  The primary purpose is that the
text identified as forming part of a paper's analytical discussion can then be
organized in a particular way when the SGML source document is combined with
other SGML documents,  giving explanations for the particular model derived.
If, at some later date, it is decided that this part of analytical discussion
is useful in another context, it is easily done to combine it with other
documents by extracting this part of discussion.<p>
There are several reasons for proposing SGML to be used in our problem solving
system.  First, since the library operates in a heterogeneous environment
containing documents scattered on different platforms across a heterogeneous
network (i.e., a set of computers connected using different character encoding
schemes), SGML is well suited for documents interchanging in our context.
Second, as we mentioned in section 2.3, documents in the library need to be
constantly updated.  Documents in a structured, machine readable form are
easily modifiable by people.  Third, the compositional approach in our problem
solving system requires documents to be used in multiple ways and for multiple
purposes.  Consider a company's annual report that has lots of data in it as an
example.  These data might be used as input to a spreadsheet analysis.  The
spreadsheet, in its turn, can be presented in different ways.  Those data can
also be exported to a database system.  Finally, SGML guarantees documents to
be independent of the life-time of an application, which is an important
property for documents such as technical manuals [7].<p>
<b><p>
3.  The Design of Algorithms</b><p>
Algorithms and software are needed to achieve the composition of documents in a
suitable environment and format.  This includes two levels: the composition of
documents and the execution of documents.  Composition, as we mentioned above,
refers to choosing appropriate fragments and/or components.  Since we are
focusing on mathematical documents, the isolation of mathematics in each
document gives rise to problems that do not arise in "ordinary" digital library
because we need to link and execute the fragments.<p>
The execution of documents is part of the problem solving process in the
digital library.  Given a query of a suitable type/format, we need to put
together a model --- a combined virtual document in the spirit of digital
library --- for the user.  Finding appropriate fragments is the first step to
take.  We need to decide what types of documents can appear as fragments and
how to use Mathematica to represent these fragments.  During execution, some
issues that have to be addressed are:  how to move results from one fragment to
interact with another?  how to combine results from several fragments?<p>
There may be many suitable models for a particular query.  Some criteria need
to be employed to automatically choose one from them.  After a model has been
fixed, the system needs to decide in which order to execute fragment documents,
in which order to put intermediate results together.  Another very important
aspect to be considered is that the mathematical expressions and data in
documents might be in quite different forms, e.g., qualitative, quantitative,
or logical.  Some documents may even contain incomplete information.  There are
existing tools and algorithms that handle quantitative information well (e.g.,
GAMS, a linear, quadratic, integer mathematical programming system that has
been used to solve real industrial problems of respectable size).  However,
when different forms are intertwined, i.e., there is qualitative data,
quantitative data, and logical type data as well in one model, the execution of
the model becomes a difficult endeavor.  We need to have a formal way of
integrating different data representations.  These all have to be considered
when designing the algorithm.  We propose to develop this algorithm based on an
existing one, Rules-Constraints-Reasoning (RCR).  Developed by Kiang et al.
[10], RCR is a method of reasoning with imprecise knowledge that is aimed
mainly at discrete dynamic systems.  It proposes a model representation which
is essentially an interval-based abstraction of difference equation systems.
(See [10] for a detailed and formal description.)  First of all, we will
discuss some data representation problems that are important to the design of
algorithms.<p>
<b><p>
3.1.  Data Representation</b><p>
Suppose that data in some documents are represented in a monotonic form, for
example, a certain variable is increasing or decreasing on a certain interval.
A system taking this form of data will give its conclusions also in the form of
monotonicity.  Typically, there will be a huge number of possibilities for the
resulting dynamic behavior, many of which may never occur in a practical
situation.  Therefore, the RCR algorithm uses numerical data representation
dealing with sets (described by means of finitely many numbers), rather than
listing all the states of behavior.<p>
However, there may be situations where it is desirable to incorporate into the
model description not only numerical data but logical type data as well.  The
question arises as to how to best handle such data.  In recent years, a
connection has emerged between logical deduction and integer programming (cf.
[9]).  By formulating problems where one needs to satisfy a set of logical
conditions, as problems of solving a set of inequalities in integers, one can
profitably solve the problem in a relatively brief period of time by certain
integer programming methods.<p>
The RCR algorithm deals with qualitative reasoning when both the initial
information and certain specifications of the system may not be completely
known.  The uncertainty is expressed by saying that, at a given time, a
quantity, whose value need not be exactly known, belongs to a set of a suitable
type.  For computational purposes, the set should be describable by finitely
many parameters.  So, for a subset of the set of real numbers, unions of
finitely many intervals seem suitable.  In practice, intervals are used so far.
The algorithm then amounts to a propagation of intervals, starting with those
initially given. [This is not the same as the so called interval arithmetic.
The idea of propagating information using intervals has also been considered by
Davis [3] mainly in connection with certain linear systems.]  The interval
obtained for a quantity at a given time then gives the best information about
that quantity that can be obtained.  The accuracy may be improved by using more
than one interval, creating an issue of how much gain can be expected from
making the computations longer and more complicated.  This algorithm does not
incorporate logical (non-numerical) information at this point.<p>
It does not seem practical to produce all possible behaviors indicated by
logical-type variables.  Suppose we have a situation where ten intervals and
two variables are studied.  During each time interval considered, a logic
variable can be true or false.  There are 4[10] = 1,048,576
possible ways for this behavior to occur.  Even if 99% of them are ruled out,
more than 10,000 still remain.  It seems doubtful that a human being inspecting
the answer would be able to comprehend such an answer.  Therefore, asking a
question like "What are all the possible states of qualitative behavior that
could arise from a given set of assumptions" does not seem very useful.  It
might be more practical to ask what the final result is after a certain period
of time, in other words, if a statement is true or false at that time.  Looking
at the problem this way produces not only the advantage of cutting down the
length of output, but also allows the kind of logical reasoning that could
apparently be phrased in terms of integer programming.<p>
We need to develop a unifying data representation language which can represent,
in addition to precisely known numerical and logical data, qualitative
knowledge and incompletely known data.  This would include knowledge that is
currently represented using various languages in logic.  It seems that, in
order to incorporate statements of logical type, it will be desirable to code
all such information numerically, so that ultimately, the computational system
will only have to deal with numbers.  This should allow us to tie the numerical
and logical components of the system together, making the ranges of certain
quantities or the bounds for certain functions dependent on the truth values of
certain logical variables.  The description of logical conditions should be
attainable in many ways, and many possibilities could be allowed in the system
for added flexibility.  For example, in circumscription [12], logical
conditions can be described by giving some main rules and then a number of
exceptions.  This is analogous to describing upper and lower bounds for
functions by formulas that involve something simple most of the time, but a few
different function forms at a number of "exceptional" intervals.<p>
When introducing logic variables, we need to incorporate variables that take
only integer values (possibly only 0 and 1 as in the case of logic), and
therefore deal with the case when the most natural type of set containing some
possible values of a variable is not a full interval of the real axis, but a
discrete set.  Data representation involving both numerical and logical (say 0
or 1 valued) or more generally integer valued variables could be handled using
Cartesian products or finite unions of Cartesian products in case of several
variables and many conditions.  Thus, the case when a logical variable takes a
value 1 and simultaneously, or conditionally on the logical variable being
equal to 1, a numerical variable is known to lie on the interval [4, 6], would
be expressed using the Cartesian product {1} x [4, 6] (if the logical variable
can be 0 in the particular situation considered and is 0, the numerical
variable might belong to another interval, say [3, 5], and the whole thing
would be expressed using ({0} x [3, 5]) [[union]] ({1} x [4, 6]).<p>
One of the challenges is to make sure that we can smoothly integrate genuinely
numerical methods with those of integer programming in order to deal with
logical information in a unified fashion.  It is worth noting that one way to
deal with integer information is to consider the corresponding "continuous"
problem for real numbers and check at the end of the computation if any integer
(or 0 or 1 valued) solutions were generated.  Many methods of integer
programming proceed using this idea.<p>
Another important issue is the development of theoretical principles needed to
create a deductive system that could be used to start with given system
specification and initial conditions expressed in terms of the unifying
language, and deduce the state of the system at a future time.<p>
<b><p>
3.2.  Time Scales and Compositional Modeling</b><p>
We recognize that it is an integral problem to find ways of organizing the
knowledge used in each individual problem, as it may consist of a large number
of pieces of information that are to be collected together and used in a
coherent way.  We need an algorithm that allows information propagation over
time, from an initial moment of time to some future moment, and draws the best
possible inferences from this mechanism.  One aspect of the algorithm design
then is to devise ways of incorporating both numerical and logical data into
such a system.<p>
Some issues related to the model building process, which thus need to be
considered in the algorithm, are time scales and compositional modeling.  Note
that when applying an algorithm to solve a problem, it is assumed that a model
describing some real situation has been developed, satisfying suitable criteria
so that the system description is in accordance with algorithm specifications.
There is, however, the question of how to model huge, complex enterprises or
other organizations, involving perhaps thousands of variables and relations.
This is of interest not only for model building but also for computational
issues: how to organize the computations most effectively when so many
variables and equations have to be potentially taken into account?<p>
As we mentioned in section 2, the approach we intend to follow is based on the
idea of compositional modeling.  This means that one forms model fragments from
the different pieces of data and descriptions of various parts of the system,
each relating together perhaps only a few of the many variables.  A suitable,
generally rather small, number of fragments is joined together to form
so-called components.  There are then a great number of interrelationships and
dependencies among the components.  However, to solve a specific problem, one
may only need to deal with a limited number of variables and hence only
relatively few components.  We, therefore, need to develop a systematic way of
determining the minimal number of components, consisting of all, and only, the
relevant ones to perform analyses.<p>
Another issue is the question of time scales.  In most real life situations,
there are many different processes at work, proceeding at different speeds (for
an example related to medicine/biology, see [11]).  Furthermore, the points of
view that analysts want their models to reflect, may depend on daily, monthly,
quarterly, or annual changes or updates.  In a complex organization, the type
of model fragments would presumably include very different time scales.  Thus
there is not only the question of separating or interconnecting such fragments
or components, but of designing the algorithm as well in such a way that it
would take time scales into account, in case of handling a problem that
requires the input of several time scales.<p>
<b><p>
4.  Optimization</b><p>
Suppose that we are using the RCR algorithm to predict the behavior of a
partially known dynamic system.  One of the variables, <i>L</i>, is initially
restricted to an interval [<i>c, d</i>].  Consider a variable <i>M</i>, which
may or may not be the same as <i>L</i>, at a given time <i>t</i>.  The
algorithm will predict that then the values of <i>M</i> will lie in an interval
[<i>a, b</i>].  Now, the numbers <i>a</i> and <i>b</i> will be obtained by
applying the RCR computations and they depend on the values of <i>c</i> and
<i>d</i>.<p>
Instead of considering <i>c</i> and <i>d</i> to be given and fixed, we now
assume that they depend on an external variable <i>x</i> (which is not part of
the model so far).  Then all quantities of the model that were obtained by
using <i>c</i> and <i>d</i> become functions of <i>x</i>.  Hence the question
arises as to how to optimize the values of <i>M</i> at time <i>t</i>.  In an
application to economics or management, the variable <i>M</i> might, for
example, be related to profits, income, or the stability of the enterprise.<p>
To optimize the interval [<i>a, b</i>], one could obviously use many
quantities.  Suppose that we want to make the lower bound <i>a</i> as high as
possible.  In other words, we want to maximize <i>a</i> as a function of
<i>x</i>.  The calculation that yields <i>a</i> as a function of <i>x</i> is
usually so complicated that it is not feasible to expect to find an explicit
formula for <i>a</i> in terms of <i>x</i>.  Here is the challenge of finding
computational procedures to solve this extremal problem.<p>
As we suggested in the last section, logical variables would be involved in
many cases.  Functions/quantities to be optimized, as described above, could
depend on logical variables.  In general, they would be expected to depend on
both numerical and logical variables, resulting in combined or hybrid
optimization.<p>
The current directions taken by businesses offer motivation for the study of
qualitative optimization.  Many corporations are interested in or have
undertaken enterprise modeling, to keep track of the development of the
organization, financially and otherwise.  Effectiveness demands the setting of
concrete corporate/organizational goals.  To achieve them, different lines of
strategy can be considered.  Alternative states of planning and behavior arise
and must be determined.  For example, the company may be simply asking whether
or not a certain loan should be taken out, and this reduces to a question
involving a logical variable.<p>
Thus, there could be a number of logical and numerical variables whose values
are to be chosen when the senior management plans the future of the
organization.  We can then use the (usually incompletely known) system and
optimize some given quantity at a future time to determine the best
configuration of logical and numerical variables to be chosen at the present
time.  When constructing models like this for a large organization, the number
of variables and functions involved in the model will obviously be very large
and it will often be so complicated that it would prove difficult for someone
to make decisions on a "common sense" basis.  Therefore, it is important to
have a computational system to assist in decision making, by giving the best
conclusions that can be obtained, in view of the incomplete information given
to the model, and to make choices that will optimize key quantities chosen with
regard to the corporate goals.<p>
Therefore, the theoretical development of an optimization approach to
qualitative systems and the theoretical development of the foundations of
robust optimization (optimization of a quantity in an incompletely known
system) are needed.  Logic variables could also be used as control variables
with respect to which a given quantity in the system is to be optimized, the
final result being a hybrid situation where the control variables can be of
numerical or logical type, and where the concept of optimization might include
a combination of optimizing a numerical quantity, and taking into account some
logical variables for which a configuration deemed to be optimal is sought.
This involves defining a preferential order among the configurations that one
may obtain for such variables.<p>
A simple example of qualitative and logic optimization would be able to show
the flavor of the issues that we mean.  Consider the financial situation of a
company, described by the variables Cash <i>C(t)</i>, Sales <i>S(t)</i>,
Inventory <i>I(t)</i>, Profit <i>P(t)</i>, Debt <i>D(t)</i>, and Stability
<i>X(t)</i>, which is defined in terms of other quantities.  Also there is the
logic variable Raid <i>R(t)</i>, which is true if, at time <i>t</i>, the
company buys another company of a size whose qualitative description is assumed
to be given.  We assume that the model gives equations or inequalities
describing how these quantities are related to each other and in particular,
how the condition <i>R(t)</i> = 1 will affect the other quantities.  Let us
assume, for example, that we try to maximize the midpoint <i>q</i> = (a+b)/2 of
the interval [a, b] containing the possible values of <i>X(5)</i> (stability
after 5 years from the starting point).  As control variables that may be
varied in this experiment, we may take, for example, the limits for the
intervals containing <i>I(0), R(0),</i> and perhaps some or all future values
of <i>R(t)</i> for 1 &lt;= t &lt;= 4.  Now <i>q</i> will depend on both
numerical and logic variables.  If we generalize and expand the model, and
define <i>X</i> to depend also on some logic variables, then both the control
variables and the variables to be optimized would involve a mixture of
numerical and logic variables.  Thus, we need an algorithm able to handle such
a very general situation.<p>
A rather crude view of optimization is the following.  There are a number of
different scenarios to be studied that correspond to different initial values
for certain variables.  If there were only finitely many scenarios, then, at
least in theory, one could perform all the calculations for each of them
individually, and then compare the final results and check which one appears
the most satisfactory.  Thus one answers a whole series of "what if" questions.
We intend to develop a more sophisticated method than that, that would answer
such questions without having to do the calculations for each scenario
separately but provide the inference power to arrive at the answer via a method
which at this point is at the level of calculus-type analysis, but which we
hope to improve further to accommodate techniques to reduce the branching that
occurs when different alternatives for the formulas used in functional
definitions are encountered.<p>
<b><p>
5.  Conclusions</b><p>
Research in digital library should go beyond the scope of categorizing and
organizing the electronic information, and searching and filtering large
volumes of data.  The tremendous amount of information scattered across the
network is a powerful source for modern organizations to capture their
strategic and tactic advantages in the tough competition in today's world, if
they fully utilize this information.  This provides the main motivation for the
idea of using digital library as a foundation for an enterprise wide decision
support system.  However, as we discussed in this paper, many problems have to
be solved before such a system could be put in use, which is a great challenge,
but a great opportunity as well.<b><p>
<p>
<p>
<p>
References</b><p>
[1]	Arnon, D., R. Beach, K. Mcisaac, and C. Waldspurger (1988) "CaminoReal: an
interactive mathematical notebook."  In <u>EP88: Document Manipulation and
Typography.</u>  Proceedings of the International Conference on Electronic
Publishing, Document Manipulation and Typography.  Nice (France).  Apr. 20-22.
Ed. by J.C. van Vliet.  Cambridge Univ. Press.  Cambridge.<p>
<p>
[2]	Bier, E. A. and A. Goodisman (1990) "Documents as user interfaces." In
<u>EP90</u>: Proceedings of the International Conference on Electronic
Publishing, Document Manipulation &amp; Typography.  Gaithersburg,  Maryland,
September.  Ed. by R. Furuta.  Cambridge University Press.  Cambridge.<p>
<p>
[3]	Davis, E. (1987) "Constraint propagation with interval labels."
<u>Artificial Intelligence</u>. 32: 281-331.<p>
<p>
[4]	de Kleer, J. (1986) "An assumption-based TMS." <u>Artificial
Intelligence.</u>  28: 127-162.<p>
<p>
[5]	Falkenhainer, B. and K. D. Forbus (1991) "Compositional modeling: finding
the right model for the job."  <u>Artificial Intelligence.</u>  51: 95-144.<p>
<p>
[6]	Fox, E. A. (1993)  Source book on digital libraries.  Virginia Tech.<p>
<p>
[7]	Herwijnen, E. van (1990)  Practical SGML.  Dordrecht/Boston/London: Kluwer
Academic Publishers.<p>
<p>
[8]	Hinkkanen, A., K. R. Lang, and A. B. Whinston (1993) "On the Usage of
Qualitative Reasoning as Approach Towards Enterprise Modeling."  forthcoming in
<u>Annals of Operations Research.</u><p>
[9]	Hooker, J. N. (1988) "A qualitative approach to logic inference."
<u>Decision Support Systems</u>.  4: 45-69.<p>
<p>
[10]	Kiang, M. Y., A. B. Whinston, and A. Hinkkanen (1993) "An interval
propagation method for solving qualitative difference equation systems."  in
<u>Qualitative Reasoning and Decision Technologies.</u>  ed. by N. P. Carrete
and M. G. Singh.  International Center for Numerical Methods in Engineering,
Barcelona,  Spain.<p>
<p>
[11]	Kuipers, B. (1988) "Qualitative simulation using time-scale abstraction."
<u>Artificial Intelligence in Engineering</u>.  3(4): 185-191.<p>
<p>
[12]	Lifschitz, V. (1993) "Circumscription."  manuscript for a chapter in
<u>Handbook of Logic in AI and Logic Programming.</u>  University of Texas at
Austin.<p>
<p>
[13]	National Science Foundation:  Request for Proposals on Digital Libraries.
1993.<p>
<p>
[14]	Wright, H. (1992) "SGML frees information."  <u>Byte.</u>  June:
279-286.<p>
<p>
<!--#include virtual="/DL94/footer.ihtml" -->
</body></html>