You are on page 1of 8

ACTIVE, A PLATFORM FOR

BUILDING INTELLIGENT OPERATING


ROOMS
D. GUZZONI1, C. BAUR1, A. CHEYER2

1
VRAI Group – EPFL – 1015 Lausanne – Switzerland
2
AIC – SRI International – Menlo Park, CA – USA

Today computers are part of the Active based assistant for endoscopic
standard equipment of modern surgery neurosurgery is under development.
rooms. They assist surgeons in Using natural modalities such as speech
performing complex procedures that recognition and hand gestures, it enables
would not be possible otherwise. surgeons to interact with computer
However, despite the availability of based equipments of the operating room
more powerful and complex computer as if they were full active members of
systems, their user interfaces have not the team. In a broader context, Active
been adapted to fully leverage their aims to ease the development of
potential. A new type of software, intelligent software by making required
behaving as an independent intelligent technologies more accessible. It will
assistant, is needed to better assist help foster research innovation, easier
surgeons and their staff. Building an development cycle and deployment of
intelligent assistant is a difficult task that this new type of applications.
requires expertise in many fields ranging
from artificial intelligence to core
software and hardware engineering. We
INTRODUCTION
Although computer systems have grown
believe that providing a unified tool and
in power, access more networked
methodology to create intelligent
content and services, computer
software will bring many benefits to this
interfaces have not changed.
area of research. Our solution, the
Conventional user interfaces with simple
Active framework, introduces the
direct manipulation commands are no
original concept of Active Ontologies to
longer sufficient to fully leverage such
model and implement intelligent
rich and dynamic environment [1]. The
applications. Based on suggestions and
medical field is no exception.
constant evaluations from surgeons, an
Figure 1 : Active Editor

Computers are now part of the standard Reaction, decision making strategies and
equipment used in modern surgery complex task execution are the
rooms. To fully leverage this new responsibility of planning systems.
context, modern software systems Finally, as planning unfolds various
should behave as intelligent assistants actions are taken by the system. Based
able to observe and sense their on their nature and purpose, intelligent
environment, for instance human inputs, systems act through a wide range of
to analyze a situation by mapping input modalities. They communicate with
senses into a model of what tasks and humans, gather information or
events may be happening [2]. They physically change their environment.
would then understand and anticipate Designing and implementing intelligent
what the user might need to finally act assistants software is also a difficult
to produce relevant and useful task.
behaviour. The development of Due to the variety and complexity of
intelligent assistants requires expertise technologies required, intelligent
in many ¯fields [3]. assistants are made of a collection of
Perception of human activities is components written in many different
typically based on techniques such as programming languages. Connecting
computer vision or speech recognition. various heterogeneous programs,
Understanding the meaning of input sometimes remotely, requires strong
signals, is performed by natural technical knowledge and careful
language processors, dialog systems or deployment policies. Testing and
activity recognition mechanisms. debugging distributed heterogeneous
systems is also a complex task. To information and services to help users
identify and correct bugs, events and with complex tasks [5]. Scheduling
associated values need to be tracked meetings, managing an agenda and
from one component to another. Finally, communicating also represent
combining many different approaches, applications where intelligent assistants
tools and technologies limits the overall are relevant [6].
performance and extensibility of the Intelligent assistant are also relevant in
system. the domain of heterogeneous smart
We believe that providing a unified tool spaces, instrumented rooms able to
and methodology to create intelligent sense their environment and act upon
software will solve many of the events and conditions. In the surgical
problems described above and bring field, modern operating rooms are
many benefits to this area of research. It becoming such smart spaces. Many
will allow more researchers and components can now be connected and
engineers to work in the field by controlled so that intelligent assistant
providing a bridge between core AI software can be deployed to assist
technologies and practical engineering. surgeons and their staff. Existing smart
This paper introduces our spaces projects are designed and
implementation of this vision, the optimized for specific domains,
Active framework. The next section is implemented using proprietary
dedicated to related work on building frameworks and methods. Our goal is to
intelligent assistants. The section Active provide a more generic intelligent
Framework outlines the Active original system toolkit, composed of a suite of
concepts, architecture and current tools and methodologies to rapidly
implementation. The next section design and deploy complex software
presents how the Active framework is into smart spaces.
used to implement an intelligent Our work also relates to the field of
assistant in the context of neurosurgery. multi agent framework research. In this
Finally, a conclusion presents directions area, heterogeneous existing AI based
of our future work. components are turned into agents able
to form communities working together
RELATED WORK with humans to help them solve
problems. In this context, the open agent
architecture [7] OAA introduces the
By definition, intelligent interactive
powerful concept of delegated
systems are based on various AI
computing. Requests and plans are
techniques.
delegated to a facilitator in charge of
Relevant efforts related to our research
orchestrating actions based on declared
can be classified into three categories.
capabilities of agents. Thanks to its ease
First, the area of interface agents aims at
of deployment and clean design, OAA is
creating intelligent user interfaces to
used in a large number of projects.
assist humans in specific domains [4].
Though very powerful, OAA does not
For instance, the Internet is an
provide a unified methodology to create
environment where intelligent assistants
intelligent systems. It rather provides a
can leverage a vast amount of
framework where heterogeneous [10] [11] are available and have proved
elements, written in many programming their relevance in the field of intelligent
languages, are turned into OAA systems. BDI based engines would be
compatible agents to form intelligent well suited to be the core of our
communities. Similarly, the Retsina [8] research, where dynamic decisions need
framework is advanced multi agent to be made to respond to an event. Their
architecture to build distributed design is nevertheless constrained to
intelligent systems. It is based on four dynamic planning and would not be
classes of agents. Interface agents that suited to implement tasks such as
interact with users, task agents that carry natural language processing or modality
out plans, information retrieval agents fusion.
and middle agents to help match agents
that request services with agents that
provide services. Though very efficient
ACTIVE FRAMEWORK
in producing independent reactive
behavior, Restina would not be suited as 1. Conceptual Overview
a unified methodology to implement Our solution, the Active framework,
basic AI components such as natural provides a unified tool and methodology
language processors or multimodal to eases the development of intelligent
fusion engines. In addition the design of software. Active is based on the original
Retsina uses different formalisms for concept of Active Ontologies, used to
communication, domain representation model and implement applications. A
and reasoning technique. In contrast, our conventional ontology is defined as a
aim is to use the same formalism for all formal representation for domain
intelligent assistant aspects. knowledge, with distinct classes,
Finally, undertaking tasks on behalf of a attributes, and relations among classes;
user and attempting to understand what it is a data structure. An Active
actions are being carried out involves Ontology is a processing formalism
planning. BDI based systems [9] where distinct processing elements are
provide goal oriented reactive planning arranged according to ontology notions;
in dynamic and partially known it is an execution environment. An
environments. Beliefs represent the Active Ontology is made up of
model and state of the world and a plan interconnected processing elements
library defines how to achieve goals. called Concepts, graphically arranged to
Intentions are activated plans elected represent the domain objects, events,
and picked from the library to reach actions, and processes that make up an
some goals. The list of intentions is application. Concepts communicate with
constantly evaluated with beliefs, thus each other through channels, passing
providing a reactive behavior to the state information, hypotheses, and
system. Many BDI implementations requests.
Figure 2 : Active Application Design

An Active powered application is


2. Technology composed of one or more Active
The Active framework implementation Ontologies deployed and executed on
is a Java based software suite designed the Active server and a community of
to be extensible and open. The Active sensors and actuators integrated as
Editor (Shown in figure 1) is a design SOAP web services (See figure 2).
environment used by developers to Sensors (user interface, speech
model, deploy and test Active recognizer, stereo camera or any
applications. The Active Server is a physical measuring probe) report events
scalable runtime engine that hosts and captured in the environment through the
executes one or more Active SOAP interface of the Active server.
applications. In response to incoming events, an
A plug-in mechanism enables Active Ontology in charge of natural
researchers to package AI functionality language interpretation attempts to
to allow developers to apply and construct structured commands. Such
combine the concepts quickly and Active Ontology (See figure 1) defines
easily. To ensure ease of integration and the structure of valid commands and,
extensibility, components of the Active within the same unified context,
platform communicate through web specifies processing rules to turn the
service (SOAP) interfaces. static ontology-like domain definition
3. Active based application design into a dynamic execution environment.
An Active Ontology in charge of natural they can be heterogeneous, distributed
language interpretation is made out of and easily added. Active is a test-bed for
two types of concepts: sensor concepts multimodal applications where multiple
and node concepts. sensors can contribute to make up a
Sensor concepts are specialized filters to command.
sense and rate incoming events about For instance, a surgeon can say
their possible meaning. A rating defines "endoscope, follow my tool” while
the degree of confidence about the gesturing to the left. The speech
possible meaning of the corresponding recognizer will contribute by reporting
sensed signal. Typically sensor concepts all recognized words and the gesture
generate ratings by testing events recognizer will report a gesture going
ordering and if their values belong to a from left to right. The language
known vocabulary set. Sensors use processing Active ontology, using its
channels to report their results to their bottom up network of concepts, will
parents, the node concepts. assemble these fragments to generate a
There are two types of node concepts: full command.
gathering nodes and selection nodes. Concepts remember their current
Gathering nodes create and rate a ratings, therefore the dialog context
structured object made out of ratings between the user and Active is
coming from all their children. Selection maintained. After successfully issuing
nodes pick the single best rating coming the command "endoscope zoom in”, to
from their children. Node concepts are further control the zoom factor the user
also part of the hierarchy and report can simply say "in” or "out”.
ratings to their own parent nodes. Once a structured command has been
Through this bottom up execution, input generated at the language processing
signals are incrementally assembled up stage, it is passed to another Active
the domain tree to produce a structured Ontology in charge of validation and
command at the root node. resolution. The incoming command will
For instance, when the surgeon says: be deconstructed, following a top down
“endoscope zoom in”, the sequence of scheme, to verify that each element is
words "endoscope”, "move”, "in” will valid and semantically correct.
be submitted to the network. Each word Complete and valid commands are
is rated by the sensors of the network. processed by a final stage, implemented
"endoscope” will be rated as a subject, as another Active Ontology, will
"move” as a verb and "in” as a zoom perform actions and communicate.
complement. The node complement is of Since Active applications interact with
type selection and picks the best rated their environment through a set of
value coming from its children. At the loosely coupled services, actuators are
top of the network, the node command not known at design time and have to be
is of type gathering and assembles dynamically chosen at runtime based on
values from its children to create the their availability, the environment
final command. context and user preferences.
Since sensors report events to the Active This concept of delegated computing [7]
server through a web service interface, is implemented by another specialized
Active Ontology. Registered service SOAP enabled sensors and actuators.
providers are rated and picked at Input sensors are speech recognition,
runtime by a delegation broker. As an vision based gesture recognition and
example, if a message has to be probes used to monitor patient vital
communicated, the delegation Active signs. Actuators are the main user
Ontology will analyze the current interface, a robotic endoscope holder
situation to decide which service and a speech synthesizer.
provider is best suited to do the job. The system is evaluated and reviewed
Selection is based on many factors such by surgeons and medical equipment
as dialog context, user preferences, suppliers on a regular basis. For the first
location, reliability or cost. Service time, a natural and intuitive computer
integration through a delegation interface enables them to interact with
mechanism provides a powerful plug computers as though they were an active
and play approach where components member of the team. In addition, a
can be dynamically integrated. service-based architecture federates
computer based systems present in the
NEUROSURGERY operating to centralize all interactions
through the same set of multimodal
INTELLIGENT channels. It saves surgeons from
ENVIRONMENT learning about different system designs
and limits the number of user interfaces
Following the methodology described in they have to deal with.
the previous section, an intelligent Since the system is built as a community
operating assistant for neurosurgery is of distributed services, multiple
under development. The system is surgeons can collaborate from different
implemented as a multimodal system locations by dynamically connecting
allowing surgeons to retrieve and their own user interfaces on a shared
manipulate pre-operative data (a set of network.
CT scans and a reconstructed 3D model The major problem we see for a broader
of the area to operate). In addition, live deployment of our system is the
images coming from a powered image standardization of the operating room
source (endoscope or microscope) are components. Operating rooms
displayed along with vital patient communication protocols are being
information. Surgeons and their staff developed, but they are not open and use
interact with the system by a proprietary technologies.
combination of hand gesture using a
contact-less mouse [12] and voice SUMMARY AND FUTURE
recognition. Commands are issued to WORK
control the powered endoscope, navigate
through pre-operative data and choose
which information to show on the main The Active framework provides a
display. The prototype is implemented unified tool and approach for rapidly
over five Active Ontologies deployed on developing applications incorporating
an Active server and a community of robust natural language interpretation,
dialog management, multimodal fusion aspects of Active. To perform realistic
and brokering of web services. As such, clinical tests, we are working on
Active aims to unleash the immense integrating real operating room
potential of intelligent software by components with the Active framework.
making required technologies more If Active has proven techniques for
easily accessible. basic language processing and service
Its goal is foster research and innovation orchestration, further investigation
in this new field of software design by needs to be done on activity recognition
helping launch more academic and and plan execution. Our philosophy is to
commercial projects. Active has been use the Active framework to unify these
used in various domains, such as two disciplines to perform them in a
intelligent spaces and ubiquitous mobile unique environment. Active could then
communications. look at the activity of a user, understand
In the medical field where computers what is being attempted to proactively
are part of the standard equipment of provide relevant assistance or take over
surgery rooms, an Active based the execution of the task.
intelligent operating environment is
under development and evaluation. This ACKNOWLEDGEMENTS
software assistant enables surgeons to
interact with computer systems as if
This research has been supported by SRI
they were an active member of the team.
International and the NCCR Co-Me of
More work remains to be done on both
the Swiss National Science Foundation.
implementation and methodology
BIBLIOGRAPHY

[1] MAES P., Sardine: dynamic seller [9] RAO A.S, GEORGEFF
Agents that reduce work and strategies in an auction M.P.
information overload marketplace BDI-agents: from theory to
Communications of the ACM, ACM Conference on Electronic practice
1995, 38. Commerce. 2000, 128-134. Proceedings of the First Intl.
Conference on Multiagent
[2] SOWA J.F., [6] BERRY P., MYERS K., Systems, San Francisco, 1995.
Architecures for intelligent URIBE T., YORKE-SMITH N. [10]MYERS K..
systems. Special Issue on Constraint solving experience A procedural knowledge
Arti¯cial Intelli- with the calo project approach to task-level control
gence of the IBM Systems Proceedings of CP05 Workshop In proceedings AIPS-96, 1996,
Journal, 2002, 41 : 331-349. on Constraint Solving under AAAI Press 1996 158-165
Change and Uncertainty, Sitges, [11] NORLING E., RITTER
[3] WINIKOFF M., Spain, 2005 4-8 F.E.
PADGHAM, L. HARLAND. Embodying the JACK agent
Simplifying the development of [7] CHEYER A., MARTIN D. architecture
intelligent agents The open agent architecture Australian Joint Conference on
Australian Joint Conference on Journal of Autonomous Agents Artificial Intelligence. 2001, 368-
Artificial Intelligence, 2001, 557- and Multi-Agent Systems. 2001, 377.
568. 4(1) : 143-148.
[12] GRAETZEL C., FONG
[4] MIDDLETON S.E. [8] SYCARA K., DECKER K., T.W, GRANGE S., BAUR, C.
Interface agents: A review of the PANNU A.S., A non-contact mouse for
field, 2002. WILLIAMSON,.M, ZENG D. surgeon-
Distributed intelligent agents computer interaction
[5] MORRIS J., REE P.,MAE IEEE Expert, 1996 Technology and Health Care
P. 2004, 12(3) : 245-257

You might also like