Building an Interface Agent for ArcInfo

Daniel Campos D., Aleksey Y. Naumov, and Stuart C. Shapiro

Building an Interface Agent for ArcInfo
Abstract Introduction The Interface Agent How the Interface Agent Works Testing the Interface Agent Discussion and Conclusions

This paper describes a knowledge-based interface agent whose mission is to help users without a knowledge of ArcInfo to access and process spatial data stored in ArcInfo databases. Our interface agent, using a client-server schema and operating on a LAN or WAN network, receives and processes requests written in plain English, interacting with the user in case of possible mismatches between his/her concepts and the representation of data in the ArcInfo database. The agent builds and sends sequences of commands oriented to provide the information requested by the user to an ArcInfo server, receives, and presents the results of those requests to the user. A prototype of the interface agent was built using SNePS (Semantic Network Processing System) and Common Lisp on a Sun Sparc station. The project faces some of the challenges raised in the work of Shapiro, Chalupsky & Chou [5] and proposes some possible solutions to them.

The development of GIS technology has made it available to a growing number of people from different disciplines and with different backgrounds. However, the degree of productivity they can achieve is limited by their lack of technical knowledge about GIS tools. Interface agents are software tools oriented to reduce the gap between user's knowledge and the technical knowledge required to operate a software system. For the purposes of this work, we are assuming that the user is neither familiar with ArcInfo, nor with the content and structure of the ArcInfo database, and that he/she is not necessarily working on the same machine on which ArcInfo is available. The problem we are interested in is to build an intelligent interface agent able to help ArcInfo users to retrieve the information they want, building or helping them to build complex queries during the interaction. Shapiro, Chalupsky & Chou describe from both the design and the implementation standpoint, an interface between SNePS, the Semantic Network Processing System, a knowledge representation, reasoning and acting system developed by Stuart C. Shapiro et al. at the State University of New York at Buffalo, and ArcInfo [4]. As the authors concluded, "many interesting questions need to be answered before a full fledged Natural Language front-end to ArcInfo can be developed". Specifically, we address two issues. One is the possibility of using ArcInfo documentation files as a source of information about geographic attributes, to partially automate the construction of a domain knowledge base and domain lexicon. These documentation files are supposed to be compliant with the new Federal Data Transfer Standard [2] and ideally have to accompany every spatial data set used in ArcInfo. The second issue has to do with the use of an intelligent agent to perform spatial queries in a more natural way, using multimedia language (a combination of Natural Language and pointing). Currently, queries to INFO, ArcInfo database module, involve knowledge of attribute values (for example, "select landuse-code 3" where 3 might stand for "agriculture", or "select vegetation 'forest'"). Queries of more complex nature are more difficult and may require additional operations, such as statistical procedures, before they can be correctly formulated (for example, to retrieve information about the highest points in an elevation data set, the user first has to run statistics on this data set to obtain the exact value). Although menu-based interfaces, such as ArcTools, simplify user's task, they usually constrain the possibilities of interaction to those defined by design, while still requiring familiarity with concepts and database structure. The main technical problems faced in this project were: design and implementation of a mechanism for linking ArcInfo and SNePS through INTERNET in such a way that the system could operate independently of the actual locations of ArcInfo and the interface agent, design and implementation of a mechanism to automate the process of building a knowledge base and a domain dictionary using the information stored in ArcInfo meta-data files, definition of a grammar for a dialect of English that enables the system to understand Natural Language queries and enables a user to teach it plans that can be used to provide answers to those queries, and design and implementation of a mechanism for negotiation of meanings of unknown words.

1 of 10

10/18/2011 10:49 AM

Consequentially. Most often. the agent must be able to: build sequences of ArcInfo commands that provide answers to typical queries to an ArcInfo database. for example. an automated procedure for describing a given ArcInfo workspace was developed. Attributes of arcs stand for certain properties of streams. grid. our intelligent agent needs to have some beliefs about the content and structure of the active ArcInfo workspace. such as coverages. In order to do that. In our prototype we deal only with the most common type of ArcInfo data set. In previous version of this demo these data were directly programmed in the demo. tin}) Coverages can represent points. SNePS representation: (assert superclass coverage subclass {polygon-coverage line-coverage point-coverage}) If a coverage is a polygon coverage. for real workspaces containing multiple data sets and INFO files associated with them. best matches between user's concepts and ArcInfo data. We identify at least two different types of beliefs in our agent: beliefs about general characteristics of the ArcInfo system. who knows which real world features each data set stands for. which are represented by lines (arcs). for a particular geographic area. table. and TIN. land parcels. nor with the content and structure of ArcInfo workspaces. Each coverage represents geographic features of a particular kind. ArcInfo documentation files 2 of 10 10/18/2011 10:49 AM . One of the problems inherent in the management of data sets. lines. For a plan to be performed. Geometric objects in a coverage (points. SNePS representation: (assert superclass ARC-datatype subclass {coverage. for example. Therefore. SNePS representation: (assert forall ($x) ant (member *x class polygon-coverage) cq (object *x system-attribute (AREA PERIMETER "#"))) Beliefs about the ArcInfo workspace used during a particular interactive session In order to be able to build queries for ArcInfo.. that a coverage named STREAMS will contain streams. and internal ID (#). We are assuming that users are neither familiar with ArcInfo. Some examples: ArcInfo data types are coverage. infer. Beliefs about general characteristics of the ArcInfo system The interface agent has some general beliefs about the ArcInfo system. our interface agent must be able to plan sequences of ArcInfo commands aimed at providing the information requested by the user. In this project we combine several sources to obtain the most complete possible knowledge about an ArcInfo workspace. The ``info'' subdirectory stores all INFO files associated with all data sets in this workspace. The owner of the information knows. interacting with him/her in the case of possible mismatch between the user's concepts and data representation in the ArcInfo workspace. its preconditions must be satisfied. Since this knowledge largely resides with the user.esri. These sources are: 1. table. A typical ArcInfo workspace is a directory that contains data sets (each forms a subdirectory) and a common ``info'' subdirectory. LENGTH represents lengths of stream. is associating objects in coverages and their attributes with the features in the real world. which can not be assumed of the large majority of users.Building an Interface Agent for ArcInfo http://proceedings. The Interface Agent Our goal is to provide inexperienced users of ArcInfo with an interface that would help them to obtain information from ArcInfo databases. for example. This procedure examines a workspace and writes a set of propositions about it in the SNePS language. image. However. DISCHARGE means annual discharge. We collected some generic information about ArcInfo and represented it as a propositional semantic network using SNePS. it is very difficult to formalize. if it is possible. as well as user-defined INFO files. perimeter. this association is maintained by the owner of the information. and generate text and graphical output to the user. Preconditions refer to beliefs of the intelligent agent. image. or vegetation classes. roads. The owner of information also knows about other INFO and text files which are related to data sets. a coverage. lines and polygons. its system attributes are : area. and polygons) represent location of geographic features. beliefs about the ArcInfo workspace used during a particular interactive session. and so on.. this would require a huge effort and substantial knowledge of both ArcInfo and SNePS. while attributes of objects in a database associated with the coverage represent properties of features. A small demonstration that shows the main features of the intelligent agent is currently available.

Related INFO file VEGCLASS. description (if available from most often between feature attribute tables and some data files.ATT . 2. Partial view of the resulting SNePS knowledge-base for one of test workspaces is shown in Figure 1. A list of INFO files described is created. contains feature classes of a data set and description of their attributes.PAT (point/polygon attribute table). 2. repeat step 4. If any of these INFO files is missing from the list created at step 1. The program receives as arguments the path of an ArcInfo workspace and a path for remote copies of output files. All descriptions of attributes available in documentation for these INFO files are extracted. Among other things documentation files contain information which is directly relevant to this project: names of data sets.aml'' program analyzes the workspace. Taking into account the popularity of ArcInfo and emerging meta-data standards for spatial data sets. which would otherwise have to be constructed manually. 3 of 10 10/18/2011 10:49 AM . all unique values of this attribute in this INFO file are described (values described by column). Every coverage is then processed in the following way. AML program to describe ArcInfo workspace An AML (Arc Macro Language) program was created to analyze the content of a workspace and produce files containing the resulting SNePS network and workspace-specific dictionary. etc. Documentation files consist of four INFO (ArcInfo database module) files per every data set. Until recently. Related INFO files in the workspace Often a workspace contains INFO files related to features in coverages. Relate file is an INFO file with particular attributes (columns).AAT (arc attribute table). It may contain an attribute VEGCLASS with definitions of vegetation classes: ``forest'' for ``1''. These data files. more or less complete). column (attribute) VEGCODE in polygon attribute table of coverage VEGETATION (VEGETATION. 4. 5. A relate is a link maintained in ArcInfo relational database between INFO files. Name. 3. DOCUMENT is used to create and update documentation files associated with ArcInfo data sets. Executive Order 12906 of April 1994 "Coordinating Geographic Data Acquisition and Access : The National Spatial Data Infrastructure" [2] required all federal agencies to document their spatial data sets according to the guidelines established by the Federal Geographic Data Committee. contains theme (topic) of the dataset and its short description. contains additional information about data set.narrative file. feature class (if associated with a feature class) are extracted. ArcInfo did not provide an easy way to keep track of information about coverages in an automated manner. All names of coverages. 5. <data set>.NAR .. 3. This files are examined to obtain information about attributes of corresponding features in the coverage. Every INFO file in the list is described.esri. <data set>. and so on. 4. The name of the file is composed of the name of the coverage and a standard extension: . describe it by record (values described by row).PAT) can represent vegetation classes with integers 1 through 5 (to save space and increase performance). Workspace is searched for related files.aml'' program was tested on several ArcInfo workspaces and was generally able to describe them (with the different degree of detail) regardless of the status of documentation (present or absent.Building an Interface Agent for ArcInfo http://proceedings.DOC . If documentation files for this coverage exist. description (if available from documentation. or default description for some types of attributes).3 for those files. To meet meta-data (data about data) needs of federal agencies Environmental Systems Research Institute (producers of ArcInfo software) distributed DOCUMENT. Coverage theme and description are extracted. ``woodland'' for ``2''. Since the structure of relate file is always the same. it is added to it. their topics and brief descriptions. 1. 2. polygon. produced by users. If this file is one of the relate and related files (discovered at step 3).) has an INFO file associated with it. 3. can contain additional information which can be linked to features in coverages when needed. . If some of the relate and related files (discovered at step 3) were not described in step 4. it seems beneficial to be able to extract essential information about a particular ArcInfo workspace from documentation files produced by DOCUMENT. For example. descriptions of feature classes and their attributes.. Three of them are considered relevant to this project: <data set>. or default description for some types of INFO files). A list of feature attribute tables (INFO files associated with feature classes in the coverage) is created. they will be examined. INFO files associated with feature classes (point. Every ArcInfo relate (link between INFO files) is stored as a record in a relate file. INFO files and values of character attributes are added to a lexicon file when they are encountered. a meta-data management tool for ArcInfo users. these files can be found in the workspace and names of related files can be obtained. arc. arc. This would allow us to automate the construction of the semantic network knowledge-base in which we represent knowledge about the workspace at SNePS-level.general description of data set.DAT will have the same attribute (to establish a relate) and other attributes. A predefined set of arc labels is written to the output network file.3. Names of all discovered relate and related files are written to a list.attribute description file. 1. polygon) of coverages Every feature class in an ArcInfo coverage (point. The ``arcinfo-workspace. Every attribute in this INFO file is then described in the following way: Name of attribute. If an attribute is character. The following algorithm outlines the way in which the ``arcinfo-workspace. etc. A list of coverages in the workspace is created.

com/library/userconf/proc96/TO50/PAP049/P4. Figure 1. 4 of 10 10/18/2011 10:49 AM . Partial view of the SNePS knowledge-base automatically generated from the files in a particular workspace.Building an Interface Agent for ArcInfo http://proceedings. Figure 2 shows the structure of the communication layers of the system. and provides a way to communicate with this server from an external application.. Connecting SNePS and ArcInfo The current version of ArcInfo offers the possibility of running ArcInfo as a server. We use this client-server mechanism to establish connection between SNePS and ArcInfo.esri..

the files are remotely copied to the machine running SNePS. the workspace is examined to filter out all relevant information. ARCCOMMAND: Function of the container of the string returned by ARC. An ArcInfo command can be evaluated or a request for a result processed. Interaction between the user. The results of the request are concatenated to the status of the request and returned in the return string. During this phase client-server connection between ArcInfo and SNePS is established. were attached to a set of primitive actions in SNePS. in such a way that the agent is able to perform the act of connecting and communicating with ArcInfo either by queries or simple requests. the execution can continue immediately. is shown in Figure 3.. message passing features are included to acknowledge errors or receive something from the ArcInfo server. How the Interface Agent Works 5 of 10 10/18/2011 10:49 AM . Client procedures. it returns a non-negative server identification.Building an Interface Agent for ArcInfo http://proceedings. Part of this model. a response is expected from or "ERROR!" if the request to the ARC server fails. for simplicity we assume that all database is within one ArcInfo workspace). At C-level. interactions during the initial phase of the system's operation. ARCQUERY: This function sends a request to execute a specified procedure number to a server. If the connection is successful. Architecture of SNePS-ArcInfo client-server connection.esri. Figure 3. The following modules compose the C-level of the client (arcclient): ARCCONNECT: This function connects the client to a server identified by the contents of the file connect. Using this client-server mechanism. the interface agent and ArcInfo. In the second case.. written in C. Figure 2. with a string as an argument for the procedure. the interface agent and ArcInfo during the initial phase. which is written to two SNePS source files: network file and lexicon file. we developed a model of interaction between the user.arc. and network and lexicon files are loaded in SNePS to form the basis of knowledge base of the intelligent agent. the user is asked for a workspace (as was stated earlier. In the first case.

db%/" and then send the coverage. The next section covers some of such plans present in our interface agent. Where is the Moneron creek? Assuming that these features are identified in coverage attribute tables or related files. "What do people do?". What is the total area of woodlands? What portion of the island is covered with woodlands? What percent of area within 100 ft of streams is forested? For example. queries of this kind can essentially be answered by retrieving information from INFO. text.. some biases in the use of GIS.Building an Interface Agent for ArcInfo http://proceedings.db% is AML variable that contains path to the database. To get an idea of questions that users might try to solve utilizing ArcInfo.. we need to broadly identify the kinds of tasks which people attempt to do using GIS. The choice of questions was also limited by practical considerations: only questions that can be answered using our sample database were considered. etc).. database construction.. receive some results back from ArcInfo. In this paper we cannot attempt to cover all "various sets of GIS tasks". translate it into ArcInfo command(s). The usefulness of this interface agent depends on it's ability to process various kinds of requests from users to ArcInfo. These questions definitely reflect an approach based on personal experience with GIS and ArcInfo. then the task of the interface agent is to interpret user's request. Based on a few responses received and on personal experience sample questions (question is used interchangeably with request or query here) were formulated. question Where is the Moneron creek? could be responded to with the following set of commands in Arcplot: reselect {path to database} {streams} {arc} {name} = '{Moneron}' linecolor red arcs {path to database} {Streams} where elements in { } are provided (through inference) by the interface agent. in relation to other features. etc.sequences of actions which are taken if their preconditions are satisfied. Depending on the type of query and response (display... at this stage) is to be able to convert user's input into sequences of ArcInfo commands.response model of interaction between a user and ArcInfo.1429]. "Show. Given all these limitations. possibly. listing of attributes. what types of queries exist? This question is definitely worth a special research which is beyond the scope of this paper. 6 of 10 10/18/2011 10:49 AM . To this point only a few plans have been developed. What would be a typical query. last two steps may be omitted. such as overlay or buffering. Mark and Gould emphasize that underlying the design of decision-support systems are questions such as "What can people do with computers?". as well Development of plans for some typical spatial queries to ArcInfo Before any plans can be developed for our interface agent. Other questions require some spatial processing.esri. it is still hoped that these questions are adequate as a starting point. Answering the question "What do GIS users do?" for various sets of GIS tasks would be a prerequisite to the design of user interfaces [3.". and the even more fundamental question. Show all forests in the study area. for every type of user request. rather we limit our task to developing plans suitable for some typical queries that users perform in ArcInfo. the agent must have plans . such that they refer directly to values of attributes of coverage features and most generally can be answered by showing requested features graphically. and present them to the user. Here are two questions of this kind used in our demonstration. which is reasonable for novice users. a questionnaire was distributed (Appendix) . At this point our interface agent has some beliefs (knowledge) about general organization of ArcInfo and about the content of the workspace in question. In order to do this. p. We started with the basic questions of type "Where is. where %. If we assume the request . The critical task (at least. Here is the actual plan that the agent follows to carry out this task: If a value is related to an attribute and the value is related to a feature and the value is related to a coverage then a plan to locate the value of a type is to say "reselect " and then say the coverage and then say "space" and then say the feature and then say "space" and then say the attribute and then say " = '" and then say the value and then issue "quote" and then issue "linecolor red" and then say "arcs %.".

of). the domain-specific dictionary automatically built from the ArcInfo workspace. Japan. possible matches between user's and ArcInfo terms and request to the user the selection of one of the possible alternatives (if they exist). Figure 4. Figure 5 shows a sample run of the prototype. 7 of 10 10/18/2011 10:49 AM . and VEGETATION . The top-right screen shows the activity of the ARC server. and used to translate Natural Language queries plan definitions.: the. TRAILS . the bottom-right screen shows the SNePS agent and left screen shows the answer to the request "find all woodlands" provided by accessing the ArcInfo workspace. and the system's belief stored in the system's knowledge base.. Natural Language Processing and Meaning negotiation In order to process natural language queries. The selected alternative is used for further processing of the query. from the agent's representation of ArcInfo workspace..vegetation cover of the island. All data sets were created from paper maps by one of the and rivers.. In order to deal with this problem. type of users this interface is developed for). plan definitions. and the new concept is added to the lexicon. and requests to ArcInfo.g.elevation contour lines. The database consists of five coverages for the Moneron Island located 50 km north of Hokkaido. The meaning negotiation process and the components involved in the processing of natural language queries are shown in Figure 4.Building an Interface Agent for ArcInfo http://proceedings. These data sources are used by a parser in the analysis and generation of English sentences. the intelligent agent uses four data sources: an analysis/generation grammar of a subset of English that represents the syntactical/semantical structures involved in the dialog among the user and the intelligent agent.trails. We modified the current parser of SNePS in such a way that it can handle possible mismatches between user's input and the representation of the ArcInfo workspace in the agent's knowledge base. our intelligent agent uses forward and reduction inference to infer. Representations of plans such as the one above are stored in a semantic network knowledge base (the system's knowledge base).esri. these data sets are typical for ArcInfo data sets and provide a simple but realistic basis for geographic problem solving using ArcInfo geographic information system.shore line. These mismatches will be quite common for users who are not familiar with the content and/or structure of the ArcInfo database (i. The coverages are: SHORE .e. and requests to sequences of ArcInfo commands. Structure of the meaning negotiation process Testing the Interface Agent The questions we used in developing of our interface agent relate to the sample ArcInfo database. a system's dictionary that contains descriptions of words that are not domain-specific (e. TOPOGRAPHY . STREAMS . Meta-data files were created using DOCUMENT in ArcInfo. Overall. in the north-west Pacific.

com/library/userconf/proc96/TO50/PAP049/P4. and we are interested in knowing about the types of problems GIS is most often used to solve. Figure 5.. The use of natural language interfaces does not exclude the possibility of using other alternatives if they are found useful in a particular situation during the user-computer interaction. Instead of ruling out a particular way of interaction. Many more plans are needed. but there may be generic features that are fairly common (for example. gestures (e. As result of this. buffering] versus questions that require operations on several datasets [overlay]).. Different types of users exists. and much more research on tasks carried out with GIS is required to define "typical questions" and provide automated answers to them. these questions vary from one application to another. if problems can be formulated as questions in plain English. with different knowledge and skills.PAT. imagine that you have a study area.screen dump. The system we are developing is mostly aimed at a particular type of user. These research questions could be investigated with the help of tools like the one we are developing. possibly. Appendix.esri. In other words. menus and images). Discussion and Conclusions Different kinds of spatial problems exist and expression of a problem in a human language may not necessarily be the easiest way to solve it. the one without sufficient knowledge of ArcInfo and. and how useful can be linguistic devices for expressing their needs of processing spatial information. As it was research oriented. In the future. questions that imply operations on one dataset [query to coverage.. we are interested in the typical questions GIS users ask.g. voice and text commands. it might be interesting to test in which cases it might be useful. to provide the user a unified interface in which Natural Language is one possible way of reach the user's goals. We will appreciate if you help us by filling the form in Appendix. pointing to icons. for which the following datasets are available : 8 of 10 10/18/2011 10:49 AM . To make it more specific. Response time can be improved if the system orientation changes from research to production. Information requested from ArcInfo users We are doing research on interfaces to GIS. without much GIS experience. and particularly about the ways these problems are stated by the users. we will consider connecting the interface agent and ArcTools. Obviously.Building an Interface Agent for ArcInfo http://proceedings. our current prototype has response time that might not be satisfactory for real time interaction. The ideal interface for a GIS system seems to be one in which the interaction is performed in a multimedia language that may include (among others) natural language. Sample run of the interface agent . Interface agents such as the one proposed in this paper could be used for research in spatial cognition: how people think about spatial problems. Natural Language interfaces seem to be helpful for this type of users as long as they offer the possibility of expressing requests in the same way the users may express them to a human consultant. efficiency and speed were not considered as important in the development of the prototype.

"Building an Interface Agent for ArcInfo.J. 5. Buffalo. Project report . ed. References 1. December 1995.buffalo. D. and Schimpf W.C.D. woodlands. 57. 3. NY 14260-2000 E-mail: campos@cs. 1987. meadows) 2. We would appreciate your questions.esri. 1991. pp. Department of Computer Science. Topography Attributes : absolute elevation values (0 . Department of Computer Science. Academic Press. 5. SNePS 2. Pergamon Press. August 29.. Mark. 1992. New York 14261-0023 E-mail: Aleksey Y. 243-275. Campos. Computers &Mathematics with Applications 23. shrubs. 7. pp.C. H.. 2. Reprinted in F..buffalo. 6.. Cedar.. Semantic Networks in Artificial Intelligence.. State University of New York at Buffalo. Vegetation Attributes : vegetation classes (forests. Photogrammetric Engineering and Remote Sensing.. Lehmann. 1991.C. Shapiro S. The SNePS Family.. S. 1427-1430. Stuart C.. as well as any remarks/comments/etc. 3. Streams Attributes : stream names (Cold. Oxford. W. Associative Networks: "The Representation and Use of Knowledge by Computers. New York. 1995. 1992). 14620.3 User's Manual. you c an ask any questions and add more datasets if you need.. State University of New York at Buffalo 226 Bell Hall. Buffalo. 11. Naumov State University of New York at Buffalo 105 Wilkeson Quadrangle Buffalo. 2-5 (January-March. 4. Technical Report.V. Examples of questions : 1. Shapiro S.Building an Interface Agent for ArcInfo http://proceedings.CS642. Chalupsky. "Connecting ArcInfo and SNACTor". & Chou H. 7.179-203. NY. Trails Attributes : trail names (use any names you want) 4. what questions would you ask about the study area ? TYPE and STRUCTURE of question are more important than any specifics. and Gould M. etc. SUNYAB. 2. "The SNePS semantic network processing system" in N. and Rapaport. Naumov. Ed. you can add some) Stuart C. State University of New York at Buffalo. Shoreline Attributes : none 5. Buffalo. New York. 4. Thank you very much. "Show all forests in the study area" "Where is the Cold creek?" "Show places higher than 400 m" "At what altitude do trails cross Cedar stream?" "List all areas with meadows that are steeper than 10 degrees" "Find sections of trails which pass through woodlands on gentle (< 5 degrees) slopes" "What percentage of area within 100 ft of streams is forested?" and so on. Daniel Campos D. D. Interacting with Geographic Information: A Commentary. 6. Department of Computer Science. Shapiro Professor Department of Computer Science 9 of 10 10/18/2011 10:49 AM .com/library/userconf/proc96/TO50/PAP049/P4. Findler. Executive Order 12906 of April 1994 "Coordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure". Shapiro and The SNePS Implementation Group.450 meters) Given these data. A. Shapiro. Department of Computer Science. 1. 243-275. The White House.

NY 14260-2000 E-mail: 10 of 10 10/18/2011 10:49 AM . Buffalo.buffalo. "Spatio-Temporal Reasoning in GIS". State University of New York at Buffalo 226 Bell Hall. support by NSF is gratefully acknowledged.. supported by a grant from the National Science Foundation (SBR-88-10917).Building an Interface Agent for ArcInfo Acknowledgments This research represents part of Research Initiative #10..esri. of the National Center for Geographic Information and Analysis.

Sign up to vote on this title
UsefulNot useful