SPATIAL DATA ANALYSIS WITH SPATIAL STATISTICS AND GIS

1

Shuming Bao1 and Luc Anselin2 China Data Center, University of Michigan, 1080 S. University, Ann Arbor, MI 48109-1106 TEL: (734) 647-9610, FAX: (734)764-5540; Email: sbao@umich.edu University of Texas at Dallas, Richardson, TX 75013-0688 TEL: (972) 883-2088, FAX: (972) 883-2735, Email: lanselin@utdallas.edu
2

ABSTRACT: The extension of the functional capacity of geographic information systems (GIS) with tools for exploratory spatial data analysis (ESDA) has been an increasingly active area of research in recent years. In this paper, two operational implementations that link spatial analysis software with a GIS are considered more closely. They consist of a linkage between the SpaceStat software for spatial data analysis and the ArcView GIS (based on so-called loose coupling), and the S-PLUS for ArcView (based on so-called close coupling). The emphasis is on the implementation of methods of exploratory spatial data analysis to describe spatial distributions, visualize spatial patterns, and assess the presence of spatial association. Conceptual and technical issues related to the implementation of these approaches are addressed and some ideas are formulated on future directions for linking ESDA and GIS.

I. INTRODUCTION In this paper, we contrast two different approaches to linking a GIS and a spatial statistical module. One is based on a loose coupling strategy by means of an efficient interchange of input and output files between the SpaceStat spatial data analysis software [Anselin (1992, 1995a)] and the ArcView GIS [ESRI (1997)]. The other is designed as a seamless integration (close coupling) between the S-PLUS statistical computing environment [Mathsoft (1996a)] and the ArcView GIS. For each of these approaches we next briefly describe the overall architecture, linkage mechanism and operational implementation. We close with a comparison of the relative merits of these approaches and some thoughts on future developments. II. STRATEGIES IN LINKING SPACESTAT AND S-PLUS WITH ARCVIEW GIS As one of the most popular GIS software, ArcView is primarily geared to the manipulation of spatial vector data. It has recently been extended with optional modules for the analysis of raster data (Spatial Analyst extension), network data (Network Analyst extension) and threedimensional data (3-D Analyst extension). Although ArcView doesn't offer much functionality of spatial statistics, the object oriented Avenue script language supported by ArcView (Version 2.1 and higher) allows external modules or programs to be integrated into ArcView environment by customizing ArcView GUI with user-developed programs. Both the SpaceStat-ArcView Link and S-PLUS for ArcView are implemented by customizing the ArcView user interface (Menu, Button, Tools) and adding functions written in Avenue and C programs, and are deployed as an extension to ArcView GIS. Built in the GAUSS (GAUSS is a PC based product of Aptech Systems, Inc. of Kent, Washington) programming environment [Aptech (1995)], SpaceStat [Anselin (1992, 1995a)] is a DOS operating system based software package for the analysis of spatial data. The SpaceStat is written in GAUSS and distributed with a GAUSS Runtime Module. SpaceStat includes a broad range of test statistics for both global as well as local spatial autocorrelation, and econometric estimation methods and specification tests for regression models that incorporate spatial dependence (spatial autoregressive models). In addition, SpaceStat has extensive capabilities to

Li, B., et al., (eds.) Geoinformatics and Socioinformatics The Proceedings of Geoinformatics'99 Conference Ann Arbor, 19-21 June, 1999, pp. 1-17

Copyright © 1999 The Association of Chinese Professionals in GIS - Abroad 151 Hilgard Hall, University of California, Berkeley, CA 94720-3110, USA All rights reserve. ISBN 0-9651441-3-5 Printed in Ann Arbor, Michigan

S. Once the S-PLUS extension is loaded into ArcView. Moran's I statistic. data from an S-PLUS data frame object can be imported into ArcView as a generic table and joined with a selected theme's attribute table. Each dialog invokes a number of Avenue scripts and C programs contained in a DLL (Dynamic Link Library). S-PLUS (MathSoft 1997a) provides powerful capabilities for graphical data analysis and statistical modeling. construct. Similarly. such as variogram. Conversely. selected fields and records are extracted from a shapefile and exported to S-PLUS as a data frame object. ANSELIN. The interface between ArcView and SPLUS is based on a close coupling approach by means of a bi-directional linkage. Geary C. results are saved in S-PLUS as data frame objects that can be imported into ArcView for visualization. The linkage is static and indirect in the sense that all the SpaceStat commands have to be issued within SpaceStat and cannot be called from within ArcView (and vice versa). S-PLUS for ArcView is implemented as an "extension" to the ArcView GIS software. and (3) a graphical user interface that hides the full complexity of the linkage mechanisms. The customary user interface is augmented with two new menus and user interaction is carried out by means of a set of dialogs. The linkage between ArcView and SpaceStat is based on a loose coupling approach by means of a bi-directional data transfer with ArcView as the visualization engine and SpaceStat as the spatial data analysis engine.2 BAO. co-variogram. As a comprehensive statistical software with over 2. while SPLUS is used for spatial data analysis. and (2) joining and mapping of SpaceStat output (report files) in a View window. Within ArcView. The ArcView GIS serves as the visualization engine. a standard text editor of Microsoft Windows. local spatial association. S-PLUS for ArcView is implemented primarily in the ArcView environment. and the estimation of spatial regression models. an important tool in the analysis of spatial autocorrelation. kriging. The SpaceStat-ArcView Link is characterized by the following features: (1) a focus on exploratory spatial data analysis (ESDA) of lattice data. as illustrated by the data flow chart in Figure 1. GEOINFORMATICS AND SOCIOINFORMATICS . manipulate and analyze spatial weight matrices. which can then be used in spatial statistical analyses such as spatial autocorrelation. and point data.. The added module S+SpatialStats (MathSoft 1996b) provides analytical functionality to handle geostatistical data. and spatial regression. These functions fall into two categories: (1) data output to file formats compatible with SpaceStat. Summary reports from S-PLUS analyses are output to ASCII text files and displayed in Notepad.000 functions. Its main objective is to provide an efficient way to display the results of spatial statistical analyses by means of the GIS and to obtain locational information for use in the statistical analysis from the GIS. (2) a division of labor between the ArcView GIS used for the visualization of the statistical results and SpaceStat used for the statistical computation. and (3) an implementation targeted at PC platforms and windows environments. The main objective of the interface is to provide a comprehensive and efficient tool for spatial data analysis that can be accessed from within a GIS environment. lattice data. constructed using ESRI's ArcView Dialog Designer [ESRI (1997)]. the S-PLUS window is launched and a connection between S-PLUS and ArcView is established. The linkage between S-PLUS and ArcView is dynamic and bi-directional. S-PLUS for ArcView is characterized by the following features: (1) a seamless integration of S-PLUS with ArcView. Data transfer occurs primarily between ArcView shapefiles and S-PLUS data frame objects. by means of a set of Avenue programs that call special purpose functions included in a DLL (Dynamic Link Library) and that are associate with menu items [see Bao and Martin (1997)]. Spatial weight matrices are constructed using scripts that exploit the geo-locational information from ArcView shapefiles and are saved in S-PLUS as spatial neighbor objects. L. (2) access to the full range of S-PLUS functions.

users can select a subset of records from the attribute table associated with the current theme or coverage. no direct conversation can be established between the two programs. spatial information is moved from ArcView to SpaceStat for analysis. and local spatial statistics. and location-specific results are passed back from SpaceStat to ArcView for visualization. spatial regression. In S-PLUS for ArcView. Spatial Data via Non-spatial Attribute Data For spatial statistics. and transferred to S-PLUS as a data frame object. The spatial weight can be defined as a matrix. attributes and spatial information. Specifically. 19-21 JUNE 199 . A general spatial weight matrix can be defined by a symmetric binary contiguity matrix. a measurement of spatial linkages or proximity of observations. it writes it out to the current S-PLUS working directory. along with the S-PLUS commands for analysis. Specifically. there is no direct mechanism to call internal SpaceStat functions from the ArcView environment. a Moran scatterplot. ANN ARBOR. The spatial weight matrices represent the strength of the potential interaction between locations. 1997). Although the recent MS windows platforms (Windows 95 and Windows NT) allow SpaceStat to run in a multi-tasking environment with ArcView in a separate window. and location-specific results are passed back to ArcView for visualization. Data are passed between SpaceStat and ArcView using auxiliary files with standardized file names and data formats. The selected data are imported into ArcView as a new table. Conversely. a fundamental element is the spatial weight. an S-PLUS data frame containing X-Y coordinates may be imported into ArcView as a new point theme. Location-specific results. including estimated values from kriging. can be integrated with ArcView tables. S-PLUS for ArcView allows users to manually export raster format grid files into S-PLUS. are moved from ArcView to S-PLUS. for output with other ArcView elements in the usual fashion. SPLUS graphs are incorporated into layouts as external graphic files saved in PostScript format. To import data from S-PLUS. and local indicators of spatial association such as the Local Moran and Gi statistics (Anselin and Bao 1996. data and commands are passed between the two environments using an automation technique. OPERATIONAL ISSUES ON THE SPACESTAT-ARCVIEW LINK AND S-PLUS FOR ARCVIEW Data Transfer and Conversion Since SpaceStat currently still uses the DOS version of GAUSS and ArcView runs under Windows. Once S-PLUS receives it. Finally. data import from S-PLUS is exported to a text file which is then added into ArcView as a new table and linked with the current ArcView attribute table. users can select one or more columns from a given S-PLUS data frame. The selected fields and records are extracted from the shape files. which can be generated from topological information from geo-locational data by using adjacency or distance criteria: PROCEEDINGS OF GEOINFORMATICS’99 CONFERENCE. All data transfers are via intermediate ASCII (text) files with formats designed to be compatible with both SpaceStat and ArcView. All data exported to SPLUS are stored as S-PLUS objects (data frame).SPATIAL DATA ANALYSIS WITH SPATIAL STATISTICS AND GIS 3 III. Location-specific results include computed spatially transformed variables (such as spatially lagged variables to construct spatial bar charts or spatial pie charts). outliers (to be visualized in a box plot or box map). To export data from ArcView to S-PLUS. Data export from ArcView to S-PLUS is packaged into a SafeArray which can then be sent to S-PLUS via Automation. denoted by W(n(n). For image data.

Dacey (1968) defined the spatial weight matrix by taking into account the relative area of the spatial units: wij = dij(i(ij.. and (ij as the boundary measure used above. programmed in Avenue. Examples of spatial information are coordinates (such as the X and Y coordinates of the centroid of a polygon) and topological information on the spatial arrangement of selected points or areal units (such as spatial neighbor contiguity). (I as the share of unit i in the total area of all space in the study. L. Bodson and Peeters (1975) introduced a general accessibility weight by combing the influence of several channels of communication between spatial units into a logical function: wij = (jkj{a( [1+b*exp(-cjdij)]}. with kj as the relative importance of the means of communication j (such as roads. and a and b as parameters. The spatial weights can be defined by spatial neighborhood or spatial distance based on spatial information. For ease of interpretation. The units of X and Y coordinates depend on the current projection of the selected data layer. another general spatial weight matrix is defined in row standardized form. and a. a new function. Distance criterion: 1 if location j is within distance d from i. has been added to the Data Menu to allow users to add the X and Y coordinates (of points or centroids of polygons) into the attribute table. In SpaceStat-ArcView. Ding and Fotheringham. in which the row elements sum to one (see WEIGHTB in the table). 1992. S. Cliff and Ord (1981) suggested a combination of distance measure and the relative length of the border between spatial units. some more complex spatial weight matrices are proposed for more precise measurement of spatial linkages. In Shape files. wij (d) = { 0 otherwise. b and cj are parameters. A number of procedures to construct spatial weight matrices using the topological information given by various GIS systems have been suggested (Anselin et al. The resulting spatial weight matrix is asymmetric. with dij as a binary contiguity factor. Figure 2 is an example of how the spatial contiguity weight matrix is constructed using the adjacency criterion. the spatial weights can be defined by using X and Y coordinates GEOINFORMATICS AND SOCIOINFORMATICS . 1992). with dij as the distance between location i and j. which include Rook Weights from Shape File and Queen Weights from Shape File. For point data.. Several added functions in the SpeceStat-ArcView Link have been provided for defining spatial weights. the topological relationship of polygons provides information for spatial neighborhood. Since the contiguity matrix cannot differentiate the strength of spatial linkages between adjacent locations. Can. Similarly. ANSELIN.4 BAO. 1992. which is defined as wij = (dij)-a((ij)b. WEIGHTA is a binary variable that represents a neighborhood relationship between locations. Those X and Y coordinates can then be exported along with other selected variables to an external ASCII data file in the format consistent with SpaceStat. which need to be estimated. dij as the distance between unit i and j. 0 if location j is not adjacent to i. it can be extracted from the shape files by programs in Avenue scripts. (ij as the proportion of the interior boundary of location i which is in contact with location j. wij = { Adjacency criterion: 1 if location j is adjacent to i. Although the spatial information are usually not transparent to users. railways and other communication links).

users can use SpaceStat to construct more complicated spatial weights such as standardization and manipulation of spatial weights matrices. Those simple binary weights can then be transformed to a more complicated spatial weight by assigning different weights in S-PLUS. With X and Y coordinates information. Adjusted First Order Neighbor Weights (a combination of adjacency and distance criteria). users can use the Spatial Neighbor function (an added menu) to construct binary spatial neighbor objects for spatial statistics and modeling by means of Avenue's buffering and spatial query functions for ArcView shapefile data. j]) of the nth order weight matrix W is 1 if polygon j is adjacent to the neighbors of order n-1 of polygon i. SpaceStat can calculate distance-based weights matrices (including great circle distance). Higher Order Neighbor Weights are constructed by using a similar criteria in defining higher order neighbor weights. and the nth order spatial weight matrix is based on the (n1)th order spatial weight.SPATIAL DATA ANALYSIS WITH SPATIAL STATISTICS AND GIS 5 exported to the SpaceStat dataset. users have to define spatial weights externally and saved the spatial neighborhood information in a text file. the spatial weights are stored in an object structure called spatial. Adjusted First Order constructs a spatial neighbor weight not only on the topological relationship but also on the geo-spatial distance between spatial units. The distance units are specified in the ArcView properties dialog.neighbor). S-PLUS for ArcView provides an option for users to include X and Y coordinates of points or centroids of polygons when they transfer attribute data to S-PLUS from ArcView. and point data. Spatial Neighbor function in S-PLUS for ArcView provides an easy tool for constructing spatial neighbor objects. First Order Neighbor Weights constructs a binary spatial neighbor object based on the adjacency of spatial units. an added module to S-PLUS. In S-PLUS/S+SpatialStats. which can then be imported into S-PLUS and converted into a spatial neighbor object (spatial. j]) of the spatial weight matrix W is one if polygon j is adjacent to polygon i. The spatial weights can be defined by using the topological information (adjacency criteria). ANN ARBOR. an average (centroid-to-centroid) distance between the neighbor polygons and the polygon (i) is calculated. The element (w [i. PROCEEDINGS OF GEOINFORMATICS’99 CONFERENCE. The spatial weight element (w [i. and zero otherwise. provided analytical functions for geostatistical data. In S-PLUS for ArcView. The data transferred to S-PLUS is stored as data frame objects. SpaceStat provided computations for roots and other characteristics of the weights matrix. and 0 otherwise. The second order spatial weight matrix is based on the first order spatial weight. Based on those binary spatial weights created by the SpeceStat-ArcView Link. lattice data. To enable S-PLUS objects to be linked to the ArcView attribute tables. Several options are provided in calculating the spatial weights.neighbor. The user can specify distance values. using border-to-border or centroid-to-centroid measurement options. an identity variable is necessary for each joined table. The S+SpatialStats. 19-21 JUNE 199 . j]) of the weight matrix W is 1 if polygon j is adjacent to polygon i under the above criteria. For irregular and regular lattices. The Spatial Neighbor function also provided several distance-based methods for both point and polygon shapefiles. A spatial weight is first defined by using the adjacency criteria. Appendix A (Bao 1997b) give an example of how the spatial weights can be created by using topological information (spatial neighborhood) from Shape files. Similar to SpeceStat-ArcView Link. Any polygon beyond the defined neighbor polygons of a polygon (i) will be included as a neighbor polygon if its spatial distance (centroid-to-centroid) to the polygon (i) is less than the average distance. Then. Since S-PLUS doesn't have direct access to geographical data such as ARC/INFO files or shapefiles. such as First Order Neighbor Weights. The element (w [i. and Higher Order Neighbor Weights. and 0 otherwise.

spatial regression models including conditional and simultaneous autoregressive models. Hubert-Golledge QAP statistics (implemented for generic case. local spatial associations and spatial linear regressions.. following an "Equal Interval" GEOINFORMATICS AND SOCIOINFORMATICS . These functions are each associated with specific output files generated by the corresponding SpaceStat commands [see Anselin (1994. bootstrap estimation of models with spatial autoregressive dependent variable. and point data. With those built menu functions in S-PLUS for ArcView. GLISA (Bao and Henry. Getis-Ord G and Gi statistics (for local spatial associations). The S-PLUS for ArcView provided the following functions of spatial statistics for geostatistical data. The following functions implemented for exploratory spatial data analyses are organized into three groups: (1) visualization of the spatial distribution of the data. robust least squares regression (Jackknife). simulation of spatial random processes. and nearest neighbor search. 1997) for more technical details]. which can be powerful in exploratory spatial data analyses. ordinary and universal kriging. maximum likelihood estimation of regression models with spatial autoregressive errors and spatial autoregressive dependent variable (with diagnostics). (2) visualization of spatial autocorrelation in attribute variables. random coefficients).6 BAO. Local Morans and Gearys(LISA). Box Plot and Box Map. Those approaches include: visualize the spatial distribution of data in GIS maps before further statistical analyses and modeling. Visualization of Spatial Data One advantage of linking spatial statistics with GIS is that users can visualize spatial data in different approaches. kriging prediction at arbitrary locations with standard errors. The functions for point data include point maps that include region boundaries. Geary and Moran spatial autocorrelation coefficients. and two-step and ML estimation of heteroskedastic models (groupwise heteroskedasticity. and correlation coefficients. and (3) visualization of local spatial association. Geary and absolute differences). users can conduct various spatial analyses from ArcView environment without switching to S-PLUS. Wartenberg multivariate spatial autocorrelation. visualize the results from spatial statistics and models in GIS maps. Functions of Spatial Statistics The current capabilities of spatial statistics in SpaceStat include descriptive statistics such as mean. lattice data (polygon data). spatial randomness tests. which may reveal more inside nature of spatial data. statistics for spatial autocorrelation and spatial associations such as join count statistics. test for normality. and local intensity estimation. visualize spatial data in various statistic graphics such as histogram. ANSELIN. empirical variogram estimation including robust methods. The functions for lattice data include "Binning" of high density data into a regular lattice of counts. third and fourth moment. 1996) for local spatial associations. quartiles. 3-D point clouds. standard deviation. variogram plots and boxplots. parametric and nonparametric trend surfaces. KelejianRobinson test). interquartile distance. variogram models including spherical and exponential. Lagrange Multiplier tests. 1995b. GMM estimation of models with spatial autoregressive dependent variable. directional variograms and correlograms for exploring anisotropy. variance. Ripley's K-functions. The Histogram is implemented as a standard bar chart for the current selected feature displayed in the View window. Moran I scatterplot. The first group of functions for spatial data visualization are simple descriptive statistics: Histogram. S. The available functions for geostatistical data include contour plots. Moran's I and Geary's C for global autocorrelation. In SpaceStat-ArcView Link. as well as for Moran. L. Least squares regression with diagnostics for spatial dependence (including Moran's I. such as spatial autocorrelation. empirical variogram.

Each of these functions requires the input of a SpaceStat Report File with a fixed file name prefix (such as MS_. Pie Charts. ANN ARBOR.0a histogram is included as a standard feature]. Scatter Plot Matrix. Regression. In addition. or GI_) followed by the name of the spatial weights file for which the spatial statistics were constructed. Coarse Surface. These include a Moran Scatterplot and Map. LISA Local Moran Map and G-Stat Map.txt) from SpaceStat that contains the spatial lags for the variables of interest. With the added menus to the ArcView GUI (Graphic Utility Interface). Those S-PLUS graphs can be imported into an ArcView layout directly and combined with ArcView maps and charts for output. median. image. the role of the Avenue scripts is limited to providing a shell for special-purpose functions included in a DLL (Dynamic Link Library). and various 2D and 3D plots such as Scatter-line plots (Smoothing Spline. S-PLUS has a wide variety of editable graphics plot types. see Anselin and Smirnov (1997) for details]. A "graphic" box plot is also added to the View. The third group visualizes the results of local indicators for spatial autocorrelation computed in SpaceStat. a number of tool buttons are implemented for the identification of a dynamic linkage between the maps. Using a Report File (sptran. these simple descriptions of spatial distributions are implemented as fully dynamically linked windows by means of external DLL functions instead of Avenue scripts. Customization and Extendibility Both the SpaceStat-ArcView Link and S-PLUS for ArcView are implemented primarily in the ArcView environment. The commands include the Spatial Lag Bar Chart and Spatial Lag Pie Chart. 19-21 JUNE 199 . The second group is derived from spatial transformations in SpaceStat. such as fitted values and residuals from classical or spatial regression analyses. These functions are especially useful for comparing analytical results. In this link. They are distributed as extensions to the ArcView. Polynomial Fit. which include the upper quartile. by means of customized menus associated with Avenue programs and C programs. resulting in increased speed and flexibility. The Data menu (Figure 3) consists of nine commands divided into three categories: (1) the auxiliary manipulation of spatial information such as adding the X-Y centroid coordinates of polygons and constructing an indicator variable for selected locations. The Box Map is a quartile map augmented with outlier indicators generated from the SpaceStat box map Report File (boxmap. both functions create ArcView spot symbols for a graphic representing respectively a pie chart or bar chart for all selected polygons. Those statistic graphs include empirical variogram. and Contour Plot. S-PLUS for ArcView provided a direct access to S-PLUS objects and statistic graphs. LM_.SPATIAL DATA ANALYSIS WITH SPATIAL STATISTICS AND GIS 7 classification [note that in ArcView version 3. outliers. and Exponential Fit). In addition to those functional menus. histograms. Box Plots. Grid Surface. The rich graphical features of S-PLUS can be accessed with the Import Graph function. lower quartile. (2) the construction of PROCEEDINGS OF GEOINFORMATICS’99 CONFERENCE. Bar Plots. Once those tools are activated. Users can easily generate the statistic graph by using the point-and-click interface. Several new menus will be added to the ArcView GUI (Graphic Utility Interface) after those extensions are loaded. a rudimentary form of dynamic linking is established between the selected spatial units in different "views" of the data. Robust. QQ Quartile Plots (QQ Normal Plot w Line. In Anselin and Smirnov (1997).txt). two additional menus and a few extra buttons and tools have been added to the standard View window: a Data menu and a SpaceStat menu. variables from S-PLUS data frame objects may be displayed in an ArcView map using the Color Classification and Spatial Bar/Pie Chart options. and mean. The SpaceStat-ArcView. tables and charts in different application windows. The Box Plot is implemented as a quartile map using the data from the quartile report generated by SpaceStat. QQ Plot without Line).

Moran Significance Map creates a new View with a combination of a Moran Scatterplot Map and a Local Moran map. Moran Scatterplot Map creates a new View with a unique value map with four colors corresponding to the four quadrants of the Moran Scatterplot of a selected variable. The S-PLUS for ArcView has two added menus: S-PLUS Menu and Spatial Stats Menu. Box Map creates a new View with a quartile map for a selected variable with the outliers highlighted (a box map). G-Stat Map is similar to LISA Local Moran Map but using the Gi or Gi* statistic.. (2) Spatial Lag Bar Chart and Pie Chart. (2) auxiliary manipulation of spatial information. (3) the data transfer between ArcView and SpaceStat. window average. CAR . and spatial linear regression. The Spatial Statistics menus provide direct access to S-PLUS/S+SpatialStats functions such as spatial autocorrelation (Moran's I and Geary's C). (3) Spatial Association. The smoother may be a spatial lag. spatial association (Generalized Local Indicators of Spatial Association . The Spatial Statistics menu contains four functions: (1) Spatial Neighbor. L. (4) linear regression. The S-PLUS menu (Figure 5) consists of eleven functions divided into five categories: (1) data transfer between ArcView and S-PLUS. The variables can be selected from either an ArcView theme table or an S-PLUS data frame object.8 BAO. Percentile Map creates a new View with a percentile map for a selected variable. (3) spatial data visualization. The Bar Chart creates a new View with a bar chart map showing the value of a selected variable and its spatial lag (can also be used for any spatial smoother computed in SpaceStat). (2) Spatial Autocorrelation. which can easily be joined to the current theme table.Simultaneous Autoregressive Model (Whittle 1954). Spatial Lag Pie Chart create a new View with a pie chart map showing the value of a selected variable and its spatial lag.GLISA). (5) Residual Map creates a new View with a standard deviational map for the residuals of any spatial regression in SpaceStat.Conditional Autoregressive Model (Bartlett 1971. Besag 1974). and (5) executing S-PLUS commands from ArcView. and (4) Spatial Regression [See Bao and Martin (1997) and MathSoft (1998) for technical details].Moving Average Model (Cliff and Ord 1981). ANSELIN. Predicted Map creates a new View with a bar chart map showing the observed and predicted values for any spatial regression in SpaceStat. The summarized results are output to text files and the estimates are saved in S-PLUS objects that can then be joined with an ArcView theme table for map visualization. The SpaceStat menu (Figure 4) consists of eleven commands divided into five categories: (1) Box Map and Percentile Map. spatial boundary files based on the information in an ArcView Shape file. (4) LISA Local Moran Map creates a new View with a unique value map for those locations with a significant Local Moran statistic. and MA . GEOINFORMATICS AND SOCIOINFORMATICS . S. and predicted values and residuals are saved in an S-PLUS data frame object. or any of the rate smoothers computed by SpaceStat. (3) Spatial Smoother creates a new View with a quintile map for the spatially smoothed values of a selected variable. A summary report of the regression is saved in a text file. Spatial association options include General Local Moran and Local Geary by Bao and Henry (1996). The Linear Regression function allows users to build a regression equation using variables from either the current ArcView theme or an S-PLUS data frame. Spatial linear regression include three types of spatial error models (Cressie 1993): SAR . such as exporting selected attribute data from ArcView to SpaceStat. and importing and joining output from the SpaceStat report files into ArcView. but a spatial neighbor object must have been predefined and be consistent with the selected variables for those spatial statistics. which are derived from the LISA statistic by Anselin (1995). showing the quadrant of the Moran Scatterplot only for those locations with a significant Local Moran statistic.

for Avenue scripts. 19-21 JUNE 199 . Alternatively. to implement small applets incorporating GIS functionality. it is necessary to develop efficient formats and data structures to enable a bi-directional data exchange between the GIS and the statistical software. S-PLUS for ArcView provide a new routine. National Science Foundation and carried out while Shuming Bao was a Visiting Scholar at the Regional Research Institute. An altogether different issue pertains to the types of statistical techniques that are most effectively included in an integrated framework. Acknowledgments Research on the SpaceStat-ArcView reported in this paper was supported in part by Grant SBR94 10612 from the U.S. from 1996 to 1997. Special thanks to Juergen for his valuable comments on the draft of the paper. Inc. These data structures must respect the complexities incorporated in spatial data. Appendix B is an example of Avenue program. PROCEEDINGS OF GEOINFORMATICS’99 CONFERENCE. and users can type in S-PLUS commands on the command line. The empirical variogram is plotted in S-PLUS as a graphsheet that can then be imported into ArcView Layout for output together with other charts and maps in ArcView.SPATIAL DATA ANALYSIS WITH SPATIAL STATISTICS AND GIS 9 Finally. a more elaborate strategy can be pursued. Research reported in this paper was supported by MathSoft. STRATEGIES FOR FUTURE DEVELOPMENT The linkages between spatial statistical functionality and a GIS outlined in this paper illustrate some important concepts. This allows analysts to develop their own specially customized dialogs and analyses for deployment to others. By applying this sample Avenue script. Army SBIR project. Inc. Users can create ArcView dialogs for their specialized application that fire off user-developed analyses and create technically sophisticated graphs in SPLUS. the reverse strategy is promising as well. Inc. the temptation will exist to use any technique that is available.Evaluate is a script that ArcView users can call from their own Avenue scripts allowing programmatic access to the rich S-PLUS language of over 2. Clearly. IV. The opinions expressed in the paper are solely those of the authors and do not imply an endorsement by MathSoft. it may perhaps be more effective to implement selected methods in small self-contained software applets that can be invoked from within the GIS.S. user can call variogram function of S-PLUS from ArcView directly. with obvious limitations. the Execute S-PLUS Commands feature provides ready access to all S-PLUS commands from the ArcView environment. as with the text file formats used in the SpaceStat-ArcView Link. projection and topology. such as location. S-PLUS for ArcView is product of MathSoft. Instead of linking a comprehensive statistical (or spatial statistical) module with the GIS as a single piece of software. Inc.Evaluate procedure to create their new menu function as an extension to the existing SPLUS for ArcView GUI. SPLUS. SPLUS. ANN ARBOR.000 functions. In order to establish an effective linkage. as in the geospatial objects implemented in S-PLUS for ArcView. For users to customize their own GUI menus. All S-PLUS objects are listed on this dialog window. even though many/most standard statistical approaches (such as classical linear regression) become inappropriate in the presence of spatial autocorrelation. Clearly. and funded by a U. West Virginia University. which is predominant in the spatial data sets manipulated by GIS. Shuming Bao was involved in the development of the initial beta version of S-PLUS for ArcView while a Research Scientist at MathSoft.Evaluate. which demonstrates how users can apply SPLUS. or make use of several auxiliary button functions. This can be implemented in a fairly simple manner. The calculated results will be added to a ArcView attribute table.

Allen. Limp.. Fotheringham and P. 1996. The Archeologist's Workbench: Integrating GIS. in: K. Zubrow (eds. S. L. L. Rhind (eds. Local Indicators of Spatial Association . Spatial Statistical Analysis and Geographic Information Systems. The SpaceStat Extension for ArcView 3. L. L. J. Henry. Anselin. Maple Valley. 1994. Can. Inc.). Anselin. R. L. Computers. Aptech. Hudak. (in press) 1997. in: M. 45-54. Farley. and S. Geographical Systems 1. pp. Rogerson (eds.org/. International Journal of Geographical Information Systems 10. SpaceStat Version 1. 1995a. Green and E. Avenue Program for Creating the Spatial Neighbor Weight Matrix. Martin. F. New Tools for Spatial Analysis.LISA.. 1990. SpaceStat. 141-164.80 User's Guide. Barkley. Bao. and A. Environment and Urban Systems 16. Anselin. pp. Linking GIS and Spatial Data Analysis in Practice. 1997b. Luxembourg: Eurostat. L. 1995. 3-19. L. Anselin. 1009-1017. Getis (eds. CA: Environmental Systems Research Institute.10 BAO. References Anselin. EDA and Database Management. 3-23. A. ESRI. S. Morgantown. 1992. The Integration of Spatial Analysis and GIS. 1996.). T. Regional Research Institute. West Virginia University. WA: Mathsoft. W. Management and Applications.. University of California. Anselin. Bao. WV. Anselin. 93-115. Longley. 1997. Interactive Techniques and Exploratory Spatial Data Analysis. 1992. ArcView GIS. CA. M. 5 (1). Anselin.. 1994. Springer-Verlag (in press). Environment. F. Dodson and S. M. Getis.). Interpreting Space: GIS and Archaeology. Anselin. Recent Developments in Spatial Analysis. in P. and D. Regional Research Institute. Smirnov. Seattle. Techniques. Y.. Spatial Analysis and GIS. WA: Aptech Systems Inc. WV.. Lockhart. Morgantown. Anselin. Fotheringham. West Virginia University. in: S. Bao. L. and S. L.. 1995b. S. Computers..an integrated Regional Analysis System with ARC/INFO. ESRI's Users Conference... A Review of Statistical Spatial Analysis in Geographical Information Systems. Bao. Exploratory Spatial Data Analysis Linking SpaceStat and ArcView. Remote Sensing. pp.C. Geographical Analysis 27. A. West Virginia University..0. San Diego. London: Taylor & Francis. Goodchild.).. S. 1: 37-56. and K. S. GEOINFORMATICS AND SOCIOINFORMATICS . SpaceStat: A Program for the Analysis of Spatial Data. D. 19-33. Exploratory Spatial Data Analysis and Geographic Information Systems. CA. Fischer and A. 13-44. User's Reference for the S-PLUS for ArcView. Geographical Information Systems: Principles. in: M. Regional Research Institute. S. and A.. D. and Urban Systems 19. Vol. Santa Barbara. 1995. 1997a. L. 1997.0. and J. CPGIS Newsletter. 1992. Painho (ed. National Center for Geographic Information and Analysis. 1997. Ding. Bailey. Integrating S-PLUS with ArcView in Spatial Data Analysis: An Introduction to the S+ArcView Link. L. The Annals of Regional Science 26. RAS .). 1993. Brooks. 1996. and O. Bao. Maguire and D. Weight Matrices and Spatial Autocorrelation Statistics Using a Topological Vector Data Model. Cambridge: Geoinformation International. London: Taylor & Francis.acpgis. http://www.apr User's Guide. L. The GAUSS System Version 3. S.. Bao. Redlands. ANSELIN.

407-423.SPATIAL DATA ANALYSIS WITH SPATIAL STATISTICS AND GIS 11 Fischer. Wise et al. S...). Haining. PROCEEDINGS OF GEOINFORMATICS’99 CONFERENCE. Inc. F. Allen. International Journal of Geographical Information Systems 6. Unwin. Inc. S-PLUS for ArcView GIS User' Guide. F. 1993. and P. 1994.. Briuer. London: Taylor & Francis. 1997.. Fotheringham. I.. Seattle: MathSoft. S-PLUS User’s Guide Version 4. 1996a. 327-334. M. J. Computing Science and Statistics. Interpreting Space: GIS and Archaeology. 29 (forthcoming). International Journal Systems 1..0 for Windows. Painho. 31-45. Geographical Information Science. Haining. 1996b.. and P. London: Taylor & Francis. Klinke. Using Geographic Informations and Exploratory Data Analysis for Archelogical Site Classfication and Analysis. Inc. 1998. F. Rogerson . 45-63. ANN ARBOR. pp. Spatial Modelling and Policy Evaluation. Nijkamp. Symanzik. MathSoft. M. 239-273. in: S. M. 19-21 JUNE 199 . Goodchild. Fotheringham and P. Designing Spatial Data Analysis Modules for Geographical Information Systems. Vol. Spatial Analysis and GIS. S+GISLink. Scholten and D. Goodchild.. London: Taylor & Francis. Seattle: MathSoft. A. Cook. D. Limp and F. S. M.. Geographic Information Systems. F. Seattle: MathSoft. 1997a..Problems and Possibilities. S. . M. M. Goodchild.. Fischer. Inc. Luxembourg: Eurostat. F. Version 1. 1990. 1992. Taylor & Francis. International Journal of Geographical Information Systems 6. Rogerson (eds. MathSoft. Schmelzer. 1992. Kötter. Seattle: MathSoft. D. Berlin: Springer-Verlag. R.. Green and E. Zubrow (eds. Integrating GIS and Spatial Analysis . R. H. MathSoft. W.0.. 1987. pp.. 1994. Spatial Analysis and GIS. 1996. in: K. S+SpatialStats User's Manual for Windows and Unix. Williams. New Tools for Spatial Analysis. Spatial Analytical Perspectives on GIS in Environmental and Socio-Economic Sciences. S. Spatial Data Analysis in the Dynamically Linked ArcView/XGobi/XploRe Environment. Swayne. 1994. T. A Spatial Analytical Perspective on Geographical Information Systems. MathSoft.).

L.GetFullName theFN=FileName. S. ‘ This program should be implemented under ‘ input: the current active theme.GetActiveDoc theTheme = theView. GEOINFORMATICS AND SOCIOINFORMATICS .AsString+"\"+"neighbor.txt" ) ' Get the list of fields in FTab numeric_fields = {} ' Numeric fields in the VTab field_aliases = {} 'List of field aliases all_fields = theFTab.Get(0) theFTab = theTheme.IsTypeNumber) then numeric_fields.SetModified(true) theSelection = theFTab.Add(f) field_aliases. theProject = av.Clone the View window.GetAlias) end end 'for fname = MsgBox. ‘ output: neighbor.GetProject theView = av.txt.12 BAO.ListAsString(field_aliases.Get(field_aliases.GetactiveThemes. Appendix A.GetFTab ' Get the name for neighbor file to export theDir = theProject. and list of field aliases. Avenue program for creating the spatial neighbor weight matrix.Add(f."Select the variable for identification:"."Neighbor Matrix") if (fname <> nil) then ' Match the alias to the actual field object theField = numeric_fields.GetProject.GetFields 'Build list of numeric fields from all fields. ANSELIN. for each f in all_fields if (f.Make( theDir.GetWorkDir..FindByValue(fname)) else exit end theDis = 0 ' Set modified flag.GetSelection. clear any ' previous selection av.

ClearAll ' theFTab.". 19-21 JUNE 199 . rec2) if (recval <> theIndex) then theString = theString++recVal.Clone theFTab. ANN ARBOR.Close if (theFlag=0) then MsgBox.ReturnValue(theField. rec) theRecord = rec.ClearAll av.ReturnValueNumber(theField.GetSelection.Set(theRecord) ' theFTab.SetStatus(numrec) end ‘for theFTab.SelectByFTab(theFTab.AsString end ‘if end if (theString <> theIndex.SPATIAL DATA ANALYSIS WITH SPATIAL STATISTICS AND GIS 13 theFTab. #VTAB_SELTYPE_NEW) selBitMap = theFTab.WriteElt(theString) theFTab.SetSelection(theSelection) theFTab.Info("No spatial weight matrix created: zero element."Neighbor Matrix") exit end PROCEEDINGS OF GEOINFORMATICS’99 CONFERENCE. #FTAB_RELTYPE_ISWITHINDISTANCEOF.GetSelection.GetNumRecords ' Define an output file for storing adjacency relationship outFile = LineFile.UpdateSelection find the total number of records theCount = theFTab.GetSelection theFlag = 0 for each rec in theFTab numrec =(rec+1)/theCount*100 theIndex = theFTab.AsString for each rec2 in selBitMap recVal = theFTab. theDis.GetSelection theString = theIndex.GetSelection.Make(theFN.AsString) then theFlag = 1 end ‘if ' Write results to file and clear the selection outFile.UpdateSelection theFTab.UpdateSelection outFile. #FILE_PERM_WRITE) ' Find out adjacent polygons for each individual polygon theBitMap = theFTab.

S-PLUS Graphsheet (distance via gamma) 'Initial inputs dataSet = "coal.AsString.ash" dataZ ="coal" dataX = "x" dataY = "y" 'Create a temporary file for output theProject = av. L.Warning(splusOutput.ash)" splusCommand2 = "stemp. gamma) 2.AsString.frame(stemp."\").Evaluate".Variogram.GetFullName.vario$gamma)" splusCommand3 = "guiPlot(DataSetValues=data.AsString) GEOINFORMATICS AND SOCIOINFORMATICS . splusOutput }) if (splusResult <> 0) then MsgBox.Left(80).Run("SPLUS."txt") theFN2 = theFN."+dataY+"). $y) 'Output: 1.gamma= stemp.frame(dist= stemp.vario2"+".GetWorkDir.BasicTrim("".vario$distance. ANSELIN."\\") splusOutput = String.AsString) exit end 'Plot empirical variogram splusResult = av.14 BAO. { splusCommand2.MakeBuffer(2^10) splusCommand = "stemp.GetProject theWorkDir = theProject.data=coal. splusOutput }) if (splusResult <> 0) then MsgBox.Substitute("\".vario2$dist" splusCommand5 = "stemp.Evaluate"..Left(80). S.file = """+theFN2+""".Run("SPLUS.vario2$gamma" 'Calculate the empirical variogram splusResult = av. Appendix B.Left(80).Run("SPLUS.table("+"stemp. { splusCommand.Evaluate".stemp. ArcView Table: Empirical Variogram (distance.vario<variogram("+dataX+"~loc("+dataX+". sep = ""\t"")" splusResult = av.Example 'Input: dataset ($z.Evaluate". Avenue program for customizing the S-PLUS for ArcView window 'SPLUS. "Error in varogram: "+splusResult.vario2$dist. splusOutput }) if (splusResult <> 0) then MsgBox.AsString.AsString. "Error in dataframe: "+splusResult.Warning(splusOutput.AsFileName theFN = theWorkDir. {splusCommand}) 'Extract the variables of distance and gamma to a new data frame splusResult = av.MakeTmp("sptemp". { splusCommand3.Run("SPLUS. $x.AsString) exit end 'Write the empirical variogram to a text file splusCommand = "write.vario2$gamma))" splusCommand4 = "stemp.Warning(splusOutput. "Error in plot: "+splusResult.vario2 <-data.

false.Info("The selected object is not a table or nil!".Make(theVTab) if (theTable = nil) then MsgBox.GetWin. 19-21 JUNE 199 .SPATIAL DATA ANALYSIS WITH SPATIAL STATISTICS AND GIS 15 end 'Add the output into S-PLUS Table theVTab = VTab.Make(theFN. "Add Table from S-PLUS") exit end av.AsString = "") then MsgBox.false) if (theVTab.GetProject. ANN ARBOR. "Add Table from S-PLUS") exit end theTable = Table.AddDoc(theTable) theTable.Info("The selected object is not a table or nil!".Open PROCEEDINGS OF GEOINFORMATICS’99 CONFERENCE.

33 0.ID WEIGHTA WEIGHTB 1 2 3 4 1 1 2 2 2 3 3 3 4 4 2 3 1 3 4 1 2 4 2 3 1 1 1 1 1 1 1 1 1 1 0.33 0.33 0.33 0. ANSELIN. L.33 0.. The data flow chart of the S-PLUS for ArcView ROW. S.ID COL. Figure 1.5 Figure 2. GEOINFORMATICS AND SOCIOINFORMATICS .5 0.33 0. Construct the spatial contiguity weight matrix using the adjacency criterion.16 BAO.5 0.5 0.

PROCEEDINGS OF GEOINFORMATICS’99 CONFERENCE. SpaceStat menu of the SpaceStat-ArcView Link Figure 5. ANN ARBOR. S-PLUS menu of the SPLUS for ArcView. 19-21 JUNE 199 .SPATIAL DATA ANALYSIS WITH SPATIAL STATISTICS AND GIS 17 Figure 3. Figure 4. The Data menu of the SpaceStat-ArcView.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.