You are on page 1of 22

INTERMEDIATE GIS CONCEPTS

AND

SKILLS

WITH

ARCGIS

S TACEY M APLES GIS S PECIALIST & I NSTRUCTION C OORDINATOR AT T HE S TERLING M EMORIAL L IBRARY M AP D EPARTMENT - STACEY . MAPLES @ YALE . EDU
This session will build upon the skills and concepts introduced in the "Introduction to Geographic Information Systems and ArcGIS for Spatial Analysis" session and participants will be expected to attend that workshop, or have comparable experience with ArcGIS 10. Topics will include: Use of Relates & Relationship Classes; Geoprocessing of geographic data; Geocoding of street addresses; Overlay Analysis; and Advanced Manipulation of Tabular Data. Part of the Yale University Library Map Collection GIS Workshop Series

GIS RESOURCES:
www.library.yale.edu/maps Yale Map Dept. Website http://mailman.yale.edu/mailman/listinfo/gis-l Yale GIS Listserv http://guides.library.yale.edu/GIS Yale GIS Support Portal

DOWNLOAD TUTORIAL DATA

FROM THE

GIS LIBGUIDE

1. http://guides.library.yale.edu/gis ; Yale GIS Workshops Tab 2. Download the Datasets for the INTERMEDIATE GIS CONCEPTS AND SKILLS WITH ARCGIS Workshop (Right-click; Save Target As; Save to c:\temp)

POINT PATTERN ANALYSIS


EXTRACT
THE

DATA

TO THE C:\TEMP FOLDER

1. Browse to the c:\Temp folder, where you saved the Data 2. Right-click on the EX_02_John_Snow.zip and select Extract All 3. Accept all defaults to extract the data file to C:\Temp 4. Browse into the EX_02_John_Snow folder and double-click on the EX_02_John_Snow.mxd to open it.

GENERATE

NEAR TABLE

AND

RELATE
Page 1 of 22

1. Open the Search Window by Hovering over the Tab at the right side of the Data Frame, or using the Main Menu>Windows>Search Window 2. Search on the term Near and click on the Generate Near Tool result to launch the tool. 3. Use the deaths_reverse_geocode layer as the Input Features and the Water_Pumps_Rev_Geocode layer as the Near Features. 4. Save the Output Table as Death2Pump_Near_Table in your Snow_Cholera_Data.gdb 5. Make sure that the Find only the closest feature option is checked and click OK to run the tool 6. Right-click and Open the resulting Death2Pump_Near_Table to examine the results.

Note that this table

contains IN_FID and NEAR_FID fields. These fields match the Input features (IN_FID) to their nearest Near features (NEAR_FID) using the OBJECTID for each layer. This table can be use dot create a relationship class that allows you to select associated fetures between these two feature classes.

JOIN

THE

DEATHS DATA

TO THE

NEAR TABLE

1. Close the Death2Pump_Near_Table Page 2 of 22

2. Right-click on the deaths_reverse_geocode layer layer and select Joins and Relates>Join. 3. Join the deaths_reverse_geocode layer to the Death2Pump_Near_Table, using the OBJECTID to the IN_FID as the Keyfield. 4. Examine the results to make sure the two tables are now joined.

Page 3 of 22

RELATE

THE

DEATHS DATA

TO THE

WATER

PUMPS
5. Right-click on the deaths_reverse_geocode layer again and select Joins and Relates>Relate. 6. Relate the deaths_reverse_geocode layer to the Water_Pumps_Rev_Geocode layer, using the NEAR_FID to the OBJECTID as the Keyfield, and name the Relate Death2Pumps. Click OK. 7. Open the Attribute Table for the Water_Pumps_Rev_Geocode layer and select the Broadwick Pump record by clicking on the gray box at the far left 8. Now, using the Related Tables Dropdown Button (at the top left of the Attribute Table Window), and select the Death2Pumps Relate. Note that the Attribute Table for the deaths_reverse_geocode layer should now be open and the features in that layer that correspond to the Broadwick Pump should be selected (both in the table, and in the Data Frame). 1. Right-Click on the Num_Cases field header and select Statistics to examine a basic statistical summary of the deaths nearest the Broadwick Pump. 2. Close the Selection Statistics Window and clear the selection using the Main Menu>Selection>Clear Selected Features.

Page 4 of 22

THIESSEN POLYGON (SPATIAL ALLOCATION)


Thiessen polygons allocate space in an area of interest to a single feature per polygon. That is, within a Thiessen polygon, all other features are closer to the point that was used to generate that polygon than to any other point in the feature set. In this case, we will create a set of Thiessen polygons based upon the locations of the Water Pumps in our project. 1. Use the Search Tab to search for the Thiessen Polygon tool, using the Search Term Thiessen and launch the tool from the result. 2. Select the Water_Pumps_Rev_Geocode as the Input Features and save the Output Feature Class as Pump_Thiessen, in the Snow_Cholera_Data.gdb. 3. Set the Output Fields option to ALL. 4. Click on the Environments button at the bottom of the window and expand the Processing Extent Option. 5. Set the Processing extent to the Same as layer extent 6. Click OK twice to apply the Environment Setting and run the Thiessen Polygon tool. Page 5 of 22

SPATIAL JOIN (POINT AGGREGATION)


Now that you have created the Thiessen polygon layer, you will allocate each of the deaths to one of the Thiessen polygons. To do this, we will use the Spatial Join tool. 1. Right-click on the deaths_reverse_geocode layer and select Joins and Relates>Join 2. Change the Method Dropdown to Join data from another layer based on spatial location 3. Save the Output Layer as Deaths_Allocated, in your Snow_Cholera_Data.gdb. Click OK 4. The resulting layer is added to the Map Document. Open its attribute table to confirm that the attributes of the Water Pumps have been transferred.

SUMMARY STATISTICS
1. Use the Search Window to search on the term Summary and open the Summary Statistics tool. 2. Select the Deaths_Allocated Table as the Input. 3. Save the Ouput Table to your Snow_Cholera_Data.gdb and name it Deaths_Summary_by_Pumps. 4. For the Statistics Field(s), select the Num_Cases Field, twice, and set the Statistic Type to SUM and MEAN.

Page 6 of 22

5. Assign the

REV_Addres_1 (the address field from the Water Pump data layer) as the case field and click OK. 6. Open the resulting table and Sort descending on the SUM_Num_Cases field. Note that the Broadwick Pump has the highest value for all three significant attributes: FREQUENCY (No. of households), SUM_Num_Cases (Total Deaths) and MEAN_Num_Cases (Mean Deaths per Household).

SPATIAL CENTRAL TENDENCY


SPATIAL MEAN
1. Search for and open the Mean Center tool. 2. Select the deaths_reverse_geocode layer as the Input Feature Class 3. Save the Output Feature Class to the Snow_Cholera_Data.gdb and name it Deaths_Spatial_Mean. 4. Do not assign a Weight Field, yet. Click OK to calculate the Mean Center. 5. Change the Symbology for the Deaths_Spatial_Mean layer to something that contrasts with the other symbologies.

WEIGHTED SPATIAL MEAN

Page 7 of 22

1. Run the Mean Center tool again, this time assigning the

deaths_reverse_geocode_Num_Cases field as the Weight Field. 2. Save the Output Feature Class to the Snow_Cholera_Data.gdb and name it Deaths_Weighted_Spatial_Mean. 3. Apply a symbology to the Deaths_Weighted_Spatial_Mean layer.

STANDARD DISTANCE
1. Search for and open the Standard Distance tool. 2. Select deaths_reverse_geocode as the Input feature class. 3. Save the Output Feature Class to the Snow_Cholera_Data.gdb and name it Deaths_Standard_Distance. 4. Select deaths_reverse_geocode_Num_Cases as the Weight Field. 5. Click OK to calculate the Standard Distance.

Page 8 of 22

INTERPOLATION (HOT

SPOTS)
INVERSE DISTANCE WEIGHTED (IDW) INTERPOLATION
1. Search for and open the IDW tool. 2. Select the deaths_reverse_geocode layer as the Input Point features 3. Select the deaths_reverse_geocode.Num_Cases as the Z Value Field. 4. Save the Output Raster to the Snow_Cholera_Data.gdb and name it IDW_Deaths. 5. Set the Output Cell Size to 10 (this is in meters). 6. Leave the remaining settings at their defaults and click OK to calculate the IDW raster.

Page 9 of 22

AREAL INTERPOLATION

OF

ATTRIBUTES

In this tutorial, we will be performing what is referred to as Areal Interpolation of Census Attributes. We have a set of boundaries (in this case the Major Watershed Basins of Connecticut, our CT_Major_Basins Layer) for which we would like to summarize the population. Our problem is that these watershed boundaries do not correspond with the geographic units that the U.S. Census uses to collect and tabulate demographic data. Some of the Census Block Groups in our CT_Block_Groups layer overlaps more than one Watershed basin unit. What we will do in the following steps is to calculate the proportion of overlap for each Census Block Group, relative to the Watershed Boundaries, and use these proportions to assign an appropriate estimate of the population to each watershed.

CALCULATING GEOMETRY

FOR A

DATA LAYER

First, we need to determine the initial area of each of our intact Census Block Groups. We can refer to these as the Parent features. 1. Right-Click on the CT_Block_Group Layer and Open the Attribute Table. 2. Take a few seconds to examine the data available in this dataset. This data describes the demographic characteristics of every Census Block Group in our area of interest. 3. Click the Options Button at the Top of the Attribute Table and Select Add Field... 4. Add a Field with Name = AREA, and Type = Float. 5. Click OK. 6. Scroll to the far right of the Attribute Table to view the newly added AREA Field. 7. Right-Click on the Area Field Header and Select Calculate Geometry Click Yes when warned about Calculating Outside and Edit Session. 8. Change the Units to Square Miles US [sq mi]. 9. Click OK. Page 10 of 22

10.Note that the AREA Field should now be populated with the new values. 11.Close the Attribute Table

GEOPROCESSING: USING

THE

UNION TOOL

Now, we need to merge the Block Group and Watershed boundary files, so that those Block Groups that span more than one watershed will be split into their sub-units of overlap, or child features. To do this, we will use a technique generically referred to as Geoprocessing. Geoprocessing is the act of applying any number of spatially transforming tools to a dataset. In this case, we will use the Union Tool to create a new dataset. 1. Search for and open the Union Tool 2. Select the CT_Major_Basins and CT_Block_Groups Layers as the Input Features. 3. Click on the Show Help>> Button at the bottom of the Dialog Box and note that the Help System is Context-Sensitive. 4. Save the Output Feature Class to your C:\temp\Intermediate_GIS_Skills\CT_Watershed_Data.gdb and name it Union 5. Leave the remaining options at their default settings. 6. Click OK to Apply the Union Tool. 7. Click Close once the process has completed. 8. You should be left with a new Union Layer, at the top of your Table of Contents.

CALCULATING

THE

NEW AREA

OF THE

UNION RESULTS

Now we need to calculate the NEW AREA of those Child Block Groups that were split by the Union Process and then the proportion of their original AREA. 1. Right-Click on the Union Layer and Open the Attribute Table. Page 11 of 22

2. Click on the Options Button and Select Add Field 3. Add a new field: Name = SUBAREA, Type = Float. Click OK. 4. Add a new field: Name = WEIGHT, Type = Float. Click OK. 5. Add a new field: Name = WTPOP, Type = Short Integer. Click OK. 6. Scroll to the right of the Attribute Table to find the newly added SUBAREA Field. 7. Right-Click on the SUBAREA field header and Select Calculate Geometry 8. Change the Units to Square Miles US [sq mi]. 9. Click OK to apply the calculation.

CALCULATING

THE

WEIGHT

VALUE

Now we will calculate the proportion of the child area to parent area, which will be used as a weight to apply to the demographics we are interested in. First, we must exclude those polygons that have an AREA=0 (these are coastal slivers and are not important to the results of our analysis). 1. Click on the Select by Attributes Button 2. In the Query Argument panel, at the bottom of the Select by Attributes Dialog Box, enter the query: "AREA" <>0 3. This will select only those records that do not have an AREA = 0. 4. Click on the Verify Button to check your SQL Query Syntax. 5. Click Apply. 6. Click Close. Page 12 of 22

7. Right-Click on the WEIGHT field header and Select Field Calculator 8. Use the Field Calculator to build the following argument: [SUBAREA] / [AREA] 9. Click OK to apply the calculation and note that, because you have an active select, the calculation is only applied to the selected subset of records, thus avoiding a divide by 0 error. 10.Finally, Scroll to the far right of the Attribute Table, RightClick on the WTPOP field header and select Field Calculator 11.Use the Field Calculator to build the following argument: [POP2004] * [WEIGHT] 12.Click OK to apply the Calculation.

13.Save

your work.

Page 13 of 22

SUMMARY STATISTICS
Now that we have a set of Census Boundary files that correspond to the watershed, and estimates of the population of those new boundary units, we need to summarize those population estimates for each of our watershed units. 1. On the Attribute Table and click on the Clear Selection button. 2. Search for and open the Summary Statistics Tool. 3. Select the Union Layer as the Input Table. 4. Browse to the CT_Watershed_Data.gdb and save the Output Table as Population_Summary 5. Select WTPOP as the Statistics Field, and select SUM as the Statistic Type. 6. Select MAJOR as the Case field. 7. Click OK. 8. Click Close when the tool completes. 9. Click on the Source Tab, at the Bottom of the Table of Contents. 10.Right-Click on the Population _Summary Table and Open it to observe the population counts for the watersheds. Page 14 of 22

11.Close Attribute Table. 12.Save your work.

Page 15 of 22

JOINS

AND

MULTIPART FEATURES

1. Join the Summary Table to the MAJOR_BASIN_POLY using the MAJOR field as a Keyfield and observe the results. Note that two of the records are repeated. This is because the Hudson and Southeast Coast features are represented in the dataset as two distinct features. The fix for this is to dissolve these features into single multipart features. 2. Remove the Join by rightclicking on the Major_Basin layer and selecting Joins and Relates>Remove Joins>Remove all joins. 3. Search for and Open the Dissolve tool. 4. Use the Dissolve Tool to Dissolve the mulitpart features into single entries based upon the MAJOR field. SUM the ACREAGE & AREA_SQMI. Call the result BASIN_DISSOLVE and save it to the CT_Watershed_Data.gdb. 5. Join the Summary Table to the BASIN_DISSOLVE feature class and examine the attribute table. 6. Remove the original CT_MAJOR_BASIN feature class and save your Map Document.

Page 16 of 22

ADD

BASEMAP

FROM

ARCGIS ONLINE

1. On the Main Menu, go to File>Add Data>Add Basemap and add a basemap of your choice to the Map Document. 2. Turn off all other layers in the Map Document.

CREATING A FEATURE CLASS XY COORDINATES

FROM A

TABLE

OF

1. Click on the View by Source button at the top of the Table of Contents 2. Add the table to ArcMap 3. Right-Click the CT_TRI_Facilties tableand examine the data. 4. Close the CT_TRI_Facilties table, rightclick on it and select Display XY Data. 5. Edit the Coordinate System to Geographic>North America>NAD 1983 6. Export the Events layer to the CT_Watersheds_data.gdb as TRI_SITES, using the coordinate system of the Data Frame, and Add the new Feature Class when prompted. 7. Remove the CT_TRI_Facilities Events Layer.

Page 17 of 22

RELATIONSHIPS CLASSES

IN THE

GDB

1. Right-Click the CT_Watershed_Data.gdb and add a New>Relationship Class a. Name = ToxicSites_to_Chem b. Origin Table = TRI_SITES c. Destination table = CT_TRI_Chemicals d. Simple Relationship e. prefix to_ to the Relationship Labels f. Cardinality = One to Many g. no attributes h. TRIFID = primary/foreign key 2. Once the relationship class is established, Use Select by Attributes to select all records in the CT_TRI_Chemicals table where:
"CHEMNAME" IN( 'LEAD' , 'LEAD COMPOUNDS' )

3. Use the Related table tool to select the related TRI_SITES that release Lead and Lead Compounds into the environment.

Page 18 of 22

GEOCODE

THE

ADDRESS DATA

1. Right-Click the Schools_Addresses Table and select Geocode Addresses 2. Use ZIPTXT as the ZIP and save the results to the GDB as Geocoding_Result_01 and click OK 3. Examine the Interactive Rematch interface after the Automatic Geocode (look at the ZipCode in the Address Table and Reference Data)

Page 19 of 22

FINDING

THE

NEAREST FEATURES

1. Search and open the Generate Near Table tool. 2. Use the Generate Near Table tool to create a table that identifies the TRI sites within 5 miles of each school in Geocoding_results_02. 3. Be sure to uncheck the option to Find only closest feature. 4. Save this table to the CT_Watersheds_Data.gdb and name it NEAR_Schools_to_TRI

Page 20 of 22

RELATIONSHIP CLASSES DOCUMENT


TRI_SITES
TO THE

IN THE

MAP

NEAR TABLE

1. Right-Click on the TRI_SITES layer and create a Relate to the NEAR_Schools_to_TRI table based on its ObjectID and the NEAR_FID 2. Call this Relate TRI_2_NEAR

SCHOOLS

TO THE

NEAR TABLE

1. Right-Click on the Geocoding_Results_01 layer and create a Relate to the NEAR_Schools_to_TRI table based on the IN_FID and the OBJECTID of the Schools

EXPLORING RELATED TABLES


2. Open the Geocoding_Results_02 layer and Select The Strong School (using any method of selection you prefer). 3. Use the Related Tables Tools to track through the table relationships until you have a selection of related CT_TRI_Chemicals Records. The selection of chemical records you have created represents the compounds being released with 5 miles of the Strong School.

Page 21 of 22

SUMMARY STATISTICS
1. Search for and open the Summary Statistics tool 2. Run the Summary Stats Tool on the active selection in the CT_TRI_Chemicals table using the CHEMNAME field as the Case field and the TTLAIR and TTLSURFWAT fields as the statistics fields, with SUM as the statistic type. 3. Name the Output Table Strong_School_Exposure and save it to the CT_Watershed_Data.gdb. 4. Open the resulting Summary Table and examine the results.

Page 22 of 22

You might also like