You are on page 1of 30

Data Science Made Easy in ArcGIS

Using Python and R

James Jones, Lu Zhang and Lain Graham


James_Jones@esri.com; l.zhang@esri.com and lgraham@esri.com
Workshop Agenda

• Introduction
• Python
•R
• Conclusion / Q&A
Data Science
From Core to Community

• Core analytics in ArcGIS


- Maximize performance and utility
- E.g. Spatial Statistics, Geostatistics, Spatial Analyst
- E.g. GeoAnalytics, Insights, ArcGIS Python SDK
• The interoperability of the ArcGIS platform makes workflows more efficient
- Techniques and methodologies continue to develop
- Data availability continues to increase
• The data science community is vast and evolving
- ArcGIS extends directly via scripting APIs
- E.g. Python, R, Java
Data Science Community
Python
Data Science Community
R

• Well over 12,000 packages to enhance core


• Most widely used statistical software in the world
• Diverse and powerful
- Universities, Government, Industry
- Finance, Ecology, Statistics
- Machine learning, predictive analytics
Battle of Bands
Which one is best?

Tailored towards statistics and data


General Programming Language Functionality Analysis
Yes (PyPI and Conda) Package Management Yes (CRAN)

Individual machines to distributed Standalone computer or individual


computing environments
Scalability servers

Large number of libraries for Numerous libraries for making


graphical display of data
Display of Data incredible graphics

General Programming, Data


Science, Web Development
Use Cases Lingua franca of data science

YES! Integrates with ArcGIS YES!


What are you most comfortable with?
What is the best tool for the job?
Data Science Process

Chart
Summary & Descriptive Jupyter Notebooks
Statistics Written/Textual Reports
Develop Hypotheses Web Apps
Identify Patterns Story Maps
Get the data Model the data PowerPoint

Ask a question Download Explore the data Regression Communicate


Scrape Machine Learning
ETL “Big Data”
Enrich
Getting the Data

Non-GIS Data GIS Data


• Large number of libraries to download, • ArcPy allows native access to all Esri
transform, condition, and prepare data Data formats and the full functionality of
• Generic web libraries (Requests, urllib) ArcGIS Desktop

• API Specific libraries (Tweepy) • ArcGIS API for Python allows for access
and interaction with content within your
• Ability to scrape and parse existing web WebGIS
sites (Scrapy and BeautifulSoup)
Exploring Data

• Jupyter Notebooks are your friend


• ArcGIS API for Python was designed to work
natively with the Jupyter Notebook
• Ability to create Spatially Enabled DataFrame
• SEDF requires ArcGIS API for Python version 1.5
• Core Pandas DataFrame, allows for rapid
exploration of data
• Can create maps, charts and a variety of other
objects quickly using a common syntaz
Modeling the Data

• Many native tools to enable modeling of


data (Space-Time Pattern Mining,
Apps
Density Based Clustering, etc…)
Desktop
APIs
• Integration of popular third-party
Machine Learning/Deep Learning
libraries
- Scikit-Learn
- Tensorflow
- PyTorch
- NLTK
Machine Learning Tools in ArcGIS
Classification Prediction
• Maximum Likelihood • Empirical Bayesian Kriging
Classification • Areal Interpolation
• Random Trees • EBK Regression Prediction
• Support Vector Machine • Ordinary Least Squares
Regression and Exploratory
Regression
• Geographically Weighted
Regression
Clustering
• Spatially Constrained
Multivariate Clustering
• Multivariate Clustering
• Density-based Clustering
• Image Segmentation
• Hot Spot Analysis
• Cluster and Outlier Analysis
• Space Time Pattern Mining
Python Demo
Jupyter Notebooks and data
integration
The R-ArcGIS Bridge
An Introduction
R?
• A widely used statistical programming language
• More than 10,000 Packages
• Powerful and Open-Sourced
- Universities, Government, Industry
- Finance, Ecology, Statistics
- Machine learning, predictive analytics
ArcGIS?
• A GIS platform
• Open Standards / OGC Compliant
- Universities, Government, Industry
- Finance, Ecology, Statistics
The R-ArcGIS Bridge
• The R-ArcGIS bridge provides the ability to integrate R and ArcGIS
functionality.
Who Can Use the R-ArcGIS Bridge?

ArcGIS R script R

GIS Analyst

ArcGIS R

Data Scientist

ArcGIS R

Developers
Version Requirements for the R-ArcGIS Bridge

ArcGIS Pro ArcMap

1.1 (or later) 10.3.1 (or later)

R RStudio

3.2.2 (or later) Recommended !


Installing the Bridge / Getting Started

ArcGIS Pro Project Tab RStudio

Top 3 Most Useful R-ArcGIS Bridge Functions:


• arc.open()
• Reads GIS data that is currently stored in a shapefile, geodatabase, layer, or table into a R
environment (containing spatial information and attribute information)

• arc.select()
• Obtain your data set in an R data frame object. Offers the ability to work with specific attributes
from your data (dplyr / tidyr) and construct SQL queries

• arc.write()
• Allows you to easily save your data to shapefile, geodatabase, or table to an ArcGIS environment
Vector Support Points

• Ability to read and write vector data

• Support for key R objects and spatial packages


- R data frame object Lines
- Compatibility with sp
- Compatibility with sf

• Customize data manipulations


- Craft SQL queries / Subset by specific columns Polygons
- Reproject data as needed

• Maintain spatial geometries when working with dplyr


Raster Support

• Ability to read and write raster data


- Handle big data raster data with the ability to read in chunks by bands
- Compatibility with CRF format and Mosaic Datasets

• Customize selections and subsets


- Create subsets by bands or pixel rows and columns
- Resample options available
- Select desired pixel format for specific analyses
How is Microsoft R useful?
Microsoft R

Raster data can become a big data problem, quickly

Mosaics: Data structure to store/process rasters in space and time

Mosaics Time-Series Rasters/Mosaic

Time
Working with Big Data?

• Microsoft Open R is a publicly available R-version for big data

• Contains almost all CRAN libraries

• Window-based operations and image operators speed up drastically

• Set-up and Usage from Pro is exactly the same as traditional R


R-ArcGIS Bridge Demo

Predict Home Vacancy Rates in Washington DC based on


Vacant Home Locations in Baltimore (XY Coordinates)
ArcGIS RStudio

Get the data Model the data

Ask a question Explore the data Communicate

ArcGIS / ArcGIS
RStudio
Resources
Learn More on Using the R-ArcGIS Bridge

Resources from UC 2018:


- https://github.com/R-ArcGIS
Getting Started:
- Analyzing Crime Using Statistics and the R-ArcGIS Bridge Learn Lesson
- Using the R-ArcGIS Bridge Introductory Web Course
Creating R Script Tools:
- Integrating R Scripts into Geoprocessing Tools Web Course
- arcgisbinding Package Vignette
Powerful, In-depth Workflows in ArcGIS and R
- Identify an Ecological Niche for African Buffalo
Upcoming Sessions at Developers Summit (1/31/2019)

• ArcGIS API for Python: Introduction to Scripting your Web GIS


- Thursday, January 31st @ 10:30 am

• ArcGIS Pro: Scripting with Python


- Thursday, January 31st @ 10:30 am

• ArcGIS API for Python: Advanced Scripting


- Thursday, January 31st @ 1:30 pm

• Expand your Analysis with ArcGIS GeoAnalytics Server


- Thursday, January 31st @ 2:30 pm

• Maps & Charts: Making Visual Interfaces Accessible


- Thursday, January 31st @ 3:30 pm
Print Your Certificate of Attendance
Print Stations Located at L Street Bridge

Wednesday
6:30 pm – 9:00 pm
Networking Reception
National Building Museum
Please Take Our Survey on the App
Download the Esri Events Select the session Scroll down to find the Complete answers
app and find your event you attended feedback section and select “Submit”

You might also like