-Social media (Volunteered Geographic Information), cell phone data + temporal stamp
Geolocation is explicit or implicit
-Explicit: from a Global Position System -Implicit: coordinates are estimated Why?
Why is spatial data necessary?
3
If several layers of data are georeferenced, we
can examine causal relationships -Are clusters of disease related to the environment? -Are epidemics related to travel? -Co-occurrence to explain crime pattern How?
Acquiring implicit spatial data?
4
Geocoding, a process converting addresses to
geolocations on the surface of the Earth DATE TIME CRIME LOCATION CITY STATE 21994 2100 GAMBLING 0599 AMHERST E ST BUFFALO NY 32594 2102 GAMBLING 0553 FOURTH ST BUFFALO NY 32694 2006 GAMBLING 0111 SPRING ST BUFFALO NY 32894 2205 GAMBLING 0015 HARBOR WK BUFFALO NY 41894 1544 GAMBLING 0497 BUSTI AV BUFFALO NY 42994 1920 GAMBLING 0039 HARBOR WK BUFFALO NY 42994 2104 GAMBLING 0548 FOURTH ST BUFFALO NY 50294 2140 GAMBLING 0087 WOODLAWN AV BUFFALO NY 50494 1702 GAMBLING 0087 NORTHLAND AV BUFFALO NY 52294 2225 GAMBLING 0261 LOCUST ST BUFFALO NY 52794 1939 GAMBLING 0457 FOURTH ST BUFFALO NY 60994 1941 GAMBLING 0515 CLINTON ST BUFFALO NY
911calls for gambling
Buffalo, NY (1994) batchgeo.com
Visualize location of events in space.
Suggest and identify location of clusters. Examples…
How is spatial data used?
5
Brent geese (Netherlands) USGS maps tweets of
Different Earthquake in Virginia, 2011 scales reveal different patterns
Time offset from
the first tweet Examples…
How is spatial data used?
6
Where 2 individuals can meet Difference in travel time
New York City, Monday AM Seattle, Saturday PM Examples…
How is spatial data used?
7
Through geocoding and GIS, we can obtain
travel estimates from origins to destinations So what?
Geocoded children with spina bifida (birth
defect) born in Florida, 1998-2007 along with their hospitalization
Travel pattern Challenges…
Statistical issues 9
We would generally say that nearby observations tend
1 to exhibit similarity (Tobler first law of geography).
If this is the case, we violate assumptions of regression
2 models which state independence in the observations. Snow ~ elevation, temperature, latitude Coefficients will vary in the geographical space… 3 since nearby locations will tend to exhibit similarities
4 Spatial pattern in the residuals
Challenges…
Other statistical issues
10
Different datasets may be collected at different
1 scales, raising the issue of the Modifiable Areal Unit Problem (MAUP) 17 11.7 13.7 17 15 22 18 10 7 5 10 12 18 11.7 9
There is inherent spatial and temporal uncertainty
2 in the data Challenges…
What to do about it?
11
1 Test for autocorrelation in residuals
Conduct spatially-based regression
2 Local and global level
Evaluate if residuals are lower and pattern
3 appears more random with spatial regressions Conclusions…
Is Spatial data Special?
12
Some individuals may argue that it is not different
1 than X and Y data.
Yes, spatial data is special. Location does matter.
2 Nearby events are more correlated to one another.
Spatial data calls for special techniques to tackle
3 autocorrelation problems.
As software improves and is able to handle real 3D
4 spatial data is going to become even more important