November 2012, Volume 51, Issue 7.
Edzer Pebesma
Abstract
This document describes classes and methods designed to deal with diﬀerent types of spatiotemporal data in R implemented in the R package spacetime, and provides exam ples for analyzing them. It builds upon the classes and methods for spatial data from package sp, and for time series data from package xts. The goal is to cover a number of useful representations for spatiotemporal sensor data, and results from predicting (spa tial and/or temporal interpolation or smoothing), aggregating, or subsetting them, and to represent trajectories. The goals of this paper are to explore how spatiotemporal data can be sensibly represented in classes, and to ﬁnd out which analysis and visualisation methods are useful and feasible. We discuss the time series convention of representing time intervals by their starting time only. This vignette is the main reference for the R package spacetime; it been published as Pebesma (2012), but is kept uptodate with the software.
Keywords:˜Time series analysis, spatial data, spatiotemporal statistics, Geographic Informa tion Systems.
Spatiotemporal data are abundant, and easily obtained. Examples are satellite images of parts of the earth, temperature readings for a number of nearby stations, election results for voting districts and a number of consecutive elections, trajectories for people or animals possibly with additional sensor readings, disease outbreaks or volcano eruptions.
Schabenberger and Gotway (2004) argue that analysis of spatiotemporal data often happens conditionally, meaning that either ﬁrst the spatial aspect is analysed, after which the temporal aspects are analysed, or reversed, but not in a joint, integral modelling approach, where space and time are not separated. As a possible reason they mention the lack of good software,
2
spacetime: SpatioTemporal Data in R
data classes and methods to handle, import, export, display and analyse such data. This R (R Development Core Team 2011) package is a start to ﬁll this gap.
Spatiotemporal data are often relatively abundant in either space, or time, but not in both. Satellite imagery is typically very abundant in space, giving lots of detail in high spatial resolution for large areas, but relatively sparse in time. Analysis of repeated images over time may further be hindered by diﬀerence in light conditions, errors in georeferencing resulting in spatial mismatch, and changes in obscured areas due to changed cloud coverage. On the other side, data from ﬁxed sensors give often very detailed signals over time, allowing for elaborate modelling, but relatively little detail in space because a very limited number of sensors is available. The cost of an in situ sensor network typically depends primarily on its spatial density; the choice of the temporal resolution with which the sensors register signals may have little eﬀect on total cost.
Although for example Botts, Percivall, Reed, and Davidson (2007) describe a number of open standards that allow the interaction with sensor data (describing sensor characteristics, requesting observed values, planning sensors, and processing raw sensed data to predeﬁned events), the available statistical or geographic information system (GIS) software for this is in an early stage, and scattered. This paper describes an attempt to combine available infrastructure in the R statistical environment with ideas from statistical literature (Cressie and Wikle 2011) and data base literature (Guting _{¨} and Schneider 2005) to a set of useful classes and methods for manipulating, plotting and analysing spatiotemporal data. A number of case studies from diﬀerent application areas will illustrate its use.
The paper is structured as follows. Section˜2 describes how spatiotemporal data are usually recorded in tables. Section˜3 describes a number of useful spatiotemporal layouts. Section˜4 introduces classes and methods for data, based on these layouts. Section˜5 presents a number of useful graphs for spatiotemporal data, and implementations for these. Section˜6 discusses the spatial and temporal footprint, or support, of data, and how time intervals are dealt with in practice. Section˜7 presents a number of worked examples, some of which include statistical analysis on the spatiotemporal data. Section˜8 points to further material, including further vignettes in package spacetime on spatiotemporal overlay and aggregation, and on using proxy data sets to PostgreSQL tables when data are too large for R. Section˜9 ﬁnishes with a discussion.
This paper is also found as a vignette in package spacetime, which implements the classes and methods for spatiotemporal data described here. The vignette is kept uptodate with the software.
For reasons of simplicity, spatiotemporal data often come in the form of single tables. If this is the case, they come in one of three forms:
timewide where diﬀerent columns reﬂect diﬀerent moments in time,
spacewide where diﬀerent columns reﬂect diﬀerent measurement locations or areas, or
long formats where each record reﬂects a single time and space combination.
Journal of Statistical Software
3
Alternatively, they may be stored in diﬀerent, related tables, which is more typical for rela tional data bases, or in tree structures which is typical for xml ﬁles. We will now illustrate the diﬀerent singletable formats with simple examples.
2.1. Timewide format 

Spatiotemporal data for which each location has data for each time can be provided in two socalled wide formats. An example where a single column refers to a single moment or period in time is found in the North Carolina Sudden Infant Death Syndrome (sids) data set (Symons, Grimson, and Yuan 1983) available from package maptools LewinKoh, Bivand, Pebesma, Archer, Baddeley, Bibiko, Dray, Forrest, Friendly, Giraudoux, Golicher, Rubio, Hausmann, Hufthammer, Jagger, Luque, MacQueen, Niccolai, Short, Stabler, and Turner (2011), which is in the timewide format: 

R> 
library("foreign") 

R> 
read.dbf(system.file("shapes/sids.dbf", package="maptools"))[1:5,c(5,9:14)] 

NAME 
BIR74 
SID74 
NWBIR74 
BIR79 
SID79 
NWBIR79 

1 
Ashe 
1091 
1 
10 
1364 
0 
19 

2 
Alleghany 
487 
0 
10 
542 
3 
12 

3 
Surry 
3188 
5 
208 
3616 
6 
260 

4 
Currituck 
508 
1 
123 
830 
2 
145 

5 
Northampton 
1421 
9 
1066 
1606 
3 
1197 

where columns refer to a particular time: SID74 contains to the infant death syndrome cases for each county at a particular time period (19741984). 

2.2. Spacewide format 

The Irish wind data (Haslett and Raftery 1989) available from package gstat (Pebesma 2004), 

for which the ﬁrst six records and 9 of the stations (abbreviated by RPT, VAL, ... by ) 
are shown 

R> 
data("wind", 
package 
= "gstat") 

R> 
wind[1:6,1:12] 

year 
month 
day 
RPT 
VAL 
ROS 
KIL 
SHA 
BIR 
DUB 
CLA 
MUL 

1 
61 
1 
1 
15.04 
14.96 
13.17 
9.29 
13.96 
9.87 
13.67 
10.25 
10.83 

2 
61 
1 
2 
14.71 
16.88 
10.83 
6.50 
12.62 
7.67 
11.50 
10.04 
9.79 

3 
61 
1 
3 
18.50 
16.88 
12.33 
10.13 
11.17 
6.17 
11.25 
8.04 
8.50 

4 
61 
1 
4 
10.58 
6.63 
11.75 
4.58 
4.54 
2.88 
8.63 
1.79 
5.83 

5 
61 
1 
5 
13.33 
13.25 
11.42 
6.17 
10.71 
8.21 
11.92 
6.54 
10.92 

6 
61 
1 
6 
13.21 
8.12 
9.96 
6.67 
5.37 
4.50 
10.67 
4.42 
7.17 
are in spacewide format: each column refers to another wind measurement location, and the rows reﬂect a single time period; wind was reported as daily average wind speed in knots (1 knot = 0.5418 m/s).
4
spacetime: SpatioTemporal Data in R
2.3. 
Long format 

Finally, panel data are shown in long form, where the full spatiotemporal information is held in a single column, and other columns denote location and time. In the Produc data set (Baltagi 2001), a panel of 48 observations from 1970 to 1986 available from package plm (Croissant and Millo 2008), the ﬁrst ﬁve records and nine columns are 

R> 
data("Produc", package 
= 
"plm") 

R> 
Produc[1:5,1:9] 

state 
year 
pcap 
hwy 
water 
util 
pc 
gsp 
emp 

1 
ALABAMA 
1970 
15032.67 
7325.80 
1655.68 
6051.20 
35793.80 
28418 
1010.5 

2 
ALABAMA 
1971 
15501.94 
7525.94 
1721.02 
6254.98 
37299.91 
29375 
1021.9 

3 
ALABAMA 
1972 
15972.41 
7765.42 
1764.75 
6442.23 
38670.30 
31303 
1072.3 

4 
ALABAMA 
1973 
16406.26 
7907.66 
1742.41 
6756.19 
40084.01 
33430 
1135.5 

5 
ALABAMA 
1974 
16762.67 
8025.52 
1734.85 
7002.29 
42057.31 
33749 
1169.8 

where the ﬁrst two columns denote space and time (the default assumption for package plm), and e.g., pcap reﬂects private capital stock. 

None of these examples has strongly referenced spatial or temporal information: it is from the data alone not clear that the number 1970 refers to a year, or that ALABAMA refers to a state, and where this state is. Section 7 shows for each of these three cases how the data can be converted into classes with strongly referenced space and time information. 

3. Spacetime layouts 

In the following we will use the word spatial feature (Herring 2011) to denote a spatial entity. This can be a particular spatial point (location), a line or set of lines, a polygon or set of polygons, or a pixel (grid or raster cell). For a particular feature, one or more measurements are registered at particular moments in time. 

Four layouts of spacetime data will be discussed next. Two of them reﬂect lattice layouts, one that is eﬃcient when a particular spatial feature has data values for more than one time point, and one that is most eﬃcient when all spatial feature have data values at each time point. Two others reﬂect irregular layouts, one of them specializes to trajectories. 

3.1. 
Spatiotemporal full grids 
A full spacetime grid of observations for spatial features (points, lines, polygons, grid cells) ^{1}
s _{i} ,i = 1,
n and observation time t _{j} , j
= 1,
m is obtained when the full set of n × m nm. We choose to cycle spatial features ﬁrst,
...
,
, set of observations z _{k} is stored, with k = 1,
...
, so observation k corresponds to feature s _{i} , i = ((k − 1) % n) + 1 and with time moment t _{j} ,
...
j = ((k − 1)/n) + 1, with / integer division and % integer division remainder (modulo). The t _{j} are assumed to be in time order.
In this data class (top left in Figure˜1), for each spatial feature, the same temporal sequence of data is sampled. Alternatively one could say that for each moment in time, the same set of
^{1} note that neither spatial features nor time points need to follow a regular layout
Journal of Statistical Software
5
3rd
3rd
Spatial features
Spatial features
2nd
2nd
1st
1st
Spatial features
STF: full grid layout
STS: sparse grid layout
1st 
3rd 
4th 
1st 
2nd 
4th 
Time points 
Time points 
STI: irregular layout
STT: trajectory




1st 
2nd 
4th 

Time points 
Time points 
Figure 1: Four spacetime layouts: (i) the topleft: full grid (STF) layout stores all spacetime combinations; (ii) topright: the sparse grid (STS) layout stores only the nonmissing space time combinations on a lattice; (iii) bottomleft: the irregular (STI) layout: each observation has its spatial feature and time stamp stored, in this example, spatial feature 1 is stored twice – the fact that observations 1 and 4 have the same feature is not registered; (iv) bottom right:
simple trajectories (STT), plotted against a common time axis. It should be noted that in both gridded layouts the grid is in spacetime, meaning that spatial features can be gridded, but can also be any other nongridded type (lines, points, polygons).
6
spacetime: SpatioTemporal Data in R
spatial entities is sampled. Unsampled combinations of (space, time) are stored in this class, but are assigned a missing value NA.
It should be stressed that for this class (and the next) the word grid in spatiotemporal grid refers to the layout in spacetime, not in space. Examples of phenomena that could well be represented by this class are regular (e.g., hourly) measurements of air quality at a spatially irregular set of points (measurement stations), or yearly disease counts for a set of administrative regions. An example where space is gridded as well could be a sequence of rainfall images (e.g., monthly sums), interpolated to a spatially regular grid.
3.2. Spatiotemporal sparse grids
A sparse grid has the same general layout, with measurements laid out on a space time grid (top right in Figure˜1), but instead of storing the full grid, only nonmissing valued observations z _{k} are stored. For each k, an index [i, j] is stored that refers which spatial feature i and time point j the value belongs to.
Storing data this way may be eﬃcient
If full spacetime lattices have many missing or trivial values (e.g., when one want to store features or grid cells where ﬁres were registered, discarding those that did not),
If a limited set of spatial features each have diﬀerent time instances (e.g., to record the times of crime cases for a set of administrative regions), or,
If for a limited set of times the set of spatial features varies (e.g., locations of crimes registered per year, or spatially misaligned remote sensing images).
3.3. Spatiotemporal irregular data
Spacetime irregular data cover the case where time and space points of measured values have no apparent organisation: for each measured value the spatial feature and time point is stored, as in the long format. This is equivalent to the (maximally) sparse grid where the index for observation k is [k, k], and hence can be dropped. For these objects, n = m equals the number of records. Spatial features and time points need not be unique, but are replicated in case they are not.
Any of the gridded types can be represented by this layout, in principle, but this would have the disadvantages that
Spatial features and time points need to be stored for each data value, and would be redundant,
The regular layout is lost, and needs be retrieved indirectly, Spatial and temporal selection would be ineﬃcient, as the grid structure forms an index.
Examples of phenomena that are best served by this layout could be spatiotemporal point processes, such as crime or disease cases or forest ﬁres. Other phenomena could be measure ments from mobile sensors (when the trajectory sequence is not of importance).
Journal of Statistical Software
7
3.4. Interval time, moving objects, trajectories
In their book“moving objects databases”, Guting _{¨} and Schneider (2005) distinguish 10 diﬀerent data types in spacetime. In particular, they deﬁne for point features ^{2} .
a Sets of events without temporal duration (time is an instant), e.g., accidents, lightning, birth, death;
b Sets of events with a temporal duration but no movement, e.g., a tree, a (point in the) capital of a country, people’s home address;
c (Sets of) moving points, e.g., the trajectories of one or more persons, or birds.
To accomodate this typology we distinguish three cases, shown in ﬁgure 2:
(i) Time is instant and the feature is not moving (it may only exist at a time instant), (ii) Time is interval, objects do not move during this interval, (iii) Time is instant and features move (objects exist between time instants and may move) along a trajectory.
When time reﬂects intervals, it means that the spatial feature (spatial location or extent of the observation) or its associated data values does not change during this interval, but reﬂects the value or state during this interval. An examples is the yearly mean temperature of a country or of a set of locations, or the existence (duration) of a nation with a particular layout of its boundaries.
Time instants can reﬂect the moments of change (e.g., the start of the meteorological summer), or events with a zero or negligible duration (e.g., an earthquake, a lightning).
Movement reﬂects the fact that moving objects exist and may change location during a time interval. For moving object data, time instants reﬂect the location at a particular moment, and movement occurs between registered (time, feature) pairs, and must be continuous.
Trajectories cover the case where sets of (irregular) spacetime points form sequences, and depict a trajectory. Their grouping may be simple (e.g., the trajectories of two persons on diﬀerent days), nested (for several objects, a set of trajectories representing diﬀerent trips) or complex (e.g., with objects that split, merge, or disappear).
Examples of trajectories can be human trajectories, mobile sensor measurements (where the sequence is kept, e.g., to derive the speed and direction of the sensor), or trajectories of tornados where the tornado extent of each time stamp can be reﬂected by a diﬀerent polygon.
The diﬀerent layouts, or types, of spatiotemporal data discussed in Section˜3 have been implemented in the spacetime R package, along with methods for import, export, coercion, selection, and visualisation.
^{2}
They obtain 10 types by adding the singular/atomic form of a and b, distinguishing area from point features.
and doubling this
set
of
5
by
8
spacetime: SpatioTemporal Data in R
Figure 2: Time instant (top left), object movement (top right), time interval with consecutive (bottom left) or arbitrary (bottom right) intervals. s _{1} refers to the ﬁrst feature/location, t _{1} to the ﬁrst time instance or interval, thick lines indicate time intervals, arrows indicate movement. Filled circles denote start time, empty circles end times, intervals are rightclosed.
4.1. Classes
The classes for the diﬀerent layouts are shown in Figure˜3. Similar to the classes of package sp (Pebesma and Bivand 2005; Bivand, Pebesma, and GomezRubio 2008), the classes all derive from a base class ST which is not meant to represent actual data. The ﬁrst order derived classes specify particular spatiotemporal geometries (i.e., only the spatial and temporal information), the second order derived classes augment each of these with actual data, in the form of a data.frame.
To store temporal information, we chose to use objects of class xts in package xts (Ryan and Ulrich 2011) for time, because
It extends the functionality of package zoo (Zeileis and Grothendieck 2005),
It supports several basic types to represent time or date:
Date,
POSIXt,
timeDate,
yearmon, and yearqtr, It has good tools for aggregation over time using arbitrary aggregation functions, essen tially deriving this from package zoo (Zeileis and Grothendieck 2005), It has a ﬂexible syntax to select time periods that adheres to ISO 8601 ^{3} .
An overview of the diﬀerent time classes in R is found in Ripley and Hornik (2001). Further advice on which classes to use is found in Grothendieck and Petzoldt (2004), or in the CRAN task view on time series analysis.
Journal of Statistical Software
9
Figure 3: Classes for spatiotemporal data in package spacetime. Arrows denote inheritance, lower side of boxes list slot name and type, green lines indicate possible coercions (both ways).
For spatial interpolation, we used the classes deriving from Spatial in package sp (Pebesma and Bivand 2005; Bivand et˜al. 2008) because
They are the dominant set of classes in R for dealing with spatial data, They are interfaced to key external libraries through packages rgdal and rgeos, and, They provide a single interface to dealing with points, lines, polygons and grids.
We do not use xts or Spatial objects to store spatiotemporal data values, but we use data.frame to store data values. For purely temporal information the xts objects can be used, and for purely spatial information the sp objects can be used. These will be recycled appropriately when coercing to a long format data.frame.
The spatial features supported by package sp are twodimensional for lines and polygons, but may be higher (three) dimensional for spatial points, pixels and grids.
4.2. Methods
The main methods for spatiotemporal data implemented in packages spacetime are listed in table 1. Their usage is illustrated in examples that follow.
4.3. Creation
Construction of spatiotemporal objects essentially needs speciﬁcation of the spatial, the tem poral, and the data values. The documentation of stConstruct contains examples of how
10
spacetime: SpatioTemporal Data in R
method 
what it does 

stConstruct 
Creates STFDF or STIDF objects from single or multiple tables 

[[, 
$, $< 
Select or replace data values 
[ 
Select spatial and/or temporal subsets, and/or data vari ables 

as coerce to other spatiotemporal objects, xts, Spatial, 

stplot 
matrix, or data.frame create spatiotemporal plots, see Section˜5 

over 
overlay: retrieve index or data values of one object at the 

aggregate 
locations and times of another aggregate data values over particular spatial, temporal, or 

spatiotemporal domains 
Table 1: Methods for spatiotemporal data in package spacetime.
this can be done from long, spacewide, and timewide tables, or from shapeﬁles. A simple toy example for a full grid layout with three spatial points and four time instances is given below. First, the spatial object is created:
R> 
sp 
= 
cbind(x 
= 
c(0,0,1), 
y 
= c(0,1,1)) 

R> 
row.names(sp) 
= 
paste("point", 
1:nrow(sp), 
sep="") 

R> 
library(sp) 

R> 
sp 
= 
SpatialPoints(sp) 
Then, the time points are deﬁned as four time stamps, one hour apart, starting Aug 5 2010, 10:00 GMT.
R> time = as.POSIXct("20100805", tz = "GMT")+3600*(10:13)
Next, a data.frame with the data values is created:
R> 
m 
= 
c(10,20,30) # 
means 
for 
each 
of 
the 
3 point 
locations 

R> 
values 
= 
rnorm(length(sp)*length(time), 
mean 
= 
rep(m, 
4)) 

R> 
IDs 
= paste("ID",1:length(values), 
sep 
= 
"_") 

R> 
mydata 
= 
data.frame(values 
= 
signif(values, 
3), 
ID=IDs) 
And ﬁnally, the STFDF object can be created by:
R> 
library(spacetime) 

R> 
stfdf 
= 
STFDF(sp, 
time, 
data 
= 
mydata) 
In this case, as no endTime is speciﬁed, it is assumed that time reﬂects time instances. Altnatively, one could specify endTime, as in
R> stfdf = STFDF(sp, time, mydata, time+60)
Journal of Statistical Software
11
where the time intervals (Section˜6.1) are one minute.
When given a long table, stConstruct creates an STFDF object if all space and time combi nations occur only once, or else an object of class STIDF, which might be coerced into other representations.
4.4. Overlay and aggregation
Aggregation of data values to a coarser spatial or temporal form (e.g., to a coarser grid, aggre gating points over administrative regions, aggregating daily data to monthly data, or aggre gation along an irregular set of spacetime points) can be done using the method aggregate. To obtain the required aggregation predicate, i.e., the grouping of observations in spacetime, the method over is implemented for objects deriving from ST. Grouping can be done based on spatial, temporal, or spatiotemporal predicates. It takes care of the case whether time reﬂects time instances or time intervals (see section 6.1). These methods eﬀectively provide a spatiotemporal equivalent to what is known in geographic information science as the spatial overlay.
4.5. Space and time selection with [
The idea behind the [ method for classes in sp was that objects would behave as much as possible similar to matrix and data.frame. For a data.frame, a construct like a[i,j] selects row(s) i and column(s) j. For objects deriving from Spatial, rows were taken as the spatial features (points, lines, polygons, pixels) and columns as the data variables ^{4} .
For the spatiotemporal data classes described here, a[i,j,k] selects spatial features i, tem poral instances j, and data variable(s) k. Unless drop=FALSE is added to such a call, selecting a single time or single feature results in an object that is no longer spatiotemporal, but either snapshot of a particular moment, or history at a particular feature (Galton 2004).
Similar to selection on spatial objects in sp and time series objects in xts, space and time indices can be deﬁned by index or boolean vectors, but also by higherlevel expressions such as spatial areas and time periods. For instance, the selection
R> air_quality[2:3, 1:10, "PM10"]
yields air quality data for the second and third spatial features, and the ﬁrst 10 time instances. Higherlevel spatial and temporal expressions can be used, and
R> air_quality[Germany, "2008::2009", "PM10"]
selects the PM10 measurements for the years 20089, lying in Germany, when Germany is a Spatial object (e.g., a SpatialPolygons) that deﬁnes Germany.
4.6. Coercion to long and wide tables
Spatiotemporal data objects can be coerced to the corresponding purely spatial objects. Objects of class STFDF will be represented in timewide form, where only the ﬁrst (selected) data variable is retained:
^{4} a convention that was partially broken for class SpatialGridDataFrame, where a[i,j,k] could select the kth data variable of the spatial grid selection with spatial grid row(s) i and column(s) j, unless the length of i equals the number of grid cells.
12
spacetime: SpatioTemporal Data in R
R> 
xs1 
= as(stfdf, 
"Spatial") 

R> 
class(xs1) 

[1] 
"SpatialPointsDataFrame" 

attr(,"package") 

[1] 
"sp" 

R> 
xs1 

coordinates 
X2010.08.05.10.00.00 
X2010.08.05.11.00.00 

point1 
(0, 
0) 
9.66 
9.64 

point2 
(0, 
1) 
21.20 
19.50 

point3 
(1, 
1) 
29.90 
32.10 

X2010.08.05.12.00.00 
X2010.08.05.13.00.00 

point1 
8.98 
11.4 

point2 
19.60 
19.8 

point3 
30.20 
29.8 
as time values are diﬃcult to retrieve from these column names, this object gets the proper time values as an attribute:
R> attr(xs1, "time")
[1] 
"20100805 
10:00:00 
GMT" 
"20100805 
11:00:00 
GMT" 
[3] 
"20100805 
12:00:00 
GMT" 
"20100805 
13:00:00 
GMT" 
Objects of class STSDF or STIDF will be represented in long form, where time is added as additional column:
R> 
x 
= 
as(stfdf, 
"STIDF") 

R> 
xs2 
= 
as(x, "Spatial") 

R> 
class(xs2) 

[1] 
"SpatialPointsDataFrame" 

attr(,"package") 

[1] 
"sp" 
R> xs2[1:4,]
coordinates 
values 
ID 
time 


0) 
9.66 
ID_1 
20100805 
10:00:00 

1) 
21.20 
ID_2 
20100805 
10:00:00 

1) 
29.90 
ID_3 
20100805 
10:00:00 

0) 
9.64 
ID_4 
20100805 
11:00:00 
Journal of Statistical Software
13
5.1. stplot: panels, spacetime plots, animation
The stplot method can create a few specialized plot types for the classes in the spacetime package. They are:
multipanel plots In this form, for each time step (selected) a map is plotted in a separate panel, and the strip above the panel indicates what the panel is about. The panels share x and yaxis, no space needs to be lost by separating white space, and a common legend is used. Three types are implemented for STFDF data:
The x and y axis denote space, an example for gridded data is shown in Figure˜4, for polygon data in Figure˜9. The stplot is a wrapper around spplot in package sp, and inherits most of its options,
The x and y axis denote time and value; one panel for each spatial feature, colors may indicate diﬀerent variables (mode="tp"); see Figure˜5 (left),
The x and y axis denote time and value; one panel for each variable, colors may denote diﬀerent features (mode="ts"); see Figure˜5 (right).
For both cases with time is on the yaxis (Figure˜5), values over time for diﬀerent variables or features are connected with lines, as is usual with time series plots. This can be changed to symbols by specifying type=’p’.
spacetime plots Spacetime plots show data in a spacetime crosssection, with e.g., space on the xaxis and time on the yaxis. (See also Figure˜1.)
Hovm¨oller diagrams (Hovm¨oller 1949) are an example of these for full spacetime lat tices, i.e., objects of class STFDF. To obtain such a plot, the arguments mode and scaleX
should be considered; some special care is needed when only the x or yaxis needs to
be plotted instead of the spatial index (1
n);
details are found in the stplot documen
... tation. An example of a Hovm¨ollerstyle plot with station index along the xaxis and
time along the yaxis is obtained by
R> 
scales=list(x=list(rot 
= 
45)) 

R> 
stplot(wind.data, 
mode 
= 
"xt", 
scales 
= 
scales, 
xlab 
= 
NULL) 
and shown in Figure˜6. Hovm¨oller plots.
Note that the yaxis direction is opposite to that of regular
animated plots Animation is another way of displaying change over time; a sequence of spplots, one for each time step, is looped over when the parameter animate is set to a positive value (indicating the time in seconds to pause between subsequent plots).
Time series plots Time series plots are a fairly common type of plot in R. Package xts has a plot method that allows univariate time series to be plotted. Many (if not most) plot routines in R support time to be along the x or yaxis. The plot in Figure˜7 was generated by using package lattice (Sarkar 2008), and uses a colour palette from package RColorBrewer (Neuwirth 2011).
14
spacetime: SpatioTemporal Data in R
0.5
0.0
−0.5
−1.0
Figure 4: Spacetime interpolations of wind (square root transformed, detrended) over Ireland using a separable product covariance model, for 10 time points regularly distributed over the month for which daily data was considered (April, 1961).
6.1. Time periods or time instances
Most data structures for time series data in R have, explicitly or implicitly, for each record a time stamp, not a time interval. The implicit assumption seems to be (i) the time stamp is a moment, (ii) this indicates either the real moment of measurement / registration, or the start of the interval over which something is aggregated (summed, averaged, maximized). For ﬁnancial “Open, high, low, close” data, the “Open” and “Close” refer to the values at the moment the stock exchange opens and closes, meaning time instances, whereas “high” and “low” are aggregated values – the minimum and maximum price over the time interval between opening and closing times.
Package lubridate (Grolemund and Wickham 2011) allows one to explicitly deﬁne and com pute with time intervals (e.g., Allen (1983)). It does not provide structures to attach these intervals to time series data. As xts does not support these times as index, spacetime does also not support it.
According to ISO 8601:2004, a time stamp like 201005 refers to the full month of May, 2010, and so reﬂects a time interval. As a selection criterion, xts will include everything inside the following interval:
R> 
library(xts) 
R> 
.parseISO8601( 201005 ) 
$first.time
Journal of Statistical Software
15
values
values
pcap hwy util water
●
●
●
●
100000
50000
0
1970
1975
1980
1985
california
1970
1975
1980
1985
100000
50000
0
time
arizona alabama
california arkansas
●
●
●
●
100000
50000
0
1970
1975
1980
1985
pcap
util
1970
1975
1980
1985
100000
50000
0
time
Figure 5: Time series for four variables and four features plotted with stplot, with mode="tp"
(left) and mode="ts" (right); see also Section˜7.2.
[1] 
"20100501 
CEST" 

$last.time 

[1] 
"20100531 
23:59:59 
CEST" 
and this syntax lets one deﬁne, unambiguously, yearly, monthly, daily, hourly or minute
intervals, but not e.g. 10 ˜ or 30minute intervals. For a particular interval, the full speciﬁcation
is needed:
R> .parseISO8601( 20100501T13:30/20100501T13:39 )
$first.time 

[1] 
"20100501 
13:30:00 
CEST" 
$last.time 

[1] 
"20100501 
13:39:59 
CEST" 
When matching two sequences of time (Figure˜8) in order to overlay or aggregate, it matters
whether each of the sequences reﬂect instances, one of them reﬂects time intervals and the
other instances, or both reﬂect time intervals. All of these cases are accommodated for.
Objects in spacetime register both (start) time and end time. By default, objects with gridded
spacetime layout (Figure˜1) of class or deriving from STF or STS assume interval time, and
STI and STT objects assume instance time.
When no end times are supplied by creation and time intervals are assumed, the assumption
is that time intervals are consecutive (Figure˜2), and the last interval (for which no end time
is present) has a length identical to the second last interval (Figures˜2 and 8).
16
spacetime: SpatioTemporal Data in R
time
Malin Head
Belmullet
_{R}_{o}_{c}_{h}_{e}_{'}_{s} _{P}_{o}_{i}_{n}_{t} Valentia Roslare Kilkenny Shannon Birr
Mullingar
Clones
Claremorris
Dublin
Apr 24
Apr 17
Apr 10
Apr 03
1.0
0.5
0.0
−0.5
−1.0
−1.5
Figure 6: Spacetime (Hovm¨oller) plot of wind station data.
6.2. Spatial support
All examples above work with spatial points, i.e., data having a point support. The assump
tion of data having points support is implicit for SpatialPoints features. For polygons, the
assumption will be that values reﬂect aggregates (e.g., sums, or averages) over the polygon.
For gridded data, it is ambiguous whether the value at the grid cell centre is meant (e.g.,
for DEM data) or an aggregate over the grid cell (typical for remote sensing imagery). The
Spatial* objects of package sp have no explicit information about the spatial support.
This section shows how existing data in various formats can be converted into ST classes, and
how they can be analysed and/or visualised.
7.1. North Carolina SIDS
As an example, the North Carolina Sudden Infant Death Syndrome (sids) data will be used.
Journal of Statistical Software
17
1961
●
●
●
●
●
●
●
●
●
●
●
●
Figure 7: Time series plot of daily wind speed at 12 stations, used for interpolation in Figure˜4.
time
Figure 8: Matching two time sequences, assuming x reﬂects time intervals, and y reﬂects time
instances. Note that the last interval extends the last time instance of x.
These data were ﬁrst analysed by Symons et˜al. (1983), and ﬁrst published and analysed
in a spatial setting by Cressie and Chan (1989). They are available from package maptools
(LewinKoh et˜al. 2011). The data are sparse in time (aggregated to 2 periods of unequal
length, according to the documentation in package spdep), but have polygons in space. First,
we will prepare the spatial data:
R> 
library("maptools") 

R> 
fname 
= system.file("shapes/sids.shp", package="maptools")[1] 

R> 
nc 
= readShapePoly(fname, proj4string=CRS("+proj=longlat 
+datum=NAD27")) 
then, we construct the time sequence:
R> 
time 
= as.POSIXct(c("19740701", "19790701"), 
tz 
= 
"GMT") 

R> 
endTime 
= as.POSIXct(c("19780630", "19840630"), 
tz 
= 
"GMT") 
and we construct the data values table, in long form, by
18
spacetime: SpatioTemporal Data in R
R> 
data 
= data.frame( 

+ 
BIR 
= 
c(nc$BIR74, nc$BIR79), 

+ 
NWBIR 
= c(nc$NWBIR74, 
nc$NWBIR79), 

+ 
SID 
= 
c(nc$SID74, nc$SID79)) 
These three components are put together by function STFDF:
R>
nct
=
STFDF(sp
=
as(nc,
"SpatialPolygons"),
time,
data,
endTime)
7.2. Panel data
The panel data discussed in Section˜2 are imported as a full spatiotemporal data.frame
(STFDF), and linked to the proper state polygons of maps. We can obtain the states polygons
from package map (Brownrigg and Minka 2011) by:
R> 
library("maps") 

R> 
states.m 
= 
map( state , plot=FALSE, 
fill=TRUE) 

R> 
IDs 
< 
sapply(strsplit(states.m$names, 
":"), function(x) 
x[1]) 

R> 
library("maptools") 

R> 
states 
= 
map2SpatialPolygons(states.m, 
IDs=IDs) 

we obtain the time points by: 

R> 
yrs 
= 1970:1986 

R> 
time 
= 
as.POSIXct(paste(yrs, "0101", 
sep=""), 
tz 
= "GMT") 

We obtain the data table (already in long format) by 

R> 
library("plm") 

R> 
data("Produc") 
When combining all this information, we do not need to reorder states because states and
Produc
order
states alphabetically.
present in Produc table (record 8):
We need to deselect District of Columbia, which is not
R> 
# 
deselect 
District 
of Columbia, polygon 
8, which 
is 
not present 
in 
Produc: 

R> 
Produc.st 
= STFDF(states[8], 
time, Produc[order(Produc[2], Produc[1]),]) 

R> 
library(RColorBrewer) 

R> 
stplot(Produc.st[,,"unemp"], 
yrs, col.regions 
= brewer.pal(9, "YlOrRd"),cuts=9) 
produces the plot shown in Figure˜9.
Time and state were not removed from the data table on construction; printing these data
after coercion to data.frame can then be used to verify that time and state were matched
correctly.
The routines in package plm can be used on the data, when back transformed to a data.frame,
when index is used to specify which variables represent space and time (the ﬁrst two columns
from the data.frame no longer contain state and year). For instance, to ﬁt a panel linear
model for gross state products (gsp) to private capital stock (pcap), public capital (pc), labor
input (emp) and unemployment rate (unemp), we get
Journal of Statistical Software
19
Figure 9: Unemployment rate per state, over the years 19701986.
R> 
zz 
< plm(log(gsp) 
~ 
log(pcap) 
+ 
log(pc) 
+ log(emp) 
+ 
unemp, 

+ 
data 
= as.data.frame(Produc.st), 
index 
= c("state", 
"year")) 
where the output of summary(zz) is left out for brevity. More details are found in Croissant
and Millo (2008) and Millo and Piras (2012).
7.3. Interpolating Irish wind
This worked example is a modiﬁed version of the analysis presented in demo(wind) of package
gstat (Pebesma 2004). This demo is rather lengthy and reproduces much of the original
analysis in Haslett and Raftery (1989). Here, we will reduce the material and focus on the
use of spatiotemporal classes.
First, we will load the wind data from package gstat. It has two tables, station locations
in a data.frame, called wind.loc, and daily mean wind speed in data.frame wind. We
now convert character representation (such as 51d56’N) to proper numerical coordinates, and
convert the station locations to a SpatialPointsDataFrame object. A plot of these data is
shown in Figure˜10.
20
spacetime: SpatioTemporal Data in R
R> 
library("gstat") 

R> 
data("wind") 

R> 
wind.loc$y 
= as.numeric(char2dms(as.character(wind.loc[["Latitude"]]))) 

R> 
wind.loc$x 
= as.numeric(char2dms(as.character(wind.loc[["Longitude"]]))) 

R> 
coordinates(wind.loc) 
= 
~x+y 

R> 
proj4string(wind.loc) 
= 
"+proj=longlat 
+datum=WGS84" 
Figure 10: Station locations for Irish wind data.
The ﬁrst thing to do with the wind speed values is to reshape these data. Unlike the North
Carolina SIDS data of section 7.1, for we have few spatial and many time points, and so the
data in data.frame wind come in spacewide form with stations time series in columns:
R> wind[1:3,]
year month day RPT VAL 
ROS 
KIL 
SHA 
BIR 
DUB 
CLA 
MUL 
61 1 1 15.04 
13.17 
9.29 
13.96 
9.87 
13.67 
10.25 
10.83 
14.71
61 1 2 
10.83 
6.50 
12.62 
7.67 
11.50 
10.04 
9.79 
61 1 3 18.50 
12.33 
10.13 
11.17 
6.17 
11.25 
8.04 
8.50 
Journal of Statistical Software
21
CLO
BEL
MAL
1 15.04
12.58
18.50
2 13.83
9.67
17.54
3 12.71
7.67
12.75
We will recode the time columns to an appropriate time data structure,
R> 
wind$time 
= 
ISOdate(wind$year+1900, wind$month, 
wind$day) 

R> 
wind$jday 
= 
as.numeric(format(wind$time, 
%j )) 
and then subtract a smooth time trend of daily means (not exactly equal, but similar to the
trend removal in the original paper):
R> 
stations 
= 
4:15 

R> 
windsqrt 
= 
sqrt(0.5148 
* as.matrix(wind[stations])) 
# 
knots 
> 
m/s 

R> 
Jday 
= 1:366 

R> 
windsqrt 
= 
windsqrt 
 mean(windsqrt) 

R> 
daymeans 
= 
sapply(split(windsqrt, 
wind$jday), 
mean) 

R> 
meanwind 
= 
lowess(daymeans 
~ 
Jday, 
f 
= 0.1)$y[wind$jday] 

R> 
velocities 
= apply(windsqrt, 
2, function(x) 
{ 
x 
 meanwind 
}) 
Next, we will match the wind data to its location, by connecting station names to location
coordinates, and create a spatial points object:
R> 
wind.loc = wind.loc[match(names(wind[4:15]), wind.loc$Code),] 

R> 
pts 
= 
coordinates(wind.loc[match(names(wind[4:15]), 
wind.loc$Code),]) 

R> 
rownames(pts) 
= wind.loc$Station 

R> 
pts 
= 
SpatialPoints(pts, 
CRS("+proj=longlat +datum=WGS84")) 
Then, we project the longitude/latitude coordinates and country boundary to UTM zone
29, using spTransform in package rgdal (Keitt, Bivand, Pebesma, and Rowlingson 2011) for
coordinate transformation:
R> 
library("rgdal") 

R> 
utm29 
= 
CRS("+proj=utm 
+zone=29 
+datum=WGS84") 

R> 
pts 
= 
spTransform(pts, 
utm29) 
And now we can construct the spatiotemporal object from the spacewide table with veloci
ties:
R> 
wind.data 
= stConstruct(velocities, 
space 
= 
list(values 
= 
1:ncol(velocities)), 

+ 
time 
= 
wind$time, 
SpatialObj 
= pts, 
interval 
= TRUE) 

R> 
class(wind.data) 

[1] 
"STFDF" 

attr(,"package") 

[1] 
"spacetime" 
22
spacetime: SpatioTemporal Data in R
For plotting purposes, we can obtain country boundaries from package maps:
R> 
library("maptools") 

R> 
m 
= 
map2SpatialLines( 

+ 
map("worldHires", 
xlim 
= 
c(11,5.4), 
ylim 
= 
c(51,55.5), 
plot=F)) 

R> 
proj4string(m) 
= "+proj=longlat 
+datum=WGS84" 

R> 
m 
= 
spTransform(m, 
utm29) 

For interpolation, we can deﬁne a grid over the area: 

R> 
grd 
= SpatialPixels(SpatialPoints(makegrid(m, 
n 
= 300)), 

+ 
proj4string 
= proj4string(m)) 
Next, we (arbitrarily) restrict observations to those of April 1961:
R> wind.data = wind.data[, "196104"]
and choose 10 time points from that period to form the spatiotemporal prediction grid:
R> 
n 
= 
10 

R> 
library(xts) 

R> 
tgrd 
= 
seq(min(index(wind.data)), 
max(index(wind.data)), 
length=n) 

R> 
pred.grd = 
STF(grd, 
tgrd) 
We will interpolate with a separable exponential covariance model, with ranges 750 km and
1.5 days:
R> 
v 
= list(space 
= vgm(0.6, 
"Exp", 750000), 
time 
= 
vgm(1, "Exp", 
1.5 
* 
3600 
* 
24)) 

R> 
pred 
= krigeST(values 
~ 
1, 
wind.data, pred.grd, 
v) 

R> 
wind.ST 
= STFDF(grd, 
tgrd, 
data.frame(sqrt_speed 
= 
pred)) 
then creates the STFDF object with interpolated values, the results of which are shown in
Figure˜4, created by
R> 
layout 
= list(list("sp.lines", 
m, col= grey ), 

+ 
list("sp.points", 
pts, 
first=F, cex=.5)) 

R> 
stplot(wind.ST, col.regions=brewer.pal(11, "RdBu")[c(10,11)], 

+ 
at=seq(1.375,1,by=.25), 

+ 
par.strip.text 
= list(cex=.7), sp.layout 
= 
layout) 
7.4. Calculation of EOFs
Empirical orthogonal functions from STFDF objects can be computed in spatial form (default),
e.g., from data values
R> eof.data = EOF(wind.data)
or alternatively from modelled, or interpolated values:
Journal of Statistical Software
23
R> eof.int = EOF(wind.ST)
By default, spatial EOFs are competed; alternatively they can be obtained in temporal form:
R> eof.xts = EOF(wind.ST, "temporal")
the resulting object is of the appropriate subclass of Spatial in the spatial form, or of class xts
in the temporal form. Figure˜11 shows the 10 spatial EOFs obtained from the interpolated
wind data of Figure˜4.
Figure 11: EOFs of spacetime interpolations of wind over Ireland (for spatial reference, see
Figure˜4), for the 10 time points at which daily data was chosen above (April, 1961).
7.5. Trajectory data: ltraj in package adehabitatLT
Trajectory objects of class ltraj in package adehabitatLT (Calenge, Dray, and RoyerCarenzi
2008) are lists of bursts, sets of sequential, connected spacetime points at which an object is
registered. An example ltraj data set is obtained by ^{5} :
R> 
library("adehabitatLT") 

R> 
data("puechabonsp") 

R> 
locs 
= 
puechabonsp$relocs 

R> 
xy 
= 
coordinates(locs) 

R> 
da 
= 
as.character(locs$Date) 

R> 
da 
= 
as.POSIXct(strptime(as.character(locs$Date),"%y%m%d", 
tz 
= 
"GMT")) 

R> 
ltr 
= 
as.ltraj(xy, 
da, 
id 
= 
locs$Name) 
^{5} taken from adehabitatLT, demo(mangltraj)
24
spacetime: SpatioTemporal Data in R
R> 
foo 
= 
function(dt) 
dt 
> 100*3600*24 

R> 
l2 
= 
cutltraj(ltr, 
"foo(dt)", 
nextr 
= 
TRUE) 
and these data, converted to STTDF can be plotted, as panels by time and id by
R> 
sttdf 
= 
as(l2, 
"STTDF") 
R> 
stplot(sttdf, 
by="time*id") 
which is shown in Figure˜12.
Figure 12: Trajectories, split by id (rows) and by time (columns).
7.6. Country shapes in cshapes
The cshapes (Weidmann, Kuse, and Gleditsch 2010) package contains a GIS dataset of country
boundaries (19462008), and includes functions for data extraction and the computation of
distance matrices. The data set consist of a SpatialPolygonsDataFrame, with the following
data variables:
Journal of Statistical Software
25
R> 
library("cshapes") 

R> 
cs 
= 
cshp() 

R> 
names(cs) 

[1] "CNTRY_NAME" 
"AREA" 
"CAPNAME" 
"CAPLONG" 
"CAPLAT" 

[6] "FEATUREID" 
"COWCODE" 
"COWSYEAR" 
"COWSMONTH" "COWSDAY" 

[11] "COWEYEAR" 
"COWEMONTH" 
"COWEDAY" 
"GWCODE" 
"GWSYEAR" 

[16] "GWSMONTH" 
"GWSDAY" 
"GWEYEAR" 
"GWEMONTH" 
"GWEDAY" 

[21] "ISONAME" 
"ISO1NUM" 
"ISO1AL2" 
"ISO1AL3" 
R> row.names(cs) = paste(as.character(cs$CNTRY_NAME), 1:244)
where two data bases are used, “COW” (correlates of war project ^{6} ) and “GW” Gleditsch and
Ward (1999). The variables COWSMONTH and COWEMONTH denote the start month
and end month, respectively, according to the COW data base.
In the following fragment, we create the time index:
R> 
begin 
= 
as.POSIXct(paste(cs$COWSYEAR, cs$COWSMONTH, cs$COWSDAY, 
sep=""), 

+ 
tz 
= 
"GMT") 

R> 
end 
= 
as.POSIXct(paste(cs$COWEYEAR, cs$COWEMONTH,cs$COWEDAY, 
sep=""), 

+ 
tz 
= 
"GMT") 
and the spatiotemporal object:
R>
st
=
STIDF(geometry(cs),
begin,
as.data.frame(cs),
end)
A possible query would be which countries are found at 7 East and 52 North,
R> 
pt = SpatialPoints(cbind(7, 
52), CRS(proj4string(cs))) 

R> 
as.data.frame(st[pt,,1:5]) 

V1 V2 
sp.ID 
time 

9.41437 50.57623 
Federal Republic 
188 
19550505 

51.09070 
Germany 
187 
19901003 

endTime timeIndex 
CNTRY_NAME 
AREA 
CAPNAME 

19901002 188 
Federal Republic 
247366.4 
Bonn 

187 
Germany 
356448.2 
Berlin 

CAPLONG CAPLAT 

50.73333 

52.51667 
which turns out to be the Germany Federal Republic and Germany, that is, before and after
the merge. No data before 195555 is available.
^{6} Correlates
of
War
Project.
2008.
State
System
Membership
List,
v2008.1.
Online,
26
spacetime: SpatioTemporal Data in R
Searching past email discussion threads on the rsiggeo (R Special Interest Group on using
GEOgraphical data and Mapping) email list may be a good way to look for further material,
before one considers posting questions. Search strings, e.g., on the google search engine may
look like:
spacetime site:stat.ethz.ch
where the search keywords should be made more precise.
The excellent book Statistics for spatiotemporal data (Cressie and Wikle 2011) provides a
large number of methods for the analysis of mainly geostatistical data. A demo script, which
can be run by
R> 
library(spacetime) 
R> 
demo(CressieWikle) 
downloads the data from the book web site, and reproduces a number of graphs shown in the
book. It should be noted that the the book examples only deal with STFDF objects.
Section˜7.3 contains an example of a spatial interpolation with a spatiotemporal separable or
Much more than documents.
Discover everything Scribd has to offer, including books and audiobooks from major publishers.
Cancel anytime.