Professional Documents
Culture Documents
article info a b s t r a c t
Keywords: The spatio-temporal variation in the demand for transportation, particularly taxis, in the
STARMA highly dynamic urban space of a metropolis such as New York City is impacted by various
Spatio-temporal factors such as commuting, weather, road work and closures, disruptions in transit services,
Time series
etc. This study endeavors to explain the user demand for taxis through space and time by
Taxi demand prediction
proposing a generalized spatio-temporal autoregressive (STAR) model. It deals with the
high dimensionality of the model by proposing the use of LASSO-type penalized methods
for tackling parameter estimation. The forecasting performance of the proposed models
is measured using the out-of-sample mean squared prediction error (MSPE), and the
proposed models are found to outperform other alternative models such as vector au-
toregressive (VAR) models. The proposed modeling framework has an easily interpretable
parameter structure and is suitable for practical application by taxi operators. The efficiency
of the proposed model also helps with model estimation in real-time applications.
© 2018 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.ijforecast.2018.10.001
0169-2070/© 2018 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
2 A. Safikhani et al. / International Journal of Forecasting ( ) –
only 3,150 pickups in the Bronx on an average day. The use (1980, 1981) and has been applied in many different dis-
of GPS enabled the spatio-temporal historical demand for ciplines since, such as economics and finance (Giacomini
taxis in the year of 2015 to be disaggregated to several sub- & Granger, 2004; Hernández-Murillo & Owyang, 2006;
regions within the city (see Fig. 2). Shoesmith, 2013), social science (Pfeifer & Deutsch, 1980;
These taxis travelled approximately 460 million miles Sartoris, 2005), transportation (Cheng, Wang, Harworth,
in 2015 (New York City Taxi & Limousine Commission, Heydecker, & Chow, 2011; Duan, Mao, Zhang, & Wang,
2016), the distribution of which can be seen in Fig. 3. Due 2016; Kamarianakis & Prastacos, 2003), climatology (Kyr-
to the myriad of factors that can impact demand, which iakidis & Journel, 1999), health sciences (Baklanov et al.,
may or may not be known in advance, there is scope for 2007), etc. Modeling the demand through time in all sub-
taxis to drive around seeking rides – some of which could regions simultaneously – a vector autoregressive model
be included in the number of trips under one mile in Fig. 3. (VAR) – is a high-dimensional problem, since the number of
This study models the demand for taxis as a dy- parameters in the model is proportional to the square of the
namic spatio-temporal process. The historical GPS-enabled number of sub-regions. A STAR-type modeling approach
spatio-temporal demand for taxis in the year 2015 (pro- reduces the number of parameters dramatically by gov-
vided by the Taxi and Limousine Commission of New York erning the neighborhood structure between the regions.
City) is used and aggregated to several sub-regions within This structure is also useful for capturing the spatial depen-
the city. dence of the demand between the regions, and makes the
We examine the demand for taxis in NYC through results more interpretable.
space and time using a spatio-temporal autoregressive In this research, we use the spatio-temporal structure
model. The spatio-temporal autoregressive moving aver- for predicting the yellow taxi demand for different zip
age (STARMA) model is a well-established spatio-temporal codes of Manhattan, a borough of NYC. For this purpose,
process that was first introduced by Pfeifer and Deutsch GPS taxi demand data for two ordinary days in October
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
A. Safikhani et al. / International Journal of Forecasting ( ) – 3
2015 are disaggregated based on the zip code. Such an yellow taxis are already using app-based ride providers
approach requires the use of a zoning system for studying such as Curb. The objective of the proposed modeling
the spatial characteristics of the data. The system should framework would be to direct taxis to certain zones at cer-
divide the study area into smaller districts in order to assess tain times. Actual dispatch to pick up a certain passenger
the existence of spatial correlations among them. However, is not a part of the proposed framework, though it could
districts cannot be too small (like census tracks), since the be done using the actual location (within the zone) of a
GPS devices are not 100% accurate and can cause pick-up passenger requesting a ride through an app-based ride-
counts in each district to be less accurate. providing service. The main benefit of this approach is that
We also ensure temporal variation in the demand by it avoids taxis cruising for rides, especially across different
disaggregating the GPS data into 15 min intervals, result- zones, which would decrease the unnecessary distance
ing in 96 time-points per day. Other studies (e.g. Qian, travelled, with the associated fuel costs and pollution. In
Ukkusuri, Yang, & Yan, 2017) have also decomposed such addition, from a policy standpoint, the spatio-temporal
data by time (15-min intervals) and space (zip code basis). structure inferred from the demand data provides a basis
In addition to the use of spatio-temporal structure for taxi to allow regulating agencies to explore cordon pricing ini-
demand prediction, the main contribution of this paper tiatives.
is the introduction of new penalty functions in the STAR This paper is organized as follows. Section 2 discusses
model. Various different penalty functions are used in this the literature on time series modeling in transportation
paper for parameter estimation in the proposed general- and short-term taxi demand prediction. Section 3 provides
ized STAR model, and we show that they improve the accu- a detailed description of the spatio-temporal modeling and
racy of the taxi demand prediction. The double hierarchical formulation of taxi demand using the STARMA approach.
group LASSO (DHGLASSO) is the new penalty scheme that Section 4 presents findings for various types of STARMA
is introduced in this paper for penalizing parameters in the models and prediction errors, and compares them with
hierarchical structure of modeling in both the space and those from other time series models. Finally, we present
time dimensions. We evaluate the forecasting performance our conclusions and future research directions.
of the proposed method using the out-of-sample mean
squared prediction error (MSPE), and the results show that 2. Literature review
the proposed model outperforms some of the alternative
algorithms such as ARMA and VAR models. With the rise of intelligent systems and increases in
Given that there are about 12 million taxi trips a month, the availability of data, taxi pick-up demand prediction
amounting to 2 GB of data, a demand forecasting model has recently come to the attention of many scholars. It is
with a good spatial and temporal predictive accuracy is obvious that the taxi demand in any given zone changes
very useful. In particular, the proposed model has the abil- from one time interval to another in real time. Hence,
ity to forecast the taxi demand a few steps into the future time series models can form a strong statistical tool for
at various locations in NYC, and this enables the agency’s capturing this time variation in taxi demand, figuring out
framework to plan taxi positioning and provide demand- the correlations within the taxi data, and producing real
sensitive taxi dispatching for various locations and specific time predictions. For univariate data, a well-known family
times of the day over the year. When using a model for of time series models called ARIMA (autoregressive in-
placement of taxis, the yellow taxis may not operate as yel- tegrated moving average) models can be beneficial, and
low taxis per se. In other words, they could serve demand- have been applied to many transportation-related prob-
based rides as well as curb-side pickups. In fact, some lems (Moghimi, Safikhani, Kamga, & Hao, 2017). However,
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
4 A. Safikhani et al. / International Journal of Forecasting ( ) –
in the dense urban transportation network that we con- both of the other algorithms, with mean absolute percent-
sider, with many different areas or zip codes that each age errors (MAPE) close to 0.1. The case study used in the
have their own demand dynamics and may be correlated current paper is the same as that used by Qian et al. (2017).
with one another, the taxi demand variation of a given zip It is well-known that spatial information can increase
code is not related only to its own values, but also to the the accuracy of prediction, especially for traffic congestion
demand in the neighboring zip codes. Since there are too and at longer horizons. The idea of capturing spatial in-
many parameters to estimate due to multiple zip codes, formation in time series studies of transportation-related
VAR models, as the most common multivariate time series problems was first introduced in the study by Okutani and
model, will not be able to perform well and will do a poor Stephanedes (1984) for the prediction of traffic flow. Later,
job of forecasting the taxi demand. This study attempts to the spatial concept was deployed in the study by Kamar-
mitigate this problem by applying a multivariate spatial– ianakis and Prastacos (2003) for forecasting the relative
temporal time series model which is discussed in details in velocity on major roads in Athens, Greece, in a method re-
Section 3 ferred to as the space–time autoregressive integrated mov-
Some of the primary research about taxi demand has ing average (STARIMA) model. The model is quite different
aimed to find the factors that influence taxi demand. from traditional ARIMA models due to the inclusion of
Schaller (1999) developed a citywide empirical time series spatial information regarding neighboring links for traffic
regression model of NYC taxis in an attempt to understand forecasting. They compared the forecasting performances
the relationship between the taxicab revenue per mile and of four models, namely the historical average, ARIMA,
economic activity in the city, taxi supply, taxi fares, and VARMA, and STARIMA. The results demonstrated that there
bus fares. Later, Schaller (2005) tried to figure out the rela- are no significant differences among the last three models,
tionships between taxi demand and other factors including although the last three models all performed better than
the city size, the availability and cost of privately owned the historical average one. Spatial–temporal modeling is
autos, the use of complements to taxicabs, the cost of taxi also used in various other areas of transportation; for ex-
usage, the taxi service quality, the presence of competing ample, the traffic condition of the downstream section of
modes, and the presence of a senior or disabled popula- a road is highly correlated with that observed upstream.
tion. With the emergence of GPS technology, subsequent Stathopoulos and Karlaftis (2003) considered the spatial
extensive research into the use of spatial information has information from four consecutive loop detectors in the
been applied in the context of transportation-related prob- area upstream of the study section for predicting the traffic
lems. GPS based systems are also used to track the taxis flow in the downstream of an urban corridor, while the
of New York City and to analyze the taxi ridership. Yang same idea was used in the studies by Cheng et al. (2011)
and Gonzales (2017) processed the New York City GPS taxi and Duan et al. (2016) for predicting the traffic speed of
data and used the negative binomial method to capture the downstream link.
the variation in the taxi pick-up demand. Their study used One of the most important parts of STARIMA modeling
six explanatory variables, namely population, education, is the spatial weighting matrix, which indicates the spa-
median age, median income per capita, employment by tial dependency between multiple time series. Thus, the
industry sector, and transit accessibility. Correa, Xie, and optimal spatial weighting matrix varies with the nature of
Ozbay (2017) performed an empirical analysis in order the problem, and determining it requires some engineer-
to explore the spatial dependence between Uber and taxi ing judgment. In general, two approaches have been used
pick-up data. The results from Moran’s I tests confirmed for selecting the neighboring dependence: (a) correlation-
the significant spatial correlation in both taxi and Uber coefficient assessment and (b) distance adjustment. The
demand. values in STARIMA’s weighting matrix can vary by time and
Several studies have considered prepositioning taxis so location. In one method that has been developed, called
as to reduce wait times (Chang, Tai, & Hsu, 2010; Yuan, General STARIMA, the spatial parameters are designed to
Zheng, Zhang, Xie, & Sun, 2011) using spatio-temporal vary by location, instead of having fixed values over all
clustering. Time series models such as ARIMA have also locations (Min, Hu, & Zhang, 2010). Another approach that
been tested for taxi demand prediction (Moreira-Matias is associated with the weighting matrix is to consider only
et al., 2013; Qian et al., 2017; Sayarshad & Chow, 2016). the link/zone that is adjacent to the target link/zone. It
Moreira-Matias et al. (2013) proposed a methodology for can be elaborated by a ring of dependency, defined by the
the prediction of short-term taxi demand at 30-min time ‘‘order’’. For instance, a first-order adjacent matrix repre-
intervals. Their methodology is an ensemble of three pre- sents the dependency between the study link/zone and
dictive models, namely a time-varying Poisson model, its immediate adjacent link/zone (first-order link/zone). A
a weighted time-varying Poisson model, and an ARIMA second-order adjacent matrix shows the dependency of
model. They found that their proposed model outper- zone that is not directly adjacent to the study zone, but it
formed all three models run individually. The recent study is an immediate adjacent to the link/zone defined as first-
by Qian et al. (2017) also used artificial neural networks order stated earlier. It can also be expanded to a third-
to combat nonlinearities in the tax demand. Furthermore, order adjacent matrix, and so forth. First- and second-order
they attempted to capture spatio-temporal variations us- adjacency-weighting matrices were used in the study by
ing conditional random fields. The proposed model and Kamarianakis, Prastacos, and Kotzinos (2004). On the other
two other algorithms (ARIMA and ANN) were run in four hand, it is more practical to use the distance between the
different scenarios and their performances evaluated. The two links/zones, where the value of the dependency is
results reported that the proposed model outperformed reduced by increasing the distance.
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
A. Safikhani et al. / International Journal of Forecasting ( ) – 5
multivariate time series with more than 100 components. where εi (t ) = (ε1 (t ) , . . . , εk (t )) is a k-variate normal
However, only 39 of the zip codes have enough non-zero variable with mean zero and
counts to keep them in the model. It is worth noting that {
σ 2 Ik , s = 0
E ε (t ) ε(t + s)′ =
( )
the zip code zoning system includes some small zones 0, ∧other w ise.
(even as small as a block/building) that should be removed
from the inputs, as the demand in these small zones is not Also, the W (l) s are k × k weighting matrices that govern
(0)
of interest. Thus, the dataset ultimately consists of k = 39 the lth neighborhood location, with Wi = Ik . Denote
(l)
locations and T = 96 time points. Fig. 4 shows the sample the ith row of W by Wi . One possible choice for W (l)
(l)
ACFs of the first five components of the data, which imply is to set W (l) (i, j) = 1 if the ith and jth locations are
the existence of a strong temporal dependence. Hence, a lth level neighbors, and W (l) (i, j) = 0 otherwise. These
multivariate time series model is chosen for analyzing this matrices are then normalized in such a way that the sum
dataset. of each row is 1. Finally, for each i = 1, 2, . . . , k, and
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
6 A. Safikhani et al. / International Journal of Forecasting ( ) –
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
A. Safikhani et al. / International Journal of Forecasting ( ) – 7
Table 1
Results for October 6th data with η = 1.
Model MSPE MRPE AIC BIC
VAR 1.7153 4.8259 216.4933 257.1222
STAR (Univariate AR) 0.2815 2.4463 176.9854 178.0271
LASSO 0.2977 1.8467 173.8735 174.9153
HGLASSO 0.2977 1.8467 173.8735 174.9153
DHGLASSO 0.2977 1.8467 173.8735 174.9153
(2014, 2017) and Song and Bickel (2011). For this purpose, Table 2
the time points are divided into three parts (usually at Results for October 6th data with η = 2.
equal distances), 0 < T1 < T2 < T . The estimation Model MSPE MRPE AIC BIC
procedure for fixed values of λ is applied for the first part, STAR 0.2707 1.9913 177.3313 179.4148
i.e., t = 1, 2, . . . , T1 ; then the mean squared prediction LASSO 0.2728 1.9616 177.0052 179.0353
error (MSPE) for predicting one step ahead is calculated HGLASSO 0.2909 1.8942 176.0614 178.1449
DHGLASSO 0.2907 1.9543 178.6925 180.6425
over all k time series components on the time interval
[T1 + 1, T2 ] :
Table 3
k T2 Results for October 6th data with η = 3.
1 ∑ ∑ )2
Yi (t ) − PT1 Yi (t ) ,
(
MSPE = (11) Model MSPE MRPE AIC BIC
k (T2 − T1 )
i=1 t =T1 +1 STAR 0.2932 2.1346 178.7531 181.8784
k T2 LASSO 0.3254 2.1413 177.3218 179.1115
1 ∑ ∑
HGLASSO 0.301 2.114 175.1811 178.3064
MRPE =
k (T2 − T1 ) DHGLASSO 0.2821 1.9472 176.3991 179.3107
i=1 t =T1 +1
4. Results 4.1. Case study using data from October 6th only
This section applies the proposed methods to the yellow Considering data from October 6th, only T = 96 time
taxi demand data on different days, and calculates their points are available. A rolling window scheme is used to
prediction performances under different scenarios. Based divide T into three parts, with T1 being set to ⌊T /3⌋ and
on the sample ACFs of the data, p is chosen to be 1, and the T2 to ⌊2T /3⌋. Different orders of neighborhood (η) are
calculation of the AIC/BIC also supports this selection. How- chosen, and the MSPE, mean squared relative prediction
ever, before applying different methods to this dataset, it error (MRPE), AIC and BIC (see Lutkepohl, 2007, for the
needs to be scaled properly. For this purpose, the sample definitions and formulas) values are reported in each case.
mean is subtracted from each time series that corresponds Tables 1–4 show the results for η = 1, 2, 3, 4, respec-
to a zip code, and the resulting series are divided by the tively. In simple words, η = 1 only considers the previous
sample standard deviation, so that all time series have the time data of the study zone which does not consider any
same scales. Also, the weighting matrices W are chosen for neighborhood information, η = 2 considers only the in-
five different neighborhood levels based on the authors’ formation of the first-order neighborhood, and it continues
judgment, or, more specifically, by counting the numbers of up to 3rd-order neighbor which is η = 4. Obviously,
boundaries between the target zip code and its neighbors. the VAR model does not perform well relative to STAR-
For example, a zip code that is adjacent to the target zip based models due to the huge number of parameters in-
code is considered as the first-order neighborhood; zip volved. Based on the MSPE, the STAR and LASSO models
codes adjacent to the first-order neighborhood are the for η = 2 outperform the rest. The difference between
second-order neighborhood for the target zip code, and so the prediction performances of these two methods, STAR
on. This study extends the neighborhood up to five levels by and LASSO, and those of the other methods is statistically
means of an eyeballing procedure. October 6th and 7th are significant, with p -values of less than 0.01 based on the
chosen for this research because they are typical weekdays, Diebold–Mariano test statistics developed by Diebold and
being away from both weekends and days with special Mariano (1995). This means that the inclusion of the first
events. Two different approaches have been considered neighborhood structure improves the forecasting perfor-
for evaluating the performance of the developed model. mance of the STAR model. Meanwhile, the proposed spatio-
First, we consider time points from October 6th only; and, temporal structure using the topology and zip code based
second, the two days of October 6th and 7th are merged to disaggregation of Manhattan with a first-order neighbor-
give a longer range of time points. hood (η = 2) performs the best in this case study. Also, it
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
8 A. Safikhani et al. / International Journal of Forecasting ( ) –
Table 5 Table 8
Results for October 6th and 7th data combined, with η = 1. Results for October 6th and 7th data combined, with η = 4.
Model MSPE MRPE AIC BIC Model MSPE MRPE AIC BIC
VAR 0.7103 14.544 204.3445 230.15 STAR 0.2247 4.4178 178.271 180.9008
STAR (univariate AR) 0.253 3.9068 182.6419 183.3035 LASSO 0.2244 4.3892 178.0977 180.6765
LASSO 0.2527 3.8983 182.5923 183.254 HGLASSO 0.2247 4.419 178.2357 180.8654
HGLASSO 0.2527 3.8983 182.5923 183.254 DHGLASSO 0.2224 4.162 177.6367 180.2156
DHGLASSO 0.2527 3.8983 182.5923 183.254
Table 9
Table 6 Results for October 6th and 7th data combined, with η = 5.
Results for October 6th and 7th data combined, with η = 2.
Model MSPE MRPE AIC BIC
Model MSPE MRPE AIC BIC STAR 0.2279 3.5851 178.7162 182.0077
STAR 0.2273 4.0633 178.3005 179.6239 LASSO 0.2257 3.5265 178.0827 181.1196
LASSO 0.2273 4.0633 178.3005 179.6239 HGLASSO 0.2277 3.5857 178.6503 181.9418
HGLASSO 0.2273 4.0633 178.3005 179.6239 DHGLASSO 0.2212 3.835 177.8113 180.95
DHGLASSO 0.2273 4.0633 178.3003 179.6237
Table 10
Table 7 Results for October 6th and 7th data combined, with η = 6.
Results for October 6th and 7th data combined, with η = 3.
Model MSPE MRPE AIC BIC
Model MSPE MRPE AIC BIC STAR 0.2405 3.5611 178.4606 182.4137
STAR 0.2249 4.1741 177.9721 179.9571 LASSO 0.238 3.4261 177.7624 181.2913
LASSO 0.2248 4.1703 177.9496 179.9347 HGLASSO 0.238 3.5376 178.2116 182.1647
HGLASSO 0.2249 4.1742 177.957 179.9421 DHGLASSO 0.2291 3.9304 178.8057 182.4703
DHGLASSO 0.2238 4.1062 177.6838 179.6519
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
A. Safikhani et al. / International Journal of Forecasting ( ) – 9
Fig. 5. MSPE results for October 6th and 7th data combined, with η = 1, 2, . . . , 6.
Fig. 6. Neighborhood-level estimated coefficients for lower Manhattan (zip codes: 10004, 10002, 10280).
Fig. 7. Neighborhood-level estimated coefficients for midtown Manhattan (zip codes: 10019, 10022, 10128).
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
10 A. Safikhani et al. / International Journal of Forecasting ( ) –
Fig. 8. Neighborhood-level estimated coefficients for upper Manhattan (zip codes: 10021, 10028, 10027).
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.
A. Safikhani et al. / International Journal of Forecasting ( ) – 11
References Moghimi, B., Safikhani, A., Kamga, C., & Hao, W. (2017). Cycle-length pre-
diction in actuated traffic-signal control using ARIMA model. Journal
Baklanov, A., Hänninen, O., Slørdal, L. H., Kukkonen, J., Bjergene, N., Fay, B., of Computing in Civil Engineering, 32(2), 04017083.
et al. (2007). Integrated systems for forecasting urban meteorology, Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., & Damas, L.
air pollution and population exposure. Atmospheric Chemistry and (2013). Predicting taxi-passenger demand using streaming data. IEEE
Physics, 7, 855–874. Transactions on Intelligent Transportation Systems, 14(3), 1393–1402.
Beck, A., & Teboulle, M. (2009). A fast iterative shrinkage-thresholding al- New York City Taxi & Limousine Commission, (2016). TLC factbook. http:
gorithm for linear inverse problems. SIAM Journal on Imaging Sciences, //www.nyc.gov/html/tlc/downloads/pdf/2016_tlc_factbook.pdf.
2(1), 183–202. Nicholson, W. B., Bien, J., & Matteson, D. S. (2014). Hierarchical vector
Chang, H., Tai, Y., & Hsu, J. Y. (2010). Context-aware taxi demand hotspots autoregression. arXiv preprint, arXiv:1412.5250.
prediction. International Journal of Business Intelligence and Data Min- Nicholson, W. B., Matteson, D. S., & Bien, J. (2017). VARX-L: structured reg-
ing, 5(1), 3–18. ularization for large vector autoregressions with exogenous variables.
Cheng, T., Wang, J., Harworth, J., Heydecker, B. G., & Chow, A. H. F. (2011). International Journal of Forecasting, 33(3), 627–651.
Modeling dynamic space–time autocorrelations of urban transport Okutani, I., & Stephanedes, Y. J. (1984). Dynamic prediction of traffic
network. GeoComputation, Session 5A: Network Complexity, 215–220. volume through Kalman filtering theory. Transportation Research, Part
Correa, D., Xie, K., & Ozbay, K. (2017). Exploring the taxi and uber demands B (Methodological), 18(1), 1–11.
in New York City: an empirical analysis and spatial modeling. In Pfeifer, P. E., & Deutsch, S. J. (1980). A three-stage iterative procedure for
Transportation research board’s 96th annual meeting, Washington, D.C. space–time modeling phillip. Technometrics, 22(1), 35–47.
Cressie, N. (2015). Statistics for spatial data. John Wiley & Sons. Pfeifer, P. E., & Deutsch, S. J. (1981). Variance of the sample space–time
Di Giacinto, V. (1994). Su una generalizzazione dei modelIi spazio- autocorrelation function. Journal of the Royal Statistical Society. Series
temporali autoregressivi media mobile (STARMAG). In Atti della B. Statistical Methodology, 43, 28–33.
XXXVII riunione scienti_ca SIS, Sanremo, Aprile 1994, vol. H. Qian, X., Ukkusuri, S. V., Yang, C., & Yan, F. (2017). A model for short-term
Di Giacinto, V. (2006). A generalized space–time ARMA model with an taxi demand forecasting accounting for spatio-temporal correlations.
application to regional unemployment analysis in Italy. International In Transportation research board annual 2017, Washington D.C.
Regional Science Review, 29(2), 159–198. Sartoris, A. (2005). A STARMA model for homicides in the city of Sao Paulo.
Diebold, F. X., & Mariano, R. S. (1995). Comparing predictive accuracy. In Proceedings of the spatial economics workshop, kiel institute for world
Journal of Business & Economic Statistics, 13, 134–145. economics, 8–9 April, 2005, Kiel, Germany.
Duan, P., Mao, G., Zhang, C., & Wang, S. (2016). STARIMA-based traffic Sayarshad, H. R., & Chow, J. Y. J. (2016). Survey and empirical evaluation
prediction with time-varying lags. In IEEE 19th international conference of nonhomogeneous arrival process models with taxi data. Journal of
on intelligent transportation system, Rio, Brazil. Advanced Transportation, 50, 1275–1294.
Giacomini, R., & Granger, C. W. J. (2004). Aggregation of space–time Schaller, B. (1999). Elasticities for taxicab fares and service availability.
processes. Journal of Econometrics, 118, 7–26. Transportation, 26, 283–297.
Hernández-Murillo, R., & Owyang, M. T. (2006). The information content Schaller, B. A. (2005). Regression model of the number of taxicabs in U.S.
of regional employment data for forecasting aggregate conditions. cities. Journal of Public Transportation, 8, 63–78.
Economics Letters, 90(3), 335–339. Shoesmith, G. L. (2013). Space–time autoregressive models and forecast-
Jenatton, R., Mairal, J., Obozinski, G., & Bach, F. (2011). Proximal methods ing national, regional and state crime rates. International Journal of
for hierarchical sparse coding. Journal of Machine Learning Research Forecasting, 29(1), 191–201.
(JMLR), 12, 2297–2334. Song, S., & Bickel, P. (2011). Large vector auto regressions. arXiv preprint
Kamarianakis, Y., & Prastacos, P. (2003). Forecasting traffic flow conditions arXiv:1106.3915.
in an urban network: comparison of multivariate and univariate ap- Stathopoulos, A., & Karlaftis, M. G. (2003). A multivariate state space ap-
proaches. Transportation Research Record: Journal of the Transportation proach for urban traffic flow modeling and prediction. Transportation
Research Board, 1857, 74–84. Research Part C: Emerging Technologies, 11(2), 121–135.
Kamarianakis, Y., Prastacos, P., & Kotzinos, D. (2004). Bivariate traffic Terzi, S. (1995). Maximum likelihood estimation of a generalized STAR(p,
relations: A space–time modeling approach. In AGILE proceedings lp) model. Journal of the Italian Statistical Society, 4(3), 377–393.
(pp. 465–474). Tibshirani, R. (1996). Regression shrinkage and selection via the lasso.
Kyriakidis, P. C., & Journel, A. G. (1999). Geostatistical space–time models: Journal of the Royal Statistical Society. Series B. Statistical Methodology,
a review. Mathematical Geology, 31, 651–683. 58, 267–288.
LeSage, J. P. (1997). Bayesian estimation of spatial autoregressive models. Yang, C., & Gonzales, E. (2017). Modeling taxi demand and supply in New
International Regional Science Review, 20(1–2), 113–129. York City using large-scale taxi GPS data. In P. Thakuriah, N. Tilhun, &
Lutkepohl, H. (2007). New introduction to multiple time series analysis. M. Zellner (Eds.), Seeing cities through big data–research, methods and
Springer. applications in urban informatics (pp. 405–425). Springer.
Min, X., Hu, J., & Zhang, Z. (2010). Urban traffic network modeling and Yuan, J., Zheng, Y., Zhang, L., Xie, X., & Sun, G. (2011). Where to find my
short-term traffic flow forecasting based on GSTARIMA model. In next passenger. In Proceedings of the 13th international conference on
13th international IEEE annual conference on intelligent transportation ubiquitous computing, Beijing, China —September 17–21 (pp. 109–118).
systems, September 19-22, Madeira Island, Portugal. New York, NY: ACM.
Please cite this article in press as: Safikhani, A., et al., Spatio-temporal modeling of yellow taxi demands in New York City using generalized STAR models.
International Journal of Forecasting (2018), https://doi.org/10.1016/j.ijforecast.2018.10.001.