Pani 2019

Journal of Transport Geography 80 (2019) 102524
Contents lists available at ScienceDirect
Journal of Transport Geography

journal homepage: www.elsevier.com/locate/jtrangeo
Assessing the extent of modifiable areal unit problem in modelling freight T

(trip) generation: Relationship between zone design and model estimation
results
Agnivesh Pania, Prasanta K. Sahua, , Aitichya Chandrab, Ashoke K. Sarkarb
⁎
a
Department of Civil Engineering, Birla Institute of Technology and Science Pilani, Hyderabad 500078, India
b
Department of Civil Engineering, Birla Institute of Technology and Science Pilani, Pilani 333031, Rajasthan, India
ARTICLE INFO ABSTRACT
Keywords: There is a growing interest in incorporating spatial indicators into freight demand model systems. The indicators
Zoning systems are measured for different areal units (e.g., census tracts or block groups) and are often used as proxy variables
Modifiable areal unit problem or aggregation layers. Model estimation results vary according to the choice of these areal units and an analyst is
Land use thus confronted with a popular decision challenge termed as ‘modifiable areal unit problem’ (MAUP). The
Aggregation
variability in results due to MAUP arises since areal units can be modified in theoretically infinite ways (in terms
Model diagnostics
of shape, orientation and size) and magnitude of aggregation loss in information will vary for each alternative
zoning system. In effect, how well the zonal (aggregated) characteristics can describe the establishment-level
(disaggregated) observations is inversely related to MAUP effects. Little is known, however, about the extent of
MAUP effects in freight generation (FG) models and freight trip generation (FTG) models. This study diagnoses
the implications of MAUP effects in FG and FTG models by designing alternate zoning systems (by means of
different zonal variables and clustering techniques) and assessing the sensitivity of model estimation results
within a framework of comparative analysis (by means of hierarchical linear models). Study results assess the
presence of MAUP as alternate zoning systems resulted in wide variation in the estimated coefficients for zonal
characteristics (e.g., industrial area, land value, number of establishments, distance to primary arterial) in terms
of magnitude, statistical significance, and even in the direction of association (sign of the coefficient). The
implication of this finding is that an analyst may design different or even counterproductive policy instruments
based on the way data is aggregated to capture the role of land-use, spatial effects and built-environment in
influencing freight travel patterns. MAUP effects are also found to be dependent on the metric in which freight is
measured (i.e., FG or FTG) and direction in which flow is measured (i.e., production or attraction). Overall, this
research improves the understanding of the parameter sensitivity and performance sensitivity of freight demand
model systems to alternative spatial representations of an establishment's relative location. The research findings
strongly encourage analysts to acknowledge that the results of freight travel analyses with spatial indicators are
sensitive to the definition of areal units.
1. Introduction mesoscopic (observations at corridor-level or neighborhood-level or

zonal-level) or microscopic (observations at establishment-level). The
Freight trip generation (FTG) models that predict the number of microscopic models have widely recognized advantages over meso-
truck trips (Alho and Abreu e silva, 2014; Gunay et al., 2016; Holguín- scopic and macroscopic models as follows: they are more policy sensi-
Veras et al., 2012) and freight generation (FG) models that predict the tive (Holguín-Veras et al., 2012); they are more likely to be spatially
quantity of tonnage transported in truck trips (Holguín-Veras et al., transferable (Landau, 1978); they have a direct connection with ex-
2016; Pani et al., 2018; Sanchez-Diaz, 2017) have become fairly routine planatory variables such as employment (Holguín-Veras et al., 2016)
components of freight demand model systems. These models can be and they reflect the economic behavior of decision makers in a freight
categorized, according to their spatial scale, into three (Gonzalez-Feliu system (Cantillo et al., 2014). Although the microscopic models provide
and Sánchez-Díaz, 2019): macroscopic (observations at city-level), FG/FTG rates for each single establishment, they are typically estimated
⁎
Corresponding author.
E-mail address: prasantsahu222@gmail.com (P.K. Sahu).
https://doi.org/10.1016/j.jtrangeo.2019.102524
Received 19 March 2019; Received in revised form 3 September 2019; Accepted 6 September 2019
0966-6923/ © 2019 Elsevier Ltd. All rights reserved.
A. Pani, et al. Journal of Transport Geography 80 (2019) 102524
using data aggregated by various classification systems, such as: (i) locational characteristics (proximity to arterials, freight corridors and
industrial classification systems (Holguín-Veras et al., 2012; Holguín- Seaports); (ii) economic characteristics of land (land value, extent of
Veras et al., 2016; Pani and Sahu, 2019a), (ii) land-use classification vacant industrial area); (iii) sociodemographic characteristics (popula-
systems (Holguín-Veras et al., 2012) or (iii) zoning systems (Alho and tion density, number of workers) and (iv) industrial characteristics (mix
Abreu e silva, 2014). The industrial aggregation is motivated by the fact of industry sectors present in the area, proximity to other establish-
that freight demand reflects the inputs and outputs of an economic ments). However, incorporation of predictors based on these zoning
process and, therefore, the ability of business size measures (e.g., em- systems makes the model coefficients sensitive to the choice of zoning
ployment, area) to be the predictor of FG/FTG would be challenged if systems, an issue that is widely recognized as modifiable areal unit
there is internal disparity in the economic activity performed by the problem or MAUP (Fotheringham and Wong, 1991; Openshaw, 1977;
establishments. Conversely, land-use or zonal aggregation is motivated Wong, 2009). The variability or inconsistency in model parameters is
by the importance of grouping establishments based on the spatial and mainly due to the fact that we can modify these areal boundaries in
locational determinants of freight demand. The adequacy of each ag- theoretically infinite ways (in terms of shape, orientation and size) and
gregation-criteria, in effect, depends on how well the resulting estab- magnitude of information loss will vary for each competing alternative
lishment classes are homogeneous in the determinants of FG/FTG – zoning system. In effect, how well the aggregated zonal freight data can
whether it is based on economic activity or spatial and locational de- describe the establishment-level observations is inversely related to
terminants. MAUP effects. The existing literature on FG/FTG modelling has largely
An emerging literature on freight demand modelling makes a strong used local ‘a priori’ zoning systems based on census units (Sánchez-Díaz
case for industrial aggregation (Gonzalez-Feliu and Sánchez-Díaz, 2019; et al., 2016) or buffer units (Kawamura and Miodonski, 2012) zoning
Holguín-Veras et al., 2016; Pani and Sahu, 2019a) of FG or FTG data. systems based on cluster analysis (Alho and Abreu e silva, 2014). By
This is based on the reasoning that they enable the direct usage of of- and large, these zoning systems are chosen with respect to available
ficial statistics regularly released by Census bureau, such as employ- data or practical reasons, and very little attention is paid to demonstrate
ment, for FG/FTG model application. The industrial aggregation, the link between the choice of zoning system and the quality of model.
however, does not take into account the differences in freight activity It is thus imperative to identify which sets of freight-related variables
due to the variations in land-use patterns (Sánchez-Díaz et al., 2016). provide can be used to design zoning systems that better represent
For instance, an establishment located in a highly dense commercial freight activity and how it affects forecasting accuracy. To the best of
area is likely to generate different level of FG/FTG compared to an our knowledge, existing works in microscopic freight demand model-
establishment located in the suburbs, even if it belongs to the same ling literature focus on producing the models or enhancing those
industry category. FG/FTG patterns could also differ between two es- models using macro-structural covariates, but not on the impacts on the
tablishments when one is located in a large city and another is located way the built-environment and land-use is discretized (by means of
in a mid-sized or small city. These differences in FG/FTG patterns be- different variables and methods used to design zoning systems) and its
yond the establishment level are also likely to result from the logistical accuracy within a framework of comparative analysis (by comparing
adjustments made by vendors to deliver supplies to land-uses of various different zoning systems and assessing the implications on model per-
characteristics (Holguín-Veras et al., 2016). The Land-use classification formance).
systems, such as Land-Based Classification Standards (LBCS), could This paper is organized as follows. Section 2 reviews the literature
explain some of these differences owing to the general characteristics of on modelling freight generation, introduces MAUP and presents some
land-use covered in its ‘a priori’ classes (Holguín-Veras et al., 2012). of the applications in the transportation literature. Section 3 covers
The weakness, however, is that the resulting land-use classes often different aspects of research design, such as methodological approach,
group together establishments that belong to different economic sectors data description. The results and discussions are given in Section 4 and
with fundamentally different FG/FTG patterns (Holguín-Veras et al., research implications are presented in Section 5. The final section
2016). The inability of both industrial and land-use classification sys- concludes the paper. It is expected that this study will highlight the
tems to satisfactorily account for the cross-sectional differences in FG/ MAUP issues that are generally not considered while developing FG or
FTG underlines the importance of investigating macro-structural cov- FTG models and offer insights into the extent of parameter sensitivity to
ariates (e.g., spatial proxies of locations, establishment density, proxi- data aggregation in the context of developing freight demand model
mity to arterials) based on zoning systems. The adequacy of these systems.
zoning systems, although, depends on how well the resulting zones
match the FG/FTG patterns of the establishments that have been in- 2. Background, research review and problem statement
cluded. In cases for which there is a good match, zoning systems pos-
sibly have the potential to fulfill the shortcomings of both industrial This section synthesizes the literature, provides a background and
aggregations and land-use class aggregations. delineates the research motivation from three well-defined standpoints.
The incorporation of macro-structural covariates based on zoning First, it examines the different aspects of FG and FTG with due focus
systems has found considerable application in traffic safety analysis (Xu given to the spatial scale, data aggregation, choice of aggregation layers
et al., 2014), passenger travel behavior modelling (Mitra and Buliung, and explanatory variables. Second, a detailed background on modifi-
2012), location choice modelling (Guo and Bhat, 2004), health in- able areal unit problem is expounded with a summary of the literature.
equality analysis (Stafford et al., 2008). On a theoretical front, this The problem statement and research design are presented as the third
approach could well be extended to freight demand modelling to cap- standpoint.
ture the potential factors that influence FG/FTG patterns at a spatial
scale (e.g., mix of industry sectors in the neighborhood), beyond the 2.1. Freight generation and freight trip generation
point-level predictors (e.g., employment, area). This is consistent with
the findings from literature that inclusion of location proxies and spatial Freight travel analyses typically make a distinction between FG and
effects enhance FTG models (Sánchez-Díaz et al., 2016) and FG models FTG because they are outputs of two fundamentally different processes
(Pani et al., 2018). It also supports the popular assertion that freight – former being a direct reflection of production process at the estab-
movement is necessitated by the need to fulfill freight orders distributed lishments and latter being influenced by logistical decisions that are
over space and, therefore, demand is influenced and constrained by the aimed at minimizing the total transportation cost (Sanchez-Diaz, 2017).
surrounding spatial structure and urban environment (Ortúzar and This distinction is also reflected in the economic order quantity (EOQ)
Willumsen, 2011). In accordance, the factors that could be incorporated model (Tavasszy and De Jong, 2014) which shows that the increase in
in FG/FTG model specifications are categorized as follows: (i) quantity of demand (FG) can be satisfied through an increase in
2
shipment size without necessarily increasing the shipment frequency establishment. The patterns and trends underlying in disaggregated
(FTG). FG and FTG quantifications serve different purposes on a prac- freight data are thus lost or averaged out when aggregation layers are
tical front as well; former is crucial for assessing the freight needs of adopted for model estimation (Novak et al., 2011). Notwithstanding
establishments and latter is key for envisaging the traffic impacts of this information loss, the necessity of an aggregation procedure can
industrial developments. Additionally, it is important to further dis- scarcely be questioned since the alternative is to develop a tremendous
tinguish FG/FTG based on the direction of freight flow into two: (i) assemblage of individual relations that govern the travel demand of
freight trip attraction (FTA) or freight attraction (FA) which considers millions of establishments which cannot be handled by any model
deliveries received by the establishment for processing, storing or system. These aggregation layers are thus essential to reduce the
selling to the customers; (ii) freight trip production (FTP) or freight computational intensity. The model quality is also improved during this
production (FP) which refers to pick-up requests forwarded by an es- process because establishments within each aggregated layer share
tablishment for use at other establishments. This distinction is im- common characteristics, which reduces the internal variability of data
portant because FP/FTP are driven by different factors compared to FA/ (Pani and Sahu, 2019a). Limiting the scope of modelling to these ag-
FTA (Sánchez-Díaz et al., 2016). It is thus important to concomitantly gregated layers also allows improved understanding of freight travel by
model both FG (i.e., FP and FA) and FTG (i.e., FTP and FTA) together to subsets of decision makers, their interrelations, planning scopes and
provide comprehensive forecasts on the likely usage of proposed new policy implications (Pani and Sahu, 2019a). The accuracy of dis-
facilities and the extent of changes in freight activity due to operational aggregated freight demand model systems, in effect, hinges on homo-
or policy instruments (Tavasszy and De Jong, 2014). The other im- geneity of aggregation layers.
portant aspects of developing FG/FTG models are the spatial scale and
level of data aggregation used for estimating the models. These are 2.1.2. Choice of aggregation layers
crucial because the ability of modelling techniques, e.g., ordinary least The aggregation layers typically adopted for estimating FG/FTG
squares (OLS) regression and explanatory variables, e.g., employment models can be categorized into three: (i) Industrial aggregation (e.g.,
to produce accurate estimates of FG/FTG depends on the spatial scale classification systems such as NAICS, ISIC); (ii) Spatial aggregation
and aggregation level of data. (e.g., Land-use classification systems or local zoning systems); (iii)
Temporal aggregation (e.g., time of the year or type of the day). The
2.1.1. Spatial scale and data aggregation first two categories are related to cross-sectional data and the third is
The FG/FTG model estimated at a macro or meso state of the system related to time-series data. A summary of these aggregation layers and
(e.g., total number of trips produced or attracted to each zone) are their respective spatial scale for estimating FG/FTG models in literature
having high levels of data aggregation and are often referred as ag- is presented in Table 1. The choice of aggregation category tends to be
gregated FG/FTG models. A great majority of freight demand model contentious and, perhaps more often, depends on the availability of
systems applied in practice have been aggregated due to the ease in data, prevalent classification systems in each country and anticipated
collecting information at this level of a transportation system (Ortúzar accuracy of the model systems (Pani and Sahu, 2019a). Six of these
and Willumsen, 2011). The aggregated approach is also conducive for studies used industrial aggregation and estimated FG/FTG models at
obtaining quick estimates about the future demand because it is rela- microscopic level. Given the economic nature of freight demand models
tively easier to forecast the aggregated explanatory variables (e.g., (input materials processed and transformed into outputs), it is typically
population per zone) than the disaggregated explanatory variables expected that industrial aggregation provide internal consistency for
(e.g., establishments in an industry sector). However, aggregated economic activity performed by establishments in a group (Holguín-
models are widely disparaged for their inaccuracy (Tavasszy and De Veras et al., 2016). The industrial aggregation also allows general-
Jong, 2014), aggregation bias (Pani et al., 2018), lack of behavioral izability to model estimation results since industrial categories remain
foundation (Holguín-Veras et al., 2012), ineffectiveness to reveal causal the same despite the variations in urban space, economic development
relationships and differences concerning the fundamental trip-making or industrial productivity (Gonzalez-Feliu and Sánchez-Díaz, 2019).
units (Ha and Combes, 2016) and the inability to capture policy-re- However, this aggregation approach is rather reductive due to the lack
levant variables (Cantillo et al., 2014). Disaggregated models, on the of recognition given to the interactions between land use and spatial
other hand, focus on estimation of FG/FTG for a specific establishment patterns that together influence freight travel patterns (Sánchez-Díaz
at a micro state of the system and can bring together several advantages. et al., 2016). As disaggregated freight data are collected with reference
They are sensitive to a mix of behaviorally-meaningful variables ex- to a location dimension (geocodes of establishments), two problems
plaining freight trip decision-making, such as business size measures stem from choosing industrial aggregation: (i) spatial correlation
(e.g., employment, area) or establishment characteristics (e.g., ex- among observations, and (ii) spatial heterogeneity in relationships that
istence of a supply chain, truck ownership). This approach has the are modeled (see LeSage and Pace, 2009 for detailed explanations on
potential to enhance the behavioral fidelity of freight demand models these two aspects). The traditional FG/FTG models using techniques
by recognizing that travel is a by-product of establishments' need to such as OLS regression or multiple classification analysis have largely
fulfill freight orders. The freight data used to develop disaggregated ignored these two connotations of spatial dependency, barring the no-
FG/FTG models are also, however, aggregated in some form or another table exception of (Sánchez-Díaz et al., 2016). This deficiency can be
(Gonzalez-Feliu and Sánchez-Díaz, 2019). These provide, for example, a overcome by choosing spatial aggregation layers as a grouping criteria
continuous equation of FG/FTG predictions for establishments in an or by using spatial indicators as macro-structural covariates (Martinez
industry sector (Holguín-Veras et al., 2016) or land use type (Holguín- et al., 2007).
Veras et al., 2012). In order to achieve a slightly lower level of ag- A wide array of spatial aggregation levels (9 out of 15 studies) can
gregation, one can also estimate these models for individual establish- be seen in Table 1, out of which four are estimated at mesoscopic scale
ments but include locational binary variables (Pani et al., 2018) or (i.e., census tracts) and five are estimated at much finer levels of dis-
industrial binary variables (Sánchez-Díaz, 2018). The understated pre- aggregation –microscopic scale (i.e., establishments). For instance,
sence of these aggregations in disaggregated FG/FTG models may be Novak et al. (2011) developed a set of spatial regression models to
analogized as modelling freight flows in several layers; each layer re- estimate aggregated FG for census tracts, while Alho and Abreu e silva
presenting the FG/FTG generated by an industry sector, zone, or land (2014) estimated disaggregated FTG by spatial aggregation using land
use type. use clusters or zones. Indeed, spatial aggregation adopted in dis-
The presence of an aggregation layer implies that predictions based aggregated FG/FTG models are considerably smaller in spatial scale
on FG/FTG model coefficients are, at best, averages for the conglom- than aggregated FG/FTG models. Nonetheless, spatially aggregated FG/
eration of establishments in that aggregation layer rather than an exact FTG models can be hypothesized to present the overall system-wide
3
A. Pani, et al.
Table 1
Overview of spatial scale and data aggregation used for estimating freight generation and freight trip generation models.
No Authors (Year) Location Dependent Variable Methodology Spatial Scale of Observations Data Aggregation Layer for Model Development
1 (Sánchez-Díaz, 2018) Stockholm (Sweden) FTA; FTP; FTG OLS Regression – Linear and Nonlinear Microscopic (Service Sector) Industrial Aggregation (Vector of Industrial binary
Regression Models variables)
2 (Pani et al., 2018) Kerala (India) FP; FA OLS Regression – Single Variable, Multiple Microscopic (Shippers) Spatial Aggregation (Vector of locational binary variables)
Variable, Pooled Regression Models
3 (Sanchez-Diaz, 2017) Gothenburg (Sweden) FTA; FWA⁎ FTA – Discrete-Continuous Model; FWA – Microscopic (Commercial Industrial Aggregation
Discrete Choice Model Sector) (SNI – Industrial Classifications)
4 (Ha and Combes, 2016) France FP; FA OLS Regression – ANOVA Models and ANCOVA Microscopic (Shippers) Industrial Aggregation (Vector of Industrial binary
Models variables)
5 (Gunay et al., 2016) Kocaeli (Turkey) FTG Conditional Models (Binary Logit and Linear Microscopic (Shippers, Industrial Aggregation (NACE based ‘posteriori’ segments)
Regression Models) Carriers Receivers)
6 (Sánchez-Díaz et al., 2016) New York (USA) FTA OLS Regression Models and Spatial Econometric Microscopic (Receivers) Industrial Aggregation (NAICS based ‘priori’ segments)
Models
7 (Alho and Abreu e silva, 2014) Lisbon, (Portugal) FTA OLS Regression Models; Generalized Linear Microscopic (Receivers) Spatial Aggregation (Zoning System) and Industrial
Models Aggregation (EIC)
4
8 (Cantillo et al., 2014) Colombia (Nation-wide) FTG; FG OLS Regression Models; Mesoscopic (Municipal Tracts) Spatial Aggregation (Zoning System) and Industrial
Aggregation
9 (Holguín-Veras et al., 2012) Manhattan, Brooklyn, New FTA OLS Regression Models; Average Trip Rates; Microscopic (Receivers, Spatial Aggregation (Zoning System and Land Use
York (USA) Multiple Classification Analysis Carriers) Classifications) and Industrial Aggregation (NAICS)
10 (Kawamura and Miodonski, Texas (USA) FA OLS Regression Models Mesoscopic (Census Tracts) Spatial Aggregation (Data Collected at Aggregated Census
2012) Tract Level)
11 (Novak et al., 2011) USA (Nation-wide) FP OLS Regression Models; Spatial Regression Mesoscopic (Census Tracts) Spatial Aggregation (Data Collected at Aggregated Census
Models Tract Level)
12 (Wagner, 2010) Hamburg (Germany) FTA OLS Regression Models; Average Trip Rates Microscopic (Logistic Spatial Aggregation (Zoning System)
Operators)
13 (Wisetjindawat et al., 2006) Tokyo (Japan) FP; FA OLS Regression Models Microscopic (Retailers) Spatial Aggregation (Zoning System) and Industrial
Aggregation
14 (Holguín-veras et al., 2002) USA (Nation-wide) FTG OLS Regression Models Microscopic (Container Temporal Aggregation (Typical Days and Busy Days)
terminals)
15 (Garrido and Mahmassani, USA (Nation-wide) FTG Space-time Multinomial Probit Model Mesoscopic (Census Tracts) Spatial Aggregation (Data Collected at Aggregated Census
2000) Tract Level)
Abbreviations: NACE - Nomenclature Statistique des Activités Economiques dans la Communauté Européenne (Europe); EIC - Establishment Industry Classifications (Lisbon); FWA – Freight Weight Attraction Models in
which FG is modeled as an ordinal variable.
Journal of Transport Geography 80 (2019) 102524
effects (e.g., land use changes, industrial agglomeration, proximity to The variables that capture, even in a proxy manner, the establish-
arterials) that are not obvious while using disaggregated data. The two ment location is found to enhance the explanatory power of FG/FTG
components of spatial aggregation layers are scale of analysis and zoning models (Holguín-veras et al., 2002). The proximity of an establishment
system configuration; the former refers to extent of map size or level of to large traffic generators or major arterials (Sánchez-Díaz et al., 2016)
planning and the latter refers to definition of areal units for structuring and ports (Pani et al., 2018) is reported to have a bearing on the freight
and modelling the study area (Ortega et al., 2014). These spatial units activity, although it is not statistically investigated in detail yet. A
can be wide-ranging, such as regions (Cantillo et al., 2014), counties multizone attribute signifying the spatial impact of an establishment's
(Garrido and Mahmassani, 2000; Novak et al., 2011), cities (Pani et al., proximity to a port – “port influence” – is found to have a positive effect
2018), census tracts (Kawamura and Miodonski, 2012), land use cate- on FG (Novak et al., 2011). The commercial attractiveness of a location
gories (Holguín-Veras et al., 2012) and traffic analysis zones (Alho and (measured in land value) is also surmised to influence freight activity
Abreu e silva, 2014). However, most of above spatial units have specific based on the logical notion that a premium space would generate more
usage (e.g., administrative, census, elections) and may not be the op- FG/FTG than an isolated space in suburban area (Pani et al., 2018;
timal zoning systems for freight demand modelling. Sánchez-Díaz et al., 2016). The attractiveness of a zone is also measured
This is reflected in micro-simulation analysis conducted by by number of retailers, sales employees and other land use variables
MacHaris and Melo (2011) which revealed that despite the existence of (Tavasszy and De Jong, 2014). The unemployment rate in a zone, on
non-ignorable differences between zones, freight characteristics re- the other hand, is found to have a negative association with FTG
mained the same for all zones. Another popular assertion is that spatial (Garrido and Mahmassani, 2000). The per capita income and block size
aggregations are limited because they tend to group disparate estab- (road length divided by number of intersections) of a zone is, as one
lishments in terms of their economic activity (Holguín-Veras et al., would expect, positively correlated with FTG (Kawamura and
2016). The potential of standardizing these spatial aggregation levels Miodonski, 2012). The results from OLS and spatial econometric
by designing new zoning systems based on FG/FTG determinants is not models for FTG (Sánchez-Díaz et al., 2016), ANCOVA-based pooled
yet explored, barring few exceptions in literature. For instance, (Alho regression models for FG (Pani et al., 2018) and spatial regression
and Abreu e silva, 2014) defined six land-use clusters using variables models for FG (Novak et al., 2011) further corroborate that macro-
such as numbers of buildings, dwellings and residents for Lisbon's es- scopic characteristics, even as proxy variables, enhance the predictive
tablishment-based freight survey. This possibility was explored further ability of FG/FTG models. The spatial autocorrelation observed for re-
by defining homogeneous urban zones in terms of logistic needs (Alho tail establishments also suggest that retail establishments located in
and de Abreu Silva, 2015). A comprehensive understanding of con- zones with high retail employment tend to have more FTA than the
tributing factors associated with FG/FTG patterns at both aggregated ones with low retail employment (Sánchez-Díaz et al., 2016). The
and disaggregated level is thus a prerequisite for developing optimal macroscopic characteristics, in essence, could act as valuable surrogates
zoning systems. The review of findings on the impact of explanatory for individual FG/FTG patterns by an establishment. The emergence of
variables is given below for augmenting this discussion. Geographical Information System (GIS) and open-source computer
programs providing satellite imagery, such as Google Earth opens the
2.1.3. Explanatory variables possibility for incorporating locational spatial effects on FG/FTG
The research findings related to various explanatory variables that models.
affect FG/FTG is synthesized in Table 2: a plus sign represent a positive However, any research efforts to incorporate spatial dimensions in
correlation; a negative sign indicate negative correlation and zero econometric models involve two main decisions on the scale of analysis
specify statistical insignificance of the predicated relationship. The and zoning system configuration (Ortega et al., 2014). The challenge
positive relationship between FG/FTG patterns and microscopic estab- here is to create new zoning system configurations that better represent
lishment factors indicating business size (employment, area) is con- the area-level variations in factors that might influence FG/FTG pat-
sistent with neoclassical economics which describes production func- terns. Alternate zoning system configurations can be created in theo-
tions to use input variables such as land, employment, capital, etc. to retically infinite number of ways using various set of variables and an
produce a designated quantity of products. Thus, larger business size analyst is thus confronted with a popular decision conundrum termed
measures are expected to result in larger outputs in a competitive as ‘modifiable areal unit problem’ (MAUP). Given the relative novelty
market-based economy which eliminates inefficient businesses. 11 of of this discussion in the context of freight demand analyses, next sub-
the 15 studies (73%) have thus used employment and 9 studies (60%) section is dedicated to briefly summarizing the background and lit-
have used some variant of floor area to predict FG/FTG patterns. Few erature on MAUP.
other studies have also found out that a positive micro relationship
extends, although to a lesser extent, to factors such as years in business 2.2. Modifiable areal unit problem
(Pani et al., 2018), number of items (Alho and Abreu e silva, 2014),
number of vendors (Gunay et al., 2016), share of transport costs (Ha Zoning uses aggregation of smallest spatial units also known as basic
and Combes, 2016), existence of a supply chain (Sánchez-Díaz, 2018) spatial units (BSUs) which are arbitrary or ‘modifiable’ in nature in
and warehouse availability (Alho and Abreu e silva, 2014). It is notable terms of similar characteristics to create homogeneous zone systems.
that floor area is rarely used as a sole explanatory variable and is more MAUP arises in a situation like this since the model estimation results
often used along with employment. A comparative assessment between becomes a function of these units and gets influenced by the number
area-based and employment-based FG models revealed that area is a and choice of zone boundaries (Fotheringham and Wong, 1991). In
skewed indicator for representing business size in cities with dense order to adequately represent the true relationship between measures of
commercial activities where acquiring land is difficult (Pani et al., built-environment and travel demand, MAUP needs to be assessed and
2018). This finding underlines the importance of macro-level (e.g., city reported in the context of various possible spatial representations
type) characteristics on performance of micro-level parameters. In fact, available for measuring aggregated variables (Biehl et al., 2018).
13 of the 15 FG/FTG models (87%) establish statistically significant
relationships between FG or FTG by an establishment and macroscopic 2.2.1. Impacts on transport-related spatial analyses
factors surrounding the establishment, such as land use (Holguín-Veras The two components of MAUP – scaling effect and the zoning effect –
et al., 2012), socio-demographic characteristics (Kawamura and are investigated separately in the context of transport-related spatial
Miodonski, 2012), industrial characteristics (Cantillo et al., 2014; analyses. First. The scaling effect of MAUP refer to the variability in
Novak et al., 2011) and network characteristics (Sánchez-Díaz et al., results when the same modelling approach is applied to different spatial
2016). dimensions. For example, travel demand patterns are differently
5
Table 2
Effect of disaggregated and aggregated variables on FG and FTG.
Variables 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Microscopic Variables
Establishment Characteristics (Continuous)
Employment + + + + + + + + + + +
Variants of Floor Area + + + + + + + + +
Years in Business +
Number of Vendors/Suppliers + +
Number of Carriers ○ +
Share of Transport Costs in Sale Price +
Number of Items +
Number of Berths/Boxes/Containers +
Establishment Characteristics (Categorical)
Existence of a Supply Chain ●
Existence of a Loading Dock ●
Existence of Stock Monitoring ●
Warehouse Availability ●
Delivery Urgency ●
Orders once per month or season ●
Macroscopic Variables
Industrial Characteristics (Continuous)
GDP by Industry sector +
Employment by Industry Sector +
Industrial Characteristics (Categorical)
Industry Sector ● ● ● ● ● ●
Commodity Type ○ ●
Spatial and Land Use Characteristics (Continuous)
Land-market value +
Number of Livestock +
Road or Intersection density 0
Highway or Railway Length 0 +
Population or Population density 0 +
Household Density −
Per Capita Income +
Block Size (Road Length / Number of Intersections) +
Median Number of Rooms per Household −
Employment in Census Zones +
Number of Vehicle Registration +
Unemployment Rate −
Wages +
Spatial and Land Use Characteristics (Categorical)
Land use Types ●
Geographic Location ● ○ ● ● ●
Location in a Zoning System ● ●
Network Characteristics (Continuous)
Distance to Truck Route 0
Distance to Primary Network 0
Distance to a Large Traffic Generator 0
Width of Front Street +
Port influence +
NOTE: (1) The column headings correspond to the numeric order of studies presented in Table 1; (2) Plus sign (+) indicate positive relationship; negative sign (−)
refer to negative relationship; ● refer to statistical significance; 0 and ○ represent statistical insignificance.
described in lower aggregation levels of spatial units (e.g., blocks, block modelling approach. There is substantial empirical evidence to suggest
groups or TAZs) and higher aggregation levels of spatial units (e.g., that the results of travel demand analyses and traffic safety analyses are
counties or states). Numerous studies in traffic safety analysis (Xu et al., influenced by zoning effect of MAUP (Martinez et al., 2007; Viegas
2014) or travel demand analyses (Biehl et al., 2018; Ortega et al., 2014) et al., 2009). For example, Zhang and Kukadia (2005) employed two
have shown that the coarser scales (census tract) are found to have distinct zoning schemes (grid based representation and census-based
higher correlation between two variables of interest (e.g., trip produc- representation) and revealed the sensitivity of urban form metrics
tion and employment) as compared to finer scales (block group) due to (population density, network pattern and land use balance) while pre-
reduced variation in aggregated data. The zoning effect, on the other dicting travel mode choice decisions in Boston. Spatial interaction
hand, represents how the statistical relationships and, in turn, the re- models related to location choice (Guo and Bhat, 2004), travel time
sultant conclusions are altered subjected to the choice of different (Stępniak and Jacobs-Crisioni, 2017), territorial cohesion (Ortega et al.,
spatial partitioning schemes, even if the number of areal units are kept 2014) are also found to suffer from MAUP. Apart from the impacts on
same. For example, consider a city divided into two zoning system model calibration results, MAUP is found to have major implications on
configurations – one considers city as a collection of zip codes and the temporal transferability of model parameter. For instance, the
another divides the city area into homogeneous areal units in terms of forecasting performance of gravity model is found to be compromised
population density. The zoning effect of MAUP purports that city when the zoning systems are either too coarse or too fine grained
structure gets represented in different representations due to these al- (Cabrera Delgado and Bonnel, 2016). The choice of zoning systems is
ternative partitioning schemes and, in effect, different statistical results thus fundamentally important in transport-related spatial analysis and
and conclusions may be yielded by two zoning systems for the same the challenge here is to create new areal units that better reflect
6
variation in the determinants of the modelling unit under considera- the quality of resulting FG or FTG models. More specifically, the paper
tion. attempts to answer the following research questions: (i) which freight-
related variables can be used to demarcate alternative zoning systems
2.2.2. Design of zoning systems and maximize the internal homogeneity in terms of overall land-use
Designing optimal zoning systems using an improved aggregation/ patterns and economic activity; (ii) how competing alternatives of
clustering technique based on spatial, land use and transport char- zoning systems perform and what inferences do they provide? (iii) does
acteristics is relevant in this context to reduce the error caused by the design of zoning systems have an impact on the quality of resulting
MAUP (Fotheringham and Wong, 1991). The earliest work towards FTG or FG models? if yes, by what extent? and (iv) which zoning sys-
developing optimal zoning systems was a hierarchical heuristic ag- tems minimize the impact of MAUP in modelling freight generation and
gregation procedure that used ‘Ward's method’ based on maximization freight trip generations? To address these research questions, a three-
of an objective function (Openshaw, 1977). The use of cluster analysis step spatial-statistical investigation approach is proposed based on a
to attain homogeneous geographic areas by aggregating BSUs to esti- comprehensive dataset consisting both microscopic (establishment-
mate traffic demand estimation is particularly notable in literature. For level) and macroscopic freight data. The first step employs ad-hoc
instance, cluster analysis supported on a Geographic Information geographical clustering and two clustering techniques – K means clus-
System (GIS) tool was used to develop traffic analysis zones for urban tering and Density based clustering – to produce alternate zoning sys-
planning in Urbana and Champaign, Illinois, USA (You et al., 1998). tems which achieve spatial homogeneity in various sets of freight-re-
Other recent zone delineation methodologies adopted include Thiessen lated variables. The second derives from the first that it assesses and
Polygon Method (Wang et al., 2014), Regionalization (Lee et al., 2014), compares the optimality of these alternate sets of zoning systems by
Automated Zone Design (Stafford et al., 2008) and Density-based spa- measuring the degree of spatial autocorrelation (i.e., similarity across
tial aggregation (Martinez et al., 2007) etc. The use of automated units grouped within a zone). The third aims to assess the impact of
zoning procedure (AZP) in the form of hierarchical/non-hierarchical each zoning system in model performance and, in turn, quantify the
cluster analysis on a GIS platform to incorporate the necessary con- zoning effect of MAUP in modelling FG and FTG.
straints of homogeneity and contiguity is recommended as a reliable
strategy (Stafford et al., 2008). The initial zoning system obtained in 3. Methodological approach and data
AZP is iteratively refined based on a heuristic optimization approach
which reassigns objects to neighboring regions. The measure of within This section describes the research design followed to develop, as-
cluster sum of squares (WSS) and between cluster sum of squares (BSS) sess and compare the potential of alternate zoning systems on model-
in AZP will help to maintain the optimum compactness and exclusive- ling FG/FTG patterns. The first part explains the methodological ap-
ness of zones. Spatial autocorrelation measures are generally used in proach and analysis methods. In the second part, research context of the
AZP literature for assessing the values of a variable of interest (e.g., study area and data used in this study are presented.
socioeconomic characteristics) of aggregated BSUs in neighboring or
proximal locations (Griffith, 2009). To identify this relationship be- 3.1. Analysis methods
tween each BSU and imply the degree of concentration or dispersion of
clustered units, Moran's I spatial autocorrelation coefficient is widely The methodological approach consists of three steps explained as
used (Haynes et al., 2007). Another advantage of AZP is that it allows follows. In the first step, different zoning systems based on varying
model performance to be selected as one of the goals of the optimiza- parameters related to socioeconomic, land use, and locational char-
tion process (Xu et al., 2014). Furthermore, AZP also helps to ensure acteristics of the study area are developed using ad-hoc geographical
that there is a homogeneous spatial process underlying the data. For clustering (without the use of any clustering algorithms), and auto-
instance, Yannis et al. (2007) revealed that AZP based on spatial mated zone design procedures (AZP) based on mathematical clustering
homogeneity in transport characteristics (vehicle ownership, fuel con- algorithms. The degree of homogeneity achieved by each of these
sumption) and road safety parameters (speed violations) uncovered zoning systems are measured and compared using Moran's I spatial
regional effects on road accidents in Greece; these spatial patterns were autocorrelation coefficient in the next step. The third and final step
otherwise inconsistent in traditional ad-hoc geographical clusters. This involve estimation of hierarchical linear models using both zonal (ag-
idea was subsequently explored in several studies using alternate gregated) data and establishment (disaggregated) data to compare the
methodologies or variables (Lee et al., 2014; Stafford et al., 2008; model performance of different zoning systems. The analysis methods
Viegas et al., 2009) to improve model performance. For example, traffic used in each of these steps are explained below in sequence.
safety analysis zones (TSAZ) created using homogeneity of crash rates
(Lee et al., 2014) had better fit than traditional TAZ based models. The 3.1.1. Clustering approach
findings from existing literature clearly implies that the statistical re- The first clustering approach – ad-hoc geographical clustering –
sults become more reliable if the spatial units of analyses are made involves clipping of BSUs according to the existing rail and road
homogeneous using mathematical clustering algorithms. boundaries in the study area (Stafford et al., 2008). This approach is
well-recognized in literature because physical barriers of movement are
2.3. Problem statement and research design conceivably important in determining an establishment's freight travel
patterns. The second approach – AZP – involves the usage of clustering
Despite the longstanding recognition in geography literature and to techniques to aggregate BSUs into different zoning systems and two
a lesser extent in transportation literature, impacts of alternative spatial popular clustering (supervised) algorithms are adopted in this study: k-
representations or MAUP on FG/FTG models have barely received any means clustering algorithm (KMCA) and density-based spatial clus-
research attention. That is, the question is how different zoning systems tering analysis (DBSCAN) algorithm. The fundamental approaches in
lead to different results and an important research gap is thus evident KMCA and DBSCAN vary since the former is based on the idea of dis-
that none of the existing studies have attempted to compare how the tance from a central clustering point and the latter is based on the
design of zoning schemes influence model estimation results and sta- concept of reachability, e.g., how many neighbors does a point has
tistical fit. This paper contributes to this research gap by designing a within a radius. The clustering algorithms used in the aggregation
wide range of alternate zoning systems and assessing their impacts on process can be briefly explained as follows. KMCA iteratively optimizes
7
the zoning process in two major steps: (i) reassigning all BSUs to nearest n
n n
W (Xi X )(Xj X)
i=1 j = 1 ij
zone centroid (ii) recompute new zone centroids and reallocate until an Moran s I =
n n
optimum set of zones are obtained. The ‘distance’ function or ‘simi- i=1 j =1
Wij (Xi X )2 (3)
larity’ function/feature for KMCA is a user input parameter. The th
mathematical formulation for KMCA can be found in Everitt et al. where, n = number of BSUs; Xi = value of X in i BSU; Xj = value of X
(2011). The within-cluster sum of squares (WSS) is a measure of com- in jth BSU; Wij = 1: if BSUs are spatially adjacent and 0: if BSUs are not
pactness of resultant zones and between-cluster sum of squares (BSS) is spatially adjacent.
a measure of the uniqueness of the cluster. In KMCA, WSS should de-
crease and BSS should increase in each k-means iteration (Everitt et al., 3.1.3. Hierarchical linear modelling
2011). The mathematical formulation of WSS and BSS is given in Eqs. Hierarchical linear modelling (HLM) is an efficient tool to under-
(1) and (2). The DBSCAN algorithm, on the other hand, aggregates the stand, analyze and model outcome variables that are clustered in
points of high density in data space separated by points of low density. hierarchical levels (Woltman et al., 2012). HLM can distinguish spatial
This is based on a ‘distance’ or ‘similarity’ function and not need an clustering and thus allows sensitivity analysis to assess multiple spatial
input for initial seed of groups (unlike KMCA). More information about arrangements. For instance, establishments within one zone share
the DBSCAN algorithm and implementation can be found in (Kriegel variance among establishments within the zones (Level 1) and between
et al., 2011). the zones (Level 2). HLM is a compound form of Ordinary Least Squares
(OLS) regression which can predict variance in outcome variables when
k n predictor variables are embedded in these varying hierarchical levels
WSS = (Xi M )2 Xi Ck (level 1 and level 2) by simultaneously investigating the relationships
k=1 i=1 (1)
between and within hierarchical levels. The model structure of this two-
k
level HLM is explained in Eqs. (4) to (7). For the purpose of this ana-
BSS = nk (Xn M )2 lysis, two types of HLMs are estimated - unconstrained (null) models
k=1 (2) and mean as outcome models. The null model shown in Eq. (8) esti-
mates the variability in the outcome variable by level − 2 zones and
where, C1to Ck indicate the individual zones; k represents the number of suggests whether HLM is applicable or not. The results for null model
zones; n denote the number of BSUs in a zone Ck; Xito Xn are the values include Intra-class correlation coefficient (ICC) which a measure of the
of variable of interest in each BSU; Xn is the sample average; M is the percentage of total variance that can be explained by the presence of
mean of the similarity feature inside a zone Ck and nk is the number of cluster at higher level. If the ICC is very low, the HLM analyses will
data points inside the kth zone. yield same results as obtained from a traditional regression analysis.
The expression for ICC is given in Eq. (9). The output also involves
3.1.2. Spatial autocorrelation coefficient estimation of −2 log likelihood which can be used to assess the
Once zones are delineated, the optimality of zones is measured in goodness of fit of model with smaller values indicating a better fit.
terms of their internal homogeneity in variables that the literature Subsequently, mean as outcome models given in Eq. (10) are developed
identifies (see Table 2) as covariates of freight activity. The idea is to to analyze the relationship between level − 2 predictor variables (i.e.,
check if the BSUs forming a cluster are actually similar to each other not zonal characteristics) and the grouped outcome variable. In these
and if so, how much is the degree of similarity that exists within the models, the measure of variance in level-1 outcome variable explained
cluster. This is done based on a parameter, known as, Moran's I Spatial by level − 2 predictor variable (r2) is computed using Eq. (11) and
Autocorrelation coefficient. The Moran's I parameter measures the compared for different zoning systems.
overall spatial autocorrelation of the data set i.e., how BSUs are similar HLM Model Structure:
to the other BSUs aggregated within its cluster. The Moran's I parameter Yij = 0j + 1j Xij + rij (4)
is computationally intensive due to the inclusion of spatial constraint
and its value ranges from −1 to 1. If the Moran's I parameter is −1, it 0j = 00 + 01 Gj + U0j (5)
denotes perfect clustering of dissimilar values (presence of perfect
dispersion). If the Moran's I parameter is 0, it denotes that there is no 1j = 10 + 11 Gj + U1j (6)
autocorrelation (presence of perfect randomness). An interval centered
Yij = + 10 Xij + 01 Gj + 11 Gj Xij + U1j Xij + U0j + rij (7)
around zero for inferring that autocorrelation is trivial, or in other
00
words, not rejecting the null hypothesis (i.e., geographic features in the Unconstrained (Null) Model:
study area are randomly distributed and shows no spatial dependence) Yij = 00 + U0j + rij (8)
depends on the confidence level chosen for the analysis (Blazquez et al.,
2018). Similarly, a Moran's I parameter around 1 denotes perfect ICC = 00
clustering of similar values. However, a large positive Moran's I para- 00 + 2 (9)
meter (close to 1) is an indicator of positive spatial autocorrelation with
Means as Outcomes Model:
an uneven pattern which is undesirable. This is undesirable mainly
because it indicates very high between-cluster heterogeneity which can Yij = 00 + 01 Gj + U0j + rij (10)
lead to several distinct zones with not interzonal movement at all.
r2 = ( means model )/ null model (11)
Hence, the concept of expected Moran's I, denoted by E[I], is used in null model
th
this study for comparing the optimality of zoning systems (Novak et al., where, Yij is the dependent variable for i level 1 unit clustered within
2011). For a cluster of ‘n’ BSUs, E[I] should be equal to −1/(n − 1). the jth level 2 unit, Xijis the value of the level 1 predictor, β0j is the
The Moran's I autocorrelation coefficient can be calculated for the intercept for the jth level 2 unit; β1j is the slope for jth level 2 unit; rijis a
overall cluster or a specific attribute of the cluster and degree of clus- random error associated with the ith level 1 unit nested within the jth
tering will be based on their positive convergence to E[I]. The mathe- level 2 unit, Gj is the value of the level 2 predictor; γ00 is the overall
matical expression of Moran's I spatial autocorrelation coefficient is mean intercept adjusted for G; γ10 is the overall mean intercept adjusted
given in Eq. (3). for G; γ01 is the regression coefficient associated with G relative to level
8
Fig. 1. Study Area Map Showing Geocoded Locations of Surveyed Establishments.
1 intercept; γ11 is the regression coefficient associated with G relative to polygons that contained zonal characteristics obtained from public
level 1 slope; U0j is the random effect of the jth level 2 unit adjusted for freight data sources. The final sample comprises of 184 establishments
G on the intercept; U1j is the random effect of the jth level 2 unit ad- with complete observations. The data mainly consisted of information
justed for G on the slope. regarding establishment characteristics (employment, area) and freight
flow details (FP and FA in tons/day; FTP and FTA in trips/day).
3.2. Data collection and preparation
3.2.2. Zonal-level (aggregated) data
The research is undertaken in Jaipur, the capital and largest city of The aggregated variables found significant for FG/FTG models in
Rajasthan in India. The city consists of 77 municipal wards that are literature were collected for all the BSUs and divided into four: (a)
divided into 8 administrative zones according to Census of 2011. The Socio-economic characteristics, that included three data variables,
data used in this study include disaggregate level data collected by number of households (NHH), population density (PDE) and number of
establishment-based freight survey (EBFS) and zonal level data col- workers (NWS); (b) Industrial characteristics, comprising of wo vari-
lected from publicly available freight data sources. ables - industrial area (INA), number of establishments (NES); (c)
Economic land characteristics such as vacant commercial area (VCA)
3.2.1. Establishment-level (disaggregated) data and land value (LVA); (d) Locational characteristics include three
The EBFS used in this study collected micro-level data from shippers variables namely; network distance of every ward centroid from nearest
(manufacturing units, wholesalers, and raw material production sites) National Highway (DNH), network distance of every ward centroid
in Jaipur. The sampling frame for EBFS was developed using economic from the city centroid (DCC) and network distance of every ward
census, which provided a list of all industrial establishments in Jaipur. centroid from nearest terminal of dedicated freight corridor (DFC). The
The final sampling frame consisted of 31,725 establishments and simple first part of zonal characteristics were collected from District Census
random sampling is adopted for EBFS since auxiliary information (e.g., Data (Directorate of Census Operations Rajasthan, 2011). The data
industry sector, employment level) was missing for many records in the described for the second part was collected from publicly available
sample frame. The survey was administered by face-to-face interviews industrial data source - RIICO GIS website (RIICO, 2018). The third part
with logistics managers or firm owners of the establishments. More of zonal characteristics was calculated from the location-allocation tool
information about the sampling strategy, socio-demographic char- of ArcGIS/ArcMap using a digitized network map.
acteristics, response rates and sample representativeness can be found
in Pani and Sahu (2019b). The locations of surveyed establishments are 4. Results and discussion
geo-located using ArcMap® 10.3.1 and the geo-referenced map is pre-
sented in Fig. 1. The geo-located establishments were mapped to zoning In this section, the analysis results are presented. The alternate zone
9
Fig. 2. Alternate Zoning Systems based on Freight-Related Variables.
systems are presented in the first part. Subsequently, the optimality of subsequently clipped according to the existing rail and road
these zone systems is assessed using Moran's I coefficient. Finally, the boundaries. These wards were merged together to form zones in
model performance of alternate zone systems is explored using hier- ArcGIS considering the contiguity criteria.
archical linear models and model estimation results are discussed in • Zones based on Sociodemographic Characteristics (ZS-C) - In this case,
detail. the zoning system is defined according to sociodemographic char-
acteristics. The process considered three sociodemographic vari-
4.1. Designing alternate zoning systems for FG/FTG modelling ables i.e., number of households, number of workers and population
density of each ward as the similarity criteria for clustering.
A total of 8 administrative zones are defined a priori for the study • Zones based on Locational Characteristics (Z S-D) - The zoning system
area as per the census tracts. The density-based clustering (DBSCAN) of ZS-D is defined using locational characteristics, such as network
establishment density yielded 8 optimal zones in the study area as well. distance of each ward centroid from DFC, from nearest NH and city
The number of zones in each alternate zoning system is therefore kept centroid were taken as the similarity criteria for clustering.
as eight for enabling model performance comparisons. The resultant • Zones based on Industrial Characteristics (ZS-E) - This zoning system is
zone systems are illustrated in Fig. 2 and the nomenclature behind their delineated zones based on industrial characteristics which consisted
definition is explained below. two variables; industrial area and number of establishments per
ward.
• Census Zones (ZS-A) - This zoning system is defined according to the • Zones based on Economic Characteristics of Land (ZS-F) - In order to
existing administrative boundaries without using any clustering accommodate the ongoing land use pattern in zone delineation
technique. The system consists of 77 wards that are grouped ‘a process, nomenclature for ZS-F considers the vacant industrial space
priori’ into 8 zones for administrative purposes. and industrial land value in INR., per ward as the input criteria for
• Zones based on Physical Boundaries (ZS-B) - This zoning system is KMCA. With the use of these two variables, the propensity for
delineated based on physical features (i.e., the movement barriers commercial expansion in in future can be accommodated in this
such as road and rail boundaries) separating the BSUs. In order to zoning system.
achieve this, the major rail and road lines were extracted and • Zones defined by Density based clustering of Population Density (ZS-G) -
aligned over the digitized map of the study area. The BSUs were The nomenclature ZS-G indicate that the zoning system is based on
10
the technique of density-based clustering using DBSCAN algorithm. Table 3

In this case, population density of each ward is used for clustering Variation of ICC, -2LL across zone systems.
similar BSUs.
•
Dependent Zone Systems
Zones defined by Density based clustering of Establishment Density (ZS- variable
H) - ZS-H is formed based on DBSCAN algorithm that uses estab- ZS-A ZS-B ZS-C ZS-D ZS-E ZS-F ZS-G ZS-H
lishment density (i.e., number of establishments per sq. km) as the
FA
clustering criteria. ICC (%) – – – – 2.66⁎ 1.27⁎ – 1.19⁎
-2 LL – – – – 1015.1 1016.2 – 1016.4
This study used Rook's two-way contiguity criteria (a BSU shares FTA
both edge and corner with other BSUs) for adjusting the clustering al- ICC (%) 4.56⁎⁎⁎ – – 3.03⁎ 4.5⁎⁎ 4.8⁎ – 5.52⁎⁎⁎
-2 LL 776.3 – – 776.8 776.5 776.5 – 776.1
gorithms so that geographically dispersed BSUs are not grouped to-
FP
gether. This is achieved by multiplying the ‘distance’ function with ICC (%) – 4.36⁎⁎ – 3.84⁎⁎ 1.56⁎ 1.75⁎⁎ – 2.24⁎⁎
‘connectivity’ matrix (obtained from GIS tools). The resulting ‘distance’ -2 LL – 1133.9 – 1095.4 1097.2 1097.2 – 1097.3
function used in the aggregation process is essentially a ‘constrained- FTP
ICC (%) 1.94⁎⁎ – – 1.35⁎ 3.87⁎⁎ 3.41⁎ – 2.48⁎
distance’ function. ArcGIS/ArcMap and GeoDa software are used as the
-2 LL 789.2 – – 789.2 788.1 788.4 – 788.9
GIS tools for performing these analyses. It may be noted that ZS-A and
ZS-B (ZS stands for Zoning System) are designed using ad-hoc geo- LL – log likelihood, − Not Significant.
graphical clustering, ZS-C to ZS-F are designed using KMCA and ZS-G ⁎⁎⁎
99% Confidence Limit.
and ZS-H are designed using DBSCAN algorithm. Each of these zoning ⁎⁎
systems are different either in the modelling objective or clustering ⁎
methodology and the next sections examine which zoning systems are
better in optimality and how well do they represent the area-level space. The zoning systems, based on population density may thus be
variation in determinants of FG/FTG patterns. inappropriate for developing freight demand model systems.
4.2. Assessing the optimality of zoning systems 4.3. Comparing the model performance of alternate zoning systems
This section presents the results of optimality assessment by com- The viability of newly developed zoning systems for FG/FTG ana-
paring the degree of spatial autocorrelation accomplished by zoning lysis is investigated by estimating HLM models. First, unconstrained
systems in each of the ten variables involved in design nomenclature. (null) HLM models were estimated to understand how much percentage
The cluster Moran's I and Expected Moran's I estimated for each zoning of variance in FTP (trips/day), FP (tons/day), FTA (trips/day) and FA
system in terms of all the variables is presented in the Appendix (Fig. (tons/day) can be attributable to zonal level and how much to the es-
A.1 and Fig. A.2). Although the existing census boundaries (ZS-A) have tablishment level. These models are devoid of predictors and it only
a weak theoretical basis and are largely subjected to quinquennial gives τ00 and σ2 to estimate ICC, log likelihood, chi-squared test statistic
electoral changes, Fig. A.1 show that these zones are internally homo- and helps to establish the statistical justification of running HLM ana-
genous in terms of socio-demographic characteristics such as number of lysis. The magnitude of ICC indicates whether there is a statistically
households, population density and number of workers. This is in line significant between-zone variability that cannot be explained by dis-
with the findings in health literature (Stafford et al., 2008) that ad-hoc aggregate (level-1) data. Even if the ICC values are relatively low, it
administrative boundaries are appropriate for modelling characteristics does not imply that a non-hierarchical analysis is adequate for the data
of a population, such as health inequalities. This finding is further since omission of hierarchical levels can lead to under- as well as over-
corroborated by the similar variation in Moran's I coefficient in the case estimation of the variation between zones (Haynes et al., 2007; Stafford
of ZS-A and zoning system designed using socio-demographic char- et al., 2008). The results from null model are shown in Table 3. It is
acteristics (ZS-C). However, it is evident that census boundaries are evident from table that the ICCs are varying between 1 and 6% at zonal
incapable of reflecting the magnitude of commercial and land use ac- level. The most prominent implication of these results is that it gives
tivities within region, the factors that intuitively are associated with statistical evidence for the variance remaining unexplained at the dis-
freight movements. For instance, ZS-A is showing negative spatial au- aggregate level, given the absence of a suitable zoning system. The
tocorrelation for variables such as industrial area, number of estab- variation in ICC for FTA/FTP/FA/FP in different zoning system suggests
lishments, land-value, etc. An overall comparison of variation in Mor- that the aggregation technique adopted for zone delineation influence
an's I underlines the popular assertion (Fotheringham and Wong, 1991; the model estimation results.
Xu et al., 2014) that there is no such thing as optimal zoning systems The presence of zonal-level variation, as revealed by ICC values
since a zoning system optimal for one set of variables may not be op- motivate further analyses to explain the variability in the mean FG/FTG
timal for another. However, zoning systems based on physical bound- by establishments across zones (intercept) using zonal variables related
aries (ZS-B), industrial characteristics (ZS-E) and economic character- to industrial characteristics, economic characteristics of land, socio-
istics (ZS-F) shows positive spatial autocorrelation across all variables. demographic characteristics and locational characteristics. Mean-as-
This suggests that it is important to acknowledge road-rail networks in outcome models are estimated for decomposing the zonal (level-2)
the study area and prevailing economic and industrial characteristics of variability in FG/FTG patterns by establishments (level-1) for those
land use while designing zoning systems. The comparison also reveals zoning systems which have statistically significant level of ICC and the
that homogeneity achieved using KMCA algorithm is significantly results are presented in Appendix B (Table B.1 and Table B.2). The
better than DBSCAN algorithm. It may be noted that the DBSCAN al- model results reveal that the zonal (aggregated) characteristics, as a
gorithm using population density produces a zoning system (ZS-G) testament to MAUP, exhibit varying magnitude and direction of asso-
having negative spatial autocorrelation for all the variables. This is due ciation with FG/FTG across zoning systems. For instance, industrial
to the clustering of dissimilar values (e.g., BSUs with high values of area per zone (INA) shows positive association with FA for ZS-E (0.106
certain characteristics have neighbor BSUs with smaller values of that tons/day increment for every 1 km2), although the relationship gets
characteristics) and may be driven by the uneven development in city inverted for ZS-F (0.015 tons/day reduction for every 1 km2). The
11
ZS-E ZS-F ZS-H ZS-A ZS-D ZS-H ZS-F ZS-A ZS-E ZS-D
Industrial Economic Socio-Demographic Locational Industrial Economic Socio-Demographic Locational
Characteristics Characteristics Characteristics Characteristics Characteristics Characteristics Characteristics Characteristics
1.0 1.0
Explanatory Power (r2) for FTA

Explanatory Power (r2) for FTP
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
INA NES VCA LVA PDE NHH NWS DDFC DCC DNH A S A A E H S C C H
IN NE VC LV PD NH NW DDF DC DN
Aggregated Variables
Aggregated Variables
ZS-B ZS-D ZS-H ZS-F ZS-E ZS-E ZS-F ZS-H
Industrial Economic Socio-Demographic Locational Industrial Economic Socio-Demographic Locational
Characteristics Characteristics Characteristics Characteristics Characteristics Characteristics Characteristics Characteristics
1.0 1.0
Explanatory Power (r2) for FP
Explanatory Power (r2) for FA

0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
INA NES VCA LVA PDE NHH NWS DDFC DCC DNH INA NES VCA LVA PDE NHH NWS DDFC DCC DNH
Aggregated Variables Aggregated Variables
Fig. 3. Effect of Aggregated Variables on (a) FTP, (b) FTA, (c) FP and (d) FA patterns.
metric of measurement (i.e., tons or trips) also shows a crucial role in effect is notably absent when freight is measured in tons (FP and FA).
influencing the relationship between freight demand and zonal char- The first connotation is logical since an establishment located near to a
acteristics. An example for this altercating influence is that the unitary major arterial is more accessible to customers and therefore is more
effect of vacant commercial area per zone (VCA) fluctuate between likely to receive freight orders which will result in more FTA or FTP.
−0.306 tons/day per km2 (FA) and 0.228 trips/day per km2 (FTA) for The second connotation suggests that suburbanization of warehousing
the same zoning system (ZS-F). This contrast corroborates the axiom possibly leads to increase in congestion, since freight trips are also ef-
that FG and FTG are outputs of two fundamentally different processes fectively increasing in this transition. This linkage between sub-
(Holguín-Veras et al., 2016; Pani and Sahu, 2019a; Sahu and Pani, urbanization and freight trips is also reflected by the fact that vacant
2019; Tavasszy and De Jong, 2014) and an increase in FG may not commercial area (VCA), a parameter strongly linked with suburbs, is
necessarily result in an increase in FTG due to the flexibility in ship- positively correlated with FTP and FTA. The model estimation results
ment size. The nature and degree of MAUP effects also depends upon also reveal that the tendency of establishments to cluster geo-
the direction of freight flow, since land-value, for example, show a graphically, termed as establishment agglomeration, leads to reduced
positive relationship with FTP for ZS-F, even though the relationship is freight needs (in tonnage), since number of establishments (NES) is
non-existent with FTA. This underlines the importance of distinguishing negatively associated with FP and FA. This can be attributed to the
pick-up requests (FTP or FP) and delivery requests (FTA or FA) during economic advantages gained from geographic proximity in these lo-
the freight data collection stage itself. gistic clusters. The operational advantages and asset-sharing in these
The model estimation results for locational variables suggest the logistic clusters, however, are possibly leading to increased number of
following connotations regarding logistic operations: (i) Distance to freight trips since NES is positively associated to FTP and FTA.
major arterial (DNH) shows significant negative association with freight The variability in the explanatory power (r2) of model estimation
demand across most of the zoning systems, regardless of the metric of results is presented in Fig. 3 for indicating the zonal level variability in
measurement; (ii) Distance to city center (DCC) explains the variability FG/FTG patterns of establishments (level-1) that can be captured uni-
in freight travel when freight is measured in trips (FTP and FTA); this variate analyses of zonal (level-2) variables. For instance, an r2 value of
12
0.45 for predicting FTP using VCA based on ZS-F configuration suggest visualized or analyzed to arrive at the actual underlying patterns in
that 45% of the zonal level variability in FTP patterns (i.e., corre- data. For example, the developed zoning systems using physical
sponding to an ICC of 3.87%) can be explained using vacant commer- boundaries (road and railway network), industrial characteristics (in-
cial area in a zone. A closer look at these results suggest that zonal level dustrial area, number of establishments) and economic characteristics
variability in FG/FTG is better explained in those zoning systems with (land value, vacant industrial space) offer satisfactory levels of internal
better overall homogeneity (e.g., ZS-E, ZS-F and ZS-H), barring the homogeneity and shows positive spatial autocorrelation across all
exception of ZS-B. The variables that explain this variability are, how- variables. It is therefore recommended to practitioners and policy-
ever, strongly sensitive to the metric in which freight flow is measured. makers that freight demand model systems can be developed by de-
The zonal level variability in FTG (both FTP and FTA) is better ex- signing zoning systems based on the physical features, prevailing eco-
plained using socio-demographic characteristics and locational char- nomic and industrial characteristics. The importance of designing
acteristics. The variability in FG (both FP and FA), on the other hand, is better zoning systems is also reflected in the results of this study, since
strongly linked to industrial characteristics and economic character- the associations between FG/FTG and zonal characteristics (e.g., dis-
istics of land. This dichotomy in explanatory power may be indicative tance to arterial, land value) are not present when census boundaries
of the disparate factors influencing FG and FTG; the former being a are adopted as the zoning system. The second research implication is
direct reflection of production process at the establishments and the that the zonal level variability in FTG is better explained using socio-
latter being influenced by logistical decisions that are aimed at mini- demographic characteristics and locational characteristics, whereas FG
mizing the total transportation cost. The distance to major arterial is better explained using industrial characteristics and economic char-
(DNH) explain zonal variability in both FG and FTG in most of the acteristics of land. Finally, the third research implication is that this
zoning systems. study underlines the case made in literature for industrial aggregation
of freight data. That is, when spatial units are behaviorally defined
5. Conclusions and research implications based on economic activity (i.e., industrial aggregation), areal units
become non-modifiable and MAUP ceases to exist in model systems. In
This research is the first to examine the potential impacts of MAUP sum, this research highlights the effects of MAUP that are generally not
in measurement of FG/FTG patterns in freight demand model systems considered while developing FG or FTG models and offer insights into
by designing alternate zoning systems. For this purpose, disaggregated the extent of parameter sensitivity and performance sensitivity during
daily freight travel from establishments and aggregated characteristics spatial aggregation of freight data. The findings will also encourage
of census wards were collected for the city of Jaipur, India. The study analysts to acknowledge that the results of freight travel analyses may
results quantify the presence of MAUP since alternate zoning systems vary widely according to the definition of the units of analysis.
resulted in wide variation in the estimated coefficients for zonal char- This research, however, is not without limitations. First, the geo-
acteristics (e.g., industrial area, land value, number of establishments, graphical units considered in this analysis (i.e., census wards) are pre-
distance to primary arterial) in terms of magnitude, statistical sig- aggregated based on smaller units known as census blocks. This means
nificance, and even in the direction of association (sign of the coeffi- that it is not possible to capture clusters and segregation patterns within
cient). The implication of this finding is that one can design different or a census block. A grid-based approach to aggregating freight data may
even counterproductive policy instruments based on the way data is produce more tractable and stable results than the statistical areal units.
aggregated to capture the role of land-use, spatial effects and built- Another limitation is that the number of geographical units used at the
environment in influencing freight travel patterns. The magnitude and larger levels of geography was relatively small and therefore be limited
scope of MAUP impacts are also found to be dependent upon the metric in their generalizability to larger cities in developed countries. There
of measuring freight (FG or FTG) and the direction in which freight flow are several future avenues that researchers can take to better under-
is measured (i.e., production or attraction). Depending upon the var- stand the link between spatial representations and freight demand
iation in these factors, zonal characteristics may appear to have a model parameters. For example, future research needs to provide the
pseudo association with FG/FTG; conversely, in other scenario, they missing insights on geographical extent of freight activity and spatial
obscure an underlying strong correlation between FG/FTG and zone scale of homogeneous areas in order to model the scale effect of MAUP.
characteristics, making the relationship to appear weak or even nega- Future studies are also recommended to explore how the aggregation
tive. Given that spatial data aggregation inherently leads to MAUP, a algorithms can be improved to achieve better homogeneity in the
general workable solution or method that totally avoids MAUP does not zoning systems, and, in turn, improve the explanatory power of FG/FTG
exist. The present research implications towards addressing MAUP in models involving spatial indicators.
freight demand model systems are three-fold: (i) Alternative spatial
aggregation units need to be examined in their scope and magnitude of
Declaration of Competing Interest
MAUP before reporting the linkages between spatial dimension and
freight travel (ii) Dichotomy in the variables that explain the zonal level
None.
variability in FG and FTG patterns suggest that spatial indicators will
have disparate influence on freight demand model systems based on the
metric of measuring freight flow and (ii) Industrial aggregation levels Acknowledgement
are viable alternatives for bypassing the MAUP effects associated with
spatial aggregation levels. The first research implication, as demon- This research was supported through the Research Initiation Grant
strated in this study, suggests that an analyst needs to determine which (RIG Head 06/03/302) by Birla Institute of Technology and Science
zonal characteristics are sensitive to the variation in areal unit defini- (BITS) Pilani. The authors would also like to acknowledge the estab-
tion, and to what extent, and which zoning system minimizes the im- lishment owners for their participation and cooperation during the
pacts of MAUP. More specifically, alternative zoning systems need to be survey.
13
Appendix A. Optimality assessment
0.05 0.05 0.05 0.05

(ZS-1) (A) (ZS-2) (B)
Expected Moran's I
Cluster Moran's I
Expcted Moran's I
Cluster Moran's I
0.00 0.00 0.00 0.00
-0.05 -0.05
-0.05 -0.05
C
ES
H
E
FC
A
S
A
C
ES
H
E
FC
A
W
PD
C
C
LV
IN
N
H
W
PD
C
C
LV
IN
N
N
H
D
V
D
N
N
N
D
V
D
N
D
D
0.05 0.05
(ZS-3) (C) 0.05 (ZS-4) (D) 0.05
Expected Moran's I
Expected Moran's I
Cluster Moran's I
Cluster Moran's I
0.00 0.00 0.00 0.00
-0.05 -0.05
-0.05 -0.05
ES
C
A
H
E
FC
A
C
ES
H
E
FC
A
S
W
PD
C
C
LV
IN
N
H
W
PD
N
C
LV
IN
N
H
D
V
D
N
D
V
D
D
D
LEGEND:-
Cluster Moran's I Expected Moran's I INA- Industrial Area
NHH- Number of Households NES- Number of Establishments NWS- Number of Workers
VCA- Commercial Vacant Area DDFC- Distance from DFC LVA- Industrial Land Value
DCC- Distance from City Centroid PDE- Population Density DNH- Distance from NH
Fig. A.1. Degree of Homogeneity Achieved in Zoning Systems – (a) ZS-A, (b) ZS-B, (c) ZS-C and (d) ZS-D.
14
0.05 0.05 0.05 0.05

(ZS-5) (ZS-6)
(A) (B)
Cluster Moran's I
Expected Moran's I
Expected Moran's I
Cluster Moran's I
0.00 0.00 0.00 0.00
-0.05 -0.05 -0.05 -0.05
ES
C
A
A
S
FC
E
H
ES
C
A
A
S
FC
E
W
PD
C
C
LV
IN
N
W
PD
C
C
LV
IN
N
H
D
V
D
N
N
N
D
V
D
N
D
D
0.05 (ZS-7) (C) 0.05 0.05 (ZS-8) (D) 0.05
Expected Moran's I
Cluster Moran's I
Expected Moran's I
Cluster Moran's I
0.00 0.00 0.00 0.00
-0.05 -0.05 -0.05 -0.05
ES
C
A
A
S
FC
E
H
ES
C
A
A
FC
H
S
E
W
PD
C
C
LV
IN
N
W
PD
C
C
LV
IN
D
V
D
N
N
N
D
V
D
N
D
D
LEGEND:-
Cluster Moran's I Expected Moran's I INA- Industrial Area
NHH- Number of Households NES- Number of Establishments NWS- Number of Workers
VCA- Commercial Vacant Area DDFC- Distance from DFC LVA- Industrial Land Value
DCC- Distance from City Centroid PDE- Population Density DNH- Distance from NH
Fig. A.2. Degree of Homogeneity Achieved in Zoning Systems – (a) ZS-E, (b) ZS-F, (c) ZS-G and (d) ZS-H
Appendix B. Comparative assessment of model performance
Table B.1
Effect of Aggregated Variables on FTA & FTP.
Zoning systems Model coefficients Level-2 Predictors (Gj)
INA NES VCA LVA PDE NHH NWS DDFC DCC DNH
Mean as Outcomes Model-I (FTA)

ZS-H ICC = 5.52% γ00 2.786⁎⁎ 2.778⁎⁎ 2.761⁎⁎ 2.811⁎⁎ 2.879⁎⁎ 2.672⁎⁎ 2.663⁎⁎ 2.793⁎⁎ 2.645⁎⁎ 2.760⁎⁎
γ01 – 0.001⁎ – – – 0.0001⁎ 0.0001⁎⁎ – 0.101⁎ –
U0 – 0.611⁎⁎ – – – 0.341⁎⁎ 0.271⁎ – 0.318⁎ –
r2 0 0.20 0 0 0 0.54 71 0 0.60 0
ZS-F ICC = 4.8% γ00 2.742⁎⁎ 2.740⁎⁎ 2.735⁎⁎ 2.749⁎⁎ 2.740⁎⁎ 2.687⁎⁎ 2.687⁎⁎ 2.707⁎⁎ 2.698⁎ 2.783⁎⁎
γ01 0.004⁎ 0.0001⁎ 0.228⁎ – 0.0001⁎ 0.0001⁎ 0.0001⁎ 0.0001⁎ 0.001⁎ −0.005⁎
U0 0.589⁎⁎ 0.583⁎ 0.523⁎ – 0.568⁎⁎ 0.498⁎ 0.498⁎ 0.546⁎ 0.519⁎ 0.532⁎
r2 0.27 0.29 0.43 0 0.32 0.48 0.48 0.37 0.44 0.41
ZS-A ICC = 4.54% γ00 2.818⁎⁎ 2.808⁎⁎ 2.794⁎⁎ 2.831⁎⁎ 2.815⁎⁎ 2.669⁎⁎ 2.659⁎⁎ 2.832⁎⁎ 2.649⁎⁎ 2.699⁎⁎
γ01 – – – – 0.0001⁎ 0.0001⁎ 0.0001⁎⁎ – 0.108⁎ −0.233⁎
U0 – – – 0.601⁎⁎ 0.498⁎⁎ 0.378⁎ 0.323⁎ – 0.344⁎ 0.515⁎⁎
r2 0 0 0 0 0.14 0.51 0.64 0 0.59 0.08
ZS-E ICC = 4.5% γ00 2.744⁎⁎ 2.746⁎⁎ 2.769⁎⁎ 2.756⁎⁎ 2.757⁎⁎ 2.818⁎⁎ 2.840⁎⁎ 2.751⁎⁎ 2.762⁎⁎ 2.713⁎⁎
γ01 0.189⁎ 0.001⁎ 1.807⁎ – 0.0001⁎ 0.0001⁎ 0.0001⁎ 0.059⁎ 0.060⁎ −0.255⁎
U0 0.528⁎ 0.533⁎ 0.514⁎ 0.562⁎⁎ 0.512⁎ 0.243⁎ 0.100⁎ 0.479⁎ 0.389⁎ 0.401⁎
r2 0.03 0.02 0 0 0.09 0.80 0.97 0.20 0.47 0.44
ZS-D ICC = 3.03% γ00 2.723⁎⁎ 2.718⁎⁎ 2.700⁎⁎ 2.709⁎⁎ 2.743⁎ 2.684⁎⁎ 2.680⁎⁎ 2.736⁎⁎ 2.643⁎⁎ 2.708⁎⁎
γ01 – – – – – 0.001⁎ 0.001⁎⁎ – 0.088⁎ −0.192⁎
U0 – – – – – 0.104⁎ 0.037⁎ – 0.283⁎ 0.342⁎
r2 0 0 0 0 0 0.92 0.99 0 0.41 0.15
Mean as Outcomes Model-II (FTP)

ZS-E ICC = 3.87% γ00 3.399⁎⁎ 3.405⁎⁎ 3.417⁎⁎ 3.371⁎⁎ 3.337⁎⁎ 3.423⁎⁎ 3.425⁎⁎ 3.365⁎⁎ 3.349⁎⁎ 3.322⁎⁎
γ01 – 0.001⁎ 8.975⁎ – 0.001⁎ 0.001⁎⁎ 0.001⁎⁎⁎ – 0.089⁎ –
U0 – 0.429⁎ 0.322⁎ – 0.420⁎ 0.047⁎ 0.033⁎ – 0.331⁎ –
r2 0 0.02 0.45 0 0.06 0.99 0.99 0 0.42 0
(continued on next page)
15
Table B.1 (continued)
ZS-F ICC = 3.41% γ00 3.326 ⁎⁎

3.322 ⁎⁎
3.308 ⁎⁎
3.348 ⁎⁎
3.387 ⁎⁎
3.289 ⁎⁎
3.289 ⁎⁎
3.335⁎⁎
3.298 ⁎⁎
3.404⁎⁎
γ01 – 0.001⁎ 0.361⁎ 0.020⁎ – 0.001⁎ 0.001⁎ – 0.001⁎ −0.001⁎
U0 – 0.466⁎ 0.365⁎ 0.393⁎ – 0.449⁎ 0.449⁎ – 0.463⁎ 0.449⁎
r2 0 0.07 0.39 0.06 0 0.08 0.08 0 0.02 0.08
ZS-H ICC = 2.48% γ00 3.257⁎⁎ 3.249⁎⁎ 3.253⁎⁎ 3.297⁎⁎ 3.204⁎⁎ 3.178⁎⁎ 3.192⁎⁎ 3.309⁎⁎ 3.130⁎⁎ 3.171⁎⁎
γ01 – – 11.638⁎ – – 0.001⁎⁎ 0.001⁎⁎ – 0.119⁎⁎ −0.26⁎⁎
U0 – – 0.243⁎ – – 0.062⁎ 0.037⁎ – 0.078⁎ 0.338⁎
r2 0 0 0.50 0 0 0.97 0.99 0 0.95 0.05
ZS-A ICC = 1.94% γ00 3.252⁎⁎ 3.252⁎⁎ 3.258⁎⁎ 3.060⁎⁎ 3.125⁎⁎ 3.145⁎⁎ 3.169⁎⁎ 3.313⁎⁎ 3.113⁎⁎ 2.985⁎⁎
γ01 – – 10.487⁎ – – 0.001⁎ 0.001⁎ – 0.126⁎ –
U0 – – 0.269⁎ 0.452⁎⁎ – 0.194⁎ 0.099⁎ – 0.126⁎ –
r2 0 0 0.23 0 0 0.61 0.90 0 0.83 0
ZS-D ICC = 1.35% γ00 3.205⁎⁎ 3.199⁎⁎ 3.199⁎⁎ 3.182⁎⁎ 3.188⁎⁎ 3.229⁎⁎ 3.231⁎⁎ 3.273⁎⁎ 3.144⁎⁎ 3.248⁎⁎
γ01 – – – – 0.001⁎⁎ 0.001⁎⁎ 0.001⁎⁎ – 0.121⁎⁎ 0.287⁎⁎
U0 – – – – 0.238⁎ 0.025⁎ 0.023⁎ – 0.031⁎ 0.191⁎
r2 0 0 0 0 0.13 0.99 0.99 0 0.99 0.43
− Non-significant Results.
⁎
95% Confidence Level.
⁎⁎
99% Confidence Level.
Table B.2
Effect of Aggregated Variables on FA & FP.
Mean as outcomes model-III (FA)

ZS-E ICC = 2.66% γ00 2.514⁎⁎ 2.514⁎⁎ 2.501⁎⁎ 2.513⁎⁎ 2.469⁎⁎ 2.522⁎⁎ 2.521⁎⁎ 2.497⁎⁎ 2.492⁎⁎ 2.351⁎⁎
γ01 0.106⁎ 0.001⁎ – −0.027⁎ – – – – – −0.564⁎
U0 0.776⁎ 0.776⁎ – – – – – – – 0.615⁎
r2 0.12 0.12 0 0.21 0 0 0 0 0 0.18
ZS-F ICC = 1.27% γ00 2.573⁎⁎ 2.571⁎⁎ 2.572⁎⁎ 2.539⁎⁎ 2.415⁎⁎ 2.488⁎⁎ 2.488⁎⁎ 2.485⁎⁎ 2.524⁎⁎ 2.771⁎⁎
γ01 −0.015⁎ −0.001⁎ −0.306⁎ −0.026⁎ – – – – – −0.021⁎⁎
U0 0.513⁎ 0.522⁎ 0.453⁎ – – – – – – 0.082⁎
r2 0.07 0.07 0.06 0.41 0 0 0 0 0 0.97
ZS-H ICC = 1.19% γ00 2.736⁎⁎ 2.732⁎⁎ 2.692⁎⁎ 2.677⁎⁎ 2.433⁎⁎ 2.579⁎⁎ 2.559⁎⁎ 2.531⁎⁎ 2.556⁎⁎ 2.289⁎⁎
γ01 0.153⁎ −0.003⁎ 0.052⁎ −0.040⁎⁎ – – – – – –
U0 0.092⁎ 0.091⁎ 0.074⁎ – – – – – – –
r2 0.96 0.96 0.97 0.98 0 0 0 0 0 0
Mean as outcomes model-IV (FP)

ZS-B ICC = 4.36% γ00 4.686⁎⁎ 4.532⁎⁎ 4.589⁎⁎ 4.474⁎⁎ 4.387⁎ 4.396⁎⁎ 4.390⁎ 4.458⁎ 4.503⁎ 4.417⁎⁎
γ01 −3.051⁎⁎ −0.008⁎⁎ −32.534⁎ −0.076⁎ – – – – – –
U0 0.081⁎ 0.086⁎ 0.859⁎ – – – – – – –
r2 0.99 0.99 0.41 0.68 0 0 0 0 0 0
ZS-D ICC = 3.84% γ00 4.379⁎⁎ 4.377⁎⁎ 4.374⁎⁎ 4.468⁎⁎ 4.375⁎⁎ 4.122⁎ 4.115⁎⁎ 4.179⁎ 4.217⁎ 4.075⁎⁎
γ01 −1.679⁎ −0.004⁎ −27.790⁎ −0.063⁎ 0.001⁎ – – – – −0.475⁎
U0 0.850⁎ 0.869⁎ 0.884⁎ – 0.962⁎ – – – – 0.953⁎
r2 0.31 0.28 0.25 0.45 0.12 0 0 0 0 0.13
ZS-H ICC = 2.24% γ00 4.184⁎⁎ 4.174⁎⁎ 4.176⁎⁎ 4.337⁎⁎ 3.671⁎⁎ 3.913⁎⁎ 3.921⁎⁎ 4.066⁎⁎ 3.918⁎⁎ 3.560⁎⁎
γ01 −0.873⁎ −0.002⁎⁎ – −0.057⁎ – – – – – −0.896⁎
U0 0.944⁎ 0.957⁎⁎ – – – – – – – 0.741⁎
r2 0.43 0.63 0 0.79 0 0 0 0 0 0.10
ZS-F ICC = 1.75% γ00 4.204⁎⁎ 4.201⁎ 4.191⁎ 4.108⁎⁎ 4.083⁎⁎ 4.128⁎⁎ 4.127⁎⁎ 4.156⁎⁎ 4.169⁎⁎ 4.548⁎⁎
γ01 −0.016⁎ – – – – – – – – −0.032⁎
U0 0.254⁎ – – – – – – – – 0.106⁎
r2 0.01 0 0 0 0 0 0 0 0 0.98
ZS-E ICC = 1.56% γ00 4.046⁎⁎ 4.046⁎⁎ 4.067⁎⁎ 4.108⁎⁎ 4.117⁎ 4.114⁎ 4.118⁎ 4.085⁎ 4.092⁎ 3.859⁎⁎
γ01 −0.473⁎ −0.001⁎ – −0.031⁎ – – – – – −0.712⁎
U0 0.663⁎ 0.677⁎ – – – – – – – 0.378⁎
r2 0.01 0.01 0 0.12 0 0 0 0 0 0.66
Note: − Non-significant Result.

⁎
95% Confidence limit.
⁎⁎
99% Confidence limit.
16
References Mitra, R., Buliung, R.N., 2012. Built environment correlates of active school transporta-
tion: neighborhood and the modifiable areal unit problem. J. Transp. Geogr. 20,
51–61. https://doi.org/10.1016/j.jtrangeo.2011.07.009.
Alho, A., Abreu e silva, J., 2014. Freight-trip generation model. Transp. Res. Rec. J. Novak, D.C., Hodgdon, C., Guo, F., Aultman-Hall, L., 2011. Nationwide freight generation
Transp. Res. Board 2411, 45–54. https://doi.org/10.3141/2411-06. models: a spatial regression approach. Networks Spat. Econ. 11, 23–41. https://doi.
Alho, A.R., de Abreu Silva, J., 2015. Utilizing urban form characteristics in urban logistics org/10.1007/s11067-008-9079-2.
analysis: a case study in Lisbon, Portugal. J. Transp. Geogr. 42, 57–71. https://doi. Openshaw, S., 1977. Optimal zoning Systems for Spatial Interaction Models. Environ.
org/10.1016/j.jtrangeo.2014.11.002. Plan. A Econ. Sp. 9, 169–184. https://doi.org/10.1068/a090169.
Biehl, A., Ermagun, A., Stathopoulos, A., 2018. Community mobility MAUP-ing: a socio- Ortega, E., López, E., Monzón, A., 2014. Territorial cohesion impacts of high-speed rail
spatial investigation of bikeshare demand in Chicago. J. Transp. Geogr. 66, 80–90. under different zoning systems. J. Transp. Geogr. 34, 16–24. https://doi.org/10.
https://doi.org/10.1016/j.jtrangeo.2017.11.008. 1016/j.jtrangeo.2013.10.018.
Blazquez, C.A., Picarte, B., Calderón, J.F., Losada, F., 2018. Spatial autocorrelation Ortúzar, J. de D., Willumsen, L.G., 2011. Modelling Transport, Modelling Transport.
analysis of cargo trucks on highway crashes in Chile. Accid. Anal. Prev. 120, https://doi.org/10.1002/9781119993308.
195–210. https://doi.org/10.1016/j.aap.2018.08.022. Pani, A., Sahu, P.K., 2019a. Comparative assessment of industrial classification systems
Cabrera Delgado, J., Bonnel, P., 2016. Level of aggregation of zoning and temporal for modeling freight production and freight trip production. Transp. Res. Rec. 2019.
transferability of the gravity distribution model: the case of Lyon. J. Transp. Geogr. https://doi.org/10.1177/0361198119834300.
51, 17–26. https://doi.org/10.1016/j.jtrangeo.2015.10.016. Pani, A., Sahu, P.K., 2019b. Planning, designing and conducting establishment-based
Cantillo, V., Jaller, M., Holguín-Veras, J., 2014. The Colombian strategic freight transport freight surveys: a synthesis of the literature, case-study examples and recommenda-
model based on product analysis. PROMET - Traffic&Transportation 26, 487–496. tions for best practices in future surveys. Transp. Policy. https://doi.org/10.1016/j.
https://doi.org/10.7307/ptt.v26i6.1460. tranpol.2019.04.006.
Directorate of Census Operations Rajasthan, 2011. District Census Handbook, Jaipur. Pani, A., Sahu, P.K., Patil, G.R., Sarkar, A.K., 2018. Modelling urban freight generation: a
Census of India, Jaipur. case study of seven cities in Kerala, India. Transp. Policy 69, 49–64. https://doi.org/
Everitt, B., Landau, S., Leese, M., Stahl, D., 2011. Quality and quantity. In: Cluster 10.1016/j.tranpol.2018.05.013.
Analysis, 5th ed. Wiley Series in Probability and Statistics. https://doi.org/10.1007/ RIICO, 2018. RIICO GIS: Rajdhara, Rajasthan State Industrial Development and
BF00154794. Investment Corporation, Government of Rajasthan, India. www.gis.rajasthan.gov.in/
Fotheringham, A.S., Wong, D.W.S., 1991. The modifiable areal unit problem in multi- riico/.
variate statistical analysis. Environ. Plan. A 23, 1025–1044. https://doi.org/10. Sahu, P.K., Pani, A., 2019. Freight generation and geographical effects: modelling freight
1068/a231025. needs of establishments in developing economies and analyzing their geographical
Garrido, R.A., Mahmassani, H.S., 2000. Forecasting freight transportation demand with disparities. Transportation (Amst). doi. https://doi.org/10.1007/s11116-019-
the space-time multinomial probit model. Transp. Res. Part B Methodol. 34, 403–418. 09995-5.
https://doi.org/10.1016/S0191-2615(99)00032-6. Sanchez-Diaz, I., 2017. Modeling urban freight generation: a study of commercial es-
Gonzalez-Feliu, J., Sánchez-Díaz, I., 2019. The influence of aggregation level and cate- tablishments' freight needs. Transp. Res. Part A Policy Pract. 102, 3–17. https://doi.
gory construction on estimation quality for freight trip generation models. Transp. org/10.1016/j.tra.2016.06.035.
Res. Part E Logist. Transp. Rev. 121, 134–148. https://doi.org/10.1016/j.tre.2018. Sánchez-Díaz, I., 2018. Potential of implementing urban freight strategies in the ac-
07.007. commodation and food services sector. Transp. Res. Rec. https://doi.org/10.1177/
Griffith, D.A., 2009. Spatial autocorrelation. Int. Encycl. Hum. Geogr. 308–316. https:// 0361198118796926.
doi.org/10.1016/b978-008044910-4.00522-8. Sánchez-Díaz, I., Holguín-Veras, J., Wang, X., 2016. An exploratory analysis of spatial
Gunay, G., Ergun, G., Gokasar, I., 2016. Conditional freight trip generation modelling. J. effects on freight trip attraction. Transportation (Amst). 43, 177–196. https://doi.
Transp. Geogr. 54, 102–111. https://doi.org/10.1016/j.jtrangeo.2016.05.013. org/10.1007/s11116-014-9570-1.
Guo, J.Y., Bhat, C.R., 2004. Modifiable areal units problem or perception in modeling of Stafford, M., Duke-Williams, O., Shelton, N., 2008. Small area inequalities in health: are
residential location choice? Transp. Res. Rec. J. Transp. Res. Board 1898, 138–147. we underestimating them? Soc. Sci. Med. 67, 891–899. https://doi.org/10.1016/j.
Ha, D.-H., Combes, F., 2016. Building a Model of Freight Generation with a Commodity socscimed.2008.05.028.
Flow Survey, Commercial Transport. Springer International Publishing, Cham. Stępniak, M., Jacobs-Crisioni, C., 2017. Reducing the uncertainty induced by spatial
Haynes, R., Daras, K., Reading, R., Jones, A., 2007. Modifiable neighbourhood units, zone aggregation in accessibility and spatial interaction applications. J. Transp. Geogr. 61,
design and residents' perceptions. Heal. Place 13, 812–825. https://doi.org/10.1016/ 17–29. https://doi.org/10.1016/j.jtrangeo.2017.04.001.
j.healthplace.2007.01.002. Tavasszy, L., De Jong, G., 2014. Modelling Freight Transport, First. ed. Elsevier, London.
Holguín-veras, J., López-genao, Y., Salam, A., 2002. Truck-trip generation at container https://doi.org/10.1016/B978-0-12-410400-6.00008-2.
terminals results from a nationwide survey. Transp. Res. Rec. 1790 (01), 89–96. Viegas, J.M., Martínez, L.M., Silva, E.A., 2009. Effects of the modifiable areal unit pro-
https://doi.org/10.3141/1790-11. blem on the delineation of traffic analysis zones. Environ. Plan. B Plan. Des. 36,
Holguín-Veras, J., Jaller, M., Sanchez-Diaz, I., Wojtowicz, J., Campbell, S., Levinson, H., 625–643. https://doi.org/10.1068/b34033.
Lawson, C., Powers, E.L., Tavasszy, L., 2012. NCFRP 19: Freight Trip Generation and Wagner, T., 2010. Regional traffic impacts of logistics-related land use. Transp. Policy 17,
Land Use: Final Report. Washington DC, United States. 224–229. https://doi.org/10.1016/j.tranpol.2010.01.012.
Holguín-Veras, J., Lawson, C., Wang, C., Jaller, M., González-Calderón, C., Campbell, S., Wang, S., Sun, L., Rong, J., Yang, Z., 2014. Transit traffic analysis zone delineating
Kalahashti, L., Wojtowicz, J., Ramirez, D., 2016. Using commodity flow survey mi- method based on Thiessen polygon. Sustain 6, 1821–1832. https://doi.org/10.3390/
crodata and other establishment data to estimate the generation of freight. Freight su6041821.
Trips, and Service Trips. 37https://doi.org/10.17226/24602. NCFRP. Wisetjindawat, W., Sano, K., Matsumoto, S., 2006. Commodity distribution model in-
Kawamura, K., Miodonski, D., 2012. Examination of the relationship between built en- corporating spatial interactions for urban freight movement. Transp. Res. Rec. J.
vironment characteristics and retail freight delivery. Transp. Res. Board 91st Annu. Transp. Res. Board 1966, 41–50. https://doi.org/10.3141/1966-06.
Meet. 1–13. Woltman, H., Feldstain, A., MacKay, C., Rocchi, M., 2012. An introduction to hierarchical
Kriegel, H.P., Kröger, P., Sander, J., Zimek, A., 2011. Density-based clustering. Wiley linear modeling. Tutor. Quant. Methods Psychol. 8, 52–69. https://doi.org/10.2307/
Interdiscip. Rev. Data Min. Knowl. Discov. 1, 231–240. https://doi.org/10.1002/ 2095731.
widm.30. Wong, D., 2009. Modifiable Areal Unit Problem. In: SAGE Handb. Spat. Anal, pp.
Landau, U., 1978. Aggregate prediction with disaggregate models : behavior of the ag- 105–123.
gregation bias. Transp. Res. Rec. J. Transp. Res. Board 100–105. Xu, P., Huang, H., Dong, N., Abdel-Aty, M., 2014. Sensitivity analysis in the context of
Lee, J., Abdel-Aty, M., Jiang, X., 2014. Development of zone system for macro-level traffic regional safety modeling: identifying and assessing the modifiable areal unit pro-
safety analysis. J. Transp. Geogr. 38, 13–21. https://doi.org/10.1016/j.jtrangeo. blem. Accid. Anal. Prev. 70, 110–120. https://doi.org/10.1016/j.aap.2014.02.012.
2014.04.018. Yannis, G., Papadimitriou, E., Antoniou, C., 2007. Multilevel modelling for the regional
LeSage, J., Pace, R.K., 2009. Introduction to Spatial Econometrics, Journal of the Royal effect of enforcement on road accidents. Accid. Anal. Prev. 39, 818–825. https://doi.
Statistical Society: Series A (Statistics in Society). Chapman & Hall/CRChttps://doi. org/10.1016/j.aap.2006.12.004.
org/10.1111/j.1467-985x.2010.00681_13.x. You, J., Nedović-Budić, Z., Kim, T.J., 1998. A GIS-based traffic analysis zone design:
MacHaris, C., Melo, S., 2011. City distribution and urban freight transport: multiple implementation and evaluation. Transp. Plan. Technol. 21, 69–91.
perspectives. Business Economics. https://doi.org/10.4337/9780857932754.00001. Zhang, M., Kukadia, N., 2005. Metrics of urban form and the modifiable areal unit pro-
Martinez, L.M., Viegas, J.M., Silva, E.A., 2007. Zoning decisions in transport planning and blem. Transp. Res. Rec. J. Transp. Res. Board 1902, 71–79. https://doi.org/10.3141/
their impact on the precision of results. Transp. Res. Rec. 58–65. https://doi.org/10. 1902-09.
3141/1994-08.
17

Pani 2019

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pani 2019

Uploaded by

Copyright:

Available Formats

Journal of Transport Geography 80 (2019) 102524

Contents lists available at ScienceDirect

Journal of Transport Geography

Assessing the extent of modifiable areal unit problem in modelling freight T

ARTICLE INFO ABSTRACT

1. Introduction mesoscopic (observations at corridor-level or neighborhood-level or

Fig. 1. Study Area Map Showing Geocoded Locations of Surveyed Establishments.

Fig. 2. Alternate Zoning Systems based on Freight-Related Variables.

the technique of density-based clustering using DBSCAN algorithm. Table 3

Explanatory Power (r2) for FTA

Explanatory Power (r2) for FA

Appendix A. Optimality assessment

0.05 0.05 0.05 0.05

0.00 0.00 0.00 0.00

0.00 0.00 0.00 0.00

0.05 0.05 0.05 0.05

-0.05 -0.05 -0.05 -0.05

-0.05 -0.05 -0.05 -0.05

Appendix B. Comparative assessment of model performance

Zoning systems Model coefficients Level-2 Predictors (Gj)

Mean as Outcomes Model-I (FTA)

Mean as Outcomes Model-II (FTP)

Table B.1 (continued)

Zoning systems Model coefficients Level-2 Predictors (Gj)

ZS-F ICC = 3.41% γ00 3.326 ⁎⁎

Zoning systems Model coefficients Level-2 Predictors (Gj)

Mean as outcomes model-III (FA)

Mean as outcomes model-IV (FP)

Note: − Non-significant Result.

You might also like