You are on page 1of 12

Transportation Research Part C 83 (2017) 134–145

Contents lists available at ScienceDirect

Transportation Research Part C


journal homepage: www.elsevier.com/locate/trc

Data-driven fuel consumption estimation: A multivariate


adaptive regression spline approach
Yuche Chen ⇑, Lei Zhu, Jeffrey Gonder, Stanley Young, Kevin Walkowicz
National Renewable Energy Laboratory, 15013 Denver West Parkway, Golden, CO 80401, United States

a r t i c l e i n f o a b s t r a c t

Article history: Providing guidance and information to drivers to help them make fuel-efficient route
Received 13 April 2017 choices remains an important and effective strategy in the near term to reduce fuel con-
Received in revised form 3 August 2017 sumption from the transportation sector. One key component in implementing this strat-
Accepted 3 August 2017
egy is a fuel-consumption estimation model. In this paper, we developed a mesoscopic fuel
Available online 12 August 2017
consumption estimation model that can be implemented into an eco-routing system. Our
proposed model presents a framework that utilizes large-scale, real-world driving data,
Keywords:
clusters road links by free-flow speed and fits one statistical model for each of cluster.
Data-driven analytics
Fuel consumption estimation
This model includes predicting variables that were rarely or never considered before, such
Multivariate adaptive regression spline as free-flow speed and number of lanes. We applied the model to a real-world driving data
Eco-routing set based on a global positioning system travel survey in the Philadelphia-Camden-Trenton
metropolitan area. Results from the statistical analyses indicate that the independent vari-
ables we chose influence the fuel consumption rates of vehicles. But the magnitude and
direction of the influences are dependent on the type of road links, specifically free-flow
speeds of links. A statistical diagnostic is conducted to ensure the validity of the models
and results. Although the real-world driving data we used to develop statistical relation-
ships are specific to one region, the framework we developed can be easily adjusted and
used to explore the fuel consumption relationship in other regions.
Ó 2017 Elsevier Ltd. All rights reserved.

1. Introduction

The transportation sector is a big energy consumer and one of the largest greenhouse gas (GHG) emissions contributors.
Governments around the world are taking steps to address the energy and GHG emissions problems caused by transporta-
tion (Chen and Fan, 2013, 2014; Zhang et al., 2016; Jiang et al., 2016a, 2016b). Clearly, a portfolio of strategies should be
employed by the transportation sector to mitigate GHG emissions and dependence on fossil fuels (Morrison and Chen,
2011; Chen et al., 2017a, 2017b; Jiang et al., 2015; Yi and Bauer, 2017). Although renewable sources of transportation fuels
and alternative fuel vehicles are playing roles in this process, reducing fuel consumption and GHG emissions of the existing
fleet, where more than 90% are conventional internal combustion engine vehicles, remains an important and effective
approach in the near term (Hu and Chen, 2016; Chen and Meier, 2016; Jiang et al., 2014; Chen and Borken-Kleefeld, 2016).
One particular area of interest is to provide guidance to drivers so that they can achieve better fuel efficiency during driv-
ing. This is broadly known as eco-driving, which includes two types of tactics. One is to offer microscopic operational tips to
drivers (such as maintaining steady speed, smoothing acceleration) to achieve better fuel efficiency. Many studies looked

⇑ Corresponding author.
E-mail address: Yuche.Chen@nrel.gov (Y. Chen).

http://dx.doi.org/10.1016/j.trc.2017.08.003
0968-090X/Ó 2017 Elsevier Ltd. All rights reserved.
Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145 135

into this tactic (Asadi and Vahidi, 2011; Manzie et al., 2007; Wan et al., 2016; Ubiergo and Jin, 2016), and it was reported that
on average a 5–15% fuel efficiency improvement could be obtained (Onoda, 2009). Although the potential fuel savings are
intriguing, it requires drivers to closely follow microscopic operational tips, which is not feasible and can potentially cause
safety concerns (Barkenbus, 2010). But with emerging technologies, such as vehicle-to-vehicle (V2V) and vehicle-to-
infrastructure (V2I), they make better eco-driving strategies available and address the safety concerns. For example, Wan
et al., 2016 proposed a Speed Advisory System (SAS), which utilizes communications between vehicles to reduce idling time
at red lights and achieve fuel minimization driving. Ubiergo and Jin (2016) presented a hierarchical eco-driving strategy
based on traffic signal control and V2I communications with the purpose of reducing travel delays and saving fuels by
smoothing speed trajectory. The second often overlooked tactic is to guide drivers to choose fuel-efficient routes. The route
choice can take into consideration road conditions and traffic conditions, which are predominant factors in determining driv-
ing cycle and fuel efficiency. It was claimed that once the route is selected, the aforementioned microscopic tactics seems to
have a relatively small influence on fuel efficiency (Nie and Li, 2013). In addition, emerging technologies provide better data
to make accurate and real-time fuel consumption estimation for routing optimization. For example, Zulkefli et al. (2017)
established a hardware-in-the-loop system to assess fuel consumption of connected and autonomous vehicle with accurate
tracking of vehicle speed and measuring of engine operating condition. Zheng and Liu (2017) developed a volume estimation
method using connected vehicle trajectory data to estimate traffic volume at signalized intersection. The volume estimation
can later be used to estimate travel delay and fuel wasting on idling. Mo et al. (2017) utilized license plate recognition data
and systematic LPR data-mending method to estimate vehicle speed profile. And there are other studies that investigated the
accuracy of network-level speed and volume data (Kim and Coifman, 2014). Although the above data are not fuel consump-
tion data directly, they can be used to accurately proxy fuel data. And, timely and accurately informing fuel consumption
estimations of different routes can help drivers choosing eco-routing options and save fuel and GHG emissions.
Given the important role of eco-routing in saving transportation fuel consumption, in this study, we developed a fuel con-
sumption estimation model to be implemented into an eco-routing guidance system. The emerging technologies provide
possibilities to implement better eco-routing strategy. But still, methodology for fuel estimation in eco-routing application
is a knowledge gap in literature, particularly, on how to utilize newly available data to accurately estimate fuel consumption.
Most of recent literature focused on designing the control strategies, rather than fuel estimation method. There are three
types of fuel consumption estimation models: macroscopic, microscopic, and mesoscopic. Macroscopic models (such as
Motor Vehicle Emission Simulator [MOVES1] lookup table approach) estimate fuel consumption through look-up tables, which
are easy to use but could not demonstrate accuracy (EPA, 2010). Microscopic models (such as VT-Micro, Comprehensive Modal
Emissions Model [CMEM]) can accurately estimate fuel consumption at the same time stamp (usually each second) as simulated
speed trajectories, but traffic simulations are complex and time demanding, thus not suitable for eco-routing systems, which
require generated fuel estimations promptly for route choice purpose (Brooker et al., 2015; Nagel and Scheicher, 1994;
Rakha et al., 2004). Mesoscopic models combine the advantages of the other two models. They apply large-scale real-world driv-
ing data to a microscopic model to obtain accurate fuel estimations and establish a statistical relationship between fuel con-
sumption and influencing factors (such as speed and acceleration) which can be used to promptly evaluate fuel consumption
of routes without conducting traffic simulation.
Our proposed model distinguishes itself from existing mesoscopic models by (1) developing a framework to utilize large-
scale real-world driving data, (2) applying a clustering prediction algorithm for a better- fitting performance, and (3) con-
sidering predicting variables not considered before, such as link free flow speed, number of lanes, etc. Specifically, we
adopted a Multivariable Adaptive Regression Spline (MARS) (Friedman, 1991) approach to optimally determine clusters of
free-flow speed and fit one regression curve for each cluster. Although the real driving data we used to develop a statistical
relationship are specific to one region, the framework we developed can be easily adjusted and utilized to explore fuel con-
sumption relationship in other regions.
The rest of the paper is organized as follows. Section 2 reviews relevant literature on fuel consumption estimation models.
Section 3 describes the proposed fuel consumption estimation model and Section 4 discusses the data applied to the model.
Results, statistical diagnostic, and comparison with other models are presented in Section 5. Section 6 concludes the paper
and discusses directions and future applications.

2. Literature review

In this section, we summarized the characteristics of three types of fuel consumption models, namely macroscopic,
microscopic, and mesoscopic. The three types of models differ based on how vehicle activities and fuel consumptions are
aggregated over time and space.
Macroscopic models typically estimate vehicle fuel consumption rate based on factors such as average travel speed, vehi-
cle type, and model year. Usually, the estimation relationship is in the format of a lookup or mapping table, such as the tables
in some major energy and transportation emission inventory models (Annual Energy Outlook (EIA, 2016), MOBILE6 (EPA,

1
Note that MOVES is considered to be a multi-resolution model, i.e., macroscopic, mesoscopic and microscopic, based on the approach users choose within
MOVES model. Here, the macroscopic model is specifically referring to the look-up table approach to find fuel consumption/GHG emission rate at given speed
range.
136 Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145

2003), etc.). And predicting parameters are often in discrete format. For example, MOBILE6 can estimate fuel consumption
rate of vehicles by vehicle type, fuel type, and average speed at 5 mph intervals. Some studies utilized macroscopic fuel con-
sumption models in conducting eco-routing, trip assignment, or related problems (Penic and Upchurch, 1992; Sugawara and
Niemeier, 2002). But one major disadvantage of using microscopic models in eco-routing problems is that they do not con-
sider heterogeneity in driving, therefore two different driving trajectories with the same average speed will produce the
same fuel consumption.
After realizing the disadvantages of macroscopic models, eco-routing researchers started to look into microscopic fuel
consumption models. Microscopic models estimate fuel consumption at the most granular level, usually every second of
vehicle travel. They are based on either a statistical or physical approach. An early effort in applying a physical approach
was the CMEM (Barth et al., 1996), which estimates engine-out energy and emissions at each second based on the total
engine output power using simplified physical models. Rakha and colleagues developed the VT-Micro model, which statis-
tically estimates fuel consumption based on second by second vehicle travel speed and acceleration (Rakha et al., 2004).
Researchers have utilized microscopic fuel consumption models in their eco-routing problems. Rakha et al. (2012) integrated
the VT-Micro model in the INTEGRATION microscopic traffic assignment and simulation framework for modeling an eco-
routing problem. Traffic simulations on each link produced second-by-second drive profiles, which were applied to VT-
Micro to estimate fuel consumption. Nie and Li (2013) proposed an environmentally constrained shortest-path problem
and developed a carbon dioxide emissions (directly related to fuel consumption) estimation model based on CMEM. The sys-
tem utilizes the CMEM model to simulate carbon dioxide emissions at each second and dynamically determines optimal
routing strategies to minimize environmental impact. To sum up, these microscopic fuel consumption estimation models
are well suited to work with second-by-second vehicle trajectory data. However, obtaining such data is time-consuming
through a traffic simulation model, particularly in a large network application (Nagel and Scheicher, 1994).
Mesoscopic fuel consumption models have drawn attention in the recent decade. Mesoscopic models statistically predict
fuel consumption of vehicles using major influencing factors, based on large-scale real-world driving data (Barth and
Boriboonsomsin, 2008; Rakha et al., 2011; Li et al., 2017). Compared with macroscopic models, they use more influencing
factors and provide estimations based on statistical relationships rather than lookup tables, which could improve the accu-
racy of prediction. Compared with microscopic models, they do not require second-by-second travel trajectories, which
reduce time demand and are more suitable for eco-routing applications. Li et al. (2017) investigated the effects of different
data segregation methods on mesoscopic modeling for vehicle energy consumption. Specifically, they tested a variety of
novel methods for performance comparison. Barth and Boriboonsomsin (2008) used a sample of 241 trips on freeway main-
lines in California to investigate the relationship between average speed and carbon dioxide emissions through a fourth-
order polynomial regression model. The fitted ‘‘U” curve shows that vehicle’s carbon dioxide emission per distance reaches
the minimum when it is traveling around 40 mph. Although the results are solid, like the majority of mesoscopic models,
they are simplified by only considering speed and/or acceleration. This calls for consideration of other important factors that
influence fuel consumption, which is achieved by our proposed model. In addition, our model clusters road links by their
free-flow speed and fits one curve for each cluster to improve prediction accuracy.

3. Methodology

This paper attempts to establish a mesoscopic fuel consumption estimation model to be implemented into an eco-routing
system. In our model, the fuel consumption rate (gallons per 100 miles) will be predicted based on factors such as average
speed, average acceleration, free -low speed, road grade, and the number of lanes on each road link. We chose to predict fuel
consumption at the road link level for two reasons: first, road link is the smallest element in a transportation network and
physical features on one road link are consistent as presented in the network database. Second, an eco-routing decision is a
combination of different road links. Therefore, predicting fuel consumption at link level is appropriate in our fuel consump-
tion model, which aims to be implemented in an eco-routing system. Fig. 1 is the flowchart of the framework we established
in this study.
There are three modules in this framework. The first module is called ‘‘data preparation.” We start by preparing a travel
trajectory data set based on a map-matching GPS device that captured speed trajectory (such as Transportation Secure Data
Center) and a road network map system. The resulting data set will be processed through a procedure for mode imputation
and speed filtration to ensure vehicle driving data were captured accurately. The products from the data preparation step are
road link trips. Each trip profile contains second-by-second speed trajectory and location of a vehicle. In the ‘‘model imple-
mentation” module, we apply the speed trajectory of each link trip to CMEM2 (a microscopic model estimates second-by-
second fuel consumption) to obtain the fuel consumption rate and other aggregated metrics such as average speed, acceleration,
and road grade of the link trip. The results for each link trip can be seen as an observation record and all of the observations are
input into a MARS statistical model to find the relationship between fuel consumption and independent variables. Finally, in the
‘‘model application” module, whenever a user needs to make an eco-routing decision, the system will prepare travel and road

2
Although it will be good to have actual fuel consumption data through field collection, we did not have them in our data set. However, the proposed
framework and methodology, which are the main focus of the study, can still be applied without any adjustments. In addition, the CMEM model uses a physical
power-demand approach based on a parameterized analytical representation of fuel consumption, which has been proved to be accurate in estimation (Barth
and Boriboonsomsin, 2008; Barth et al., 1996; Boriboonsomsin and Barth, 2009; Boriboonsomsin et al., 2012).
Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145 137


• w Eco-rou
consump

• •

• •

Mul
• Adap
imputa

Y=f(x)=c0+c1x1+…+ckx Eco-rou

Fig. 1. Layout of the system framework.

link information and apply it to the regression model established in ‘‘model implementation.” If there is no travel information
specific to the driver and on the links, the model will utilize averaged information for the same link from other drivers. Although
each user’s data and estimated coefficients in the statistical models will be different, the methodology/framework will remain
the same. The proposed framework is considered to be able to provide user-specific fuel consumption estimation in eco-routing
decision.
One major contribution of our paper is to consider free-flow speed in predicting link travel fuel consumption and to
develop a clustering-based regression model to improve predicting accuracy. Several existing studies investigated and
proved the influence of road characteristics on fuel consumption estimation (Ericsson et al., 2006; Boriboonsomsin et al.,
2012). But road characteristics are mainly considered as categorical variables, which impact the prediction accuracy and
limit its application in eco-routing problems. To address this issue, we use free-flow speed of a link as a proxy to categorize
road characteristics.3 Specifically, we utilized MARS approach to determine the optimal (in terms of wellness-of-fit) clusters of
free-flow speed and fit one multivariate regression curve for all links within the same free -low speed range. It can be seen as
producing a kink in the dimensions of independent variables. And the kink effect is produced by hinge functions. There are sev-
eral non-parametric techniques for multivariate regressions. For example, nonparametric multiplicative regression (NPMR) is a
form of regression based on multiplicative kernel estimation, which weighs the influence of observations according to the dis-
tance of each observation to the predictor (McCune, 2006). But NPMR is particularly suitable when the response variable is a
quantitative or binary variable (McCune, 2006). Another form of non-parametric technique is the additive model (AM), which
assumes the response is the sum of arbitrary smooth univariate functions of explanatory variables (Buja et al., 1989). This
approach avoids the traditional assumption of linearity in explanatory variables but retains the assumption that explanatory
variable affects are additive. Although a widely used model, the AM approach was not selected here as the original assumption
of univariate function for each predictor has significant influence on the results of the model and misrepresentation of assump-
tion will largely influence the accuracy of the model. Among the non-parametric models, the MARS approach satisfies our needs
in this study best, because it automatically models nonlinearities and interactions between variables, suitable for response in
continuous form, and is easy to explain. The MARS model was also chosen for similar reasons by other studies focusing on esti-
mating fuel consumption (Oduro et al., 2015; Silva, 2014; Wang et al., 2015).
P
A traditional multivariate regression model with K independent variables has the standard form Y ¼ Kk¼1 C k X k . In our
multivariate adaptive regression spline model, we adjusted the form to be:

3
Free-flow speed of a road link is closely related to roadway characteristics, such as roadway type, type of intersection at link ends, etc. The inclusion of a
continuous independent variable, such as free-flow speed, can bring the benefits of better understanding and quantifying road characteristics’ influence on fuel
consumption.
138 Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145

X
K1
logY ¼ C k ðhðX K ÞÞX k þ C K ðhðX K ÞÞX K ð1Þ
k¼1

h(XK) is the hinge function that determines the interval XK belongs to, which further determines the coefficients C1 to CK. The
hinge function is a key part of the MARS model. It has zero value for part of its range and thus can be used to partition the data
into disjointed regions, each of which can be treated independently. If it is determined the data are only needed to partition into
two parts using one knot, the hinge functions can be conceptually expressed as a1 maxð0; x  X 1 Þ þ a2 maxð0; X 1  xÞ with X1 as
the divider for partitioning or clustering and the regression value will be varied depending on the value of the independent vari-
able’s cluster. It is possible to have more than one divider, in this case, the hinge functions can be expressed as
a1  maxð0; x  X 1 Þ þ a2  maxð0; X 1  xÞ þ a3  maxð0; x  X 2 Þ þ . . . þ ak  maxð0; x  X k1 Þ with k clusters. In this case, an
independent variable x < X1 will have coefficient a1 , for X1 < x < X2 with coefficient a1 þ a2 , and so on.
Note, we assume only the Kth variable (i.e., free-flow speed) could influence the coefficients of all independent variables.
We took the natural log of the dependent variable Y (fuel consumption rate) so that the percent of change in Y for every
increment in Xi can be calculated as (eC i  1Þ ⁄ 100%.4 The variables and reasons to include them in this model are discussed
below:

 S represents average traveling speed on a link. Average speed is an important driving profile characteristic and the most
common variable considered in fuel consumption model literature. Existing studies have used a high-order functional
curve to represent the relationship between fuel consumption rate and average travel speed (Barth and
Boriboonsomsin, 2008; Ahn et al., 2002). We also include average travel speed up to the fourth order as independent
variables.
 a represents average acceleration on a link. Acceleration is defined as the change rate of a vehicle’s velocity with respect
to time and could be positive or negative. This variable is the average of a first-order derivative of vehicle velocity, which
is an indicator for variation of vehicle speed.
 ra represents standard variance of second-by-second acceleration on a link. Acceleration is one major influencing factor
on fuel consumption, but using only the average acceleration, as several previous studies used (Ahn et al., 2002;
Boriboonsomsin and Barth, 2009), can lead to the issue of overlook driving dynamic condition of vehicles.5 Therefore,
we included this variable in our statistical model.
 r represents road grade of a link. There are studies investigating the road grade’s impacts on fuel consumption of vehicles
(Boriboonsomsin and Barth, 2009; Wood et al., 2014). The grade of a link is determined by (Ee - Es)/L ⁄ 100%, where Ee, Es
are elevations of ending and starting points of a link trip, and L stands for link length. This determination method is con-
sistent with existing researches studying impacts of road grade on fuel consumption. The above literature also shows a
possible non-linear relationship between fuel consumption and road grade. Therefore, we include linear and quadratic
terms of road grade in our MARS model.
 Lane represents the number of lanes on a road link. Studies have found that excessive lane changing could interrupt traffic
flow and cause extra acceleration and braking events (Ioannou and Stefanovic, 2005; Awal et al., 2015; Atagoziyev et al.,
2016). In addition, other studies showed that narrowed lane width leads to more lane changing behavior (Macbeth, 1998;
Martens et al., 1997). Specifically, Macbeth (1998) reported that on Toronto’s arterial roads, drivers tended to travel faster
and might change lanes to pass slower vehicles, which might adversely influence vehicle fuel efficiency. Therefore, we
included number of lanes as an independent variable in our model. It is worth noting that environmental factors (all
infrastructure features outside of the vehicle itself) have an influence on a vehicle’s fuel consumption. These factors
include lane width, curvature of link, etc. This paper is one of the first papers to consider those environmental factors,
although only the number of lanes variable is considered due to data availability. But the proposed framework facilitates
the process of including other environmental factors in the statistical regression once they become available. And this is
also one of the focused future research direction the authors.
 FFS represents free-flow speed of a link. No previous studies have considered this variable in their fuel consumption esti-
mation model. But studies have showed road characteristics (as categorical variables) influence fuel consumption of vehi-
cles (Ericsson et al., 2006; Boriboonsomsin et al., 2012). By including FFS as a continuous predicting variable in our model,
we can better understand the quantitative relationship between road characteristics and fuel consumption. In addition,
this variable enables indirectly studying impacts of traffic congestion on fuel consumption. If FFS is increased by 1 and the
average speed remains unchanged, it means traffic congestion index is increased by FFSS 1
⁄ 100%. The increased traffic con-
gestion result in fuel consumption rate change by (eC FFS  1Þ ⁄ 100%.

Above discussion shows that our proposed fuel consumption model not only considers variables that were used in exist-
ing studies, but it also includes variables that have not been explored or are less explored before. Particularly, the introduc-
P
expð
C X þC i ðX i þ1ÞÞ
4 Y 2 Y 1
¼ Pk–i k k
 1 ¼ expðC i Þ  1
Y1 exp ð
C X þC i X i Þ
k–i k k
5
Think about two driving profiles, one with constant acceleration of 0 mph/s, and another one with instantaneous acceleration fluctuating between 1 and
1 mph/s but averaged at 0 mph/s. When all other driving parameters are holding the same, the latter one should have higher fuel consumption due to the more
dynamic driving conditions.
Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145 139

tion of the multivariable adaptive regression spline approach, which fits curves for links within different free flow speed
clusters, could potential improve the accuracy of fuel consumption estimation.
4. Data

The proposed statistical model requires input data for defined independent and dependent variables at the road link level.
We prepared the input data based on a vehicle travel trajectory data set in the Philadelphia-Camden-Trenton metropolitan
area. Delaware Valley Regional Planning Commission (DVRPC) obtained the data set in the 2013 household travel survey by
installing global positioning system devices on 728 light-duty vehicles (LDVs) from about 500 households (DVRPC, 2013).
The data is currently hosted in the National Renewable Energy Laboratory’s (NREL’s) Transportation Secure Data Center
(NREL, 2015). The data set contains 6764 vehicle trips with a second-by-second speed trajectory. We divided each trip into
multiple subtrips, where a subtrip is defined as a subset of a trip within the same road link. To ensure representativeness of
travel patterns on subtrips, we excluded subtrips less than 60 s, which resulted in 48,752 subtrips or link trips. The average
link length of all subtrips is 0.6 mile and over 70% of the subtrips are within a 0.2–1.0 mile range. For each subtrip, we cal-
culate the value for each independent variable defined in our proposed model (see Table 1).
The dependent variable of the proposed model is the fuel consumption rate of each subtrip. The ideal data will be real-
measured fuel consumption of those subtrips. However, such data are hard to obtain. As a result, we simulated the second-
by-second speed trajectory of each subtrip on a CMEM model to obtain fuel consumption at the same time interval. Since
we have the vehicle information for each subtrip, we applied the vehicle categorizing decision tree in the CMEM model to find
the specific vehicle category to be simulated (Barth et al., 1996). Then we weighted the fuel consumption results based on the
fleet mix of the 728 vehicles in the survey.6 Therefore, the fuel consumption estimations are results of an averaged LDV reflecting
the LDV fleet mix. The CMEM model has been validated by comparing the model fuel consumption outputs to the Environmental
Protection Agency (EPA) fuel economy test data for a variety of types of light-duty vehicles and has been widely utilized by U.S.
Department of Transportation (DOT)-sponsored projects (Barth et al., 1996; Boriboonsomsin and Barth, 2009; Boriboonsomsin
et al., 2012). The approach of using a simulation model to estimate fuel consumption to be used in a mesoscopic model was also
adopted in other studies, such as Barth and Boriboonsomsin (2008). Based on the simulated fuel consumption, we calculated the
fuel consumption rate (gallons per 100 miles) for each subtrip and used those values as inputs of dependent variable.
Before running our fuel consumption model, we present some summary statistics of our input data. Fig. 2 is a heat plot
showing the fuel consumption rate (gallons per 100 miles) based on different free- flow speed and average travel speed for
all link trips. The speeds are grouped at 2.5-mph intervals, and the color of each ‘‘block” shows the average fuel consumption
rate of all link trips with free-flow and the average speed falling in that range. The patterns show that for a trip traveling on
road with a free-flow speed of less than 55 mph, the fuel consumption rate will keep decreasing as the actual average speed
increases and reaches the minimum rate when the average speed is close to the free -low speed. But this does not hold true
for situations when the FFS of roads is greater than 55 mph. In these cases, the average fuel consumption rate initially
decreases as the average speed increases, reaching the minimum when the average speed is between 50 and 55 mph and
then gradually increasing with further increases in average speed. The general trend of fuel consumption rate with respect
to changes in average speed is aligned with those from other studies (Barth and Boriboonsomsin, 2008; Boriboonsomsin
et al., 2012). But if looking horizontally on Fig. 2, fuel consumption changing patterns are different at various free-flow
speeds, which indicates heterogeneities of fuel consumption rates in terms of free-flow speeds of links. For example, when
the average speed is fixed at 40 mph, the simulated fuel consumption rate is different depending on the free-flow speed of
traveling links and reaches the minimum when the link’s free-flow speed is 40 mph. This observation supports inclusion of
free-flow speed as an independent variable in our prediction model.
As we mentioned earlier, the free-flow speed variable is indirectly related to the traffic congestion level. And it is impor-
tant to consider traffic congestion in predicting fuel consumption. Fig. 3 shows the average congestion level and cumulative
vehicle miles traveled (VMT) of trips on links grouped by 5 mph. The data set is the same one used to generate Fig. 2, but we
focused on analyzing the traffic congestion of links. The congestion index is defined as FFS S
 1. A congestion index of 30%
(drivers need to spend 30% more time traveling on a link compared with no traffic/free-flow driving conditions) can already
be considered as severe congestion. About 65% of VMT are traveled on the freeway or highway roads (FFS > 55 mph), which
have a congestion index lower than 30%. However, the rest of the 35% VMT traveled on links with congestion index greater
than 30%, some even greater than 100%. Therefore, without considering congestion level it might lead to less accurate fuel
consumption estimations on 35% of VMT.

5. Results

5.1. Model estimate results

We ran the proposed fuel consumption estimation model using the prepared inputs as discussed in Section 4. Specifically,
we randomly selected 80% of the observed data as the training data set to develop predicting coefficients (in Section 5.1) and

6
Note that we tried to match each of the 728 light-duty vehicles to one vehicle category in the CMEM model database using a CMEM categorizing decision
tree. We did not consider electric vehicles because the focus of this study is fuel consumption and there were only two electric vehicles in the 728-vehicle fleet.
140 Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145

Table 1
Independent variable definitions.

Variable Definition
x1 S Average speed (mph)
x2 a Acceleration (mph/s), the change rate of a vehicle’s velocity, could be positive or negative
x3 r Road grade (%), the ratio of ‘‘rise” (vertical distance) to ‘‘run” horizontal distance
x4 Lane Number of lanes on a road link
x5 FFS Free flow speed (mph)
x6 ra Variance of acceleration

Fig. 2. Free-flow speed, average speed, and average fuel consumption rate (gallons per 100 mile) of observed link trips.

Fig. 3. Congestion Index and VMT distribution versus free-flow speed of trips.

used the remaining 20% data for cross validation (in Section 5.3). Table 2 presents estimates of the coefficients for the inde-
pendent variables based on the training data set with about 37,400 observations. The multivariate adaptive regression spline
approach optimally determined five link free-flow speed clusters and fitted one regression curve for links within each cluster
(see Model 1 to Model 5 in Table 2). Interestingly, the five clusters’ free-flow speed ranges are aligned with those of the five
major road categories, which can be seen as evidence to demonstrate the validity of the clustering approach. In Table 2, only
Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145 141

coefficients that are statistically significant at the 5% level are shown, and the results indicate that the majority of coefficients
are significant at the 0.1% level. Mathematically, the coefficient ci of a variable xi means that, holding all other variables’ value
unchanged, for every increment (or decrement) in value of variable xi, the value of a predicted variable (i.e., fuel consumption
rate in this study) will be increased (or decreased) by jeci  1j  100% (approximately equal to ci ⁄ 100%, when ci is close to 0).
Model 1 corresponds to estimations on road links where the FFS is below 16 mph. The road links in this category can be
classified as ‘‘local collector,” which are mainly in residential areas. Results show that out of all variables, only road grade and
the average speed’s linear and square terms are statistically significant at a 5% level. Overall, the signs of coefficients and
trends of fuel consumption are consistent with results from several previous studies (Barth and Boriboonsomsin, 2008;
Ericsson et al., 2006; Boriboonsomsin et al., 2012). But the values of coefficients are different. The coefficients indicated that,
when traveling on links with FFS below 16 mph, the fuel consumption rate decreases monotonically as average speed
increases. It also showed that for every 1% increase in road grade, the fuel consumption rate will increase by 7.2%
(e0:069  1). This is aligned with results of previous studies (Boriboonsomsin and Barth, 2009; Wood et al., 2014;
Fernandez and Long, 1995).
Model 2 corresponds to estimations on road links with FFSs between 16 and 34 mph. These are collector roads that serve
to move traffic from local or residential streets to arterial roads. Seven variables are statistically significant, i.e., the intercept,
average speed first to third order, FFS, interaction of FFS and speed, and road grade. The coefficients show that, when holding
other variables unchanged, increasing average speed results in a decreased fuel consumption rate. Within this free-flow
speed cluster, a 1% increase in road grade leads to an 11% increase in fuel consumption rate, which is a larger effect compared
with that of 1–16 mph cluster. In fact, the same 1% increase in road grade will bring an even larger fuel impact, such as a 23%
increase in Model 5, and a 16% increase in Model 4. This increasing trend of fuel impact for a 1% road grade increase was also
observed in other studies (Boriboonsomsin and Barth, 2009; Wood et al., 2014) with a similar order of magnitude in value. In
addition, the coefficient of FFS means that, for every 1 mph increase in FFS (keeping the average speed unchanged, this
means traffic is more congested), the fuel consumption rate will be increased by 2.8%. This impact will be diminished in a
high free-flow speed cluster, such as the 1.2% increase in Model 5, and the 1.4% increase in Model 4.
Models 3, 4, and 5 are corresponding to free-flow speed clusters of (34, 44), (44, 56), (56+). Model 3’s coefficient sign and
magnitude are similar to Model 2, with the only exception of having S4 as a significant independent variable. Model 4 has all
variables existing in Model 3, but has one additional variable, a (acceleration). It shows that for every 1 mph/s increase in
acceleration, the fuel consumption will be increased by 1.9%. This result is consistent with that of an empirical study by
the EPA (Jones, 1980). Model 5’s coefficient of variable a is 2.2%, which is slightly larger than that of Model 4. Usually vehicles
travel at higher speed on links with higher free-flow speed, such as the free-flow speed cluster for Model 5 compared with
Model 4. Because aerodynamic drag (accounting for 50%70% of vehicle drag force at high speed) is proportional to square of
travel speed, the energy needed to achieve the same acceleration will be greater for vehicle travel at higher speeds, such as
in Model 5. Thus, it is reasonable to see acceleration has a larger impact on fuel consumption in Model 5 compared with
Model 4.

5.2. Multicollinearity diagnostics

Multicollinearity is a major concern in regression analysis. Multicollinearity is a phenomenon in which two or more inde-
pendent variables in a regression model are highly correlated. The strong correlation between independent variables can
invalidate estimation results of an individual predictor, even though it does not reduce the predictive power or reliability
of the model as a whole (Beasley et al., 1980; Breusch and Pagan, 1979). We conducted multicollinearity test for variables
in our models. When multicollinearity exists, the variance of the estimated coefficients will be inflated. Variance inflation
factor (VIF) is a statistic that can quantify how much the variance is inflated and identify the severity of multicollinearity
in regression analysis (Kutner et al., 2005). In Table 3, we showed VIF test statistics for all predicting variables in our five
models. For each variable, when the VIF is greater than 5 it means multicollinearity exists between that variable and other
variables in the model (O’Brien, 2007). We found that most of VIFs are significantly smaller than 5, except those of the VIF
between variables S, S2, S3 and S4. We expected those VIFs to be large because S2, S3 and S4 are just variables with higher
powers of variable S. But according to Kutner et al. (2005), it is not something to be concerned about because the
‘‘p-value for the variables is not affected by the multicolinearity, which could be simply demonstrated by reducing the cor-
relations by centering the variables before creating higher-power variables, and the p-values for all variables will not be
influenced.” As S2, S3 and S4 are added to the models, their VIFs with other variables remain around 1, which means those
variables are not closely correlated with any of S2, S3 and S4 variables. Thus, we claimed that the models are free of
multicolinearity.

5.3. Comparison with other studies

We compare the prediction performance of our model with that of similar studies or models. Specifically, several meso-
scopic and microscopic models were compared. Microscopic models were not considered in comparison because they are the
bases of mesoscopic models and their estimates are seen as the most accurate, but their requirements of second-by-second
speed trajectory are time-consuming to generate and make microscopic models unsuitable for use in a prompt eco-routing
142 Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145

Table 2
Model estimation results.

Model 1 Model 2 Model 3 Model 4 Model 5


FFS (0, 16) FFS (16, 34) FFS(34, 44) FFS(44, 56) FFS (56+)
Intercept 1.7E+0** 2.2E+0*** 2.6E+0*** 1.6E+0*** 1.8E+0***
S 1.2E1** 2.3E1*** 4.5E1*** 4.1E1*** 2.5E1***
S2 5.6E3* 2.3E3*** 3.3E3*** 3.2E3*** 7.8E3***
S3 3.2E5*** 7.1E5*** 6.3E5*** 6.4E5***
S4 6.6E7*** 7.5E7*** 7.3E7***
FFS 2.4E2*** 1.9E2*** 1.4E2*** 1.2E2***
r 6.3E2*** 1.1E1*** 1.6E1*** 2.1E1*** 2.9E1***
r2 2.6E2** 5.2E2*** 6.6E2**
a 1.4E2** 1.9E2** 2.2E2***
Lane 1.3E2**
ra 1.9E3*** 2.3E3** 2.4E3** 2.9E3* 3.5E3*
Adj. R2 0.87 0.85 0.82 0.85 0.88
# Obs. 4136 12,447 9928 6692 15,549
*
Significant at 5% level.
**
Significant at 1% level.
***
Significant at the 0.1% level.

Table 3
Variance inflation factor for variables in Model 1 to Model 5.

Model 1 Model 2 Model 3 Model 4 Model 5


S 3.88 4.85 14.15 15.33 8.02
S2 3.88 19.54 35.12 36.87 21.77
S3 10.19 69.25 71.91 23.89
S4 32.77 33.84 19.10
FFS 1.03 1.05 1.05 1.05
r 2.15 2.38 2.94 3.86 2.75
r2 12.89 14.85 9.22
a 1.56 1.43 1.24
ra 2.76 1.79 1.84 1.53 1.41
Lane 1.14

decision system. Specifically, the models we compare are the lookup table models for EMFAC and MOVES, the one in Barth
and Boriboonsomsin (2008), and MOVES (drive profile/VSP method). MOVES and EMFAC are widely used emission inventory
models. They provide lookup tables that correspond with average vehicle travel speeds to estimate the fuel consumption
rate. The average speeds are usually classified into bins with 5 mph intervals. And the fuel consumption estimates are har-
monized for each vehicle type, e.g., gasoline passenger car. The model developed by Barth and Boriboonsomsin (2008) esti-
mates average fuel consumption for a representative set of light-duty vehicles recorded on road.7 Basically, all of the
compared models have a similar vehicle set and none of them considers road grade as one predictor for fuel consumption.
Table 4 summarized the comparison results.
We adopted Mean Absolute Percentage Error (MAPE) as the comparison metrics. This metric is widely used to prove the
validity of previous studies. Assuming N is the predicting sample size, En and E0 n are the true and predicted value of the nth
observation.
MAPE is calculated as the average of absolute deviation of predicted value from true value for the whole sample, as shown
in Eq. (1). The larger this value, the less accurate a model can predict the true value.

N  0 
1X En  En 
MAPE ¼ ð2Þ
N n¼1  En 

As mentioned in Section 5.1, 80% of the 48,752 link trips data were used to develop the regression models and the remain-
ing 20% were treated as a cross-validation data set and applied to the regression models set up to generate predicted values.
The predicted fuel consumption rates were compared with true values8 to calculate prediction performance statistics. The
MAPEs of our models are reported based on different free-flow speed clusters. It shows that the average absolute deviation

7
Some preliminary comparisons show that the vehicle set in Barth and Boriboonsomsin (2008) is similar to the one set we had in our data, which should also
be consistent with EMFAC or MOVES’s real-world-based vehicle set.
8
Note, in this study, the true value is the fuel consumption calculated from simulating a second-by-second speed trajectory on a microscopic model (CMEM).
As we discussed, the CMEM model is proven to achieve reasonable accuracy and is widely utilized by U.S. DOT-sponsored projects (Barth and Boriboonsomsin,
2008; Barth et al., 1996; Boriboonsomsin and Barth, 2009; Boriboonsomsin et al., 2012). Thus, it is reasonable to use CMEM-simulated results as true fuel
consumption.
Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145 143

Table 4
Model accuracy comparison between proposed model and other models.

Model MAPE
Our Model FFS (1, 16) 11.8%
FFS (16, 34) 12.1%
FFS (34, 44) 12.8%
FFS (44, 56) 9.1%
FFS (56+) 6.1%
All 9.3%
Macroscopic Models MOVES-Lookup Table 42.1%
EMFAC-Lookup Table 47.3%
Mesoscopic Model MOVES-Link Drive Schedule 26.3%
Barth and Boriboonsomsin (2008) 21.2%

of fitted values from true values is 9.3% for all data sets. Another interesting observation is that MAPE is decreasing as the free-
flow speeds of links increase and achieve minimum MAPE on highways or freeways. This is reasonable because vehicles usually
drive more smoothly on high free-flow speed roads, which make the predicting parameters more stable.
For the ‘‘MOVES-Lookup Table” model, we estimated fuel consumption rates based on the ‘‘fuel-average speed” lookup
table in MOVES. The average speed of each link trip in the cross-validation data set is used to generate estimates, which were
compared with true values. A similar approach was used for ‘‘EMFAC” model, except that the ‘‘fuel-average speed” lookup
table is from California’s EMFAC model. The results show that the MAPEs of those macroscopic models are 42% and 47%
respectively, which is larger (less accurate) compared with our proposed model.
The cross-validation data set was also applied to the regression model setup proposed in Barth and Boriboonsomsin
(2008)9 and the estimated results were compared with true values to calculate MAPE. MAPE of our model (9.3%) is smaller than
that of Barth and Boriboonsomsin’s model, which is 21.2%. This could be attributed to the more complex model setup of our
model as well as the free-flow speed cluster approach. However, as illustrated in Footnote 9, the difference might also be
due to differences in the fleet mix of driving data in the two models. Thus, the comparison results should be used cautiously.
For the ‘‘MOVES-Link Drive Schedule” approach, we first calculated Operating Mode at each second of vehicle driving
based on instantaneous speed and Vehicle Specific Power (VSP10) on each link trip. And then we generated the Operating
Mode Distribution and used emission/fuel consumption rates per hour to apply to Operating Mode Distribution to obtain the
fuel consumption for each link trip. Upon this, we compared the true values with calculated values to come up with MAPE.
It shows the MAPE of ‘‘MOVES-Link Drive Schedule” approach is 26.3%, which is higher (less accurate) than either our model
or even the model developed in Barth and Boriboonsomsin (2008).

6. Conclusion

Providing guidance and information to drivers to achieve better fuel efficiency remains an important and effective
approach in the near term for reducing energy consumption and GHG emissions from the transportation sector. In this paper,
we proposed a fuel consumption estimation framework, which utilizes a multivariable adaptive regression spline approach,
to be implemented into an eco-routing system. Our model adds several new features into a growing body of environmentally
sensitive routing model literature. First, we developed a framework showing how to utilize large-scale real-world driving
data to establish a mesoscopic fuel estimation model. Second, we cluster road links by their free-flow speed and apply
the MARS algorithm to optimally fit one regression curve for each cluster to achieve a higher predicting accuracy. Last
but not least, we included several predicting variables rarely or never considered before, such as number of lanes and road
grade, to fully explore determinants of vehicle fuel consumption.
A large-scale real-world vehicle speed trajectory data set from the Philadelphia area was applied to the methodology
framework we proposed. The estimation coefficients revealed the relationship between them and fuel consumption of vehi-
cles. In addition, some statistical validation processes were conducted to ensure the validity of the model and the results.
Compared with similar macroscopic and mesoscopic fuel estimation models, our model can achieve higher accuracy due
to the added features in the model, which shows its potential for implementation into an eco-routing system for rapidly esti-
mating fuel consumption of routes.
It is worth noting that the regression analysis was based on vehicle driving trajectories in the Philadelphia-Camden-
Trenton metropolitan area. It would be interesting to re-examine the relationships using driving trajectory data covering
a wider geographical region or an area elsewhere in the nation.

9
Note, it is important to clarify that the model in Barth and Boriboonsomsin (2008) was based on data from a representative LDV fleet mix in Southern
California, where our proposed model was based on representative LDV flex mix in Philadelphia area. The differences in results between these two studies
might due to differences in fleet mix. However, there is no available data to evaluate fleet mix in Barth and Boriboonsomsin (2008). Thus, the difference results
should be used in cautions and only be treated as a reference.
10
VSP is a vehicle load metric representing the sum of loads from aerodynamic drag, acceleration, rolling resistance, etc. It was first developed by Jimenez
(1998).
144 Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145

Acknowledgment

This work was supported by the U.S. Department of Energy under Contract No. DE-AC36-08GO28308 with the National
Renewable Energy Laboratory thorugh the SMART Mobility laboratory consortium. The views and opinions expressed in this
paper are those of the authors alone. The authors also would like to thank the anonymous reviewers’ comments and
suggestions.

References

Ahn, K., Rakha, H., Trani, A., Van Aerde, M., 2002. Estimating vehicle fuel consumption and emissions based on instantaneous speed and acceleration levels. J.
Transport. Eng. 128 (2), 182–190.
Asadi, B., Vahidi, A., 2011. Predictive cruise control: utilizing upcoming traffic signal information for improving fuel economy and reducing trip time. IEEE
Trans. Control Syst. Technol. 19 (3), 707–714.
Atagoziyev, M., Schmidt, K.W., Schmidt, E.G., 2016. Lane change scheduling for autonomous vehicles. In: 14th IFAC Symposium on Control in Transportation
Systems. http://dx.doi.org/10.1016/j.ifacol.2016.07.011 (accessed 10 June. 2017).
Awal, T., Murshed, M., Ali, M., 2015. An efficient cooperative lane-changing algorithm for sensor- and communication-enabled automated vehicles. In: IEEE
Intelligent Vehicles Symposium (IV). http://dx.doi.org/10.1109/IVS.2015.7225900 (accessed 1 June. 2017).
Barkenbus, J.B., 2010. Eco-driving: an overlooked climate change initiative. Energy Policy 38 (2), 762–769.
Barth, M., Boriboonsomsin, K., 2008. Real-world carbon dioxide impacts of traffic congestion. Transport. Res. Rec.: J. Transport. Res. Board 2058, 163–171.
Barth, M., An, F., Norbeck, J., Ross, M., 1996. Modal emissions modeling: a physical approach. Transport. Res. Rec.: J. Transport. Res. Board 1520, 81–88.
Beasley, D.B., Huggins, L.F., Monke, E.J., 1980. ANSWERS: a model for watershed planning. Trans. Am. Soc. Agric. Eng. 23, 938–944.
Boriboonsomsin, K., Barth, M., 2009. Impacts of road grade on fuel consumption and carbon dioxide emissions evidenced by use of advanced navigation
systems. Transport. Res. Rec.: J. Transport. Res. Board 2139, 21–30.
Boriboonsomsin, K., Barth, M., Zhu, W., Vu, A., 2012. Eco-routing navigation system based on multisource historical and real-time traffic information. IEEE
Trans. Intell. Transp. Syst. 13 (4).
Breusch, T.S., Pagan, A.R., 1979. A simple test for heteroscedasticity and random coefficient variation. Econometrica 47 (5), 1287–1294.
Brooker, A., Gonder, J., Wang, L., Wood, E., Lopp, S., Ramroth, L., 2015. FASTSim: A Model to Estimate Vehicle Efficiency, Cost and Performance. SAE Technical
Paper 2015-01-0973. http://dx.doi.org/10.4271/2015-01-0973.
Buja, A., Hastie, T.J., Tibshirani, R., 1989. Linear smoothers and additive models. Ann. Stat. 17, 453–555.
Chen, Y., Borken-Kleefeld, J., 2016. NOx emissions from diesel passenger cars worsen with age. Environ. Sci. Technol. 50 (7), 3327–3332.
Chen, Y., Fan, Y., 2013. Transportation fuel portfolio design under evolving technology and regulation: a California case study. Transport. Res. Part D:
Transport Environ. 24, 76–82.
Chen, Y., Fan, Y., 2014. Coping with technology uncertainty in transportation fuel portfolio design. Transport. Res. Part D: Transport Environ. 32, 354–361.
Chen, Y., Hu, K., Zhao, J., Li, G., Johnson, j., Zietsman, J., 2017a. In-use energy and CO2 emissions impact of a plug-in hybrid and battery electric vehicle based
on real-world driving. Int. J. Environ. Sci. Technol. http://dx.doi.org/10.1007/s13762-017-1458-0 (in press).
Chen, Y., Meier, A., 2016. Fuel consumption impacts of auto roof racks. Energy Policy 92, 325–333.
Chen, Y., Zhang, Y., Fan, Y., Hu, K., Zhao, J., 2017b. A dynamic programming approach for modeling low-carbon fuel technology adoption considering
learning-by-doing effect. Appl. Energy 185 (1), 825–835.
Delaware Valley Regional Planning Commission, 2013. 2012-2013 Household Travel Survey for the Delaware Valley Region. No. 14033. <http://www.dvrpc.
org/Products/14033/>.
Ericsson, E., Larsson, H., Brundell-Freij, K., 2006. Optimizing route choice for lowest fuel consumption – potential effects of a new driver support tool. Transp.
Res. Part C 14, 369–383.
EIA, 2016. Annual Energy Outlook 2016. U.S. Department of Energy.
EPA, 2003. User’s Guide to MOBILE6.1 and MOBILE6.2: Mobile Source Emission Factor Model. United States Environmental Protection Agency, Technical
Report, EPA420-R-03-010.
EPA, 2010. Motor Vehicles Emission Simulator (MOVES) 2010 User Guide. United States Environmental Protection Agency, Technical Report, EPA-420-B-09-
041.
Fernandez, P.C., Long, J.R., 1995. Grades and other load effects on On-Road Emissions: An on-board Analyzer Study. In Fifth CRC On-Road Vehicle Emission
Workshop, San Diego, California, Coordinating Research Council, Alpharetta, Ga., 1995.
Friedman, J.H., 1991. Multivariate adaptive regression splines. Ann. Stat. 19 (1), 123–141.
Hu, K., Chen, Y., 2016. Technological growth of fuel efficiency in European automobile market 1975–2015. Energy Policy 98, 142–148.
Ioannou, P.A., Stefanovic, M., 2005. Evaluation of ACC vehicles in mixed traffic: lane change effects and sensitivity analysis. IEEE Trans. Intell. Transport. Syst.
6 (1), 79–89.
Jiang, H., Dai, E., Gao, W., Zhang, J.J., Zhang, Y., Muljadi, E., 2016a. Spatial-temporal synchrophasor data characterization and analytics in smart grid fault
detection, identification and impact causal analysis. IEEE Trans. Smart Grid 7 (5), 2525–2536.
Jiang, H., Zhang, Y., Zhang, J.J., Gao, D.W., Muljadi, E., 2015. Synchrophasor-based auxiliary controller to enhance the voltage stability of a distribution system
with high renewable energy penetration. IEEE Trans. Smart Grid 6 (4), 2107–2115.
Jiang, H., Zhang, J.J., Gao, W., Wu, Z., 2014. Fault detection, identification, and location in smart grid based on data-driven computational methods. IEEE
Trans. Smart Grid 5 (6), 2947–2956.
Jiang, H., Zhang, Y., Muljadi, E., Zhang, Y., Gao, W., 2016b. A short-term and high-resolution distribution system load forecasting approach using support
vector regression with hybrid parameters optimization. IEEE Trans. Smart Grid 99.
Jimenez, J.L., 1998. Understanding and Quantifying Motor Vehicle Emissions with Vehicle Specific Power and TILDAS Remote-Sensing Ph.D. thesis.
Massachusetts Institute of Technology, Cambridge, Massachusetts.
Jones, R., 1980. Quantitative Effects of Acceleration Rate on Fuel Consumption Technical Report. Environmental Protection Agency.
Kim, S., Coifman, B., 2014. Comparing INRIX speed data against concurrent loop detector stations over several months. Transport. Res. Part C: Emerg.
Technol. 49, 559–572.
Kutner, M.H., Nachtsheim, C., Neter, J., Li, W., 2005. Applied Linear Statistical Models. McGraw-Hill Irwin Publishers, pp. 434–435.
Li, W., Wu, G., Zhang, Y., Barth, M., 2017. A comparative study on data segregation for mesoscropic energy modeling. Transport. Res. Part D: Transport
Environ. 50, 70–82.
Macbeth, A.G., 1998. Calming Arterials in Toronto. In: 68th Annual Meeting of the Institute of Transportation Engineers, Toronto, Ontario, Canada.
Manzie, C., Watson, H., Halgamuge, S., 2007. Fuel economy improvements for urban driving: hybrid vs. intelligent vehicles. Transport. Res. Part C 15, 1–16.
Martens, M., Comte, S., Kaptein, N., 1997. The Effects of Road Design on Speed Behaviour: A Literature Review. TNO Report, RO-96-SC.202. <https://pdfs.
semanticscholar.org/f5b9/c77cbecb12b6ac750a2ec106456c09a33321.pdf> (accessed on 10 May 2017).
McCune, B., 2006. Nonparametric multiplicative regression for habitat modeling. J. Veget. Sci. 17, 819–830.
Mo, B., Li, R., Zhan, X., 2017. Speed profile estimation using license plate recognition data. Transport. Res. Part C: Emerg. Technol. 82, 358–378.
Morrison, G.M., Chen, Y., 2011. How will changes in the ethanol market affect California’s Low Carbon Fuel Standard? Transport. Res. Rec.: J. Transport. Res.
Board 2252, 16–22.
Y. Chen et al. / Transportation Research Part C 83 (2017) 134–145 145

Nagel, K., Scheicher, A., 1994. Microscopic traffic modelling on parallel high performance computers. Parallel Comput. 20, 125–146.
National Renewable Energy Laboratory, 2015. Transportation Secure Data Center. National Renewable Energy Laboratory. <www.nrel.gov/tsdc> (accessed
January 15, 2015).
Nie, Y.M., Li, Q., 2013. An eco-routing model considering microscopic vehicle operating conditions. Transport. Res. Part B: Methodol. 55, 154–170.
O’Brien, R.M., 2007. A caution regarding rules of thumb for variance inflation factors. Qual. Quant. 41, 673–690.
Oduro, S.D., Metia, S., Duc, H., Hong, G., Ha, Q.P., 2015. Multivariate adaptive regression splines models for vehicular emission prediction. Visual. Eng. 3 (13).
Onoda, T., 2009. IEA policies—G8 recommendations and an afterwards. Energy Policy 37 (10), 3823–3831.
Penic, M.A., Upchurch, J., 1992. Transyt-7f: enhancement for fuel consumption, pollution emissions, and user costs. Transp. Res. Rec., 1360
Rakha, H., Ahn, K., Trani, A., 2004. Development of VT-micro model for estimating hot stabilized light duty vehicle and truck emissions. Transport. Res. Part
D: Transport Environ. 91, 49–74.
Rakha, H., Ahn, K., Moran, K., 2012. Integration framework for modeling eco-routing strategies: logic and preliminary results. Int. J. Transport. Sci. Technol.
13, 259–274.
Rakha, H., Yue, H., Dion, F., 2011. VT-Meso model framework for estimating hot-stabilized light-duty vehicle fuel consumption and emission rates. Can. J.
Civ. Eng. 38 (11), 1274–1286.
Silva, A., 2014. Estimating Fuel Consumption from GPS Data. Dissertation for Master in Information Engineering. University of Porto.
Sugawara, S., Niemeier, D., 2002. How much can vehicle emissions be reduced?: Exploratory analysis of an upper boundary using an emissions-optimized
trip assignment. Transport. Res. Rec.: J. Transport. Res. Board 1815, 29–37.
Ubiergo, G.A., Jin, W., 2016. Mobility and environment improvement of signalized networks through vehicle-to-infrastructure communications. Transport.
Res. Part C: Emerg. Technol. 68, 70–82.
Wan, N., Vahidi, A., Luckow, A., 2016. Optimal speed advisory for connected vehicles in arterial roads and the impact on mixed traffic. Transport. Res. Part C:
Emerg. Technol. 69, 548–563.
Wang, L., Duran, A., Gonder, J., Kelly, K., 2015. Modeling Heavy/Medium-Duty Fuel Consumption based on Drive Cycle Properties. SAE Technical Paper, 2015-
01-2812.
Wood, E., Burton, E., Duran, A., Gonder, J., 2014. Contribution of road grade to the energy use of modern automobiles across large datasets of real-world drive
cycles. National Renewable Energy Laboratory. NREL/CP-5400-61108.
Yi, Z., Bauer, P.H., 2017. Adaptive multi-resolution energy consumption prediction for electric vehicles. IEEE Trans. Veh. Technol. http://dx.doi.org/10.1109/
TVT.2017.2720587 (in press).
Zhang, D., Zhan, Q., Chen, Y., Li, S., 2016. Joint optimization on logistics infrastructure investments and subsidies in a regional logistics network with CO2
emission reduction targets. Transport. Res. Part D (in press).
Zheng, J., Liu, H.X., 2017. Estimating traffic volumes for signalized intersections using connected vehicle data. Transport. Res. Part C: Emerg. Technol. 79,
347–362.
Zulkefli, M.A.M., Mukherjee, P., Sun, Z., Zheng, J., Liu, H.X., Huang, P., 2017. Hardware-in-the-loop testbed for evaluating connected vehicle applications.
Transport. Res. Part C: Emerg. Technol. 78, 50–62.

You might also like