You are on page 1of 18

energies

Article
A Data-Driven Method for Energy Consumption
Prediction and Energy-Efficient Routing of Electric
Vehicles in Real-World Conditions
Cedric De Cauwer 1, *, Wouter Verbeke 1 , Thierry Coosemans 1 , Saphir Faid 2
and Joeri Van Mierlo 1
1 Mobility, Logistics and Automotive Technology Research Centre (MOBI), Electrotechnical Engineering and
Energy Technology (ETEC) Department, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium;
wouter.verbeke@vub.be (W.V.); thierry.coosemans@vub.be (T.C.); joeri.van.mierlo@vub.be (J.V.M.)
2 Punch Powertrain, Industriezone Schurhovenveld 4125, 3800 Sint-Truiden, Belgium;
Saphir.Faid@punchpowertrain.com
* Correspondence: cedric.de.cauwer@vub.be; Tel.: +32-2-629-2838

Academic Editor: Michael Gerard Pecht


Received: 11 March 2017; Accepted: 21 April 2017; Published: 1 May 2017

Abstract: Limited driving range remains one of the barriers for widespread adoption of electric
vehicles (EVs). To address the problem of range anxiety, this paper presents an energy consumption
prediction method for EVs, designed for energy-efficient routing. This data-driven methodology
combines real-world measured driving data with geographical and weather data to predict the
consumption over any given road in a road network. The driving data are linked to the road
network using geographic information system software that allows to separate trips into segments
with similar road characteristics. The energy consumption over road segments is estimated using a
multiple linear regression (MLR) model that links the energy consumption with microscopic driving
parameters (such as speed and acceleration) and external parameters (such as temperature). A neural
network (NN) is used to predict the unknown microscopic driving parameters over a segment
prior to departure, given the road segment characteristics and weather conditions. The complete
proposed model predicts the energy consumption with a mean absolute error (MAE) of 12–14% of
the average trip consumption, of which 7–9% is caused by the energy consumption estimation of the
MLR model. This method allows for prediction of energy consumption over any route in the road
network prior to departure, and enables cost-optimization algorithms to calculate energy efficient
routes. The data-driven approach has the advantage that the model can easily be updated over time
with changing conditions.

Keywords: electric vehicle (EV); energy consumption; prediction; routing

1. Introduction and State-of-the-Art


The electric vehicle (EV) has great potential in reducing the impact of the transport sector on
global warming by decreasing greenhouse gas (GHG) emissions, particularly in combination with low
emission electricity production, and improving local air quality by having no tail-pipe emissions [1].
Despite the EV’s environmental benefits, its market penetration and widespread adoption is only
moderately progressing, with a market share still below 1% for passenger vehicles in the European
Union [2]. Consumer EV adoption behavior is found to be influenced by attitudinal factors related to
the high initial purchase cost, the consumer’s perception of supportive policy and attitude towards
technical features [3]. Both the high purchase cost and limited range are a result of the current
development state of the battery technology. Limited by the specific energy and cost of the battery,

Energies 2017, 10, 608; doi:10.3390/en10050608 www.mdpi.com/journal/energies


Energies 2017, 10, 608 2 of 18

most current commercial vehicles have a battery pack with a capacity of no more than 30 kWh,
resulting in a New European Drive Cycle (NEDC) range of maximum 250 km [4], which can decrease
significantly for real-world use where the energy consumption is reported to increase up to 60% [5–7].
This limited driving range combined with an absence of a vast and dense public charging infrastructure
network enforces the need for accurate range estimation to address the problem of range anxiety [8].
The estimation of the driving range is a combination of both estimating the remaining energy in
the battery and predicting the future energy consumption. Most studies regarding range estimation
are focused on the prediction of the variable energy consumption and assume the remaining energy
in the battery, in the form of state-of-charge (SoC) and state-of-health (SoH), are given. Energy
consumption of an EV depends on the characteristics of the vehicle and its drivetrain, the drive cycle
(the speed profile driven) and auxiliary consumption. In real-world driving, this speed profile—and
therefore energy consumption—is extremely variable and dependent on both road characteristics [9,10],
such as road type and altitude profile, and driving style [11,12]. Additionally, the speed profile is
affected by a number of external influences, such as traffic [13], weather [14] and driver mood, which
either influence the behavior of or impose a behavior on the driver and trigger the use of auxiliaries.
The energy consumption of auxiliaries is heavily dependent on the weather. Field tests and long term
trials show that these auxiliaries are responsible for an important portion of the average real-world
consumption [5,7,15].
This complex system of energy consumers and their influencing factors make a prediction of the
energy consumption difficult. As reported in [16], energy estimation models are generally created
for the purpose of EV drivetrain design and optimization [17,18], assessment of the influences on the
energy consumption [10,19,20], global energy consumption or grid impact due to the introduction
of EVs or hybrid vehicles [14,15], or (all-electric) range prediction [21]. Energy estimation for the
purpose of range prediction either relies on vehicle simulations where drivetrains and vehicle behavior
are being simulated [13,22], sometimes down to the component level, or statistical models. Vehicle
simulation models require calibration and validation using real-life tests or roller bench tests, and use
detailed speed profiles or drive cycles as input for their estimation. Statistical models rely on the
availability of real-world data and vary in the extent to which they can be linked with the physical
underlying principles and speed profile [16,23–25]. An important part of any energy estimation model
for the prediction of energy consumption in real-world circumstances is thus the prediction of the
speed profile driven. The speed profiles for real-world energy prediction are often presented in a
discrete set of drive cycles or a combination of these drive cycles [26,27].
Energy-efficient routing allocates an energy cost to all the links or segments in a road network and
applies cost-optimization algorithms to determine to path with the lowest energy consumption.
For a prediction over the road network, whether it be driver-centric or non-centric (with a set
destination), individual predictions over the individual road segments have been proposed for
EVs [23,28], and combustion engine vehicles [29]. The speed profile driven over road segment will
depend on the road characteristics, the vehicle performance, the traffic and the driver himself. Driving
behavior can change average speed (i.e., speeding or conservative driving) and the aggressiveness
of acceleration. A third factor of driving behavior is the capability to anticipate behavior of other
vehicles or traffic lights to avoid slowing down and accelerating again, which in combination with
intelligent traffic systems (ITS) has proven to reduce fuel consumption in internal combustion engine
(ICE) vehicles [30] and energy consumption in EVs [13]. Traffic density can influence the driving
behavior by imposing a de facto maximum speed or a higher frequency of stops and accelerations.
Weather, in the form of temperature, rain and daylight might influence the driving behavior towards a
more cautious style to lower the risk for accidents [12].
The goal of this paper is to develop a data-driven method to predict the energy consumption
of an EV, usable for energy-efficient routing. The prediction must be performed on the individual
segments in a road network, and account for external disturbances that influence the speed profile
and auxiliary consumption. By calculating the energy consumption over the complete road network,
Energies 2017, 10, 608 3 of 18

energy-optimal solutions can be calculated using cost-optimization algorithms known as shortest-path


algorithms. The proposed model applies a statistical model for the energy consumption estimation,
based on the underlying physical principles, and a machine learning technique that accounts for the
external disturbances
Energies 2017, 10, 608 on the speed profile. By separating the model in in this way, it benefits 3 of 17from
the power and flexibility of data mining techniques, while preserving the interpretability (because its
strongaccounts
link with for the
the external disturbances
underlying physicalonprinciples)
the speed profile.
of the By separating the model
computationally in in
simple this way,model.
statistical it
benefits from the power and flexibility of data mining techniques, while
Both the statistical model and machine learning are based on real-world measured data, so externalpreserving the
interpretability (because its strong link with the underlying physical principles) of the
influences are implicitly present in the data, and the model is not calibrated to only specific conditions.
computationally simple statistical model. Both the statistical model and machine learning are based
The data-driven approach of this method will allow the developed model to be easily updated over
on real-world measured data, so external influences are implicitly present in the data, and the model
time and
is notadjusted
calibrated totochanging conditions.
only specific conditions. The data-driven approach of this method will allow the
developed model to be easily updated over time and adjusted to changing conditions.
2. The Proposed Energy Prediction Model
2. The
The Proposed
proposed Energyapplies
method Prediction Model learning technique and a statistical method to real-world
a machine
measuredThe driving datamethod
proposed and energy
applies aconsumption datatechnique
machine learning of EVs, andweather data method
a statistical and geographical
to real-
world measured
information. driving data
The real-world and energy
driving, energy,consumption data geographical
weather, and of EVs, weather dataare
data andfirst
geographical
linked to the
information. The
road characteristics ofreal-world driving,
the individual road energy, weather,
segments and geographical
by location, data positioning
using global are first linked to the(GPS)
system
road characteristics of the individual road segments by location, using global positioning
coordinates. The data are then used to train a NN that predicts the speed profile (translated into system
(GPS) coordinates. The data are then used to train a NN that predicts the speed profile (translated
microscopic driving parameters) from the road characteristics, weather and traffic-related parameters,
into microscopic driving parameters) from the road characteristics, weather and traffic-related
and construct an energy consumption estimation model using multiple linear regression (MLR).
parameters, and construct an energy consumption estimation model using multiple linear regression
The regression
(MLR). Themodel estimates
regression modelthe energythe
estimates consumption based on
energy consumption some
based onmeasurable
some measurableroad road
and and
external
parameters, and the predicted values for the microscopic driving parameters from the NN.
external parameters, and the predicted values for the microscopic driving parameters from the NN. A schematic
overview of the proposed
A schematic overview ofenergy prediction
the proposed method,
energy its inputs
prediction anditsoutput
method, inputs and flow ofand
and output calculations
flow of is
givencalculations
in Figure 1.is given in Figure 1.

Figure
Figure 1. Schematic
1. Schematic overview
overview of of
thetheproposed
proposedenergy
energy prediction
predictionmodels and
models their
and flowflow
their of calculations.
of calculations.
MLR: multiple linear regression; GPS: global positioning
MLR: multiple linear regression; GPS: global positioning system. system.

Although the flow of calculations moves as indicated in Figure 1, the logic for the layout of the
Although the flow
proposed method of derived
was calculations moves
in another as indicated
sequence. in Figure
To provide 1, the the
the reader logic forfeel
same the and
layout of the
logic
proposed
behind method was derived
the build-up in another
of the model, sequence.
the individual partsTo
of provide the reader
the proposed methodthe willsame feel andinlogic
be presented
behindthethe
same order asof
build-up they
thewere developed.
model, In the remaining
the individual parts ofpart
theofproposed
this section, first a description
method of the in
will be presented
available data is given, then the energy consumption estimation model is presented,
the same order as they were developed. In the remaining part of this section, first a description followed by theof the
method to link the vehicle monitoring data with the road network (segmentation method),
available data is given, then the energy consumption estimation model is presented, followed by the and finally
the NN for speed profile prediction is presented.
method to link the vehicle monitoring data with the road network (segmentation method), and finally
the NN2.1.for speed profile
Description prediction
of the Available Datais presented.
The model
2.1. Description of theisAvailable
built by combining
Data information from datasets originating from different sources.
The data consists of vehicle monitoring data, a road network database, weather data and an altitude
The
map.model is built
The vehicle by combining
monitoring information
data consists from datasets
of two datasets. originating
One dataset consists offrom different
30 EVs sources.
which were
The data consists
monitored forof vehicle
a period of monitoring data,The
more than 1 year. a road network
vehicles database,asweather
were monitored data
part of the and an
Flemish altitude
Living
map. Labs
The project
vehicleEVteclab
monitoring data
[31,32]. consists
These of are
vehicles twoofdatasets.
the Ford One dataset
Connect consists
EV model, of 30
which is EVs
a Fordwhich
Connect transformed to an EV drivetrain by the Punch Powertrain company [33]. The vehicles
were monitored for a period of more than 1 year. The vehicles were monitored as part of the Flemish were
monitored
Living with EVteclab
Labs project a logger that measured
[31,32]. These both GPS data
vehicles are and data
of the from
Ford the vehicle
Connect controller
EV model, area is a
which
network (CAN). The GPS data were logged at a 1 Hz frequency, the CAN data at a 5 Hz frequency.
The GPS data provided the timestamp, latitude, longitude, and vehicle speed. Vehicle accelerations
Energies 2017, 10, 608 4 of 18

Ford Connect transformed to an EV drivetrain by the Punch Powertrain company [33]. The vehicles
were monitored with a logger that measured both GPS data and data from the vehicle controller area
network (CAN). The GPS data were logged at a 1 Hz frequency, the CAN data at a 5 Hz frequency.
The GPS data provided the timestamp, latitude, longitude, and vehicle speed. Vehicle accelerations are
calculated as the discrete derivative of the GPS speed. Although GPS speed measurements themselves
are accurate, the 1 Hz measurement frequency can introduce some loss of accuracy, especially in the
calculations of the accelerations. The CAN data provided information on the energy consumption in
the form of battery voltage, current and SoC. This dataset will be referred to as Dataset 1. The second
dataset concerns three 2014 Nissan Leaf used as taxis in the Brussels Capital Region. They are driven
24/7 by multiple drivers per vehicle. As for Dataset 1, the GPS is logged with a 1 Hz frequency,
while the CAN-bus data is logged with a 1 Hz frequency. This dataset will be referred to as Dataset 2.
The vehicle specifications of both vehicles are presented in Table 1.

Table 1. Overview of the vehicle specifications of both vehicle models in the two datasets. EV: electric vehicle.

Reference Motor Top Speed Torque Battery Driving


Vehicle Model Mass (kg)
Name Power (kW) (km/h) (Nm) Capacity (kWh) Range (km)
Punch Powertrains
Dataset 1 1900 60 120 (limited) 300 (limited) 27 130
Ford Connect EV
Dataset 2 Nissan Leaf (2014) 1601 80 144 300 (limited) 24 199

The road database consists of data on the Belgian road network, where the monitored vehicles
were predominantly driven. The road network database has navigating capabilities and contains
information per segment such as road type, segment length, expected speed over the segment and
whether the road was a one-way road, a bridge or a tunnel. The database did not contain any
information on the presence of traffic lights and pedestrian crossings, nor did it mention the local
speed limit. For the Brussels Capital Region, the road database information was extended with the
presence of pedestrian crossings, traffic lights and speed bumps by adding these layers, provided by
Brussels UrbIS® ©, to the base road network. The vehicles in Dataset 1 were driven in a mix of highway,
rural and urban roads, while the vehicles in Dataset 2 were predominantly driven in a dense urban
road network.
The geographical data consists of a 3 arc-second precision digital elevation map (DEM) coming
from the shuttle radar topography mission (SRTM) that provides altitude information on the major
part of the globe. The altitude information was extracted from the DEM for each GPS coordinate of the
driving data with the use of the geographic information system (GIS) software ArcGIS. To link the
driving data to the road network, their GPS coordinates and the road database were joined spatially.
The resulting dataset is thus a combination of vehicle GPS data, road information per segment and
altitude. To visually illustrate the data, Figure 2 shows the road network in the Brussels Capital Region
with part of Dataset 2’s driven trips and the color-scaled altitude map as used in ArcGIS.
The weather data was measured and provided by the Royal Meteorological Institute (RMI) of
Belgium and contained temperature, wind speed and direction, and precipitation on an hourly basis
for weather stations close to the respective regions in Flanders (Flanders, Belgium) where the vehicles
of each dataset were driven. The weather data are considered sufficiently accurate and representative
for the whole of the vehicle monitoring datasets because of the limited area of the regions where the
vehicles were driven.
As the procedure to link the altitude and road information with the GPS coordinates is
computationally intensive, only a representative selection of the vast amount of vehicle monitoring data
was taken to establish the scientific value of the proposed methodology. The selection is considered
representative if it covers a sufficient part of the road network (i.e., all types of roads) under various
conditions. Therefore, the selection for Dataset 1 contained multiple vehicles driven in different parts
of the region, on a variety of road types and spread out over multiple months of monitoring. After
dense urban road network.
The geographical data consists of a 3 arc-second precision digital elevation map (DEM) coming
from the shuttle radar topography mission (SRTM) that provides altitude information on the major
part of the globe. The altitude information was extracted from the DEM for each GPS coordinate of
the driving data with the use of the geographic information system (GIS) software ArcGIS. To link
Energies
the2017, 10, 608data to the road network, their GPS coordinates and the road database were joined5 of 18
driving
spatially. The resulting dataset is thus a combination of vehicle GPS data, road information per
segment and altitude. To visually illustrate the data, Figure 2 shows the road network in the Brussels
filtering the selected data, the used data consisted of 3700 km driven by three different vehicles for
Capital Region with part of Dataset 2’s driven trips and the color-scaled altitude map as used in
Dataset 1 and 10,700 km driven by 2 vehicles in Dataset 2.
ArcGIS.

Figure 2. Part of the Dataset 2 driven trips on the road network and altitude map for the Brussels
Figure 2. Part of the Dataset 2 driven trips on the road network and altitude map for the Brussels
Capital Region as used in the geographic information system (GIS) software ArcGIS.
Capital Region as used in the geographic information system (GIS) software ArcGIS.

2.2. Energy Estimation Model


The model is based on the underlying physical model that describes the forces acting on a vehicle
in motion. The mechanical energy dE required at the wheels to cover the distance ds is written as:
" #
1 1 (v EV + vw )2   dv
dE = mg( f cosϕ + sinϕ) + (ρCx A ) + m + mf ds (1)
3600 2 3.6 dt

dE Mechanical energy required at the wheels to drive a distance ds [kWh]


m Total vehicle mass [kg]
mf Fictive mass of rolling inertia [kg]
g Gravitational acceleration [m/s2 ]
f Vehicle coefficient of rolling resistance [-]
ϕ Road gradient angle [◦ ]
ρ Air density [kg/m3 ]
Cx Drag coefficient of the vehicle [-]
A Vehicle equivalent cross section [m2 ]
v EV Vehicle speed between the point i and the point j [km/h]
vw Wind speed projected to the opposing direction of the driving direction [km/h]
ds Distance driven from point i to point j [km]
The terms in (1) represent respectively the rolling resistance, potential energy, aerodynamic loss
and inertial energy. Assuming in a first order the rolling resistance coefficient, drag coefficient, air
density and vehicle mass are constant, the energy consumption can be described as a linear combination
of the kinematic parameters ds, v2 ds, dvdt ds, and h = ds sinϕ. To represent the consumption of the
auxiliaries, the formula was then extended with a time-linear, temperature scaled term. The simplified
linear representation of the energy consumption of the EV can now be written as:

EEV = B1 s + B2 (v EV + vw )2 s + B3 a s + B4 h + B5 Aux T Auxt t (2)

Aux T Temperature scaling


Auxt Fraction of time the auxiliaries are switched on
t Time
s Distance
Energies 2017, 10, 608 6 of 18

By applying MLR to the real-world driving and energy data, the coefficients of the linear
combination in (2) are determined. The effect of wind speed on energy consumption does not feel as
Energies 2017, 10, 608 6 of 17
very significant because, in general, wind speed is moderate compared to vehicle speed and driving
direction
very during
significantdriving mostly
because, shifts wind
in general, frequently.
speed isHowever,
moderate upon closetoexamination
compared vehicle speedof thedriving
and outliers in
direction
results during driving
of the energy mostly
estimation shifts
using frequently.
(2), However, that
it was established upontheclose
windexamination
can haveofa the outliers
large influence
in results of the energy estimation using (2), it was established that the wind
on energy consumption in some cases. Therefore, wind speed was added to the predictor variables can have a large
in (2)influence on energy
by projecting it onconsumption
the driving in some cases.
direction. Therefore,
Figure wind an
3 presents speed was added
example to the consumption
of energy predictor
variables in (2) by projecting it on the driving direction. Figure 3 presents an example of energy
estimation using (2) for a measured trip with reported heavy headwinds. It depicts the individual
consumption estimation using (2) for a measured trip with reported heavy headwinds. It depicts the
contributions of the regression terms in a cumulative way along the progression of a trip, with and
individual contributions of the regression terms in a cumulative way along the progression of a trip,
without the superposition of headwind on the speed predictor. Superposing the projected wind speed
with and without the superposition of headwind on the speed predictor. Superposing the projected
to thewind
vehicle
speedspeed
to the invehicle
the aerodynamic
speed in the term of the energy
aerodynamic term ofequation
the energyreduced
equationthe error the
reduced from around
error
30% to only a few percent over the trip.
from around 30% to only a few percent over the trip.

Figure
Figure 3. Depicts
3. Depicts thethe speed
speed profile,cumulative
profile, cumulative energy
energy measured,
measured,thethecumulative energy
cumulative and and
energy its its
individual contributions estimated from the regression model for a trip with strong headwind. The
individual contributions estimated from the regression model for a trip with strong headwind. The top
top figure does not take into account the headwind whereas the lower figure shows the result of the
figure does not take into account the headwind whereas the lower figure shows the result of the
regression when superposing the headwind to the vehicle speed.
regression when superposing the headwind to the vehicle speed.
The sensitivity analysis of the energy demand, presented in [34], highlights the effect of a
The sensitivity
variable analysis of
rolling resistance thetoenergy
and, a lesserdemand, presented
extent, vehicle mass in
on[34], highlights
the energy the effect
consumption ofof
thea EV.
variable
The rolling resistance coefficient can vary considerably because of many factors, such as
rolling resistance and, to a lesser extent, vehicle mass on the energy consumption of the EV. The rolling road surface,
road wetness,
resistance tires
coefficient and
can tire considerably
vary pressure, withbecause
reportedofvariations up to 65%
many factors, such[22,35],
as roadandsurface,
therefore require
road wetness,
extensive measurements to characterize it. There also exist methods to estimate vehicle mass and the
tires and tire pressure, with reported variations up to 65% [22,35], and therefore require extensive
rolling resistance coefficient online [36]. If explicit measurements of rolling resistance and vehicle
measurements to characterize it. There also exist methods to estimate vehicle mass and the rolling
mass are available, these parameters can easily be drawn out of the regression coefficients and added
resistance coefficient online [36]. If explicit measurements of rolling resistance and vehicle mass are
explicitly to the predictors in (2) to account for their variability.
available,Bythese parameters
constructing the can
modeleasily
usingbethe
drawn
MLR out of on
based thethe
regression coefficients
vehicle dynamics and added
equation, explicitly
the method
to theispredictors in (2) to account for their variability.
both computationally simple and increases interpretability through the causal relations in the
By constructing
model. theMLR
To allow the model usingthe
to detect theindividual
MLR based on the on
influences vehicle dynamics
the energy equation,
consumption theeasily,
more method is
the trips are split into
both computationally shorter
simple segments,
and increasesso interpretability
more variability resides
through in the
the measured data. in the model.
causal relations
Energies 2017, 10, 608 7 of 18

To allow the MLR to detect the individual influences on the energy consumption more easily, the trips
are split into shorter segments, so more variability resides in the measured data.

2.3. Segmentation Method


The trips can be split into shorter segment with more distinct conditions to avoid over-aggregation
and a loss of variability in the data. The energy estimation is done on these segments and are
later recombined for an estimation on trip level. Based on (2), the formula for the simplified linear
representation of the energy consumption in function of its predictors now becomes:

∆E = ∑ ∆Esegments
trip

n  
= ∑ B1 ∆s j + B2 ∑ (v EV i + vwi )2 ∆si + B3 CMFj+ ∆s j (3)
segments
 j  i i
+ B4 CMFj− ∆s j + B5 ∆H pos j ∗ + B6 ∆Hneg j + B7 Aux Tj ∆t j + ε

with:
∑in=2 v EV 2 i − v EV 2 i−1
CMFj = (4)
∆s
n
AFj = ∑ (vEV i + vwi )2 ∆si (5)
i

while:

Bi : regression coefficients
∆E : energy
v EV i : vehicle speed at time ti
vwi : wind speed value projected on the driving direction at time ti
∆s : distance
∆si : distance driven between ti−1 and ti
Aux T : temperature scaling
Auxt : fraction of time auxiliaries are switched on
∆t : time
∆H pos : positive elevation changes
∆Hneg : negative elevation changes
ε : error term
n : number of data points in segment j

The constant motion factor (CMF), defined in (4), is the sum of kinetic energy changes per unit
distance and is equivalent to the acceleration term in (2). Because the sum of the positive and negative
kinetic energy changes over a segment are not necessarily equal, they are split up in CMF+ and CMF−
in (3). The CMF and aerodynamic factor (AF), defined in (5), are a translation of the speed profile for
this method and represent respectively the performed accelerations and driving speed.
The method to split the trips into micro-trips or segments is an important part in the complete
proposed method. The simplified linear representation of the energy consumption expressed in (3)
requires a minimum of aggregation of data points, but over-aggregation of the predictors leads to
loss of variability. A common practice in many analysis [37] consists of splitting trips into micro-trips
of equal duration. However, this method leads to an arbitrary division in segments, as there is no
link between the road characteristics and the segments. By splitting the trips into segments in an
arbitrary way, driving and road conditions are not represented uniformly over the segments and their
representation cannot be controlled. Additionally, splitting trips into segments of equal duration makes
the duration predictor constant, making it hard to detect its relation to the dependent variable through
linear regression. One method to divide trips into segments with variable duration is to link the data
Energies 2017, 10, 608 8 of 18

points to the road segments by location. The segment length will then be variable and the speed
profile2017,
Energies allocated
10, 608 to a specific road segments with its own characteristics. This method is therefore 8 ofthe
17
most sensible with respect to the complete proposed method. Applying this segmentation method,
segmentation
the length of method, the length
the obtained segmentsof the obtained
depends segments
entirely depends
on the lengthentirely on the
of the road length of The
segments. the road
road
segments.
segments’The roadrange
lengths segments’
from lengths range
a few tens from a up
of meters fewtotens of kilometers,
three meters up towith
three kilometers,
a high with a
concentration
high concentration
of very short segments.of very
Forshort
very segments. For very
short segments, the short
number segments, the number
of data points of data become
per segment points per
too
segment become
low to obtain too lowresults.
accurate to obtainHence,
accurate results. Hence,
sequential sequential
very short verywithin
segments short segments within one
one trip (containing
trip
less(containing
than 100 data lesspoints)
than 100
withdata points)road
identical withtypes
identical
wereroad types were
aggregated aggregated
to combined to combined
segments up to
segments up to 100 data points.
100 data points.

2.4.Speed
2.4. SpeedProfile
ProfilePrediction
Prediction
In the
In the energy
energy model
model presented
presented above,
above, the
the speed
speed profile
profile isis translated
translated into
into two
two predictors:
predictors: the the
speed-related AF and the acceleration-related CMF. The AF and CMF are
speed-related AF and the acceleration-related CMF. The AF and CMF are highly variable and highly variable and unknown
prior to departure.
unknown All other All
prior to departure. predictors in (3) are in
other predictors either known
(3) are eitherorknown
directlyormeasurable for a chosen
directly measurable for
aroute.
chosenToroute.
be ableTotobepredict
able tothe energy
predict consumption
the over the route,
energy consumption over thethe route,
valuestheof these
valuestwo predictors
of these two
of the energy
predictors estimation
of the model must
energy estimation be predicted.
model must be Ifpredicted.
we want to enable
If we want energy-efficient routing, this
to enable energy-efficient
prediction
routing, thismust be done
prediction for each
must individual
be done for eachsegment of the
individual road network
segment of the roadto allocate
networkan to energy
allocatecost.
an
Because
energy theBecause
cost. interactions between the
the interactions road characteristics,
between traffic situation
the road characteristics, and driver
traffic situation andare complex
driver are
and likely
complex to have
and likelynon-linear and interdependent
to have non-linear relations with
and interdependent the driving
relations with speed and accelerations
the driving speed and
performed, the
accelerations decision the
performed, wasdecision
taken towasdevelop
takenatomodel
developbased on machine
a model based on learning.
machineThe estimation
learning. The
technique used is a neural network (NN) [38]. The NN is a powerful technique
estimation technique used is a neural network (NN) [38]. The NN is a powerful technique for black for black box function
approximation,
box capable of predicting
function approximation, capable ofnon-linear,
predicting complex relations.
non-linear, complex A NN is trained
relations. to link
A NN attributes
is trained to
fromattributes
link the road from
and thethe traffic of the
road and theroad segments
traffic withsegments
of the road the actualwithmeasured AF and
the actual CMF. Figure
measured AF and4
illustrates
CMF. Figurethe4 principle
illustratesofthetheprinciple
NN inputs andNN
of the outputs.
inputs and outputs.

Figure 4. Schematic overview of the neural network (NN), its inputs and outputs.
Figure 4. Schematic overview of the neural network (NN), its inputs and outputs.

The available road-related attributes were the road type, altitude differences, indication of the
Thespeed,
average available
and road-related attributes
crossings, and were thewith
were extended roadpresence
type, altitude differences,
of traffic indication
lights, speed bumpsofand
the
average speed, and crossings, and were extended with presence of traffic lights, speed bumps
pedestrian crossings for Dataset 2. In case sequential very short segments of the same road type were and
aggregated to have sufficient data points, as explained in Section 2.3, their road related attributes
were aggregated as well. The traffic light information was added as static information to the road
database and merely indicates its presence on a segment, without information on signal phases. The
crossings were categorized as left turn, right turn, straight through and were categorized according
the magnitude of the angle. The measured average speed over a segment could have been used as a
Energies 2017, 10, 608 9 of 18

pedestrian crossings for Dataset 2. In case sequential very short segments of the same road type were
aggregated to have sufficient data points, as explained in Section 2.3, their road related attributes
were aggregated as well. The traffic light information was added as static information to the road
database and merely indicates its presence on a segment, without information on signal phases.
The crossings were categorized as left turn, right turn, straight through and were categorized according
Energies 2017, 10, 608 9 of 17
the magnitude of the angle. The measured average speed over a segment could have been used as a
predictor,
predictor,as asthe
thereal-time
real-timeaverage
average speed
speed can
can be
be imported
imported usingusing real-time
real-timetraffic
trafficservices.
services.However,
However,
because
becauseno nodata
datafromfrom traffic
traffic services
services were available that
were available that would
wouldallow
allowverification
verificationofofthis
thisassumption,
assumption,
ititwas opted not to do so and have a more conservative performance
was opted not to do so and have a more conservative performance of the prediction. The of the prediction. Thedataset
dataset
did not contain explicit characteristics of traffic, but the weather characteristics
did not contain explicit characteristics of traffic, but the weather characteristics (temperature and (temperature and
precipitation),
precipitation), timetime ofofthe
theday,
day,and
anddaydayof of
thethe week
week were were considered
considered implicit
implicit indicators
indicators of state.
of traffic traffic
state. Although the prediction of CMF and AF by the NN is based on many
Although the prediction of CMF and AF by the NN is based on many road-related attributes, weather road-related attributes,
weather characteristics
characteristics and implicitand traffic
implicit traffic indicators,
indicators, these parameters
these parameters do notofcomprise
do not comprise of all the
all the attributes,
attributes,
or account orfor
account for all complex
all complex interactions
interactions that influence
that influence the speed
the speed profile.
profile. UniqueUnique events,
events, suchsuchas
as accidents, sport events or road works, will have an influence on the traffic state
accidents, sport events or road works, will have an influence on the traffic state [39,40]. Individual [39,40]. Individual
driving
drivingstyle
stylecan
can modify
modify thethe speed profile, while
while the
the traffic
trafficlight
lightstatus
statuscancanchange
changeititfundamentally.
fundamentally.
AsAsthis
thisinformation
informationwas wasnot
not present
present in
in the available datasets, itit presents
presentssome
somelimitations
limitationsof ofthe
themodel
model
ininits current state.
its current state.

3.3.Results
Results
The
Theproposed
proposed model
model is aiscombination of a NN
a combination of for the prediction
a NN of the CMF
for the prediction ofand
theAF (representing
CMF and AF
the speed profile) on the road segments, followed by the MLR model to estimate
(representing the speed profile) on the road segments, followed by the MLR model to estimate the energy
the
consumption from thefrom
energy consumption predicted CMF, the CMF,
the predicted predicted AF, and the
the predicted remaining
AF, and the measurable parameters
remaining measurable
inparameters
(3). Based in
on (3).
the Based
schematic
on theoverview of the
schematic model, presented
overview in Figure
of the model, 1, a detailed
presented overview
in Figure of the
1, a detailed
proposed
overviewmodel
of the with its inputs
proposed model and outputs
with is given
its inputs in Figureis5.
and outputs given in Figure 5.

Figure 5. Detailed overview of the proposed model for energy consumption prediction. AF:
Figure 5. Detailed overview of the proposed model for energy consumption prediction. AF: aerodynamic
aerodynamic factor; CMF: constant motion factor.
factor; CMF: constant motion factor.

To construct the data-driven model, the selected datasets are first split up into 80–20% for
To constructofthe
training-testing thedata-driven
entire modelmodel, the selected
as a cascade datasets
of the NN are first
and MLR. Thesplit
80% up
for into 80–20%
training for
is then
training-testing
split up in 90%ofinthe entire and
training model as in
10% a cascade of the
validation NNNN
of the andspecifically.
MLR. The 80% Thefor training
data is thenand
partitioning split
up in 90% in training and 10% in validation
data process flow is illustrated in Figure 6. of the NN specifically. The data partitioning and data
process flow is illustrated in Figure 6.
aerodynamic factor; CMF: constant motion factor.

To construct the data-driven model, the selected datasets are first split up into 80–20% for
training-testing of the entire model as a cascade of the NN and MLR. The 80% for training is then
split up in 90% in training and 10% in validation of the NN specifically. The data partitioning and
Energies 2017, 10, 608 10 of 18
data process flow is illustrated in Figure 6.

Figure 6.
Figure 6. Overview
Overview of
of the
the data
data partition
partition and
and data
data process
process flow
flow for
for the
the proposed
proposed energy
energy consumption
consumption
prediction model.
prediction model.

There is no specific test set for the NN, as the test set will serve to evaluate the complete cascade
model. The filtering process focused on mainly three issues: the correct spatial joining between
the GPS coordinates and the road database (for example if the vehicles drove on a factory site—the
spatial joining would then incorrectly link those GPS points to the nearest road segment), the correct
synchronization between the CAN data and GPS data, and the occurrence of a charging event during
the segment. The results of the energy estimation model, NN prediction and the complete proposed
model for energy consumption prediction will be presented in Sections 3.1–3.3 respectively.

3.1. Energy Estimation Model


Applying the MLR with the segmentation based on (3) for Dataset 1 and Dataset 2 result in the
correlation coefficient, regression coefficients and p-values presented in Table 2. Comparison of the
results for Dataset 1 and Dataset 2 shows that the energy estimation model is vehicle-specific—as the
regression coefficients are different—but have similar order of magnitudes and trends. All p-values
for the regression coefficients B1 –B7 are below 0.0001, indicating these terms are very significant.
The MLR also generates an intercept, which is a constant term (or offset) that equals the prediction
when all predictors are zero. However, the vehicle dynamics in (1), leading to the simplified linear
representation of the energy consumption in (2), do not have a constant term. This means no physical
interpretation can be given to this intercept term and a part of the variability which is not explained by
the model is contained within the intercept term for a better fit.

Table 2. Overview of the regression coefficients and p-values for energy estimation model based for
Dataset 1 and Dataset 2.

MLR Results of the Energy Estimation Model


Rolling Positive Negative Positive Negative
Aerodynamic Auxiliaries
Coefficient Intercept Resistance Accelerations Accelerations Altitude Altitude
(B2 ) (B7 )
(B1 ) (B3 ) (B4 ) (B5 ) (B6 )
Dataset 1 Bi −0.0071 0.0670 1.10 × 10−5 2.75 × 10−5 1.64 × 10−5 0.00423 0.00389 0.136
R2 = 0.96 p-values <0.5 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001
Dataset 2 Bi −0.0024 0.0541 5.23 × 10−6 1.92 × 10−5 1.21 × 10−5 0.00341 0.00259 0.297
R2 = 0.93 p-values <0.05 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001

For both datasets, the coefficient for the positive altitude changes is larger than the coefficients
for the negative altitude changes, which is physically correct as only part of the potential energy
will be recovered in kinetic energy or electrical energy through regenerative braking. The ratio of
the negative to the positive coefficient for Dataset 2, 0.00259/0.00341 = 76%, is consistent with the
typical drivetrain efficiency [5] and efficiency of regenerative braking [41]. However, for Dataset 1
this value of 0.389/0.423 = 93% is high. It should also be mentioned that for Dataset 1, the value of
these coefficients was sensitive to the trips contained in the dataset and had subsets where the negative
coefficient was larger than the positive, which is physically impossible because it would mean more
energy is recovered during downhill than what was consumed during uphill. In energy-efficient
routing this must be avoided because it will potentially lead to a suggestion of routes with many
Energies 2017, 10, 608 11 of 18

altitude differences instead of avoiding them. The reason for the sensitivity of the coefficients for
the altitude contribution in Dataset 1 is suspected to be the small variability of the altitude in the
geographic region of Dataset 1, which makes it harder to detect its influence in the variability of the
energy consumption. However, because the altitude differences are so small in this region, this will
also introduce low average absolute errors compared to the total energy estimation.

3.2. Speed Profile Prediction


The data for the NN is split up into 90% for training and 10% for validation. The regression plots
for the prediction
Energies of the
AF and CMF on Dataset 2 are given in Figures 7 and 8.
2017, 10, 608
Energies 2017, 10, 608
11 of 17
11 of 17

Figure 7. Regression
7. Regression plotplot
for for
thethe prediction
prediction of of
thethe aerodynamicpredictor
aerodynamic predictor over the
the segments by
bythe
Figure
Figure 7. Regression plot for the prediction of the aerodynamic predictorover
over the segments
segments by the NN
the
NN
on the on the
training training
set and set and validation
validation set of set of Dataset
Dataset 2. 2.
NN on the training set and validation set of Dataset 2.

Figure 8. Regression plot for the prediction of the CMF over the segments by the NN on the training
Figure 8. Regression plot for the prediction of the CMF over the segments by the NN on the training
Figure
set8. Regression
and validation plot
set offor the prediction
Dataset 2. of the CMF over the segments by the NN on the training
set and validation set of Dataset 2.
set and validation set of Dataset 2.
The correlation coefficient of 0.97 and 0.93 for the two respective predictions is high. However,
The correlation coefficient of 0.97 and 0.93 for the two respective predictions is high. However,
there is a spread of the data points around the blue line, especially for the CMF prediction, which
The
therecorrelation
is a spread coefficient of 0.97around
of the data points and 0.93
thefor
bluethe two
line, respective
especially for predictions is high.which
the CMF prediction, However,
indicates not all the variability is explained by the current predictors. This is due to the fact that the
thereindicates not all
is a spread of the
thevariability
data points is explained by the
around the current
blue line, predictors.
especiallyThis is due
for the CMFto the fact that the
prediction, which
driven speed and accelerations are complex variables with a lot of interdependent influencing factors
driven speed and accelerations are complex variables with a lot of interdependent influencing factors
and they are sensitive to random events. When including explicit information on traffic density and
and they are sensitive to random events. When including explicit information on traffic density and
driving style, more variability of the data can be explained, similarly to what was observed after
driving style, more variability of the data can be explained, similarly to what was observed after
including the traffic lights and speed bumps in Dataset 2.
including the traffic lights and speed bumps in Dataset 2.
Table 3 presents the correlation coefficient R and the mean squared error (MSE) for the
Table 3 presents the correlation coefficient R and the mean squared error (MSE) for the
prediction of the AF and CMF for both Dataset 1 and Dataset 2. The values obtained in Table 3 have
prediction of the AF and CMF for both Dataset 1 and Dataset 2. The values obtained in Table 3 have
Energies 2017, 10, 608 12 of 18

indicates not all the variability is explained by the current predictors. This is due to the fact that the
driven speed and accelerations are complex variables with a lot of interdependent influencing factors
and they are sensitive to random events. When including explicit information on traffic density and
driving style, more variability of the data can be explained, similarly to what was observed after
including the traffic lights and speed bumps in Dataset 2.
Table 3 presents the correlation coefficient R and the mean squared error (MSE) for the prediction
of the AF and CMF for both Dataset 1 and Dataset 2. The values obtained in Table 3 have a very stable
behavior when calculated multiple times.

Table 3. Correlation coefficient (R) and mean squared error (MSE) for the prediction of the AF and
CMF for both Dataset 1 and Dataset 2.

Dataset 1 Dataset 2
Variable
R MSE R MSE
AF 0.97 2.57 × 106 0.97 4.17 × 105
CMF 0.91 5.37 × 106 0.93 9.37 × 105

According to the values presented in Table 3, the NN is more performant for Dataset 2. This is
attributed to the extension of the input parameters for the prediction with traffic lights, pedestrian
crossings and speed bumps in Dataset 2. This information was not available in Dataset 1. However,
the large difference in the MSE between the 2 datasets is also partly due to the different sizes of the
road segments in both sets. On average, the larger road segments in Dataset 1 will cause larger absolute
errors with similar relative errors.

3.3. Energy Consumption Prediction Model


The proposed method is a combination of the MLR for the energy consumption estimation, based
on the vehicle dynamics equation, and a NN for the prediction of the speed profile. Using a multistep
model can lead to an increased loss of accuracy as the error is accumulated in each step. To benchmark
the results, the results of the proposed model were compared to two other models. The first benchmark
model is a NN prediction that was trained with the same parameters as the proposed model in the
input, and the energy consumption in the output. The second benchmark model is a simple calculation
of the energy consumption by multiplication of the distance with the total real-world measured average
energy consumption. The models’ performance will be analyzed through two absolute error indicators
which are calculated on the part of the data preserved for testing: root-mean-square error (RMSE)
and mean absolute error (MAE). If the error is defined as the target value (vt ) minus the predicted
value (v p ), the indicators are calculated as:

ε= vt − v p (6)
r
∑1n ε2
RMSE = (7)
n
∑1n |ε|
MAE = (8)
n
The root-mean-square error penalizes the predictions far off target. The performance indicators
for the NN–MLR model and the two benchmark models for Dataset 1 and Dataset 2 are presented in
Table 4 also indicates the average consumed energy per segment, <E>, for both datasets.
Energies 2017, 10, 608 13 of 18

Table 4. Performance indicators for the energy prediction model and the two benchmark models for
comparison on segment level. RMSE: root-mean-square error; MAE: mean absolute error; <E>: average
consumed energy per segment.

Performance NN–MLR NN Average Consumption


Dataset <E> (kWh)
Indicator Prediction Prediction Prediction
RMSE (kWh) 0.0733 0.0923 0.0723
Dataset 1 MAE (kWh) 0.0516 0.0615 0.0503 0.159
MAE/<E> 0.32 0.38 0.32
RMSE (kWh) 0.0267 0.0335 0.0341
Dataset 2 MAE (kWh) 0.0190 0.0230 0.0232 0.0336
MAE/<E> 0.57 0.68 0.69

Regarding the performance indicators, Table 4 shows that for Dataset 1, the NN–MLR and
average2017,
Energies consumption
10, 608 predictions have a similar performance and outperform the direct NN prediction.
13 of 17
For Dataset 2, the NN–MLR clearly outperforms the two other prediction methods, which have a
have
similara similar performance.
performance. This is consistent
This is consistent with thewith the observation
observation from
from Table Table
3 in 3 in 3.2,
Section Section 3.2,showed
which which
showed that the prediction
that the prediction of the AF of and
the AF
CMFand hadCMF hadresults
better better inresults in Dataset
Dataset 2 due to 2 due to the presence
the presence of
of traffic
traffic lightsspeed
lights and and speed
bumps bumps
in theinNN
the predictors.
NN predictors. The The performance
performance of three
of all all three prediction
prediction methods
methods is
is significantly lower in Dataset 2 compared to Dataset 1. This can be explained by
significantly lower in Dataset 2 compared to Dataset 1. This can be explained by the difference in the the difference in
the composition
composition of road
of the the road network
network of both
of both datasets.
datasets. Dataset
Dataset 1 contains
1 contains a mixture
a mixture of highway,
of highway, ruralrural
and
and urban roads, with a considerable number of rural roads, while Dataset 2
urban roads, with a considerable number of rural roads, while Dataset 2 primarily consists of denseprimarily consists of
dense urban This
urban roads. roads. This in
results results in significantly
significantly shorter
shorter road road segments
segments with muchwith moremuch more
diverse diverse
conditions
conditions and lower
and lower average average
energy energy consumption.
consumption. This also
This also contributes to contributes
the power oftothe theNNpower of thestage
prediction NN
prediction stage in the NN–MLR and its better performance for in Dataset 2.
in the NN–MLR and its better performance for in Dataset 2. To visualize the energy consumption To visualize the energy
consumption
prediction prediction
for the individualforsegments,
the individual segments,
the regression plotthe
forregression plotoffor
the prediction thethe prediction
energy of the
consumption
energy consumption
of the segments of the2segments
in Dataset in Figure
in given in Dataset9.2 in given in Figure 9.

Figure
Figure 9.
9. Regression
Regressionplot
plotfor
forthe
theNN–MLR
NN–MLRprediction
predictionofofthe
theenergy consumption
energy onon
consumption thethe
segments in
segments
Dataset 2.
in Dataset 2.

If the segments are recombined to the original trips, we can evaluate the performance of the
If the segments are recombined to the original trips, we can evaluate the performance of the
prediction methods on trip level. The same performance indicators for the prediction of the segments
prediction methods on trip level. The same performance indicators for the prediction of the segments
are calculated for the trips and presented in Table 5. The performance indicators on trip level have
are calculated for the trips and presented in Table 5. The performance indicators on trip level have the
the same trend as on segment level: the average consumption prediction performs best for Dataset 1
same trend as on segment level: the average consumption prediction performs best for Dataset 1 by a
by a small margin, the NN–MLR performs best in Dataset 2 by a significant margin. The values of the
MAE per average consumed energy decreases significantly for the prediction on trip level. This is
because the errors on the segment prediction are symmetrically distributed and partly cancel each
other out when recombined to trips. To visualize the energy consumption prediction for the
individual trips, the regression plot for the prediction of the energy consumption of the trips in
Energies 2017, 10, 608 14 of 18

small margin, the NN–MLR performs best in Dataset 2 by a significant margin. The values of the MAE
per average consumed energy decreases significantly for the prediction on trip level. This is because
the errors on the segment prediction are symmetrically distributed and partly cancel each other out
when recombined to trips. To visualize the energy consumption prediction for the individual trips,
the regression plot for the prediction of the energy consumption of the trips in Dataset 2 is given in
Figure 10. The figure demonstrates the good results for the NN–MLR prediction on trip level.

Table 5. Performance indicators for the energy prediction model and the two benchmark models for
comparison on trip level.

Performance NN–MLR NN Average Consumption


Dataset <E> (kWh)
Indicator Prediction Prediction Prediction
RMSE (kWh) 0.605 0.539 0.471
Dataset 1 MAE (kWh) 0.335 0.364 0.316 2.4
MAE/<E> 0.14 0.15 0.13
RMSE (kWh) 0.142 0.178 0.178
Dataset 2 MAE (kWh) 0.0917 0.119 0.119 0.78
MAE/<E> 0.12 0.15 0.15
Energies 2017, 10, 608 14 of 17

Figure 10.
Figure 10. Regression
Regressionplot
plotfor
forthe
theNN–MLR
NN–MLRprediction
predictionofofthe
theenergy consumption
energy onon
consumption thethe
trips in
trips
Dataset 2.
in Dataset 2.

To evaluate how much of the prediction error in the NN–MLR is attributed to the MLR, the
To evaluate how much of the prediction error in the NN–MLR is attributed to the MLR,
measured values of the CMF and AF can be inserted into the MLR, instead of the predicted values by
the measured values of the CMF and AF can be inserted into the MLR, instead of the predicted
the NN stage. Table 6 shows the performance indicators on segment and trip level of for Dataset 1
values by the NN stage. Table 6 shows the performance indicators on segment and trip level of for
and Dataset 2, in the case the measured values of the AF and CMF are the inputs to the MLR. The
Dataset 1 and Dataset 2, in the case the measured values of the AF and CMF are the inputs to the MLR.
results show a MAE of 7.1% and 8.5% of the mean energy consumed per trip for Dataset 1 and Dataset
The results show a MAE of 7.1% and 8.5% of the mean energy consumed per trip for Dataset 1 and
2, respectively. This represents approximately half of the mean error created by the NN–MLR. It is
Dataset 2, respectively. This represents approximately half of the mean error created by the NN–MLR.
important to observe the high value of MAE/<E> for the estimation on the segments in Dataset 2. The
It is important to observe the high value of MAE/<E> for the estimation on the segments in Dataset 2.
large error for short segments demonstrates the incapability of the MLR to estimate the instant power
The large error for short segments demonstrates the incapability of the MLR to estimate the instant
consumption and the need for aggregation of data points.
power consumption and the need for aggregation of data points.
Table 6. Performance indicators MLR energy estimation model on trip and segment level by
considering the measured CMF and aerodynamic predictor as input.

Dataset 1 Dataset 2
Indicator
RMSE MAE MAE/<E> RMSE MAE MAE/<E>
MLR estimation for segment 0.0350 0.0231 0.15 0.0146 0.0133 0.39
MLR estimation for trip 0.275 0.170 0.071 0.465 0.0663 0.085

The results presented here are averages for the prediction on segments originating from a
Energies 2017, 10, 608 15 of 18

Table 6. Performance indicators MLR energy estimation model on trip and segment level by considering
the measured CMF and aerodynamic predictor as input.

Dataset 1 Dataset 2
Indicator
RMSE MAE MAE/<E> RMSE MAE MAE/<E>
MLR estimation for segment 0.0350 0.0231 0.15 0.0146 0.0133 0.39
MLR estimation for trip 0.275 0.170 0.071 0.465 0.0663 0.085

The results presented here are averages for the prediction on segments originating from a random
selection from all road segments in the datasets, which are not limited to a specific selection of an
area in the road network or specific road types (such as highway or arterials), as is often done in
literature [29]. The NN–MLR has a better overall performance than the two benchmark models and
has several other advantages. It performs significantly better when diverse conditions that influence
energy consumption and driving behavior are present. This illustrates the power of a NN in the
prediction of these non-linear systems. If the energy consumption is close to the average energy
consumption (in less diverse conditions), the NN–MLR loses part of its advantage compared to the
average consumption model, because, by cascading both models, the error produced in the NN is
propagated in the MLR. However, cascading both models grants more flexibility and preserves the link
with the underlying physical relationships. This facilitates interpretation of (causal) relations between
the inputs and outputs of the model.

4. Conclusions
This paper presents a data-driven energy consumption prediction method for EVs, suited for
energy-efficient routing. It uses a cascade of a NN and a linear regression model. The MLR model is
used to estimate the energy consumption, given a number of predictor variables, while the NN serves
to predict the unknown predictor variables (inputs) of the MLR. The proposed method predicts the
energy consumption on the individual segments of the road network, allowing a cost allocation to each
link in the road network, so cost-optimization algorithms can define energy-efficient routes. The MLR
is performed on smaller parts of trips (segments) to capture more variability in the data. It was decided
to segment the trips based on the actual road segments in the network instead of an arbitrary division
in order to allocate driving parameters to the road characteristics of the segments. The NN is trained
to predict the speed profile, here translated in an AF and CMF (representing accelerations), from road-,
traffic-, and weather-related attributes. It is the cascade of first the NN for speed profile prediction
and thereafter the regression for energy estimation that form the proposed energy prediction model.
To evaluate its performance, the proposed NN–MLR model is compared to two benchmark models.
A first benchmark model is a NN that directly predicts the energy consumption from the road and
traffic related attributes, omitting the regression part, and the second benchmark model is a simple
estimation calculated with the total average consumption. The NN–MLR prediction has an overall
better performance than both benchmark models. In a dense urban environment, subject to more
diverse conditions, the NN–MLR prediction has a significantly better performance than the other
two. When recombined to trips, the performance increases as errors on segments are symmetrically
distributed and partly cancel each other out. For the proposed complete energy prediction model,
approximately half of the total error can be allocated to the NN prediction of the CMF and AF.
The proposed NN–MLR has a MAE that is 12–14% of the average trip consumption of which only
7–9% is caused by the MLR energy estimation itself. These results are averages for the prediction
originating from a random selection from all road segments in the datasets, and is not limited to a
specific selection of an area in the road network or specific road types, as is often done in literature.
The model results show it is, on average, able to predict the energy consumption more precisely
than an average consumption model. It distinguishes different energy consumption influencing factors
per road segment (such as road characteristics, weather, altitude differences), making this approach
Energies 2017, 10, 608 16 of 18

suited for energy consumption prediction for any given road in the network prior to departure and
enables cost-optimization algorithms to calculate energy-efficient routes. Furthermore, by separating
the model in a stage for the prediction of the speed profile and a stage for the energy consumption
estimation (MLR), it benefits from the power and flexibility of data mining techniques, while preserving
the interpretability of the results because of the preserved link with the underlying physical model.
The data-driven approach allows this method to be easily applied to other EVs and allows for the
developed model to be easily updated over time to adjust to changing conditions.

Acknowledgments: The authors would like to acknowledge: the Agency for Innovation by Science and Technology
in Flanders (IWT) as the funder for the PhD grant of the first author; Punch Powertrain for their contribution in
providing data; and Flanders Make for the support to our team.
Author Contributions: Cedric De Cauwer processed the data, performed the data-analysis, developed the
architecture of the proposed model and wrote the paper.; Wouter Verbeke contributed in the data-analysis and the
development of the machine learning part of the model; Saphir Faid provided part of the vehicle monitoring data
and reviewed the writing; Thierry Coosemans and Joeri Van Mierlo provided guidance in the data-analysis and
reviewed the writing.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Messagie, M.; Boureima, F.S.; Coosemans, T.; Macharis, C.; van Mierlo, J. A range-based vehicle life cycle
assessment incorporating variability in the environmental assessment of different vehicle technologies and
fuels. Energies 2014, 7, 1467–1482. [CrossRef]
2. European Alternative Fuel Observatory. Available online: http://www.eafo.eu/ (accessed on 24 March 2017).
3. Rezvani, Z.; Jansson, J.; Bodin, J. Advances in consumer electric vehicle adoption research: A review and
research agenda. Transp. Res. D Transp. Environ. 2015, 34, 122–136. [CrossRef]
4. Grunditz, E.; Thiringer, T. Performance Analysis of Current BEVs—Based on a Comprehensive Review of
Specifications. IEEE Trans. Transp. Electr. 2016, 7782, 270–289. [CrossRef]
5. De Cauwer, C.; Maarten, M.; Coosemans, T.; van Mierlo, J.; Heyvaert, S. Electric vehicle use and energy
consumption based on real-world electric vehicle fleet trip and charge data and its impact on existing EV
research models. In Proceedings of the EVS28, International Electric Vehicle Symposium and Exhibition,
Kintex, Korea, 3–6 May 2015.
6. Laurikko, J.; Granstrom, R.; Haakana, A. Realistic estimates of EV range based on extensive laboratory and
field tests in Nordic climate conditions. In Proceedings of the 2013 World Electric Vehicle Symposium and
Exhibition (EVS27), Barcelona, Spain, 18–20 October 2013.
7. De Vroey, L.; Jahn, R.; el Baghdadi, M.; van Mierlo, J. Plug-to-wheel energy balance—Results of a two
years experience behind the wheel of electric vehicles. In Proceedings of the EVS27 International Battery,
Hybrid and Fuel Cell Electric Vehicle Symposium 11, Barcelona, Spain, 18–20 October 2013.
8. Nilsson, M. Electric Vehicles: The Phenomenon of Range Anxiety. 2011. Available online: http://e-mobility-
nsr.eu/fileadmin/user_upload/downloads/info-pool/the_phenomenon_of_range_anxiety_elvire.pdf
(accessed on 24 April 2017).
9. Sentoff, K.M.; Aultman-Hall, L.; Holmén, B.A. Implications of driving style and road grade for accurate
vehicle activity data and emissions estimates. Transp. Res. Part Transp. Environ. 2015, 35, 175–188. [CrossRef]
10. Yao, E.; Yang, Z.; Song, Y.; Zuo, T. Comparison of electric vehicle’s energy consumption factors for different
road types. Discret. Dyn. Nat. Soc. 2013, 2013, 328757. [CrossRef]
11. Bar, T.; Nienhuser, D.; Kohlhaas, R.; Zollner, J.M. Probabilistic driving style determination by means of a
situation based analysis of the vehicle data. In Proceedings of the 2011 14th International IEEE Conference
on Intelligent Transportation Systems (ITSC), Washington, DC, USA, 5–7 October 2011; pp. 1698–1703.
12. Ellison, A.B.; Greaves, S.P.; Bliemer, M.C.J. Driver behaviour profiles for road safety analysis. Accid. Anal. Prev.
2015, 76, 118–132. [CrossRef] [PubMed]
13. Wu, X.; He, X.; Yu, G.; Harmandayan, A.; Wang, Y. Energy-Optimal Speed Control for Electric Vehicles on
Signalized Arterials. IEEE Trans. Intell. Transp. Syst. 2015, 16, 2786–2796.
14. Yuksel, T.; Michalek, J.J. Effects of Regional Temperature on Electric Vehicle Efficiency, Range, and Emissions
in the United States. Environ. Sci. Technol. 2015, 49, 3974–3980. [CrossRef] [PubMed]
Energies 2017, 10, 608 17 of 18

15. Kambly, K.R.; Bradley, T.H. Estimating the HVAC energy consumption of plug-in electric vehicles.
J. Power Sources 2014, 259, 117–124. [CrossRef]
16. De Cauwer, C.; van Mierlo, J.; Coosemans, T. Energy Consumption Prediction for Electric Vehicles Based on
Real-World Data. Energies 2015, 8, 8573–8593. [CrossRef]
17. Wager, G.; McHenry, M.P.; Whale, J.; Bräunl, T. Testing energy efficiency and driving range of electric vehicles
in relation to gear selection. Renew. Energy 2014, 62, 303–312. [CrossRef]
18. Dib, W.; Chasse, A.; Moulin, P.; Sciarretta, A.; Corde, G. Optimal energy management for an electric vehicle
in eco-driving applications. Control Eng. Pract. 2014, 29, 299–307. [CrossRef]
19. Shankar, R.; Marco, J. Method for estimating the energy consumption of electric vehicles and plug-in hybrid
electric vehicles under real-world driving conditions. Intell. Transp. Syst. IET 2013, 7, 138–150. [CrossRef]
20. Badin, F.; le Berr, F.; Castel, G.; Pasquier, M. Energy efficiency evaluation of a Plug-in Hybrid Vehicle under
European procedure, Worldwide harmonized procedure and actual use. In Proceedings of the EVS28
International Electric Vehicle Symposium and Exhibition, Kintex, Korea, 3–6 May 2015.
21. Neaimeh, M.; Hill, G.A.; Hübner, Y.; Blythe, P.T. Routing systems to extend the driving range of electric
vehicles. Intell. Transp. Syst. IET 2013, 7, 327–336. [CrossRef]
22. Wang, J.; Besselink, I.; Nijmeijer, H. Electric vehicle energy consumption modelling and prediction based on
road information. In Proceedings of the EVS28 International Electric Vehicle Symposium and Exhibition,
Kintex, Korea, 3–6 May 2015.
23. Ondruska, P.; Posner, I. Probabilistic attainability maps: Efficiently predicting driver-specific electric vehicle
range. In Proceedings of the 2014 IEEE Intelligent Vehicles Symposium, Dearborn, MI, USA, 8–11 June 2014;
pp. 1169–1174.
24. Karbowski, D.; Pagerit, S.; Calkins, A. Energy consumption prediction of a vehicle along a user-specified
real-world trip. In Proceedings of the 26th EVS international Electric Vehicle Symposium and Exhibition,
Los Angeles, CA, USA, 6–9 May 2012; Volume 3, pp. 2081–2092.
25. Zhang, R.; Yao, E. Electric vehicles’ energy consumption estimation with real driving condition data.
Transp. Res. D Transp. Environ. 2015, 41, 177–187. [CrossRef]
26. Lee, T.K.; Filipi, Z.S. Synthesis and validation of representative real-world driving cycles for plug-in hybrid
vehicles. In Proceedings of the 2010 IEEE Vehicle Power and Propulsion Conference (VPPC), Lille, France,
1–3 September 2010.
27. Wang, H.; Zhang, X.; Ouyang, M. Energy consumption of electric vehicles based on real-world driving
patterns: A case study of Beijing. Appl. Energy 2015, 157, 710–719. [CrossRef]
28. Grubwinkler, S.; Hirschvogel, M.; Lienkamp, M. Driver- and situation-specific impact factors for the energy
prediction of EVs based on crowd-sourced speed profiles. In Proceedings of the 2014 IEEE Intelligent Vehicles
Symposium, Dearborn, MI, USA, 8–11 June 2014; pp. 1069–1076.
29. Boriboonsomsin, K.; Barth, M.J.; Zhu, W.; Vu, A. Eco-Routing Navigation System Based on Multisource
Historical and Real-Time Traffic Information. IEEE Trans. Intell. Transp. Syst. 2012, 13, 1694–1704. [CrossRef]
30. Jiménez, F.; Cabrera-Montiel, W. System for road vehicle energy optimization using real time road and traffic
information. Energies 2014, 7, 3576–3598. [CrossRef]
31. Coosemans, T.; Lebeau, K.; Macharis, C.; Lievens, B.; van Mierlo, J. Living labs for electric vehicles in
Flanders. World Electr. Veh. J. 2012, 5, 1005–1010.
32. Vlaamse Proeftuin Elektrische Voertuigen. Available online: http://proeftuin-ev.be/?_ga=1.210822173.
940857972.1470907495 (accessed on 1 November 2016).
33. Home|Punchpowertrain. Available online: http://www.punchpowertrain.com/ (accessed on 1 November 2016).
34. Asamer, J.; Graser, A.; Heilmann, B.; Ruthmair, M. Sensitivity analysis for energy demand estimation of
electric vehicles. Transp. Res. D 2016, 46, 182–199. [CrossRef]
35. Ejsmont, J.; Sjögren, L.; Świeczko-Żurek, B.; Ronowski, G. Influence of Road Wetness on Tire-Pavement
Rolling Resistance. J. Civ. Eng. Archit. 2015, 9, 1302–1310.
36. Wang, J.; Besselink, I.; Nijmeijer, H. Online prediction of battery electric vehicle energy consumption.
In Proceedings of the 29th EVS international Electric Vehicle Symposium and Exhibition, Montréal, QC,
Canada, 19–22 June 2016.
37. André, M.; Keller, M.; Sjödin, Å.; Gadrat, M. The Artemis European tools for estimating the transport
pollutant emissions. In Proceedings of the 18th International Emission Inventories Conference, Baltimore,
MD, USA, 14–17 April 2009.
Energies 2017, 10, 608 18 of 18

38. Bishop, C.M. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995.
39. Zhang, R.; Shu, Y.; Yang, Z.; Cheng, P.; Chen, J. Hybrid Traffic Speed Modeling and Prediction Using
Real-World Data. In Proceedings of the 2015 IEEE International Congress on Big Data, BigData Congress,
New York, NY, USA, 27 June–2 July 2015; pp. 230–237.
40. Walker, G.; Calvert, M. Driver behaviour at roadworks. Appl. Ergon. 2015, 51, 18–29. [CrossRef] [PubMed]
41. Advanced Powertrain Research Facility. Nissan Leaf Testing and Analysis; Advanced Powertrain Research
Facility: Argonne, IL, USA, 2012.

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

You might also like