Multivariate Short-Term Traffic Flow Prediction Based On Real-Time Expressway Toll Plaza Data Using Non-Parametric Techniques

32 Int. J. Vehicle Information and Communication Systems, Vol. 7, No.
1, 2022
Multivariate short-term traffic flow prediction

based on real-time expressway toll plaza data
using non-parametric techniques
Annu Mor* and Mukesh Kumar

University Institute of Engineering and Technology,
Panjab University,
Chandigarh, Punjab, India
Email: annu_mor@pu.ac.in
Email: mukesh_rai9@pu.ac.in
*Corresponding author
Abstract: Accurate real-time traffic flow prediction is a vital component of

Intelligent Transportation System (ITS). The real-time traffic flow prediction
helps transportation authorities as well as travellers for better route guidance.
In this study, a novel approach is proposed for accurate Toll Plaza traffic
prediction by introducing heterogeneous data sources other than traffic volume
data. Toll data is analysed with exogenous factors such as weather conditions
and holidays. Here, ten non-parametric techniques are applied for traffic
prediction on real-time multivariate data set. The proposed approach is
validated using data collected from Pinjore-Kalka Toll Plaza, Chandigarh,
India. The performances of the non-parametric models are compared on the
basis of mean square error, absolute mean square error, coefficient of
determination and correlation. The experimental results revealed that random
forest regression technique outperforms other techniques and achieved
accuracy 90%. The proposed approach can be used for further proxy measure
of Level of Service (LOS) to design the existing infrastructure more efficiently
for the application purpose in smart cities.
Keywords: traffic flow; ITS; intelligent transportation system; non-parametric

technique; multivariate time series data set; proxy measure level of service.
Reference to this paper should be made as follows: Mor, A. and Kumar, M.

(2022) ‘Multivariate short-term traffic flow prediction based on real-time
expressway toll plaza data using non-parametric techniques’, Int. J. Vehicle
Information and Communication Systems, Vol. 7, No. 1, pp.32–50.
Biographical notes: Annu Mor is a research scholar pursuing PhD from UIET
Panjab University, Chandigarh. She received her Master’s in Engineering
(2012) and Bachelors of Engineering (2010) in Computer Science and
Engineering. She is GATE qualified and authored many peer reviewed research
papers in international conferences and journals. Her research interests are
artificial intelligence, image processing and data mining.
Mukesh Kumar is an Associate Professor and received his Doctorate of

Philosophy in 2014 from UIET Panjab University, Chandigarh. He received his
Masters in Technology (2004) and Bachelors of Technology (2002) in
Computer Science and Engineering. He has authored over 25 peer reviewed
research papers in international conferences and journals. His research interests
are data mining, algorithms and soft computing.
Copyright © 2022 Inderscience Enterprises Ltd.

Multivariate short-term traffic flow prediction 33
1 Introduction
In the era of urbanisation and development of an economy, numbers of vehicles are

growing exponentially and existing road infrastructure is not capable of handling a large
number of vehicles. To relieve the congestion traffic state, two approaches can be
considered. One way is to expand the existing infrastructure, which is not frequently used
due to limited resources such as use-able land, operational cost. The other way is to
utilise existing infrastructure by using various control strategies (Lécué et al., 2014). The
control strategy includes short-term traffic prediction, an interesting research area for
decades in Intelligent Transportation System (ITS). The traffic prediction is useful to
provide insightful information to design traffic planning rules, enhance traffic safety and
traffic control.
Owing to the stochastic time-varying characteristic of traffic flow, travel time and
accurately predicting traffic flow is a tedious task. Short-term traffic forecast primarily
depends on traffic data. However, the widely deployed traffic detectors quickly provide
data and lay out for plenty of traffic prediction studies using this generated data source.
Mainly these studies rely on stationary sensors (inductive loop, camera) (Bezuglov and
Comert, 2016), floating car data (Ma et al., 2015), GSM probe data (Liu et al., 2014) or
toll plaza data to forecast travel speed, travel time of traffic flow in short-term duration
(Bharti et al., 2017; Su et al., 2018).
For accurate traffic prediction, various time series models have been proposed. The
traffic prediction models are basically divided into two approaches: data-driven and less-
data-demanding. The data-driven approaches involve methods like Bayesian Network
(Xu et al., 2014a), regression (Lu and Shladover, 2014), Artificial Neural Network
(Zhang et al., 1998), Random Forest (Guo et al., 2018), Decision tree (Xu et al., 2014b),
Deep learning (Polson and Sokolov, 2017; Zhang et al., 2017; Wu et al., 2018) which try
to capture the relationship between the available data and the corresponding output.
While less-data-demanding approaches involve Kalman Filter (Zhao et al., 2017), Grey-
Model (Smith and Demetsky, 1997), ARIMA (Autoregressive Integrated Moving
Average) (Badhrudeen et al., 2016) and SARIMA (Seasonal Autoregressive Integrated
Moving Average) (Kumar and Vanajakshi, 2015). As traffic flow pattern in urban
environments generally exhibits a spatiotemporal behaviour characterised by non-
uniform patterns, it is challenging for a single prediction method to capture such a
dynamic pattern. Traffic forecasting models are slowly shifting from less-data-
demanding to data-driven or Computational Intelligence (CI) approaches (Vlahogianni
et al., 2014). The computational approaches are more robust to process outliers, noisy as
well as missing data. Computational intelligence methods are more efficient to handle
complex data without prior knowledge of the problem, the capability to learn patterns
from data, even can recognise the existing relationships of traffic flow even when they
are not detectable (Chakroborty et al., 2016).
In recent years, large numbers of short-term prediction models are proposed for
highway, interstate and city using different traffic sensor data (Xu et al., 2014a; Lu and
Shladover, 2014; Zhang et al., 1998; Guo et al., 2018; Xu et al., 2014b; Polson and
Sokolov, 2017; Zhang et al., 2017; Wu et al., 2018; Christopher et al., 2016). In
developing countries like India, little attention is paid on toll-plaza entity need to be
explored for traffic flow prediction. Toll plazas are required so that the revenue for
maintenance and enhancement of surface transportation shall be generated. However, toll
plazas cause hindrance in a smooth flow of traffic a certain situation that needs to be
34 A. Mor and M. Kumar
addressed to minimise congestion (Wang et al., 2017). The improperly designed toll
plaza causes inconvenience to travellers and reduction in mobility at various locations
(Zhu et al., 2017). Toll plaza still needs to be explored by traffic flow prediction, which
helps to design effective traffic guidance strategy (Kim et al., 2016). Design of toll plaza
including a number of toll-lanes, merging patterns and service provided in toll-lane
(Antônio et al., 2016). These characteristics help to reduce the accident rate, reduce road
construction, operational cost and enhance traffic capacity.
Recent studies have primarily focused on microscopic simulation model by using
simulated data for discrete events based on the traffic flow theory (Bartin, 2018; Abadi
et al., 2015). On the other hand, most traditional prediction models are based on a single
data type method such as volume, speed, travel time and occupancy. The information
provided is not enough for public or traffic administration for decision making especially
in context of Indian scenarios due to heterogeneous vehicle type, limited data resources.
Different machine learning methods use different approaches to learn relationship
from training data set (Christopher et al., 2016). There is no single prediction model that
gives best results on all traffic variables on non-recurrent traffic condition (traffic caused
by holiday, event and weather conditions) at particular location (Kim and Wang, 2016).
The short-term traffic forecast involves the prediction of traffic flow over the time
duration of a few seconds to few hours into the future using current and historical
measurements of traffic variables (Habtemichael and Cetin, 2016; Qi and Ishak, 2014;
Guo and Williams, 2010).The previous studies that compare ANN with grey model
(Badhrudeen et al., 2016) on day time data, but a comparison with different non-
parametric techniques with each other while predicting short-term-traffic, seems to be
missing. Even the use of heterogeneous data sources for traffic prediction, seems to be
missing. In this paper, we focus on the particular problem that is: How to decide traffic
prediction models using nonparametric approaches for multivariate data set collected
from different sources? Thus, multivariate traffic flow prediction is essential to capture
the traffic trends or to design traffic rules over a certain period in future for advanced
traveller information systems. Based on the discussion mentioned above, the
contributions of this study are the following:
 A real-time data based approach has been proposed for traffic flow prediction with
the influence of exogenous factors such as weather conditions and holiday.
 Traffic similarity patterns are analysed for day and night duration to analyse the
phenomenon known as heteroskedasity (Fosgerau and Fukuda, 2012) including
holiday for Indian scenarios.
 Selection of an appropriate time-series interval for short-term traffic prediction
approach with respect to geometry.
 Different non-parametric techniques are compared, which helps in making suitable
choice for prediction approach.
 An overall accuracy of 90% is achieved for multivariate short-term traffic flow
prediction using random forest regression technique.
The remainder of this paper is organised as follows: Firstly, the related work on short-
term traffic flow prediction, followed by framework proposed and data set used after that
the results and performance assessment metrics. Finally, conclusions are discussed.
Table 1
Variables
Input data Prediction
References Techniques Model type Traffic parameter Data set Duration Context
duration (min) step
Wu et al. (2018) DNN Non-parametric traffic flow PeMS 14 month Inter-state 15 1
Zhao et al. (2017) Deep Learning Non-parametric Traffic volume, lane Cameras, induction coils 6 month city 15 1
(LSTM) occupancy average velocity radars
velocity
Polson and CNN Non-parametric vehicle type Cameradata 300 images Single lane 10 1
Sokolov (2017)
Zhang et al. Deep learning Non-parametric Traffic volume, weather, TaxiBJ BikeNYC 3 year and city 15 multi
(2017) (ST-ResNet) event 5 month
Smith and Grey level Parametric Traffic speed, travel time Loop data set (data 24 days Highway 10 multi
Demetsky (1997) aggregation to 1-min)
Kim and Bayesian Non- Time-of- Loop 608 Highway 5 multi
Wang (2016)
Network Parametric Day, Incident, weather, Detector with 3-mint day
traffic state duration
Badhrudeen et al, ANN grey model Both Traffic volume, traffic Detector 9 days Highway 10 1
(2006 speed
Habtemichael KNN SAIMA, Both Traffic volume Detector/sensors 12 data 12 month and Highway 15 1
Multivariate short-term traffic flow prediction
and Cetin (2016) Kalman Filter set 15 days

Qi and Hidden Markov Parametric Speed data detector 5-mint inter-state 10 1
Ishak (2014) model (HMM)
(Guo and ARIMA Holt- Parametric Traffic volume Loop detector Simulation data, Junction 15 multi
Williams, 2010) winters observed data
Kumar and SARIMA Parametric Traffic volume Collect-R-camera 3 days with 1- 3-lane arterial 15 1
Vanajakshi (2015) min duration roadway arterial
(Habtemichael Knn Non-parametric Delay time, congestion Floating car data, GSM 7 month Road segment 15 multi
and and Cetin) level probe data, stationary
sensors (loop, camera)
Summary of previous research studies related to prediction models for traffic flow
35
2 Related work
Traffic flow prediction is a very demanding area in the Intelligent Transportation System
(ITS) for the last few decades. The short-term traffic forecast involves the prediction of
traffic flow over the time duration of a few seconds to a few hours into the future using
current and historical measurements of traffic variables. In general, traffic flow-casting
models fall into two categories: parametric and non-parametric. The summary of the
different prediction models of the approaches as mentioned above is described in Table 1.
Previous studies have concluded that the prediction model in the intelligent
transportation system is switching towards data-intensive machine learning models.
Different machine learning models use different strategies to find relationship from
training data set with different performance in different scenarios. Based on the above
discussion, regression techniques are suitable approach for traffic prediction due to
simple interpretable structure as well as less data demanding. Based on aforementioned
discussions, this study applied regression techniques on toll plaza data with external
factors affecting the traffic volume such as weather conditions and holiday.
3 Proposed framework
3.1 Data set

Toll plaza sensor data contains total volume with vehicle types, mode of payment,
vehicle entry time and vehicle‘s details with 1-min duration. Traffic time series
decomposition slot taken as 15-min intervals, the related work shows that duration of the
traffic flow patterns become stable. The first part includes the first ten-day data (from 1st
to 10th July 2016) used as training purpose. The next four days (from July 11th to 14th
July) for model testing. In this study, toll plaza data contains day and night entries. The
collected traffic data are analysis for non-recurrent traffic scenarios including holiday as
well as weather conditions, description of data is mentioned in Table 2.
Table 2 Description of data set
Toll plaza data Exogenous factors

Data Type Sensor Data Holiday 1
Location Pinjore-kalka Weather Condition 1–3 (1=Rain, 2=Sunny,
Temperature (C) 3=Cloudy) (24–40)
Wind Speed (/mph) 6–13
Humidity (%) 60–100
The time series are decomposed of n sampling data points per 24 hour duration. Suppose
the sampled traffic flow data in N consecutive days here, Sample time interval is: i=15
min, then n=15*24*14, can be written as a series of 1-D vectors matrix.
Di1  Di11 , Di1 2, ,..., Di1 n (1)
DiN  DiN 1 , DiN  2 ,..., DiN  n (2)
where N=no of days, i= number of time interval (15-min), Di1 = Traffic volume during
the current time interval during day-one, Di11 = Traffic volume during the next time
interval, DiN = Total number of days.
Traffic varies due to the current time-of-day as well as day-of-the-week. In addition,

the number of traffic passed through in single lanes, all vehicle types are considered
homogeneous type, can be estimated as:
Tf  t    i 1 X L  t 
m
(3)
where Tf  t  = Total Traffic flow, X L  t  = No of vehicles passed in particular lane L.

The combined 1-minute data were aggregated to 15-minute interval.
3.2 Data integration and processing

Data filtering is indispensable for accurate and reliable prediction results. Multivariate
traffic data contains many missing values, which first need to address. It is vital to focus
not only on the predicting accuracy, but also on the reliability of prediction performance
considering missing data situations. Following steps are carried out for data smoothing:
1) Missing data values importuning: As raw traffic data contains random and missing
values, therefore data is smoothed firstly. The missing values are imputed by finding
out the medium. In this study, MEDIUM method is used to impute the missing
values.
Ti  n    n  1  2 (4)
where Ti ( n ) is total number of attribute and n is the number of data points in the set.
2) Data normalisation: Real-time toll plaza traffic volume combined with exogenous
variables (temperature, humidity, visibility, barometer and holidays) collected from
the Internet. Then, whole Data is normalised in form [0, +1]. The scaling of data is
done to avoid attributes in wider numeric ranges dominating those in smaller
numeric ranges.
Wi  i 
Pi  i   (5)
W
n
i i
where n= total \attributes ,wi=weight for one particular attribute, i one particular attribute.
Traffic flow trends are affected with holiday (6 July, Eid-ul-Fitar) have higher traffic
volume as compared to a normal day as shown in Figure 1.
Furthermore, some other traffic trends can be inferred. The traffic volume changes
with similarity pattern due to regular daily traveling; well-known road traffic exhibits
strong cyclic patterns for morning and evening peak-hours. However, traffic conditions
in the peak periods are more dynamic than those in the non-peak periods, a phenomenon
known as heteroskedasity (Fosgerau and Fukuda, 2012). Traffic flow on weekends is
much higher than that of weekdays. Hence, it has been caused heavy traffic bottleneck
due to heavy vehicles during night hours. As there is no separate lane for heavy vehicles
as well as for emergency vehicles such as ambulance, defense vehicles. Different traffic
similarity patterns are analysed for day and night duration for two-week data set as
shown in Figure 2.
Figure 1 Traffic flow on weekends with holiday or normal data
Figure 2 Two-week traffic flow pattern similarity for 15-min time intervals for 24 hour
 Non-parametric-based framework for prediction traffic (NPBF-PT): In this study, a

framework is proposed for short-term traffic flow prediction, composed of five steps:
Data Collection, Data Pre-processing, Data Integration, Model Selection and
Performance Evaluation as shown in Figure 3. In final step, prediction methods are
analysed using mean square error, absolute mean square error, coefficient of
determination and correlation. Five-Fold cross-validation is done to evaluate stability
of methods.
 Model development: To decide which regression techniques show better results was
very tedious work. So different regression techniques are applied to decide which
suits the best on multivariate data set. A brief description of all techniques applied in
this study is given as below:
 Gradient descent: Gradient descent is the linear regression used for minimisation of
function. This optimisation algorithm used to minimise the error of a framework on
training data (Stearns et al., 2017). This iterative minimisation is reached using
calculus by taking steps in the negative direction of given function gradient (Jordan
et al., 2006). It calculates the derivative from training sample points before
calculating any increment in an update, and alpha is a learning rate that is configured
(Raza and Zhong, 2018).
a n
Pi  Pi  h  
n i 1
(6)
where Pi =Predicted value for particular data samples, n = Total number of data sets,
a =learning rate h   = function gradient.
 K-nearest neighbour (KNN): KNN is used for regression as well as classification

purpose. KNN works on similarity measure from search space, assigns a label to the
unlabelled object based on the nature of nearest neighbours (Habtemichael and
Cetin, 2016). The two important parameters affecting the performance of KNN are
number of neighbours and distance measure between data points (Guo et al., 2017).
Figure 3 Short-term traffic flow prediction framework based on non-parametric techniques
Correlation distance and weighted Euclidean distance, the Euclidean distance used
mostly to measure the distance between data points as shown in equation. A small
number of neighbours are enough to give the good results (Myung et al., 2011). KNN is
not more suitable for the sparse data set which contains mostly feature having 0 values,
even though the numbers of features are much larger (In hundred or more) till
performance is very low. Basic parameters used for KNN are: historical values, number
of neighbours and prediction function (Jiber et al., 2018).
D j  t  1
1 K Dist j
D  t  1 
k j 1 1
(7)
 j 1 Dist
K
where Dist j is distance between two neighbours, K= no of neighbours, D j  t  1 =

Distance among the neighbours.
 Ridge regression: Ridge Regression (RR) is based on regularised linear regression.

Ridge regression reduces over-fitting by using other models with lower regression
coefficients (Hoerl and Kennard, 1970). It can be used with Kernel Methods (KMs)
for traffic forecasting, that is a more suitable for multicollinearity data-set (Haworth
et al., 2014). When multicollinearity occurs, least squares errors are negligible but
corresponding variance are large which may be far from the true value (Li et al.,
2014). Ridge regression can be combined with different base predictions model for
more accuracy and stable results, known as ensemble learning approach (Zhan et al.,
2018). Basic notation used for ridge regression are as
Z  XA  e (8)
where Z = Dependent variable, X=Independent variable, A=Regression coefficient to be
learned and e = errors.
 Bayesian ridge: Bayesian ridge primarily used for finding the probability distribution
on ridge parameters. It is similar to the classical ridge; differ by prior parameters
(Müller and Guido, 2017). The parameters are calculated by the optimisation
function by maximising the marginal log-likelihood as
   
p    N  ,  1 I i  (9)
 0 
Bayesian ridge regression deals with the hierarchical data structure. Bayesian forecasting
is learning process that reassesses the state of prior knowledge with an initial distribution
to predict the results for posterior (Huang and Abdel-aty, 2014). Bayesian ridge is
probabilistic model of the regression problem can be estimates  1 ,  2 , µ1 , µ2 106
(by default). These parameters are known as non-informative, calculated by maximising
the marginal log likelihood.
 Least absolute shrinkage and selection operator (LASSO): Lasso is also known as
the penalised regression method, works for shrinkage as well as variable selection.
Lasso regression more suitable for high dimensional and dynamic data set
(Kamarianakis et al., 2012). Lasso method can be used to build graphical Bayesian
network model for prediction. Lasso method works on the assumption of stationary
time series, even fail if assumptions are not met (Arnold and Abe, 2007). Time series
also play a vital role in finding the underlying causal relationships in traffic patterns.
A more detailed description of lasso regression can be found (Li et al., 2015). In this
study, multivariate data set used, which is dynamic in nature, Lasso suits on the
application. Lasso model is showing predicting results better than ridge regression,
as estimated by sum of squares and sum of penalty, where λ is a non-negative
regularisation learning rate parameter.
p 2 p
min arg
Iˆ  lasso   U   Vi  i    j (10)
 i 1 i 1
 Random forest: Random Forest was developed by Breiman Leo, based on ensemble
learning methods (Breiman, 2001). Random Forest uses bagging technique for
training models. This technique is also known as bootstrap aggregating (Stearrns
et al., 2017). Random Forest algorithm is a collection of the different decision tree,
which slightly different from each other. In Random Forest overfitting of the training
data set can be reduced, by selecting variables to branch each decision tree and node
which changes the way of tree construction (Guo et al., 2018; Verikas et al., 2011).
The learning process for random forest can be performed as follows:
Pseudo code for random forest

Input: Training Random vector Space T  Fm, Fn 
Output: Predicted Results Set, Q  Q1 , Q2 ,..., Qn 
1 Training_Set_Generation  M , N  / / N = observation data points

with M features T  Fm, Fn  / / original training set Sample point
2 Bagging_Method()
while  i  k  // K=playback times[Iteration _times=120]

while j   M 
BAG_IN =M’ /M’ Random Features
BAG_OUT =M–X //Remaining
Random Features Samples
BS=Best_Feature (BAG_IN)
return Best_Feature (BAG_IN)
while  P  M  // (which feature gives the best split is used to split the node
iteratively)
Best_Split =Best_F(Mq)
Current_Node=Best_Split
BS= Current_Node
U= mini pruning .ADD (Tk  Fm, Fn 
//New Training Set of k times
End
End
Return (GENERATE_TREE(BS))
End
3 GENERATE_TREE(BS)
while  i  C  // Total No of Decision_tree generated
Training_Set
Generation (M, N)
Bagging_Method()
[ Go to Step 2]
/* Decision_Tree =mini pruning  i  1 to  M Best_Split*/

Random_Forest
End
The study shows that, the feature importance also provided by Random Forest by
aggregating the feature importance among the decision trees is more reliable. In random
forest regression, there is no need for data normalisation or scaling the data set.
 Adaboost regression: Adaptive boosting is an Ensemble method that combines
multiple trees (weak learners) for prediction. It is primarily used for classification,
based on binary classification (Leshem et al., 2007). It is meta-estimator and
sensitive to noisy data and outliers, less suitable for over fitting problem. Each
learner may be weak, but the performance of all combined is better than random
guessing, which can have exponential loss (Giot, 2014; Freund and Schapire, 1997),
as shown in equation:
H  x   Sign  n
i 1
 i hi  x   (11)
where H  x  is exponential Loss,  i is learning factor, hi  x  data sample point.
 Multi-layer perceptron (MLP): MLP are non-linear, data-driven, self- adaptive

methods to find the functional relationship between input-output variables from
training data set, that is unknown or hard to find out (Zhang et al., 1998). MLP is
also known as the feed forward neural network, family member of Artificial Neural
Network (ANN). ANN is a well-known countermeasure for traffic prediction due to
the proficiency of handling multi- dimensional data. In this study, feed forward
neural network used by optimisation of different parameters. The parameters used
are number of input variables with different weights, number of hidden layers and
activation function-rectified linear unit or relu with different number of iterations to
calculate the output variable (Lippi et al., 2013). The cross-validation applied for the
selection of parameters.
X  Wij  H  B (12)
where H is hidden layers, Wij weight matrix, B is activation functions and  adjustment
parameters
 Support vector machines (SVM): SVM based on the statistical learning method,
perform well on scaled data, by mapping the input-output relationship for non-linear
regression problems (Wu et al., 2004). SVM work with different kernel functions
such as linear, polynomial and Radial Basis Function (RBF), RBF found more
suitable for traffic forecasting under different scenarios (Cortes and Vapnik, 1995).
Hence, in this study kernel function RBF is used. SVM works well as classification,
not as a repression. But it shows good results to minimise error by setting an epsilon
value for error tolerance, that would find out from problem domain by optimising
parameters (Mingheng et al., 2013).
M  a, b    a  ,   b   (13)
For traffic
RBF : M  Ai , Bi   exp  Ai  Bi 2

 ,  0 (14)
where Ai represents input vector and Bi output vector using RBF kernel function.
 Decision-tree: Decision tree regression technique is easy to use, robust against

missing values and provides the quick decision. Decision-tree based on to learn a
sequence of if/else questions follows the flow-chart structure in recursive way. Each
node contains a problem relative to particular attributes and leaf nodes that includes
the same class label (Xu et al., 2014b). For unknown instance the algorithm iterates
over all possible nodes and gives the one most relevant leaf node for the target
variable. For missing data values, tree is not partition till leaf prediction is done on
the basis of averaging sub-tree which traversed (Dombi and Zsiros, 2005). In this
study, the number of decision trees is decided by growing and pruning method, more
detailed description can be found in Chang and Bong (1996); Quinlan (1986) and
Traffic et al.(2018).
4 Results and discussion
4.1 Performance assessment Metrics

Several measures can be employed to measure the accuracy of a time series prediction
model, errors play an essential role in evaluating the performance of the prediction
model. Mostly, error indicators include: Mean Square Error (MSE), Absolute Mean
Square Error (AMSE), R2 (coefficient of determination) and Correlation (r) (Ma et al.,
2015; Lu and Shladover, 2014; Guo et al., 2018). Five-fold cross-validation was used.
MSE is used to measure the error between the actual value and the predicted value.
RMSE is used to measure the degree of deviations between actual volume and the
predicted value. Correlation was used to measure how much-observed values differ from
the mean value of predicted value. The values of r range –1 to 1, the higher the r-value,
the greater the correlation exists. R2 or coefficient of determination measures the
proportion of variance of dependent variables. It shows the degree of approximation data
with regression line. Its value ranges between 0 and 1. The zero value means that models
show no variance between actual value and predicted values (Bartin, 2018; Lippi et al.,
2013; Traffic et al., 2018).
1 n  Oi  Pi 
MSE  
n i 1  Oi 
 (15)
i 
n Oi  Pi
AMSE  
n i 1 Oi
(16)
 P O 
n 2
i i
R2  1  i 1
(17)
 O  Ó 
n 2
i 1 i i
  O  Ó   P  P 
n
i 1 i i i
r (18)
 
2
  Pi  Pi
n n
i 1
(Oi  Ói ) 2
i 1
where Oi is observed traffic variable, Pi predicted traffic variable, n is the total sample
data size, Ói average of all observed values, Pi average of all predicted values.
4.2 Model estimation

The model estimation includes the estimation of different learning parameters, i.e.
learning rate (alpha), max-iter, max_depth, n_estimator, hidden layer (Raza and Zhong,
2018; Breiman, 2001). In this study, different parameters are set by conducting empirical
experiment as mentioned in Table 3 as well as literature surveys (Müller and Guido,
2017; Verikas et al., 2011; Quinlan, 1986).
Table 3 Learning parameters for various regressions technique
Approach Method used Module name Learning parameters

Gradient descent lr Linear regression Alpha=0.01
KNN sm Clustering k=5
Ridge lr Linear regression Alpha=0.1
Bayesian ridge brd Probabilistic model Alpha_1=1e-06,
Lambda_1=1e-06, Max-
iter=150, Alpha_2=1e-06,
Lambda_2=1e-06, Tol=0.01
LASSO Pr Peneralised Reg. Alpha=0.01, Max_iter=300
Random forest rf Bagging Max-depth=7,
Max_iter=450, n_estimator=8
Adaboost wl Ensemble Alpha=0.1, n_estimator=10
Multilayer neuralnet Feed-Forward NN Activation=relu,
Perceptron(MLP) Max_iter=150
Randomstate=40,
hidden_layer=[20,20]
SVM nlr Non-linear Regression Kernel=‘rbf‘
Decision_Tree rpart Rpart Number_tree=5.Max-
Depth=25
The whole data is again normalised for multi-layer perceptron neural network, as the
variance must be 1 and mean values should be 0.
4.3 Experimental results

All regression techniques are trained based on different learning parameters as mentioned
in Table 3 for short-term traffic flow prediction. Input variables comprise time series,
humidity, visibility, barometer and holiday. Total traffic volume taken as output variable
is to be predicted. Whole data set is partitioned into two sets, training and testing data set.
The five-fold cross-validation was used to get average errors so as to evaluate
accuracy stability. The performance of models was evaluated using R2 (R-two) metric as
described earlier. The results of testing data set using non-parametric methods and their
corresponding R2 error values are shown in Figure 4. It can be noticed that R2 (coefficient
of determination) errors are least in Random Forest.
Figure 4 Results comparison of differentregression techniques with performance metrics
Performance Metric
multivariate
1
0.8
0.6
0.4
0.2
0
Mean Square Error Absolute Mean Square Error R_Two correlation

Non Parametric Techniques
The Random Forest regression shows accuracy up to 90% for short-term traffic flow
prediction among all applied techniques while decision tree regression, Gradient descent
regression, MLP regression shows r2_score up to 89%, 87 % and 85%, respectively as
shown in Figure 5 gave better results as compared to others.
Figure 5 Observed and predicted traffic volume on testing data
R2 are calculated from equation (17) for more accurate performing methods and values
are mentioned in Table 4. Random Forest estimated the maximum R2-values for short-
term traffic flow prediction.
The results show that Random Forest regression technique performed well for limited
data as well as against overfitting issue. Regression techniques learn exogenous variable
importance without explicitly assigning weight to variables.
These analysis results show that performance can be increased on multivariate data
set as compared to single-variable data. Figures 6 and 7 show weekdays and weekends
analysis for traffic pattern using random forest regression technique.
Table 4 Sample data set of regression techniques for observed vs. predicted
Actual Random forest Decision tree Gradient decent MLP predicted R2-
Vol. predicted R2-error predicted R2-error predicted R2-error error
0.211 0.190 .021 0.187 .024 0.183 .027 0.179 .032
0.422 0.410 .012 0.375 .047 0.367 .055 0.358 .064
0.851 0.765 .086 0.757 .094 0.740 .111 0.723 .128
0.939 0.845 .064 0.835 .104 0.816 .123 0.798 .141
Note: *Bold numbers are highest R -two values in each column
Figure 6 Comparison of observed and predicted traffic volume using random forest during
weekend (one unit=15 mint)
Figure 7 Comparison of observed and predicted traffic volume using random forest during
weekday (one unit=15 mint)
5 Conclusions
Short-term traffic prediction approaches play a vital role in providing useful information
for Intelligent Transportation Systems and smart cities. Traffic patterns are affected by
non-recurrent factors such as holiday, event and bad weather conditions. These factors
cause uncertainties for multivariate short term traffic prediction. In this study, different
non-parametric techniques are applied to reduce these uncertainties and enhance the
accuracy for prediction. The proposed approach is validated using 14 days traffic volume
data aggregated with 15-min interval from Chandigarh Toll plaza Expressway. The first
10 days data is used as training and remainder was utilised for testing purpose. Five-fold
cross validation is performed to evaluate the accuracy of the applied techniques.
The experimental results demonstrated that there are significant differences in
predictions of different non-parametric techniques. The results revealed that Random
Forest method excels over all the other applied techniques, especially using multivariate
real-time data set. The limitation of the proposed methodology is lack of multi-step
prediction between predicted and observed traffic flow. This feature is highly desirable
for traffic flow prediction problems, as future traffic state commonly dependent to
previous state. The spatial features of traffic parameters were not taken into
consideration.
Future work can be extended by applying semantic knowledge base for inference
mechanism. The predicted results can be used to design the queuing model to provide
probability density function for the queue at toll-lanes. More data should be collected
from multiple toll lanes from inter-urban expressway to obtain better results.
The non-parametric techniques presented in this study have the potential to forecast
the future scenarios using the present data. The Random Forest technique is robust to
handle multivariate attributes and implicit variable selection make the algorithm more
suitable for prediction at the intersection, link-based as well as path-based road network.
It can provide important information to toll plaza authorities and town planners for
decision making, policy designing as well as infrastructure network related problems.
6 Data availability statement
Some or all data, models, or code generated or used during the study are proprietary or
confidential in nature and may only be provided with restrictions. Toll plaza data list
items include (such as vehicle type, mode of payment, exempted vehicles, entry time of
vehicle) and restrictions.
References
Abadi, A., Rajabioun, T. and Ioannou, P.A. (2015) ‘Traffic flow prediction for road transportation
networks with limited traffic data’, IEEE Transactions on Intelligent Transportation Systems,
Vol. 16, pp. 653–662.
Antônio, M., Caldas, D.F. and Sacramento, K.T. (2016) ‘Simulation model of discrete events
applied to the planning and operation of a toll plaza’, Journal of Transport Literature, Vol. 10,
pp.40–44.
Arnold, A. and Abe, N. (2007) ‘Temporal causal modeling with graphical granger methods’,
Knowledge Discovery in Databases, Vol. 10, pp.66–75.
Badhrudeen, M., Raj, J. and Vanajakshi, L.D. (2016) ‘Short-term prediction of traffic parameters –
performance comparison of a data-driven and less-data-required approaches’, Journal of
Advanced Transportation, Vol. 50, pp.647–666.
Bartin, B. (2018) ‘The use of learning classifier systems in microscopic toll plaza simulation
models’, Transportation Research Annual Meeting, Vol. 97, pp.1–6.
Bezuglov, A. and Comert, G. (2016) ‘Short-term freeway traffic parameter prediction: application
of grey system theory models’, Expert Systems with Applications, Vol. 62, pp.284–292.
Bharti, M., Saxena, S. and Kumar, R. (2017) ‘Intelligent resource inquisition framework on
internet-of-things’, Computer Electric Engineering, Vol. 58, pp.265–281.
Breiman, L.E.O. (2001) ‘Random forests’, Machine Learning, Vol. 45, pp.5–32.
Chakroborty, P., Gill, R. and Chakraborty, P. (2016) ‘Analyzing queuing at toll plazas using a
coupled, multiple-queue, queuing system model: application to toll plaza design’,
Transportation Planning and Technology, Vol. 39, pp.675–692.
Chang, K. and Bong, S. (1996) ‘An intelligent approach to time series identification by a neural
network driven decision tree classifier’, Decision. Support Systems, Vol. 17, pp.183–197.
Christopher, I., Scofield, L. and Us, W.A. et al. (2016) Predicting Expected Road Traffic
Conditions Based on Historical and Current Data, US 9,257,041B2.
Cortes, C. and Vapnik, V. (1995) ‘Support-vector networks’, Machine Learning, Vol. 20,
pp.273–297.
Dombi, J. and Zsiros, A. (2005) ‘Learning multicriteria classification models from examples:
decision rules in continuous space’, European Journal of Operational Research, Vol. 160,
pp.663–675.
Fosgerau, M. and Fukuda, D. (2012) ‘Valuing travel time variability: characteristics of the travel
time distribution on an urban road’, Transportation Research Part C: Emerging Technologies,
Vol. 24, pp.83–101.
Freund, Y. and Schapire, R.E. (1997) ‘A decision-theoretic generalization of online learning and
an application to boosting*’, Journal of Computer and System Sciences, Vol. 55, pp.119–139.
Giot, R. (2014) ‘Predicting bikeshare system usage up to one day ahead’, in Proceedings of the
IEEE Symposium on Computational Intelligence in Vehicles and Transportation Systems
(CIVTS), pp.1–8.
Guo, F., Krishnan, R. and Polak, J. (2017) ‘The influence of alternative data smoothing prediction
techniques on the performance of a two-stage short-term urban travel time prediction
framework’, Journal of Intelligent Transportation Systems, Vol. 21, pp.214–226.
Guo, F., Polak, J.W. and Krishnan, R. (2018) ‘Predictor fusion for short-term traffic forecasting’,
Transportation Research Part C: Emerging Technologies, Vol. 92, pp.90–100.
Guo, J. and Williams, B.M. (2010) ‘Real-time short-term traffic speed level forecasting and
uncertainty quantification using layered Kalman filters’, Transportation Research Record:
Journal of the Transportation Research Board, Vol. 75, pp.28–37.
Habtemichael, F.G. and Cetin, M. (2016) ‘Short-term traffic flow rate forecasting based on
identifying similar traffic patterns’, Transportation Research Part C: Emerging Technologies,
Vol. 66, pp.61–78.
Habtemichael, F.G. and Cetin, M. (2016) ‘Short-term traffic flow rate forecasting based on
identifying similar traffic patterns’, Transportation Research Part C: Emerging Technologies,
Vol. 66, pp.61–78.
Haworth, J., Shawe-taylor, J., Cheng, T. and Wang, J. (2014) ‘Local online kernel ridge regression
for forecasting of urban travel times’, Transportation Research Part C: Emerging
Technologies, Vol. 46, pp.151–178.
Hoerl, A.E. and Kennard, R.W. (1970) ‘Ridge regression biased estimation for non-orthogonal
problems’, Technometrics, Vol. 12, pp.55–67.
Huang, H. and Abdel-aty, M. (2014) ‘Multilevel data and Bayesian analysis in traffic safety’,
Accident, Analysis and Prevention, Vol. 42, pp.1556–1565.
Jiber, M., Lamouik, I., Ali, Y. and Sabri, M.A. (2018) ‘Traffic flow prediction using neural
network’, Proceedings of the International Conference on Intelligent Systems and Computer
Vision (ISCV), pp.1–4.
Jordan, M., Bishop, M., Kleinberg, J. and Schölkopf, B. (2006) ‘Pattern recognition and machine
learning’, in Jordan, M., Kleinberg, J., Schölkopf, B. (Eds): Information Science and
Statistics, 1st ed., Springer, pp.144–194.
Kamarianakis, Y., Shen, W. and Wynter, L. (2012) ‘Real-time road traffic forecasting using
regime-switching space-time models and’, Applied Stochastic Models in Business and
Industry, Vol. 28, pp.297–315.
Kim, C., Kim, D., Kho, S., Kang, S. and Chung, K. (2016) ‘Dynamically determining the toll plaza
capacity by monitoring approaching traffic conditions in real-time’, Journal of Applied
Sciences, Vol. 6, pp. 87–90.
Kim, J. and Wang, G. (2016) ‘Diagnosis and prediction of traffic congestion on urban road network
using Bayesian networks’, Transportation Research Record: Journal of the Transportation
Research Board, Vol. 95, pp.108–118.
Kumar, S.V. and Vanajakshi, L. (2015) ‘Short-term traffic flow prediction using seasonal ARIMA
model with limited input data’, European Transport Research Review, Vol. 7, pp.21–25.
Lécué, F., Tallevi-diotallevi, S. and Hayes, J. et al. (2014) ‘Web semantics: science, services and
agents on the world wide web smart traffic analytics in the semantic web with STAR-CITY:
scenarios, system and lessons learned in Dublin City’, Web Semantic: Science, Services and
Agents on the World Wide Web, Vols. 27/28, pp.26–33.
Leshem, G., Ritov, Y. and Adaboost, A. (2007) ‘Traffic flow prediction using adaboost algorithm
with random forests as a weak learner’, Proceeding World Academy of Science, Engineering
and Technology, Vol. 21, pp.193–198.
Li, L., Chen, X. and Zhang, L. (2014) ‘Multi-model ensemble for freeway traffic state estimations’,
IEEE Transactions on Intelligent Transportation Systems, Vol. 15, pp.1323–1336.
Li, L., Su, X., Wang, Y., Lin, Y., Li, Z. and Li, Y. (2015) ‘Robust causal dependence mining in big
data network and its application to traffic flow predictions’, Transportation Research Part C:
Emerging Technologies, Vol. 58, pp. 292–307.
Lippi, M., Bertini, M. and Frasconi, P. (2013) ‘Short-term traffic flow forecasting: an experimental
comparison of time-series analysis and supervised learning’, IEEE Transactions on Intelligent
Transportation Systems, Vol. 14, pp.871–882.
Liu, X., Chien, S.I. and Chen, M. (2014) ‘An adaptive model for highway travel time prediction’,
Journal of Advanced Transportation, Vol. 48, pp. 642–654.
Lu, X. and Shladover, S.E. (2014) ‘Review of variable speed limits and advisories’, Transportation
Research Record, Vol. 24, pp.15–23.
Ma, X., Tao, Z., Wang, Y., Yu, H. and Wang, Y. (2015) ‘Long short-term memory neural network
for traffic speed prediction using remote microwave sensor data’, Transportation Research
Part C: Emerging Technologies, Vol. 54, pp.187–197.
Mingheng, Z., Yaobao, Z., Ganglong, H. and Gang, C. (2013) ‘Accurate multistep traffic flow
prediction based on SVM’, Mathematical Problems in Engineering, Vol. 20, pp.1–8.
Müller, A.C. and Guido, S. (2017) Introduction to Machine Learning with Python, O‘Reilly Media.
Myung, J., Kim, D., Kho, S. and Park, C. (2011) ‘Travel time prediction using k nearest neighbor
method with combined data from vehicle detector system and automatic toll collection
system’, Transportation Research Record: Journal of the Transportation Research Board,
Vol. 56, pp.51–59.
Polson, N.G. and Sokolov, V.O. (2017) ‘Deep learning for short-term traffic flow prediction’,
Transportation Research Part C: Emerging Technologies, Vol. 79, pp.1–17.
Qi, Y. and Ishak, S. (2014) ‘A Hidden Markov Model for short term prediction of traffic conditions
on freeways’, Transportation Research Part C: Emerging Technologies, Vol. 43, pp.95–111.
Quinlan, J.R. (1986) ‘Induction of decision trees’, Machine Learning, Vol. 1, pp.81–106.
Raza, A. and Zhong, M. (2018) ‘Lane-based: short-term urban traffic parameters forecasting using
multivariate artificial neural network and locally weighted regression models: a genetic
approach’, The Canadian Journal of Civil Engineering, Vol. 12, pp.32–38.
Smith, B.L. and Demetsky, M.J. (1997) ‘Traffic flow forecasting comparison of modeling
approaches’, Journal of Transportation Engineering, Vol. 123, pp.261–266.
Stearns, B., Rangel, F. and Rangel, F. (2017) ‘Scholar performance prediction using boosted
regression trees techniques’, European Symposium on Artificial Neural Network,
Computational Intelligence and Machine Learning, Vol. 5, pp.26–28.
Su, S.S.F.C., Tsai, H.N.P. and Cheng, C. (2018) ‘Using machine learning and big data approaches
to predict travel time based on historical and real-time data from Taiwan electronic toll
collection’, Soft Computing, Vol. 22, pp.5707–5718.
Traffic, I., Using, P. and Models, T. (2018) ‘Intersection traffic prediction using decision tree
models’, Symmetry (Basel), Vol. 10, pp.80–87.
Verikas, A., Gelzinis, A. and Bacauskiene, M. (2011) ‘Mining data with random forests: a survey
and results of new tests’, Pattern Recognition, Vol. 44, pp.330–349.
Vlahogianni, E.I., Karlaftis, M.G. and Golias, J.C. (2014) ‘Short-term traffic forecasting: where
we are and where we are going’, Transportation Research Part C: Emerging Technologies,
Vol. 43, pp.3–19.
Wang, Y., Ning, J. and Li, S. (2017) ‘Research on merging pattern after toll based on simulation’,
Journal of Advances in Social Science, Education and Humanities Research, pp.509–515.
Wu, C., Ho, J. and Lee, D.T. (2004) ‘Travel-time prediction with support vector regression’, IEEE
Trans. Intell. Transp. Syst., Vol. 5, pp.276–281.
Wu, Y., Tan, H., Qin, L., Ran, B. and Jiang, Z. (2018) ‘A hybrid deep learning based traffic flow
prediction method and its understanding’, Transportation Research Part C: Emerging
Technologies, Vol. 90, pp.166–180.
Xu, Y., Kong, Q. and Liu, Y. (2014b) ‘Short-term traffic volume prediction using classification and
regression trees short-term traffic volume prediction using classification and regression trees’,
IEEE Intelligent Vehicle Symposium, Vol. 4, pp.493–498.
Xu, Y., Kong, Q., Klette, R. and Liu, Y. (2014a) ‘Accurate and interpretable Bayesian MARS for
traffic flow prediction’, IEEE Transactions on Intelligent Transportation Systems, Vol. 15,
pp.2457–2469.
Zhan, H., Gomes, G., Li, X.S., Madduri, K., Sim, A. and Member, S. (2018) ‘Consensus ensemble
system for traffic flow prediction’, IEEE Transactions on Intelligent Transportation Systems,
Vol. 19, pp.3903–3914.
Zhang, G., Patuwo, B.E. and Hu, M.Y. (1998) ‘Forecasting with artificial neural networks: the state
of the art’, International Journal of Forecasting, Vol. 14, pp.35–62.
Zhang, J., Zheng, Y. and Qi, D. (2017) ‘Deep spatio-temporal residual networks for citywide
crowd flows prediction’, AAAI Conference on Artificial Intelligence, Vol. 31, pp.1655–1661.
Zhao, Z., Chen, W., Wu, X., Chen, P.C.Y. and Liu, J. (2017) ‘LSTM network: a deep
learning approach for short-term traffic forecast’, IET Intelligent Transport Systems, Vol. 11,
pp. 68–75.
Zhu, D., Dong, F., Shi, W. and Zheng, S. (2017) ‘The model of toll station planning’, Journal of
Advances in Intelligent System Research, pp.390–394.

Multivariate Short-Term Traffic Flow Prediction Based On Real-Time Expressway Toll Plaza Data Using Non-Parametric Techniques

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multivariate Short-Term Traffic Flow Prediction Based On Real-Time Expressway Toll Plaza Data Using Non-Parametric Techniques

Uploaded by

Copyright:

Available Formats

32 Int. J. Vehicle Information and Communication Systems, Vol. 7, No.

Multivariate short-term traffic flow prediction

Annu Mor* and Mukesh Kumar

Abstract: Accurate real-time traffic flow prediction is a vital component of

Keywords: traffic flow; ITS; intelligent transportation system; non-parametric

Reference to this paper should be made as follows: Mor, A. and Kumar, M.

Mukesh Kumar is an Associate Professor and received his Doctorate of

Copyright © 2022 Inderscience Enterprises Ltd.

In the era of urbanisation and development of an economy, numbers of vehicles are

and Cetin (2016) Kalman Filter set 15 days

3.1 Data set

Toll plaza data Exogenous factors

Traffic varies due to the current time-of-day as well as day-of-the-week. In addition,

where Tf  t  = Total Traffic flow, X L  t  = No of vehicles passed in particular lane L.

3.2 Data integration and processing

Figure 1 Traffic flow on weekends with holiday or normal data

 Non-parametric-based framework for prediction traffic (NPBF-PT): In this study, a

 K-nearest neighbour (KNN): KNN is used for regression as well as classification

Figure 3 Short-term traffic flow prediction framework based on non-parametric techniques

where Dist j is distance between two neighbours, K= no of neighbours, D j  t  1 =

 Ridge regression: Ridge Regression (RR) is based on regularised linear regression.

Pseudo code for random forest

Output: Predicted Results Set, Q  Q1 , Q2 ,..., Qn 

1 Training_Set_Generation  M , N  / / N = observation data points

/* Decision_Tree =mini pruning  i  1 to  M Best_Split*/

where H  x  is exponential Loss,  i is learning factor, hi  x  data sample point.

 Multi-layer perceptron (MLP): MLP are non-linear, data-driven, self- adaptive

RBF : M  Ai , Bi   exp  Ai  Bi 2

 Decision-tree: Decision tree regression technique is easy to use, robust against

4 Results and discussion

4.1 Performance assessment Metrics

4.2 Model estimation

Approach Method used Module name Learning parameters

4.3 Experimental results

Figure 4 Results comparison of differentregression techniques with performance metrics

Mean Square Error Absolute Mean Square Error R_Two correlation

Figure 5 Observed and predicted traffic volume on testing data

6 Data availability statement

You might also like