Professional Documents
Culture Documents
M21 Data Mining With Various Optimization Methods PDF
M21 Data Mining With Various Optimization Methods PDF
a r t i c l e i n f o a b s t r a c t
Keywords: Road traffic represents the main source of noise in urban environments that is proven to significantly
Traffic noise affect human mental and physical health and labour productivity. Thus, in order to control noise sound
Artificial intelligence level in urban areas, it is very important to develop methods for modelling the road traffic noise. As
Genetic algorithm observed in the literature, the models that deal with this issue are mainly based on regression analysis,
Hooke and Jeeves
while other approaches are very rare. In this paper a novel approach for modelling traffic noise that is
Simulated annealing
Particle swarm optimization
based on optimization is presented. Four optimization techniques were used in simulation in this work:
Software genetic algorithms, Hooke and Jeeves algorithm, simulated annealing and particle swarm optimization.
Two different scenarios are presented in this paper. In the first scenario the optimization methods use
the whole measurement dataset to find the most suitable parameters, whereas in the second scenario
optimized parameters were found using only some of the measurement data, while the rest of the data
was used to evaluate the predictive capabilities of the model. The goodness of the model is evaluated by
the coefficient of determination and other statistical parameters, and results show agreement of high
extent between measured data and calculated values in both scenarios. In addition, the model was com-
pared with classical statistical model, and superior capabilities of proposed model were demonstrated.
The simulations were done using the originally developed user friendly software package.
Ó 2013 Elsevier Ltd. All rights reserved.
0957-4174/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.eswa.2013.12.025
3994 V. Nedic et al. / Expert Systems with Applications 41 (2014) 3993–3999
motorcycles, and the number of trucks, and got the ANN model possible. Simultaneously, variations in traffic flow, traffic speed
with 5 inputs. In terms of the parameters involved in the CoRTN and composition of traffic flow were measured. For that reasons
(Calculation of Road Traffic Noise) model (Quartieri et al., 2009), the surveys at the same time also consist of the following param-
which was initially developed in 1975 by the Transport and Road eters: the number of light motor vehicles, the number of medium
Research Laboratory and the Department of Transport of the Uni- trucks, the number of heavy trucks, the number of buses, and the
ted Kingdom, the ANN model that was used in Givargis and Karimi average traffic speed in the given time periods.
(2010) has 5 input variables: the total hourly traffic flow, the Measurements were taken in accordance with recommenda-
percentage of heavy vehicles, the hourly mean traffic speed, tions for road traffic noise measurement; microphone was
the gradient of the road, and the angle of view. Authors tested mounted away from reflecting facades, at a height of 1.2 m above
the developed model on the data collected on Tehran’s roads, the ground level and 7.5 m away from central line of the road. Dur-
and found no significant differences between the outputs of the ing the measurements it has been taken care that climate condi-
developed ANN and the calibrated CoRTN model. In Gndogdu tions are as similar as possible (no wind, no rain) in order to
et al. (2005) genetic algorithm was used to model the traffic noise eliminate their influence.
in relation to traffic composition (vehicle per hour), the road gradi-
ent and the ratio of building height to the road width. In Rahmani
et al. (2011) the proposed model is a function of total equivalent 4. Mathematical model and methods
traffic flow and equivalent traffic speed. In both papers the authors
used MATLAB to find the optimized values of model parameters. The equivalent sound pressure level is supposed to be modeled
In this paper an application of four optimization techniques for by the following equation:
the prediction of traffic noise is presented. These techniques are:
Leq ¼ N1 log 10 ðLMVÞ þ N2 log 10 ðSTVÞ þ N3 log 10 ðTTVÞ
genetic algorithms, Hooke and Jeeves algorithm, simulated anneal-
ing, and particle swarm optimization. The model that is proposed þ N4 log 10 ðBUSÞ þ N5 Vav g N6 þ N7 log 10 ðVav gÞ ð2Þ
consists of five variables: the number of light motor vehicles, the
number of medium trucks, the number of heavy trucks, the num- where Ni ði ¼ 1 7Þ are coefficients. The problem transforms to find
ber of buses and the average traffic flow speed. All optimized mod- coefficients Ni , such that supposed model best fits experimental data.
els are tested on data measured on Serbian road using the For that purpose genetic algorithms, Hooke and Jeeves algorithm,
originally developed user friendly software package. simulated annealing, and particle swarm optimization are used.
These techniques are briefly described in following subchapters.
2. Problem formulation
4.1. Genetic algorithms
The most suitable measure for depicting traffic noise emission
is equivalent sound pressure level ðLeq Þ, which is expressed in units Genetic algorithms (Rao, 1996) are class of evolutionary algo-
of dbA and corresponds to fictitious noise source emitting steady rithms that could be used for a large number of different applica-
noise, which in specific period of time contains the same acoustic tion areas. The principle of genetic algorithms is based on
energy as the observed source with fluctuating noise. For a number Darwin’s theory of evolution, by which the fittest individuals have
of discrete measurements ðNÞ; Leq for time period T is expressed by the best chances to survive. Genetic algorithms operate with a set
following equation: of individuals (chromosomes) called population. The information
!
X
N Li
Leq ¼ 10log 10 1=T 1010 ð1Þ
i¼1
th
where Li is sound pressure level, which corresponds to i
measurement.
In order to reduce the noise it is necessary to know functional
relationship between the equivalent sound pressure level and
influential parameters. Leq is correlated to numerous parameters,
such as numbers and types of vehicles, their velocities, type of road
surface, width and slope of the road, height of buildings facing the
road, etc. As mentioned in the introduction, in this paper the
following variables were considered: the number of light motor
vehicles (LMV), the number of medium trucks (STV), the number
of heavy trucks (TTV), the number of buses (BUS) and the average
traffic flow speed (Vavg). A brief description of how these variables
were measured is given in the following chapter.
3. Data sampling
The simulated annealing algorithm (Rao, 1996) was originally 1. Initialize each particle with a random velocity and random
inspired from the process of annealing in metallurgy, in which position.
metals are heated up to high temperature and then cooled very 2. Calculate fitness values for each particle; if the current fit-
slowly. Slowly cooling allows the metal to alter its physical proper- ness is better than the particle best value so far, save current
ties and achieve its most regular crystal lattice configuration which position as particle best.
corresponds to minimal energy state, which might not occur if me- 3. Choose the particle with the best fitness of all particles and
tal is cooled too fast. Simulated annealing involves temperature assign its fitness value to global best.
variable, which simulates this heating process. This variable is ini- 4. Calculate, for each particle, the new velocity and position
tially set to high value and then gradually diminishes as the algo- according to update rules.
rithm runs. Temperature reduction is specified with annealing 5. Repeat steps 2–4 until an interruption criterion is reached.
schedule, which most often is defined as geometric cooling. Geo-
metric cooling means that temperature in each step is multiplied 5. Simulation
by temperature reduction factor, which is less than 1. The key algo-
rithmic feature of simulated annealing is its ability to escape local The simulation was done using the home made originally de-
optima by allowing changes that worsen the objective function va- signed user friendly software (Fig. 3).
lue according to its probability function. The probability of accept- To assess the quality of proposed models two scenarios were
ing non-improving solutions depends on a temperature parameter, applied. In the first scenario the whole measurement dataset was
which is reduced over time; hence when the temperature variable used to find the model that best fits those data, whereas in the sec-
is high the algorithm allows accepting worse solutions more fre- ond scenario we wanted to test also the predictive capabilities of
quently. This makes it possible that in early stage of execution
algorithm jumps out of any local optimums. As the temperature
is reduced the chance of accepting worse solutions is also reduced.
The simulated annealing algorithm is presented as follows:
developed model. Therefore the measurement dataset was split in coefficient of determination ðR2 Þ. A comparative statistical tests
two parts by randomly selecting data. The first part, which con- (F-test and Paired t-test) were applied to compare calculated traffic
tains 100 data (80% of total number of measurements), was used noise levels with the measured values.
to optimize the model, and the second part, which contains other These statistical analysis and statistical tests are applied to var-
24 data (20% of total number of measurements), was used to test ious optimization methods that were used to find the optimum
the predictive capabilities. In both scenarios the goodness of fit is model parameters. In addition, in order to compare the proposed
assessed by the coefficient of determination ðR2 Þ and other statis- model with classical regression models for traffic noise modelling,
tical parameters. the same analysis and tests are applied to widely used statistical
The parameters that were used for each optimization method regression model developed by Fagotti et al. (Quartieri et al.,
are given in Table 1. 2009). The Fagotti regression model is given by following
expression:
The optimized parameters of the proposed model for various where Q L is the number of light vehicles per hour, Q P is the number
optimization methods, for scenario 1 are shown in Table 2. The of heavy vehicles per hour, Q M is the number of motorcycles per
optimized parameters of the proposed model for various optimiza- hour, and Q BUS is the number of buses per hour.
tion methods, for scenario 2 are shown in Table 3. The comparative analysis for scenario 1 is presented in Table 4,
Capabilities of the models to calculate noise levels are assessed whereas the comparative analysis for scenario 2 (testing the model
by certain statistical parameters and by performing statistical with the data that was not used in optimization) is presented in
tests. The goodness of fit of the models are appraised using the Table 5. Based on these data it could be concluded that there is
mean error (ME), the mean absolute error (MAE), the mean relatively good agreement between measured data and computa-
absolute relative error (MARE), coefficient of correlation ðRÞ and tional results for all optimization methods for both scenarios. This
Table 1
Parameters that were used for optimization.
Table 2
Optimized parameters found by various optimization methods for whole dataset (NDATA = 124).
Opt. method N1 N2 N3 N4 N5 N6 N7
GA 3.8 0.322 3.729 1.114 14.464 0.245 29.997
Hooke & Jeeves 3.166 0.358 3.726 0.908 20.51 0 22.36
Sim. Annealing 3.603 0.555 3.567 1.158 155.662 182.254 33.087
PSO 2.799 0.703 3.77 1.118 98.684 0.916 32.671
Table 3
Optimized parameters found for modelling dataset (NDATA = 100) for various optimization methods.
Opt. method N1 N2 N3 N4 N5 N6 N7
GA 6.112 0.301 3.29 0.961 29.918 0.857 29.96
Hooke & Jeeves 2.741 0.949 3.391 0.994 13.742 0.304 7.085
Sim. Annealing 3.49 0.831 3.19 1.123 46.183 113.006 33.301
PSO 3.327 0.73 3.312 1.189 58.232 0.788 32.263
Table 4
Statistical analysis for various optimization methods for whole dataset (NDATA = 124, Scenario 1).
Table 5
Statistical analysis for various optimization methods for test dataset (NDATA = 24, Scenario 2).
Fig. 4. Side by side comparison of measured data and calculation obtained using GA Fig. 5. Comparison of measured data (T) and calculation (A) obtained using GA for
for scenario 1. scenario 1.
3998 V. Nedic et al. / Expert Systems with Applications 41 (2014) 3993–3999
Fig. 6. Predictive capabilities of proposed models that are obtained using different optimization methods.
is also obvious from Fig. 4 that shows side by side comparison of techniques were used to optimize the proposed model: genetic
measured data and calculation values and Fig. 5 that shows those algorithms, Hooke and Jeeves algorithm, simulated annealing,
data in normalized coordinate system. For the sake of conciseness and particle swarm optimization. The model is optimized and
only results obtained using GA are presented. tested using originally developed user friendly software package.
The correlation coefficient between the measurements and The goodness of the model is evaluated by statistical parameters
calculated values of proposed model is relatively high, which indi- and compared with experimental results and widely used regres-
cates that there is a close relationship between the measurements sion model for traffic noise modelling developed by Fagotti et al.
and the results of model for all optimization methods. On the other All statistical analysis show that the developed model is precise.
hand Fagotti regression shows by far worse results (R ¼ 0:629 for Statistical tests showed a high correlation of proposed model to
scenario 1 and R ¼ 0:555 for scenario 2). the measured values which makes the model significant.
The high values of coefficient of determination for both scenar- The proposed model operates with relatively simple input val-
ios imply good modelling and predictive capabilities of proposed ues (parameters and structure of traffic flow) that are easily mea-
model. All statistical parameters show that the proposed model surable on site, and that are already monitored on the traffic
is superior to classical regression model, particularly concerning network. Validation has shown that by measuring and monitoring
predictive capabilities (Table 5). traffic flow parameters it is possible by means of developed model
The null hypothesis of the F test shows that the dispersion of to predict traffic noise level by far better than with classical statis-
the calculated and measured noise values are approximately equal. tical models. That makes it possible to take certain measures
For the degrees of freedom k1 ¼ 23; k2 ¼ 23 and p < 0:05 calcu- toward the traffic noise reducing by changing the influential traffic
lated F value for various optimization methods and Fagotti model flow parameters and traffic regime. Based on calculated noise level
is less than the table value of F 23;23
0:05 ¼ 1:44 (Table 5). It can be con- by means of proposed model, it is possible to appraise the noise
cluded that the dispersion of the measured and calculated values level in existing as well as in newly designed traffic arteries or de-
do not differ significantly, and there is no basis for rejecting the tours, or during the traffic regime change on an existing network.
null hypothesis of equal dispersion of data sets. When the paired Traffic noise is complex phenomenon that depends on many
t-test is concerned the mean value of the measured and calculated influential factors. Apart from intrinsic parameters such as
noise values are approximately equal. For the degree of freedom numbers of vehicles in specific group of vehicles and the average
k ¼ 46 and p < 0:05 calculated t is less than the table value speed of the traffic flow, there is also a group of parameters that
t46
0:05 ¼ 1:96 (Table 5), and there is no reason to reject the null are specific for area under investigation, such as speed limits, road
hypothesis of equality of means of data sets. This is not case with surface, driving skills and habits, type of intersections, traffic lights’
Fagotti model where calculated t value is more than the table value schedules etc. The proposed model for the analysis of traffic noise
and the null hypothesis of equality of means of data sets must be is shown to be a reliable tool for the practical application of the
rejected. calculating equivalent noise levels based on the traffic flow struc-
Predictive capabilities are illustrated in Fig. 6, where side by ture for typical road in Serbia. The weakness of the model is that
side comparison of measured data that was not used in optimiza- it does not take into account the area specific parameters. There-
tion, calculated values obtained using various optimization meth- fore the future research directions might be to include these
ods and calculated values obtained using Fagotti model is shown. specific parameters in model in order to make it more general.
7. Conclusion References
In this paper a novel approach for modelling of traffic noise is Brink, M. (2011). Parameters of well-being and subjective health and their
presented. This approach is based on advanced optimization tech- relationship with residential traffic noise exposurea representative evaluation
in switzerland. Environment International, 37, 723–733.
niques on the contrary to regression analysis that is most widely Cammarata, G., Cavalieri, S., & Fichera, A. (1995). A neural network architecture for
used method for traffic noise modelling. Four optimization noise prediction. Neural Networks, 8, 963–973.
V. Nedic et al. / Expert Systems with Applications 41 (2014) 3993–3999 3999
Guarnaccia, C., Lenza, T. L. L., Mastorakis, N. E., & Quartieri, J. (2011). A comparison WSEAS international conference on applied and theoretical mechanics, Puerto De
between traffic noise experimental data and predictive models results. La Cruz, Canary Islands.
International Journal of Mechanics, 5, 379–386. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proc. of IEEE
Fyhri, A., & Klboe, R. (2009). Road traffic noise, sensitivity, annoyance and self- international conference on neural network, Piscataway, NJ (pp. 1942–1948).
reported healtha structural equation model exercise. Environment International, Pirrera, S., De Valck, E., & Cluydts, R. (2010). Nocturnal road traffic noise: A review
35, 91–97. on its assessment and consequences on sleep and health. Environment
Givargis, S., & Karimi, H. (2010). A basic neural traffic noise prediction model for International, 36, 492–498.
tehrans roads. Journal of Environmental Management, 91, 2529–2534. Rahmani, S., Mousavi, S. M., & Kamali, M. J. (2011). Modeling of road-traffic noise
Gndogdu, O., Gkdad, M., & Yksel, F. (2005). A traffic noise prediction method based with the use of genetic algorithm. Applied Soft Computing, 11, 1008–1013.
on vehicle composition using genetic algorithms. Applied Acoustics, 66, 799–809. Rao, S. S. (1996). Engineering optimization: Theory and practice. John Wiley & Sons.
Quartieri, J., Mastorakis, N. E., Iannone, G., Guarnaccia, C., D’Ambrosio, S., Troisi, A., & Steele, C. (2001). A critical review of some traffic noise prediction models. Applied
Lenza, T. L. L. (2009). A review of traffic noise predictive models. In The 5th Acoustics, 62, 271–287.