1 s2.0 S0968090X1830679X Main

Transportation Research Part C 105 (2019) 126–144
Contents lists available at ScienceDirect
Transportation Research Part C

journal homepage: www.elsevier.com/locate/trc
A dual model/artificial neural network framework for privacy

T
analysis in traffic monitoring systems
Edward S. Canepaa, , Christian G. Claudelb
⁎
a
Electrical Engineering Department, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
b
Department of Civil, Architectural and Environmental Engineering, University of Texas Austin, USA
ARTICLE INFO ABSTRACT
Keywords: Most large scale traffic information systems rely on a combination of fixed sensors (e.g. loop
Traffic estimation detectors, cameras) and user generated data, the latter in the form of probe traces sent by
Mixed integer linear programming smartphones or GPS devices onboard vehicles. While this type of data is relatively inexpensive to
Neural network gather, it can pose multiple privacy risks, even if the location tracks are anonymous. In particular,
an issue could be the possibility for an attacker to infer user location tracks from anonymous
location data, which affects users privacy. In this article, we propose a new framework for
analyzing a variety of privacy problems arising in transportation systems. The state of traffic is
modeled by the Lighthill–Whitham–Richards traffic flow model, which is a first order scalar
conservation law with a concave flux function. Given a set of traffic flow data, we show that the
constraints resulting from this partial differential equation are mixed integer linear inequalities
for some decision variable. These constraints allow us to determine the likelihood of two distinct
location tracks being generated by the same vehicle. We then use these model-based re-
identification metrics to train an artificial neural network classifier. Numerical implementations
are performed on experimental data from the Mobile Century experiment, and show that this
framework significantly outperforms naive reidentification techniques.
1. Introduction
The convergence of mobile sensing, communication and computing has led to the rise of a new class of systems known as
cyberphysical systems, which are physical systems sensed and actuated by “cyber” agents, an example of which is the transportation
network. In transportation systems, a new form of sensing has emerged since a few years in the form of probe vehicles. In this
paradigm, the vehicles themselves transmit their speed and location anonymously (http://traffic.berkeley.edu/) to a central server,
which uses this data in conjunction with fixed sensor data (http://pems.eecs.berkeley.edu) to generate real-time traffic maps. While
systems such as the Mobile Millennium system (http://traffic.berkeley.edu/) have successfully demonstrated the concept, multiple
issues remain in terms of privacy (Hoh et al., 2008). Unlike fixed sensors which are difficult to tamper with, it is relatively easy for an
attacker to generate fake data and inject it in the system to modify the estimates with dire consequences, in particular if the traffic
estimates are used for optimal traffic control (traffic lights, ramp metering). In addition, since the user data is sent to a central server,
privacy breaches are a real possibility, even if the tracks are anonymous (Krumm, 2007). A possible approach to reduce the privacy
breach is to decrease the sampling frequency (the frequency with which probe vehicles provide position updates), albeit this could
affect the service quality as discussed in Hoh et al. (2006a). Because of the increasing penetration of social networks among users
⁎
Corresponding author.
E-mail addresses: edward.canepa@kaust.edu.sa (E.S. Canepa), christian.claudel@utexas.edu (C.G. Claudel).
https://doi.org/10.1016/j.trc.2019.05.031
Received 25 September 2016; Received in revised form 13 March 2019; Accepted 24 May 2019
Available online 06 June 2019
0968-090X/ © 2019 Elsevier Ltd. All rights reserved.
E.S. Canepa and C.G. Claudel Transportation Research Part C 105 (2019) 126–144
which give details about workplace or home address, even anonymous tracks could carry substantial privacy risks, provided that an
attacker is able to reliably reconstruct the location track from a set of anonymous location data.
The privacy issues of probe-based traffic information systems have been explored in numerous articles such as (Rass et al., 2008;
Krumm, 2007; Eichler, 2006; Hoh et al., 2008; Hoh et al., 2006b). An interesting approach for large data is the notion of differential
privacy, which provides a quantitative privacy requirement to develop a rigorous privacy mechanism that minimize the impact on
performance (Le Ny, 2013; Le Ny et al., 2013). However, in order to bring differential privacy into practice there are some challenges
to be addressed depending on the application (Dwork and Pottenger, 2013). These articles do not rely on traffic flow models for the
analysis of probe data, and thus do not make the full use of all information available to an attacker (traffic information is typically
publicly available). For instance, naive vehicle matching algorithms based on a constant velocity assumption such as the scheme used
in Hoh et al. (2008) are only valid when the vehicle density is uniform on the highway. While they perform very well when the
density of vehicles is low, they can lead to overly optimistic privacy metrics in the converse situation, since the probe and fixed sensor
data constrains the possible density profile of the highway, which brings additional information that an attacker can exploit.
One of the biggest difficulties arising when dealing with probe-based traffic flow information systems is the integration of the
effects of the model. Indeed, such systems are traditionally modeled by partial differential equations (PDEs), for which very few
mathematical tools are available for control and estimation. An additional difficulty is the integration of the probe data into the PDE,
which is computationally cumbersome in general (N.G.F., 1998; Claudel and Bayen, 2010a).
We showed earlier (Claudel and Bayen, 2011; Canepa and Claudel, 2012) that the constraints of a PDE model can be written in an
explicit (and tractable) form whenever the underlying model is a first order scalar conservation law with convex flux function,
encoded here by the classical Lighthill–Whitham–Richards (LWR) PDE. For general convex or concave flux functions, the model
constraints are mixed integer convex, and boil down to mixed integer linear inequalities for specific flux functions such as the
triangular flux function (Daganzo, 1994; Daganzo, 2006). Since the constraints of the model are encoded in a tractable form, the
resulting framework is very useful for solving a variety of transportation engineering problems: estimation (Canepa and Claudel,
2012), boundary control, model parameter estimation, which all result in optimization problems with mixed integer convex con-
straints. The same framework can be extended to study privacy problems, which is the contribution of this article. In the present case,
we use this framework to quantify the likelihood that two arbitrary vehicle tracks originate from the same vehicle. We then use these
metrics to train a neural network classifier, which matches vehicle tracks together. We then illustrate these results on numerical
implementations using experimental data from the Mobile Century field test (Herrera et al., 2009) and from the Mobile Millennium
system (http://traffic.berkeley.edu/). The results are very promising: this reidentification method significantly outperforms naive
vehicle matching, and thus raises new questions regarding user data protection in traffic sensing systems, particularly when high
quality traffic estimates are available for the attacker.
The rest of this article is organized as follows. Section 2 describes the Hamilton–Jacobi traffic flow model used in this study.
Section 2.5 presents a semi-analytic solution framework, which enables the derivation of explicit PDE model constraints in Section 3.
We use these explicit model constraints and data constraints to derive a measure of the vehicular label difference between two
location tracks, by solving an optimization problem in Section 4. The results of this optimization problem are the input to a neural
network classifier, which we implement and validate in Section 5. This dual model/neural-network based approach yields very good
results, reducing the number of missed reidentification by almost 40% in comparison with naive reidentification approaches.
2. Framework definition
2.1. Variable definitions
In this article, we use the following notations, which are compared to the notations of Daganzo (2006) in Table 1.
Table 1
Link between (Daganzo, 2006) and the current article.
Concept Notation of Daganzo (2006) Notation in this article
Moskowitz function N M
Fundamental diagram Q (·) (·)
Capacity qmax vf ·k c
Free flow speed w (0) vf
Congestion speed w( ) w
Maximal density
Minimum principle Newell’s minimum principle Inf-morphism property
Representation formula Variational Principle Lax-Hopf formula
Decision variable x u= x
Convex transform R (·) (·) = R ( ·)
Data Boundary data D Value condition c
Computational principle Dynamic programming Semi-analytic formula
127
2.2. Hamilton–Jacobi formulation of the LWR traffic flow model
Let us assume that the road section is a spatial domain defined by Z [ , ], where and are the upstream and downstream
boundaries respectively. We assume that the state of the system is described by a scalar function M(·,·) of both time and space, known
as Moskowitz function (Moskowitz, 1965; Newell, 1993b). The Moskowitz function is a macroscopic description of traffic flow which
can be thought as follows: let consecutive integer labels be assigned to vehicles entering the highway at location x = .
One of the most common models used to described traffic flow is know as the Lighthill–Whitham–Richards (LWR) model (Lighthill
et al., 1956; Richards, 1956). With this assumption, the Moskowitz function satisfies a Hamilton–Jacobi (HJ) PDE evolution equation:
M (t , x ) M (t , x )
=0
t x (1)
The function (·) defined in Eq. (1) is the Hamiltonian, or Fundamental Diagram. The Eq. (1), represents the Fundamental
Diagram relation, since M(t , x ) represents the flow at (t , x ) and represents the density at (t , x ) .
M(t , x )
t x
Several classes of weak solutions to Eq. (1) exist, such as viscosity solutions (Crandall and Lions, 1983; Bardi and Capuzzo-
Dolcetta, 1997) or Barron-Jensen/Frankowska (B-J/F) solutions (Barron and Jensen, 1990; Frankowska, 1993) used in the present
article. The B-J/F solutions to Eq. (1) are fully characterized by a Lax-Hopf formula (Aubin et al., 2008; Claudel and Bayen, 2010a),
which was initially derived using the control framework of viability theory (Aubin, 1991).
In the remainder of this article, we assume that the Hamiltonian is given by the following formula:
vf : [0, kc ]
( )=
w( ) : [kc , ] (2)
where
w
kc =
vf w
Such Hamiltonian is often referred to in the transportation literature as triangular fundamental diagram (Daganzo, 1994; Daganzo,
2006), and is commonly used to model traffic flow because of its robustness.
2.3. Lax-Hopf formula
Solving the HJ PDE (1) requires the definition of value conditions, which encompass the concept of initial, upstream, downstream
and internal boundary conditions. The value conditions are known as boundary data in Daganzo (2006).
Definition 2.1 (Value condition). A value condition c(·,·) is a lower semicontinuous function ranging in + . The effective domain
(Rockafellar, 1970) of c(·,·) is
Dom(c) = {(t , x ) + × Z|c (t , x ) < + }.
In all applications of this work, the value conditions are assumed to be affine functions of space and time, defined on a line
segment of + × Z .
Physically, the effective domain of definition Dom(c) of a value condition c represents the subset of the space time domain + × Z
in which we want the value condition to apply. For instance, imposing an upstream boundary condition c upstream (·,·) amounts to
constraint the value of the flow on the set Dom(c upstream) = + × { } , i.e. constraining the value of the flow at the upstream boundary,
and for all times.
Given an arbitrary value condition c(·,·) , we define its associated solution Mc (·,·) to (1) by the following Lax-Hopf formula (Aubin
et al., 2008; Claudel and Bayen, 2010a).
Proposition 2.2 (Lax-Hopf formula). Letc(·,·) be a value condition, as inDefinition 2.1. The B-J/F solutionMc (·,·) to(1)with hamiltonian(2)
associated withc(·,·) is defined (Aubin et al., 2008; Claudel and Bayen, 2010a) by:
Mc (t , x ) = inf (c (t T , x + Tu) + Tkc (u + vf ))

(u, T ) [ vf , w] × + (3)
The structure of the Lax-Hopf formula (3), implies the following important property, known as inf-morphism property. This
property was first conjectured by Newell (Newell, 1993a), and is also known as Newell’s minimum principle.
Proposition 2.3 (Inf-morphism property). Let the value conditionc(·,·) be minimum of a finite number of lower semicontinuous functions:
128
(t , x ) [0, t max ] × Z , c (t , x ) min cj (t , x )

j J (4)
The solutionMc (·,·) associated with the above value condition can be decomposed (Aubin et al., 2008; Claudel and Bayen, 2010a; Claudel
and Bayen, 2010b) as:
(t , x ) [0, t max ] × Z , Mc (t , x ) = min Mcj (t , x )
j J (5)
The above proposition has considerable importance in experimental problems, in which the value condition function is typically a
set of piecewise affine functions (Claudel and Bayen, 2011). In this case, the value condition c(·,·) can be decomposed as (4), where
the functions cj (·,·) are all affine functions. By the Lax-Hopf formula (3), one can easily compute the function Mcj associated with
cj (·,·) analytically, since it amounts to solving a one dimensional linear program with few constraints (Claudel and Bayen, 2010b).
2.4. Model constraints for piecewise affine value conditions
In the remainder of this article, we decompose the value condition c(·,·) into affine block value conditions cj, j J each re-
presenting some measurement data. The relation between block value conditions and measurement data is outlined in Section 2.5.
One of the specificities of our problem is that the functions cj (·,·) are not exactly known from the measurement data: measurement
data only constraints the values of some of its coefficients. Thus, from a given experimental dataset, one cannot define the function
c(·,·) uniquely.
In the remainder of this article, we assume that the model constraints hold for a given value condition candidate c(·,·) if and only
if the following condition is satisfied:
(t , x ) Dom(c), Mc (t , x ) = c (t , x ) (6)
The above condition can be physically interpreted as follows. The function c(·,·) represents a desired value of the boundary data,
while the function Mc (·,·) represents the actual value of the boundary data. These can be different: for instance it may be that the
desired inflow on a highway section is too high (given the current traffic state), and thus the actual inflow can be lower than the
desired outflow.
Using the inf-morphism property (5), one can rewrite (6) as follows:
Proposition 2.4 (Model compatibility of block value conditions). Letc (·,·) = minj J cj (·,·) be given, and letMc (·,·) be defined as in(3). The
value conditionc(·,·) satisfies(6)if and only if the following inequality constraints are satisfied:
M c j (t , x ) c i (t , x ) (t , x ) Dom(ci), (i , j ) J2 (7)
The proof of this proposition is available in Claudel and Bayen (2011).

When the above compatibility property is satisfied, all value conditions can be imposed in the strong sense (Strub and Bayen,
2006), i.e. the solution to the HJ PDE (1) will be identical to the value conditions that we have imposed, on the domain of definition
of these value conditions.
The above compatibility conditions is not sufficient to solve the problem entirely. As mentioned earlier, we consider BJ/F so-
lutions to (1), which are only lower-semicontinous in general, and thus can be discontinuous, while the actual Moskowitz function
should be continuous (Newell, 1993b; Canepa and Claudel, 2012). Hence, the Moskowitz function Mc (·,·) has to satisfy additional
continuity conditions, which are derived in Section 3.1.
We now define the affine initial, boundary and internal condition functions that will play the role of building blocks to construct
the value condition c(·,·) of the problem, as described in (4).
2.5. Affine initial, boundary and internal conditions
Multiple types of value conditions can be incorporated into the estimation problem. In the present article, we include initial,
boundary and internal conditions. The initial and boundary conditions are typically measured (with some error) using fixed sensors,
such as inductive loop detectors, magnetometers or traffic cameras. Similarly, the internal conditions are partially measured using
probe vehicle trajectories(Work et al., 2010).
2.6. Definition of affine initial, boundary and internal conditions
The formal definition of initial, upstream, downstream and boundary conditions associated with the HJ PDE (1) is the subject of
the following definition.
Definition 2.5. [Affine initial, boundary and internal conditions] Let us define = {0, …, k max }, = {0, …, n max} and
= {0, …, m max} . For all k ,n and m , we define the following functions, respectively called initial, upstream,
129
downstream (boundary) and internal conditions:

k 1
(i ) X
i=0
Mk (t , x ) = (k )(x kX ) if t = 0
and x [kX , (k + 1) X ]
+ otherwise (8)
n 1
i=0
qin (i ) T
n (t , x ) = + qin (n)(t nT ) if x =
and t [nT , (n + 1) T ]
+ otherwise (9)
n 1
i=0
qout (i ) T
+ qout (n)(t nT )
kmax
n (t , x) = (k ) X if x =
k=0
and t [nT , (n + 1) T ]
+ otherwise (10)
Lm + rm (t t min (m))
(if x = x min (m)
µm (t , x ) = + xtmax ((m
m)
)
xmin (m )
tmin (m )
(t tmin (m)) .
max
and t [t min (m ), t max (m)])
+ otherwise (11)
In the above definition, T and X represent the time and space segmentation of the boundary conditions and initial condition
respectively.The coefficients used in the above definition can be physically interpreted as follows:
1. The initial condition (8) represents a constant density (k ) over the interval [kx , (k + 1) x ].
2. The upstream condition (9) represents a constant flow qin (n) over the interval [nT , (n + 1) T ].
3. The downstream condition (10) represents a constant flow qout (n) over the interval [nT , (n + 1) T ].
4. The internal condition (11) represents a vehicle starting from x min (m) at time t min (m ) , driving with the speed xmax (m) xmin (m)
, until
tmax (m) tmin (m)
time t max (m) . The vehicle has an initial label Lm , and is passed by rm vehicles per time unit.
In practical problems, the initial, boundary and internal conditions defined above are usually not known exactly. In particular, we
do not usually know the exact values of the initial densities (·) , the boundary flows qin (·) and qout (·) , as well as the coefficients Lm
and rm of the internal conditions. Some coefficients such as (·), qin (·) and qout (·) can be known with some uncertainty using flow or
traffic density sensors, but some coefficients such as Lm and rm simply cannot be measured experimentally by any traffic sensor. All of
these unknown variables will act as part of our decision variable for the Mixed Integer Linear Program (MILP) derived in Section 3.
Note that the coefficients x min (·), x max (·), t min (·) and t max (·) are known with high accuracy since they are typically measured with a
GPS, and will thus not be part of the problem’s decision variable.
2.7. Analytical solutions to affine initial, boundary and internal conditions
Given the affine initial, upstream, downstream and internal conditions defined above, the corresponding solutions
M Mk (·,·), M n (·,·), M n (·,·) and M µm (·,·) defined by the Lax-Hopf formula (3) can be computed explicitly (Claudel and Bayen, 2011;
Mazare et al., 2011) as closed-form expressions. These expressions, in the case of the fundamental triangular diagram, can be found in
the Section A of this article.
The closed-form expressions of M Mk (·,·), M n (·,·), M n (·,·) and M µm (·,·) are very important: they enable one to compute the
solution to the HJ PDE (1) semi-analytically for a very low computational cost using the inf-morphism property (Claudel and Bayen,
2010a; Mazare et al., 2011). They also enable one to write the model compatibility constraint condition (7) as a set of linear
inequalities in a specific decision variable.
These expressions can be physically interpreted as follows. The solution Mc (·,·) associated to c(·,·) represents the largest value of
the Moskowitz function that would be the solution to a problem involving c(·,·) and other value conditions. The functions Mc (·,·) are
only defined on a subset of the entire computational domain, called domain of influence (Claudel and Bayen, 2010b), and limited by
the maximal and minimal velocity of the characteristics.
130
3. Constraints arising from model and measurement data
We consider a set of block boundary conditions cj defined as in Section 2.6, with unknown coefficients. Let us call V the vector
space of unknown coefficients. Our measurement data (from the data set) constraints the possible values of these coefficients. Such
constraints are called data constraints, and are outlined in Section 3.2 below. Similarly, the model compatibility conditions (7) also
constraint the possible values of the unknown coefficients. Such constraints are called model constraints, and are outlined in Section
3.1. An important and nontrivial result of Claudel and Bayen (2011) is that all these constraints are explicit, and also tractable. A list
of all constraints can be found in Canepa and Claudel (2012) and Canepa and Claudel (2013b).
3.1. Model constraints
The model constraints are derived in Appendix B. These constraints have an important property outlined below.
Fact 3.1 (Linear inequality property). The model constraints derived in Appendix B are linear in the variables
(1), (2), …, (k max ), qin (1), …, qin (n max ), qout (1), …, qout (n max ), L1, …, Lmmax and r1, …, rmmax .
While the HJ PDE model constraints (7) ensure that the initial, boundary and internal conditions can all be applied in the strong
sense, they do not ensure the continuity of the solution (the BJ/F solution to the HJ PDE (1) is only lower-semicontinuous in general
(Aubin et al., 2008)). In this article, we look for continuous solutions, since they correspond to the physically meaningful solutions.
Non-continuous solutions violate the principle of conservation of vehicles. The necessary and sufficient conditions for the continuity
of the function Mc (·,·) defined by (5) are outlined in Proposition (3.2) below.
Proposition 3.2 (Continuity constraints with relaxation term). Let a set of initial, boundary and internal conditions be defined as in(8), and
let the corresponding partial solutions be defined asM Mk (·,·), M n (·,·), M n (·,·) andM µm (·,·) . Let us also assume that the model constraints(7)
are satisfied. LetMp (·,·) be defined asMp (·,·) = mink, n, m | m p (M Mk (·,·), M n (·,·), M n (·,·), M µm (·,·)). The solutionM(·,·)to the HJ PDE(4)
defined byM (·,·) = mink, n, m (M Mk (·,·), M n (·,·), M n (·,·), M µm (·,·)) is continuous if and only if the following conditions are satisfied
(withh = 0 ):
p , Mp (t min (p), x min (p)) = µp (tmin (p), x min (p)) ± h (12)
µm (tmin (m), x min (m)) ± h = min(M Mk (t min (m ), x min (m)),

M n (t min (m), x min (m )), M n (tmin (m), x min (m)),
M µp (t min (m), x min (m ))) k , n , (m , p ) 2
µm (tmax (m), x max (m)) ± h = min(M Mk (tmax (m), x max (m)),

M n (t max (m), x max (m)), M n (t max (m), x max (m)),
M µp (t max (m), x max (m))) k , n , (m , p ) 2
(13)
The term h represents the relaxation term: it is a measure of the distance to feasibility of the original problem introduced in
Canepa and Claudel (2013a), and allows us to incorporate model uncertainty into this problem (the lower the value of h, the better
the data fits the model). This term represents the integration errors associated with incorrectly-predicted densities and flows between
value conditions. This term can only be introduced thanks to the lower-semicontinuity of the BJ/F solutions to (1): alternative
frameworks such as frameworks derived from computational methods associated with the LWR model (including for example the
CTM), the wave-front tracking framework, or the Variational method framework, all consider continuous solutions to the HJ PDE (1).
With such constraints, the solution to the HJ PDE only lower semicontinuous in general.
The lower semicontinuity of M can be interpreted as follows. The function M that is solution to the HJ PDE (1) does not match the
true value of the experimental Moskowitz function because of model noise. The presence of the h term allows this uncertainty to be
considered, by allowing discrepancies between the model prediction of the value of the label of a floating car, and its actual value,
yielding a nonzero h (a zero h indicates that the available measurement data can be fitted perfectly using the LWR model).
The inequality constraints (12) can be written as a set of mixed integer linear inequalities involving the continuous variables
(1), (2), …, (k max ), qin (1), …, qin (n max ), qout (1), …, qout (n max ), L1, …, Lmmax and r1, …, rmmax , as well as auxiliary integer variables.
The proof of (12) is straightforward, and follows directly (Claudel and Bayen, 2011) from the piecewise affine structure of the
partial solutions M Mk (·,·), M n (·,·), M n (·,·) and M µm (·,·) .
The fact that (12) can be written as a set of mixed integer linear inequalities is more involved. It can be shown that since
Mp (·,·) = mink, n, m | m p (M Mk (·,·), M n (·,·), M n (·,·), M µm (·,·)), Eq. (12) can be written as a set of inequalities involving the continuous
variables (1), (2), …, (k max ), qin (1), …, qin (n max ), qout (1), …, qout (n max ), L1, …, Lmmax and r1, …, rmmax , as well as boolean variables. An
example of such derivation is shown in Canepa and Claudel (2012) for the case in which m max = 1. These inequalities can be further
rewritten as mixed integer linear inequalities using the piecewise affine dependency of the partial solutions with respect to the
variables (1), (2), …, (k max ), qin (1), …, qin (n max ), qout (1), …, qout (n max ), L1, …, Lmmax and r1, …, rmmax .
In the remainder of this article, we define y as the decision variable of the problem, which contains the continuous variables
(1), (2), …, (k max ), qin (1), …, qin (n max ), qout (1), …,
131
qout (n max ), L1 , …, Lmmax and r1, …, rmmax , with the relaxation term h and additional integer variables used to encode the continuity
constraints and the internal conditions.
With the above assumptions, the model and relaxed continuity constraints are mixed integer linear (since y contains integer
variables):
Ay b (14)
Note that the number of integer variables in y is a function of the configuration of the internal conditions, and is not a function of
k max, n max and m max only. Thus, the size of y is difficult to predict or even estimate in advance, which can pose a problem for real time
systems, though an upper bound on y can be found for all values of k max, n max and m max .
Physically, the model constraints are related to the concept of weak/strong solutions to hyperbolic conservation laws (Strub and
Bayen, 2006). For a given dataset, the initial, upstream, downstream and internal conditions do not necessarily strongly apply, i.e. the
solution to (1) does not necessarily satisfy reflect all these constraints. For example, one cannot simultaneously impose a completely
jammed initial condition with a nonzero upstream condition. Eq. (14) represent the set of conditions ensuring that solutions are
physical (continuity constraints), and that all initial, upstream, downstream and internal conditions strongly apply simultaneously.
3.2. Data constraints
Similarly, the unknown coefficients of the initial, boundary and internal conditions have to satisfy data constraints to be com-
patible with the observations. The data constraints express the fact that the true values of the conditions coefficients should be close
to the corresponding measurements, within the corresponding sensor specifications.
Hypothesis 3.3 (Data constraints). In the remainder of our article, we assume that the data constraints are linear in the unknown
coefficients of the initial, boundary and internal conditions, and can thus be written symbolically as
Cy d (15)
where y is the decision variable defined earlier.

Different choices of error models that yield convex data constraints are available such as the example outlined below.
Example of convex data constraints- Consider a sensor measuring the boundary flows (qin (0), …qin (n max )) with 6% relative
uncertainty, a loop detector measuring the initial density (3) with 12% absolute uncertainty, and no downstream sensor. In this
situation, the constraints are convex inequalities (indeed, linear inequalities) in the decision variable:
0.94qinmeasured (n) qin (n) 1.06qinmeasured (n) n [0, n max ]

(3) measured 0.12 m (3) (3) measured + 0.12 m (16)
4. Applications to privacy analysis
We now present some applications of this framework to privacy problems occurring in probe-based traffic information systems.
Though we present only one complete example for compactness, more problems could be posed as mixed integer linear programs in
the same framework. Examples of such problems include:
Fig. 1. Naive (non model-based) reidentification method. In this figure, our objective is to determine if IC 2, IC 3, IC 4 or IC 5 are originating
from the same vehicle as IC 1. The dotted line represents the actual trajectory of the vehicle that generated IC 1, which ends in IC 3 in this figure. The
upper subfigure shows the non model-based reidentification scheme described in Hoh et al. (2008), in which the velocity of the vehicle is assumed to
be constant, and the successor of IC1 is chosen as the closest to the predicted position of the vehicle (circle), assuming a constant velocity (dashed
line).
132
• Real time assessment of vulnerability to attacks on given location tracks

• Offline assessment of worst case effects of attacks
4.1. Naive reidentification method
The naive (kinematic-based) reidentification method of Hoh et al. (2008) is based on the idea that vehicles maintain their velocity
between two location tracks. The principle is illustrated in Fig. 1 below.
While the naive reidentification method performs very well in free flow conditions (since vehicles are indeed free to choose and
maintain their velocities), the method can fail during congestion events. In the latter case, vehicles are not free to select their
velocities anymore, and the effects of the model (back-propagating shockwaves and speed reduction) become apparent. This is
illustrated in the following section.
4.2. Model-based reidentification method
An internal condition of the form (11) can be interpreted as follows. The coefficients Lm and rm respectively represent the initial
value of the Moskowitz function corresponding to the internal condition (at position x min (x ) and at time t min (m ) ), and the passing rate
(number of vehicles passing the probe vehicle per unit time.
By construction, vehicles trajectories correspond to the isolines of the Moskowitz function (Newell, 1993b), assuming that no
passing occurs. In this situation, two internal conditions µm and µn are generated by the same vehicle if and only if Lm = Ln .
Evidently, the general assumption that vehicles are not allowed to pass each other does not hold in practice, but it is in most
situations a very good approximation. Under this approximation, the problem of reidentification (Hoh et al., 2008) (i.e. do the
internal conditions µm and µn originate from the same vehicle?) can be posed as:
Minimize |Lm Ln |
Ay b
s. t.
Cy d (17)
If the solution to (17) is zero, µm and µn can originate (though not necessarily) from the same vehicle. Indeed, if the optimal
solution of (17) is zero, there exists an arrangement of the initial, boundary and internal condition blocks that is compatible with both
µm (·,·) and µn (·,·) originating from the same vehicle. However, this arrangement may not correspond to the true state of the system,
which is unknown. Hence, µm (·,·) and µn (·,·) may or may not originate from the same vehicle. In the converse case (if (17) is strictly
positive), µm and µn are guaranteed to originate from two different vehicles. When more than one plausible options have a label
difference of zero, the relevant secondary metrics of the reidentification problem are the relaxation term h (which is a measure of the
quality of the model fit) or the time difference between the actual location track and the expected location track of the vehicle to be
reidentified, assuming that its velocity is constant. The latter parameter is the basis of the reidentification method developed in Hoh
et al. (2008).
4.3. Vehicle reidentification example
Our objective is now to illustrate the performance of the MILP-based framework described above on a non trivial case of vehicle
reidentification.
In all subsequent numerical experiments, we consider the Mobile Century (http://traffic.berkeley.edu/, 2019; Work et al., 2010)
dataset. The Mobile Century field test demonstrated the use of Nokia N-95 cellphones as mobile traffic sensors in February 2008, and
was a joint UC-Berkeley/Nokia project.
For all numerical applications, a spatial domain of 3.858 km is considered, located between the PeMS (http://pems.eecs.berke-
ley.edu) VDSs (vehicle detection stations) 400536 and 400284 on the Highway I - 880 N around Hayward, California. The data used in
this implementation was generated on February 8th , 2008, between times 18: 30 and 18: 55 (local time). In our scenario, we consider
inflow and outflow data qinmeasured (·) and qout
measured
(·) generated by the above PeMS stations, i.e. we do not assume to know any initial
density data. We also consider internal condition data (i.e. probe vehicle data) generated by the GPS units of Nokia N-95 smart-
phones during the experiment.
We choose the data constraints as (1 e ) qin/out
measured measured
(n) qin/out (n) (1 + e ) qin/out (n) n [0, n max ], where e = 0.01 = 1% . These
constraints imply that the relative vehicle count error from the fixed sensors is less than 1% , which corresponds to the typical
performance of loop detectors.
We divided the spatial domain into six segments of equal distance X = 643 m. We also set T = 30 s as the aggregation time for the
flow data (T is determined by the granularity of PeMS data). All MILPs have been implemented using IBM Ilog Cplex working on a
Macbook operating MacOS X. The problems described in this article are tractable: they typically involve tens of variables and
hundreds of constraints, and are solved in less than a second.
In the present case, we consider 20 blocks of upstream (9) and downstream (10) boundary conditions. We also consider 3 blocks
of internal conditions (11), extracted from the mobile century dataset. Among these 3 blocks, two originate from the same Mobile
Century test vehicle, and one originates from another Mobile Century test vehicle. The layout is illustrated in Fig. 2.
Vehicle reidentification problems are at the core of user privacy analysis for probe-based traffic information systems. In particular,
133
Fig. 2. Vehicle reidentification problem layout. In this problem, we consider one block of internal condition (left) generated by a given probe
vehicle. We also have two additional blocks of internal condition, generated after the first one. Among these two blocks, one comes from the same
vehicle that generated block #1.
the authors of Hoh et al. (2008) and Hoh et al. (2010) the average distance to confusion is an important metric to evaluate user privacy.
However typical algorithms such as the one used in Hoh et al. (2008) do not take into account the effects of the flow model. For
instance, the reidentification model used in Hoh et al. (2008) assumes that the velocity of vehicles is more or less constant, and looks
for the best candidate within a region of the space–time domain satisfying this constraint. If we apply this procedure to the problem
described in Fig. 2, it is easy to visually check that GPS track #3 is the most probable successor of GPS track #1, since it is in the
alignment of track #1. However, in this specific case GPS track #2 is actually the successor to GPS track #1, and GPS track #3 has been
generated by another probe vehicle. The model-based reidentification scheme (17) is able to capture this fact: minimizing |L1 L 2 |
gives 0, while minimizing |L1 L3 | gives 41. Thus, the nonzero optimal value of (17) rules out GPS track #3 as a possible successor to
GPS track #1, a result that does not seem obvious at all by looking at the configuration in Fig. 2. The density maps corresponding to
the computations of (17) are illustrated in Fig. 3.
This result suggests that the framework can help in the vehicle reidentification problem, which is importance for privacy analysis.
Indeed, it is very likely that if an attacker gains access to some private probe vehicle data, he or she can also gain access to additional
traffic flow measurements from sensors, which are sometimes even public (for instance the PeMS system operating in California, see
(http://pems.eecs.berkeley.edu)). Hence, the example described above suggests that attacks on anonymous location tracks can be
much more damaging than initially thought.
Due to the low overall accuracy of the LWR scheme, the results may not be very robust in practice. In particular, while the LWR
model is very robust, the output of the estimation problem can be sensitive to model changes. This is illustrated in Fig. 4 below.
In the free flow regime, the LWR model predicts that all vehicles will have identical speeds, equal to v (free flow speed parameter).
Fig. 3. Example of reidentification. The corresponding scenario is decribed in Fig. 2. Top: Solution to the reidentification problem (17), with an
objective |L1 L 2 |. Bottom: Solution to the same problem with an objective |L1 L3 |. A nonzero optimum means that both tracks cannot be
generated by the same vehicle, according to both the model and the available data.
134
Fig. 4. Model parameter uncertainty effects. In this figure, we consider the same reidentification problem, but with different model parameters.
For the upper subfigure, the free flow speed parameter vf of the fundamental diagram (2) is chosen to be 65 mph. For the lower subfigure, the same
parameter is chosen as 60 mph, all other parameters of (2) being identical. As one can see from both figures, a small change of this parameter has a
considerable impact on reidentification. In the first case (vf = 65 mph), the vehicle is properly reidentified with a zero optimal value of (17)), while
the vehicle cannot be reidentified for vf = 60 mph with an optimal value of 14. Note that both parameters are realistic for the dataset considered, as
the uncertainty on vf is on the order of 10%.
In practice, vehicles are free to adopt different speeds around v, and thus our framework will underperform in comparison with non
model-based reidentification methods (Hoh et al., 2008) in which vehicles are assumed to maintain their current speeds. In parti-
cular, it will sometimes incorrectly predict that two given components of the same vehicle track correspond to distinct vehicles if the
speed of the vehicle differs too much from v, as illustrated in Fig. 4.
To address this problem, we consider a dual machine learning/model based approach that is detailed below. The output of the
reidentification algorithm presented earlier is fed to an artificial neural network classifier. The supervisory data is the correct vehicle
classification, which we obtain from our traffic measurement dataset.
5. Artificial neural network formulation
We now present an implementation of the reidentification framework presented earlier. The dataset involves fixed sensor data
(obtained from inductive loop detectors in the present case) and probe vehicle data.
5.1. Artificial neural network framework
Neural networks can be trained to perform complex functions in various fields, including pattern recognition, identification,
classification, speech, vision, and control systems (Li, 1994). For this framework, a pattern recognition is used with the following
inputs:
• label difference |L L |
• model relaxation term h (used as a measure of how well the data is fits in the model)
m n
135
Fig. 5. Training process flow chart. The weights are adjusted with actual vehicle reidentification examples.
Fig. 6. Flow chart of the privacy framework. This figure illustrates the flow chart of the training process and normal working process of the
framework.
• Euclidean distance between v meas

m and Ln
The supervisory training data is the correct vehicle matching data. The neuron weights are adjusted during a training process (see
chart in Fig. 5), in which the data set chosen is generated from congested scenarios covering the complete Mobile Century dataset.
The training process is done until 2500 iterations were completed. We used a batch training style in which the weights and biases
are only updated after all the inputs are presented.
The reidentification framework is available once the weights of the neurons have been adjusted through training. The full chart
flow of the framework is shown in Fig. 6.
The performance of this vehicle reidentification framework on experimental traffic data is illustrated in Section 5.
5.2. Artificial neural network implementation
The neural network is defined to solve the previously described privacy analysis, it consists in a hidden layer of 20 neurons, 9
inputs and 2 outputs, as shown in Fig. 7. For this implementation, the Matlab® neural network toolbox was used (see Fig. 8).
The number of inputs in the network is defined as follows: the measured speed of the condition to be identified together with the
Fig. 7. Neural network layout. The number of inputs and outputs in the network are directly defined by the number of possible tracks that the car
could take (in our Example 2).
136
Fig. 8. Training performance The best MSE during training was 0.0406.
label difference, relaxation term, distance to predicted position and measured speed of each possible option, for the example defined
before nine inputs will be needed. The output array size is equal to the options for identification, defining a one in the selected track
and zero for the rest.
The function used during the training process was the scaled conjugate gradient backpropagation, widely used on pattern re-
cognition neural networks (Møller, 1993). For the training process, eight thousand random scenarios with real traffic data were
generated. 70% of the scenarios were used for training and 30% of them were used for validation. The computational time required to
train it on an Imac with an Intel i5@2.5 GHz was 3 min 15 s.
The percentage or correct answers during training was 95.4. Once the neural network weights have been defined the full fra-
mework is ready to be tested on a privacy test.
5.3. Privacy Framework analysis
The reidentification scheme described in Hoh et al. (2008) attempts to match vehicles from both subsets by assuming a constant
velocity of the transmitting vehicle, as illustrated in Fig. 9. This scheme could be improved by checking model feasibility constraints
Fig. 9. Dual model/neural network reidentification principle. In these figures, our objective is to determine if IC 2 or IC 3 are originating from
the same vehicle as IC 1. The dotted line (bottom subfigure) represents the actual trajectory of the vehicle that generated IC 1, which ends in IC 3.
The upper subfigure shows the inputs that will be used in the dual model/artificial neural network framework for the reidentification. The lower
subfigure represents the option chosen by the framework.
137
Fig. 10. Model based reidentification framework The diagonal cells show for how many (and what percentage) of the examples the trained
network correctly estimates the classes of observations. That is, it shows what percentage of the true and predicted vehicles match. The off diagonal
cells show where the classifier has made mistakes. The column on the far right of the plot shows the accuracy for each predicted class, while the row
at the bottom of the plot shows the accuracy for each true class. The rate of correct matching of 92.2% is reported on the lower right cell of this
figure.
beforehand, which we illustrate in the same figure.

In the testing with classical traffic flow parameters (obtained from the Highway Capacity Manual (Highway Capacity Manual,
1985)) on 8800 reidentification problems drawn randomly (in each problem, the actual internal condition is to be chosen among 2
candidates) lead to the results shown in Fig. 10.
As one can see from these figures, the proposed algorithm outperforms the naive reidentification scheme on real data by 6.2%.
The 8800 randomly chosen scenarios used for this experiment were taken under different traffic situations covering the complete
Mobile Century dataset, including free flow, recurring congestion and a road accident captured during this experiment (Amin et al.,
2008). The relaxation term h was an addition to the decision variable in order to avoid model infeasiblity and being able to use a large
dataset for the analysis. Useful information for the neural network is coming from the measured speeds of the internal conditions
(v meas ) on each scenario since they give an insight of the traffic conditions and improve the matching. Under free flow conditions, the
naive approach performance improves and is just 2% worse than the toolbox, however in congested conditions the difference could
be up to 10 or 12%.
Since the Lighthill–Whitham–Richards traffic flow model is not a perfect description of traffic flow propagation, the benefits of the
method depends on its relative accuracy on the considered dataset. Handling model uncertainty (which is not done in the current
framework) is thus something important to consider since it will probably lead to better matching performance. As a final com-
parison, the Neural Network was evaluated in two different modes:
• Velocity Data
• Model and Velocity Data
As can be seen in Fig. 11, velocity and position data reduced the error by 4% and the model-derived features reduced the error by
an additional 4%. The framework was tested with different model parameters, maintaining the improvement reported over small
changes (less than 10%) around the optimal point in all three model parameters. The maximum value of correct reidentifications
Fig. 11. Percentage of wrong reidentification match for different approaches Each of these error metrics determined by solving 8000 in-
dependent reidentification problems involving experimental measurement data.
138
during the parameter sweep was 96% with the following configuration:
• Free flow velocity v = 65mph

• Critical density k = 140 vehicles per mile
f
• Congested velocity w = 15mph

c
6. Conclusion
In this article, we introduce a new framework for solving privacy problems on systems modeled by Hamilton–Jacobi equations,
such as the highway transportation network. Using a semi-analytical expression of the solutions to the Hamilton–Jacobi equation, we
formulate the problem of checking the consistency of the data with respect to the model as a Mixed Integer Linear Program (MILP)
and then use this information as an input to a neural network. The method does not require any approximation or Monte-Carlo
simulations to operate, and is tractable. We illustrate the performance of the method on an experimental dataset containing fixed
sensor as well as probe data. We show that the method is able correctly match the trace of a car on more than 90% of the time, which
is considerably better than the commonly used naive approach.
Future work will be dedicated to the generalization of the method to allow model uncertainty with the purpose of improving the
accuracy of the matching. Another direction is the detection of spoofing cyber-attacks, taking into account the relaxation term as a
parameter to indicate the possible attack or sensor failure.
Appendix A. Explicit solutions to affine initial, boundary and internal conditions
+ if x kX + wt
or x (k + 1) X + vt
k 1
(i ) X
i=0
+ (k )(tv + kX x) if kX + tv x
and (k + 1) X + tv x
and (k ) c
k 1
(i ) X
i=0
+ c (tv + kX x) if kX + tv x
and kX + tw x
and (k ) c
M Mk (t , x ) = k 1
(i ) X
i=0
+ (k )(tw + kX x)
m tw if kX + tw x
and (k + 1) X + tw x
and (k ) c
k
(i ) X
i=0
c (tw + (k + 1) X x)
m tw if (k + 1) X + tv x
and (k + 1) X + tw x
and (k ) c (A.1)
x
+ if t nT + v
n 1
i=0
qin (i) T
x x
+ qin (n)(t v
nT ) if nT + v
t
M n (t , x ) = and t (n + 1) T
x
+ v
n
i=0
qin (i) T
x
+ c v (t (n + 1) T ) otherwise
v (A.2)
139
+ if t nT
x
+ w
kmax n 1
(k ) X + qout (i ) T
k=0 i=0
x
+qout (n)(t w
nT )
m (x ) if nT
M n (t , x ) = x
+ w
t
and t (n + 1) T
x
+ w
kmax n
(k ) X + qout (i) T
k=0 i=0
x
+ c v (t (n + 1) T ) otherwise
v (A.3)
Lm +
rm t ( x xmin (m)
v
v meas (m)(t
v meas (m)
tmin (m ))
t min (m) )
if x x min (m) + v meas (m)(t t min (m ))
and x x max (m) + v (t t max (m))
and x x min (m) + v (t t min (m))
Lm +
rm t ( x xmin (m) v meas (m)(t
w v meas (m )
tmin (m ))
t min (m) )
xmin (m) v meas (m)(t tmin (m ))
x
M µm (t , x ) = +kc (v w) w v meas (m )
if x x min (m) + v meas (m)(t t min (m ))
and x x max (m) + w (t t max (m))
and x x min (m) + w (t t min (m ))
Lm + rm (t max (m ) t min (m ))+
(t t max (m )) kc v ( x
t
xmax (m )
tmax (m) )
if x x max (m) + v (t t max (m))
and x x max (m) + w (t t max (m))
+ otherwise (A.4)
Appendix B. Explicit model constraints
The model constraints (7) can be expressed as the following finite set of convex inequality constraints:
M Mk (0, x p) Mp (0, x p) (k , p ) 2
M Mk (pT , ) p (pT , ) k , p
xk + 1
M Mk ( v
, )
xk + 1
p( v
, ) k , p s. t.
xk+1
v
[pT ,
(p + 1) T ]
M Mk (pT , ) p (pT , ) k , p
xk xk
M Mk ( w
, ) p( w
, ) k , p s. t.
xk
w
[pT , (p + 1) T ] (B.1)
140
M Mk (t min (m), x min (m )) µm (t min (m), x min (m ))

k , m
M Mk (t max (m), x max (m)) µm (t max (m ), x max (m ))
k , m
M Mk (t1 (m , k ), x1 (m , k )) µm (t1 (m , k ), x1 (m , k ))
k , m s. t. t1 (m , k ) [tmin (m);t max (m)]
k , m s. t. t2 (m , k ) [tmin (m);t max (m)]
k , m s. t. t3 (m , k ) [t min (m);tmax (m)]
M Mk (t4 (m , k ), x 4 (m , k )) µm (t4 (m , k ), x 4 (m , k ))
k , m s. t. t4 (m , k ) [t min (m);tmax (m)] (B.2)
M n (pT , ) 2
p (pT , ) (n , p)
M n (pT , ) 2
p (pT , ) (n , p)
M n (nT + , ) 2
v p (nT + v
, ) (n , p) s. t. nT +
v
[pT , (p + 1) T ] (B.3)
M n (t min (m), x min (m )) µm (t min (m), x min (m ))

n , m
M n (t max (m), x max (m)) µm (t max (m ), x max (m ))
n , m
M n (t5 (m , n), x5 (m , n)) µm (t5 (m, n), x5 (m , n))
n , m s. t. t5 (m , n) [t min (m);tmax (m)] (B.4)
M n (pT , ) 2
p (pT , ) (n , p )
M n (nT + , ) 2
w p (nT + w
, ) (n , p ) s. t. nT +
w
[pT , (p + 1) T ]
M n (pT , ) 2
p (pT , ) (n , p ) (B.5)
M n (tmin (m), x min (m)) µm (tmin (m), x min (m))

n , m
M n (tmax (m), x max (m)) µm (t max (m), x max (m))
n , m
M n (t6 (m , n), x 6 (m , n)) µm (t6 (m , n), x 6 (m , n))
n , m s. t. t6 (m , n) [t min (m );t max (m )] (B.6)
M µm (pT , ) p (pT , ) (m , p ) × (vii )(a)

M µm (t7 (m ), ) p (t7 (m), ) (m , p ) × s. t.
t7 (m) [pT , (p + 1) T ] (vii)(b)
M µm (t8 (m ), ) p (t8 (m), ) (m , p ) × s. t.
t8 ( m ) [pT , (p + 1) T ] (vii)(c ) (B.7)
M µm (pT , ) p (pT , ) (m , p ) × (viii )(a)

M µm (t9 (m), ) p (t9 (m), ) (m , p) × s. t.
t 9 (m ) [pT , (p + 1) T ] (viii )(b)
M µm (t10 (m ), ) p (t10 (m), ) (m , p ) × s. t.
t10 (m) [pT , (p + 1) T ] (viii )(c ) (B.8)
141
M µm (tmin (p), x min (p)) µp (t min (p), x min (p))

(m , p ) 2
(ix )(a)
M µm (tmin (p), x max (p)) µp (t max (p), x max (p))
(m , p ) 2 (ix )(b)
M µm (t11 (m , p), x11 (m , p)) µp (t11 (m , p), x11 (m , p))
(m , p ) 2 s. t. t (m , p) [t min (p), t max (p)] (ix )(c )
11
(m , p ) 2 s. t. t (m , p) [t min (p), t max (p)] (ix )(d )
12
(m , p ) 2 s. t. t (m , p) [t min (p), t max (p)] (ix )(e )
13
(m , p ) 2 s. t. t (m , p) [t min (p), t max (p)] (ix )(f )
14
2
(m , p ) s. t. t15 (m , p) [t min (p), t max (p)] (ix )(g ) (B.9)
where the coefficients t1 (m , k ), x1 (m , k ), t2 (m , k ), x2 (m , k ), t3 (m , k ), x3 (m , k ), t4 (m , k ), x 4 (m , k ), t5 (m , n), x5 (m , n), t6 (m , n),
x 6 (m , n), t7 (m), t8 (m ), t9 (m), t10 (m), t11 (m , p), x11 (m , p), t12 (m , p), x12 (m , p), t13 (m , p), x13 (m , p), t14 (m , p), x14 (m , p), t15 (m , p) and
x15 (m , p) are given by Eqs. (B.10), (B.11) and (B.12) below:
xmin (m) (k + 1) X v meas (m) tmin (m )
t1 (m , k ) = v v meas (m)
x1 (m , k ) = x min (m)+
v meas (m) ( xmin (m) (k + 1) X v meas (m ) tmin (m)
v v meas (m)
tmin (m) )
xmin (m) kX v meas (m) tmin (m)
t2 (m , k ) = w v meas (m)
x2 (m , k ) = x min (m)+
v meas (m) ( xmin (m ) kX v meas (m) tmin (m)
w v meas (m )
t min (m) )
xmin (m) kX v meas (m ) tmin (m)
t3 (m , k ) = v v meas (m)
x3 (m , k ) = x min (m)+
v meas (m) ( xmin (m ) kX v meas (m) tmin (m)
v v meas (m)
t min (m) )
xmin (m) (k + 1) X v meas (m) tmin (m)
t 4 (m , k ) = w v meas (m)
x 4 (m , k ) = x min (m)+
v meas (m) ( xmin (m) (k + 1) X v meas (m ) tmin (m)
w v meas (m)
tmin (m) ) (B.10)
nTv v meas (m) tmin (m ) + xmin (m)

t5 (m , n) = v v meas (m )
x5 (m , n) = x min (m)+
v meas (m) ( nTv v meas (m ) tmin (m) + xmin (m)
v v meas (m)
t min (m) )
nTw v meas (m) tmin (m ) + xmin (m)
t 6 (m , n ) = w v meas (m)
x 6 (m , n) = x min (m)+
v meas (m) ( nTw v meas (m) tmin (m) + xmin (m )
w v meas (m )
t min (m) )
xmin (m) + wtmin (m )
t7 (m ) = w
xmax (m) + wtmax (m)
t8 (m ) = w
xmin (m) + vtmin (m)
t 9 (m ) = v
xmax (m ) + vtmax (m)
t10 (m ) = v (B.11)
and
142
xmin (m) xmin (p) + v meas (p) tmin (p) v meas (m) tmin (m)
t11 (m , p) = v meas (p) v meas (m)
x11 (m , p) = x min (p) + v meas (p)( tmin (p)+
xmin (m) xmin (p) + v meas (p) tmin (p) v meas (m) tmin (m)
v meas (p) v meas (m)
)
xmax (m) xmin (p) + v meas (p) tmin (p) vtmax (m)
t12 (m , p) = v meas (p) v
x12 (m , p) = x min (p) + v meas (p)( tmin (p)+
v meas (p) v
)
xmin (m) xmin (p) + v meas (p) tmin (p) vtmin (m )
t13 (m , p) = v meas (p) v
x13 (m , p) = x min (p) + v meas (p)( t min (p)+
xmin (m) xmin (p) + v meas (p) tmin (p) vtmin (m)
v meas (p) v
)
xmax (m) xmin (p) + v meas (p) tmin (p) vtmax (m )
t14 (m , p) = v meas (p) w
v meas (p) w
)
xmin (m )xmin (p) + v meas (p) tmin (p) vtmin (m )
t15 (m , p) = v meas (p) w
xmin (m) xmin (p) + vmeas (p) tmin (p) vtmin (m)
)
v meas (p) w (B.12)
Proof — Note that (k, n) [0, k max ] × [0, n max], Dom(Mk ) Dom(M n) = and that
(k, n) [0, k max ] × [0, n max], Dom(Mk ) Dom(M n) = . Thus, the set of inequality constraints (7) can be written in the case of
initial, upstream, downstream and internal conditions as:
M Mk (0, x ) Mp (0, x ) x [pX , (p + 1) X ], (k , p ) 2
M Mk (t , ) 2
p (t , x p) t [pT , (p + 1) T ], (k , p)
M Mk (t , ) 2
p (t , ) t [pT , (p + 1) T ], (k , p)
M Mk (t , x ) µm (t , x ) t [t min(m), t max(m) ], x = x min(m)+
v meas (m)(t t min(m) ) (k, m) ×
M n (t , ) 2
p (t , ) t [pT , (p + 1) T ], (n , p)
M n (t , ) 2
p (t , ) t [pT , (p + 1) T ], (n , p)
M n (t , x ) µm (t , x ) t [t min(m), t max(m) ], x = x min(m)+
v meas (m)(t t min(m) ) (n , m ) ×
M n (t , ) 2
p (t , ) t [pT , (p + 1) T ], (n , p)
M n (t , ) 2
p (t , ) t [pT , (p + 1) T ], (n , p)
M n (t , x ) µm ( t , x ) t [t min(m), t max(m) ], x = x min(m)+
v meas (m)(t t min(m) ) (n , m ) ×
M µk (t , x ) µm ( t , x ) t [t min(m), t max(m) ], x = x min(m)+
v meas (m)(t t min(m) ) (k , m ) × (B.13)
The inequalities outlined in Eqs. (B.1)–(B.9) are written as a finite set of inequalities owing the piecewise affine structure of the
solutions (A.1), (A.2), (A.3) and (A.4).
References
Amin, S., Andrews, S., Apte, S., Arnold, J., Ban, J., Benko, M., Bayen, A.M., Chiou, B., Claudel, C., Claudel, C., Dodson, T., Elhamshary, O., Flens-batina, C., Gruteser,
M., carlos Herrera, J., Herring, R., Hoh, B., Jacobson, Q., Iwuchukwu, T., Lew, J., Litrico, X., Luddington, L., Margulici, J., Mortazavi, A., Pan, X., Rabbani, T.,
Racine, T., Sherlock-thomas, E., Sutter, D., Tinka, A., 2008. Mobile century using gps mobile phones as traffic sensors: A field experiment.
Aubin, J.-P., 1991. Viability Theory. Systems and Control: Foundations and Applications. Birkhäuser, Boston, MA.
Aubin, J.-P., Bayen, A.M., Saint-Pierre, P., 2008. Dirichlet problems for some Hamilton-Jacobi equations with inequality constraints. SIAM J. Control Optimizat. 47 (5),
2348–2380.
Bardi, M., Capuzzo-Dolcetta, I., 1997. Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman equations. Birkhäuser, Boston, MA.
Barron, E.N., Jensen, R., 1990. Semicontinuous viscosity solutions for Hamilton-Jacobi equations with convex Hamiltonians. Commun. Partial Different. Eq. 15,
1713–1742.
Canepa, E.S., Claudel, C.G., September 2012. Exact solutions to traffic density estimation problems involving the Lighthill-Whitman-Richards traffic flow model using
Mixed Integer Linear Programing. In: Proceedings of the 15th International IEEE Conference on Intelligent Transportation Systems, Anchorage, AK.
Canepa, E.S., Claudel, C.G., 2013a. A framework for privacy and security analysis of probe-based traffic information systems. In: Proceedings of the 2nd ACM
143
International Conference on High Confidence Networked Systems. ACM, pp. 25–32.

Canepa, E.S., Claudel, C.G., January 2013. Spoofing cyber attack detection in probe-based traffic monitoring systems using mixed integer linear programming. In:
Proceedings of the IEEE International Conference on Computing, Networking and Communications, San Diego, CA.
Claudel, C.G., Bayen, A.M., 2010a. Lax-Hopf based incorporation of internal boundary conditions into Hamilton-Jacobi equation. Part I: theory. IEEE Trans. Autom.
Control 55 (5), 1142–1157. https://doi.org/10.1109/TAC.2010.2041976.
Claudel, C.G., Bayen, A.M., 2010b. Lax-Hopf based incorporation of internal boundary conditions into Hamilton-Jacobi equation. Part II: Computational methods. IEEE
Trans. Autom. Control 55 (5), 1158–1174. https://doi.org/10.1109/TAC.2010.2045439.
Claudel, C.G., Bayen, A.M., 2011. Convex formulations of data assimilation problems for a class of Hamilton-Jacobi equations. SIAM J. Control Optimizat. 49,
383–402.
Crandall, M.G., Lions, P.-L., 1983. Viscosity solutions of Hamilton-Jacobi equations. Trans. Am. Mathe. Soc. 277 (1), 1–42.
Daganzo, C., 1994. The cell transmission model: a dynamic representation of highway traffic consistent with the hydrodynamic theory. Transp. Res. 28B (4), 269–287.
Daganzo, C.F., 2006. On the variational theory of traffic flow: well-posedness, duality and applications. Networks Heterog. Media 1, 601–619.
Dwork, C., Pottenger, R., 2013. Toward practicing privacy. JAMIA 20 (1), 102–108.
Eichler, S., 2006. Anonymous and authenticated data provisioning for floating car data systems. In: 10th IEEE Singapore International Conference on Communication
systems, 2006. ICCS 2006. IEEE, pp. 1–5.
Frankowska, H., 1993. Lower semicontinuous solutions of Hamilton-Jacobi-Bellman equations. SIAM J. Control Optimizat. 31 (1), 257–272.
Herrera, J.C., Work, D.B., Herring, R., Ban, X.J., Jacobson, Q., Bayen, A.M., 2009. Evaluation of traffic data obtained via GPS-enabled mobile phones: The Mobile
Century field experiment. Transport. Res. Part C: Emerg. Technol.
Highway Capacity Manual, 1985. Transportation Research Board, Washington, DC.
Hoh, B., Gruteser, M., Xiong, H., Alrabady, A., 2006a. Enhancing security and privacy in traffic-monitoring systems. Pervasive Comput., IEEE 5 (4), 38–46.
Hoh, B., Gruteser, M., Xiong, H., Alrabady, A., 2006b. Enhancing security and privacy in traffic-monitoring systems. Pervasive Comput., IEEE 5 (4), 38–46.
Hoh, B., Gruteser, M., Herring, R., Ban, J., Work, D., Herrera, J., Bayen, A.M., Annavaram, M., Jacobson, Q., 2008. Virtual trip lines for distributed privacy-preserving
traffic monitoring. In: Proceedings of the 6th International Conference on Mobile Systems, Applications, and Services. ACM, pp. 15–28.
Hoh, B., Gruteser, M., Xiong, H., Alrabady, A., 2010. Achieving guaranteed anonymity in gps traces via uncertainty-aware path cloaking. IEEE Trans. Mobile Comput. 9
(8), 1089–1107.
http://pems.eecs.berkeley.edu.
http://traffic.berkeley.edu/.
Krumm, J., 2007. Inference attacks on location tracks. Pervasive Comput. 127–143.
Le Ny, J., Dec 2013. On differentially private filtering for event streams. In: 2013 IEEE 52nd Annual Conference on Decision and Control (CDC), pp. 3481–3486.
Le Ny, J., Pappas, G.J., 2013. Privacy-preserving release of aggregate dynamic models. In: Proceedings of the 2Nd ACM International Conference on High Confidence
Networked Systems, HiCoNS ’13. ACM, New York, NY, USA, pp. 49–56.
Li, E.Y., 1994. Artificial neural networks and their business applications. Inf. Manage. 27 (5), 303–313.
Lighthill, M.J., Whitham, G.B., 1956. On kinematic waves. II. A theory of traffic flow on long crowded roads. Proc. Roy. Soc. London 229 (1178), 317–345.
Mazare, P.E., Dehwah, A., Claudel, C.G., Bayen, A.M., 2011. Analytical and grid-free solutions to the lighthill-whitham-richards traffic flow model. Transport. Res. Part
B: Methodol. 45 (10), 1727–1748.
Møller, M.F., 1993. A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks 6 (4), 525–533.
Moskowitz, K., 1965. Discussion of ‘freeway level of service as influenced by volume and capacity characteristics’ by D.R. Drew and C. J. Keese. Highway Res. Rec. 99,
43–44.
Newell, G.F., 1993a. A simplified theory of kinematic waves in highway traffic, part I: general theory. Transporat. Res. B 27B (4), 281–287.
Newell, G.F., 1993b. A simplified theory of kinematic waves in highway traffic, Part (I), (II) and (III). Transporat. Res. B 27B (4), 281–313.
N.G.F., November 1998. A moving bottleneck. Transport. Res. Part B: Methodol. 32(7), 531–537.
Rass, S., Fuchs, S., Schaffer, M., Kyamakya, K., 2008. How to protect privacy in floating car data systems. In: Proceedings of the fifth ACM International Workshop on
VehiculAr Inter-NETworking. ACM, pp. 17–22.
Richards, P.I., 1956. Shock waves on the highway. Operat. Res. 4 (1), 42–51.
Rockafellar, R., 1970. Convex Analysis. Princeton University Press, Princeton, NJ.
Strub, I.S., Bayen, A.M., 2006. Weak formulation of boundary conditions for scalar conservation laws. Int. J. Robust Nonlinear Control 16 (16), 733–748.
Work, D., Blandin, S., Tossavainen, O., Piccoli, B., Bayen, A., 2010. A distributed highway velocity model for traffic state reconstruction. Appl. Res. Math. eXpress
(ARMX) 1, 1–35.
144

1 s2.0 S0968090X1830679X Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S0968090X1830679X Main

Uploaded by

Copyright:

Available Formats

Transportation Research Part C 105 (2019) 126–144

Contents lists available at ScienceDirect

Transportation Research Part C

A dual model/artificial neural network framework for privacy

ARTICLE INFO ABSTRACT

2.1. Variable definitions

2.2. Hamilton–Jacobi formulation of the LWR traffic flow model

2.3. Lax-Hopf formula

Dom(c) = {(t , x ) + × Z|c (t , x ) < + }.

Mc (t , x ) = inf (c (t T , x + Tu) + Tkc (u + vf ))

(t , x ) [0, t max ] × Z , c (t , x ) min cj (t , x )

2.4. Model constraints for piecewise affine value conditions

The proof of this proposition is available in Claudel and Bayen (2011).

2.5. Affine initial, boundary and internal conditions

2.6. Definition of affine initial, boundary and internal conditions

downstream (boundary) and internal conditions:

2.7. Analytical solutions to affine initial, boundary and internal conditions

3. Constraints arising from model and measurement data

3.1. Model constraints

µm (tmin (m), x min (m)) ± h = min(M Mk (t min (m ), x min (m)),

µm (tmax (m), x max (m)) ± h = min(M Mk (tmax (m), x max (m)),

3.2. Data constraints

where y is the decision variable defined earlier.

0.94qinmeasured (n) qin (n) 1.06qinmeasured (n) n [0, n max ]

4. Applications to privacy analysis

• Real time assessment of vulnerability to attacks on given location tracks

4.2. Model-based reidentification method

4.3. Vehicle reidentification example

5. Artificial neural network formulation

5.1. Artificial neural network framework

• Euclidean distance between v meas

5.2. Artificial neural network implementation

5.3. Privacy Framework analysis

beforehand, which we illustrate in the same figure.

• Free flow velocity v = 65mph

• Congested velocity w = 15mph

Appendix A. Explicit solutions to affine initial, boundary and internal conditions

Appendix B. Explicit model constraints

M Mk (t min (m), x min (m )) µm (t min (m), x min (m ))

M n (t min (m), x min (m )) µm (t min (m), x min (m ))

M n (tmin (m), x min (m)) µm (tmin (m), x min (m))

M µm (pT , ) p (pT , ) (m , p ) × (vii )(a)

M µm (pT , ) p (pT , ) (m , p ) × (viii )(a)

M µm (tmin (p), x min (p)) µp (t min (p), x min (p))

nTv v meas (m) tmin (m ) + xmin (m)

M Mk (0, x ) Mp (0, x ) x [pX , (p + 1) X ], (k , p ) 2

International Conference on High Confidence Networked Systems. ACM, pp. 25–32.

You might also like