Professional Documents
Culture Documents
xk+1 = Axk + Buk + wk , ∀k ∈ [0, N − 1], (1) where ai ∈ RnN and bi ∈ R. In this work, the environment
parameters, ai and bi , are assumed to be deterministic. The
where xk ∈ Rn is the system state, wk ∈ Rn is the process expected value of the control inputs are also constrained to
noise and N is the time horizon. The initial state, x0 , is be in a feasible region FU ; the control inputs could also be
assumed to be a Gaussian random variable with mean x̄0 constrained probabilistically.
and covariance Σ0 i.e., x0 ∼ N (x̄0 , Σ0 ). At each time step, Finally, the stochastic control problem can be stated as a
a noisy measurement of the state is taken, defined by chance constrained optimization problem in Program P2.1.
yk = Cxk + vk , ∀k ∈ [0, N − 1], (2)
Chance Constrained Problem
where yk ∈ Rp and vk ∈ Rp are the measurement out-
minimize E (f (X, U))
put and noise of the sensor at time k, respectively. The
subject to
process and measurement noise have zero mean Gaussian
X = T0 x0 + HU + GW
distributions, wk ∼ N (0, Σw ) and vk ∼ N (0, Σv ).
Y = CX + V (P2.1)
The process noise, measurement noise and initial state
W ∼ N (0, Σw )
are assumed to be mutually independent. For notational
V ∼ N (0, Σv )
convenience, the state, control inputs, measurements, and
E (U) ∈ FU
noise parameters for all time-steps areconcatenated to form,
T T
T T P(X ∈/ FX ) ≤ δ
X = x . . . xN , U = uT0 . . . uTN −1 ,
T0 T
T T
Y = y0 . . . y N −1 ,W = w0T . . . wNT
−1 ,
T T In the current problem formulation there are two compli-
T
and V = v0 . . . vN −1 . Using this compact cations that prevent solving the optimization program P2.1
notation, the system equations can be written as, directly: (i) optimizing the controller that is used in the
X = T0 x0 + HU + GW (3) closed-loop system, and (ii) evaluating and satisfying the
chance constraints P(X ∈ / FX ) ≤ δ. The next section
Y = CX + V (4) addresses this first issue by developing the controller that
735
is used to provide recourse into the system to reduce the constraints and improve the objective. In general, the pro-
uncertainty of the system state. gram is not convex in the original design variables F and
Ū. However, [5], [12] showed by a change of variables
III. C LOSED -L OOP S YSTEM the problem can be cast as a convex optimization problem.
A feedback controller can be used to shape the uncertainty This is accomplished by using the Youla parametrization as
of the system state, facilitating the satisfaction of the chance follows,
−1
constraints and improving the objective function cost. As has Q = F (I − CHF ) , (16)
been considered in the past [5], [12], only the set of affine
causal output feedback controllers will be considered, i.e., and the original gain matrix F can be efficiently solved for
by
k −1
X F = (I + QCH) Q, (17)
uk = ūk + Fk,τ yτ , ∀k = 0, . . . , N − 1. (7)
τ =0 resulting in a block lower triangular matrix. Also, define
The control law could have also been P defined in terms of a r = (I + QCH) Ū (18)
k
reference output, yτr , i.e. uk = ūk + τ =0 Fk,τ (yτ − yτr ).
The compact form of Eqn. (7) is given by, such that
−1
U = Ū + F Y (8) Ū = (I + QCH) r = (I + F CH) r. (19)
736
A. Ellipsoidal Relaxation be efficiently evaluated using a series approximation or a
The ellipsoidal relaxation method has been previously pre-computed lookup table. If the constraints
developed by [5] to simplify the joint chance constraints and !
is presented for completeness. bi − aTi X
1−Φ p T ≤ δi
Let z ∈ Rnz and z ∼ N (z̄, Σz ). The chance constraint (32)
Pai ΣX ai
P hTi z > gi , ∀i ≤ δ is equivalent to P hTi z ≤ gi , ∀i ≥ δi ≤ δ
1 − δ, which is also equivalent to,
Z are satisfied then the original chance constraint will also be
1 T
α exp − (z − z̄) Σz (z − z̄) dz ≥ 1 − δ, (24) satisfied.
Pz 2
In the current problem formulation both δi and ΣX are
1 variables and as of now there isn’t a direct way of handling
where α = p and Pz is the feasible region for them. However, if either variable is fixed then the opti-
n
(2π) det (Σ)
z. There is no straightforward way of handling this integral mization problem can be solved efficiently. This property is
constraint, therefore it was suggested to tighten the constraint exploited in the proposed solution methodology in Section V.
as follows. If the constraint For a fixed covariance, ΣX , and δ ≤ 0.5 the constraints in
Eqn. (32) are convex since the function Φ(x) is concave in
z̄ + Er ⊂ Pz (25) the range x ∈ [0, ∞) [8], [13].
P
is ensured to be satisfied for the ellipsoid, If the risk allocation, δi , is fixed such that δi ≤ δ then
the constraints in Eqn. (32) can be converted into equivalent
Er = ξ : ξ T Σ−1 2
z ξ ≤r , (26) second order cone constraints as follows. The constraints
require
with an appropriately
chosen r, then the original constraint !
P hTi z ≤ gi , ∀i ≥ 1 − δ is implied by bi − aTi X
1−Φ p T ≤ δi , (33)
Z
1
ai ΣX ai
α exp − (z − z̄)T Σz (z − z̄) dz ≥ 1 − δ. (27)
z̄+Er 2
which can be simplified to
After simplifying Eqn. (27) to a one-dimensional integral, r q
is chosen such that aTi X + Φ−1 (1 − δi ) aTi ΣX ai ≤ bi (34)
Z r2
1 χ
nz
nz
χnz /2−1 exp(− )dχ = 1 − δ. (28) where Φ−1 is the inverse of the Gaussian cumulative dis-
2 Γ 2
2 0 2
tribution function. Finally, the standard deviation of the
Now the original constraint P hTi z ≤ gi , ∀i ≥ 1 − δ can constraint uncertainty can be converted into a second order
be replaced by requiring z̄ + Er ⊂ Fz which is equivalent to cone constraint,
requiring q p
T
hTi z̄ + r hTi Σz hi ≤ gi , ∀i. (29) hai ΣX ai = 1 1 1
i
k (I + HQC) T0 Σ02 , (I + HQC)GΣW 2
, HQΣV2 ai k.
B. Boole’s Inequality (35)
The chance constraint P(X ∈ / FX ) can be converted into
univariate integrals by using Boole’s inequality to conserva-
C. Comparison
tively bound the probability of violation. Consequently, from
Eqn. (6) and Boole’s inequality the probability of the state The previous two techniques are equally valid for sim-
not being contained inside the feasible region is bounded by, plifying and bounding the joint chance constraints, but the
NFX
! ellipsoidal relaxation technique is very conservative even for
P(X ∈ / FX ) = P X ∈
S T
X : ai X > bi a small number of states. For example, for two states and
i=1 (30) two constraints the ellipsoidal relaxation method’s conserva-
PNFX
≤ T tiveness in the violation probability is 72.9% whereas using
i=1 P(ai X > bi ).
Boole’s inequality results in only 0.7% conservativeness.
Now that the multivariate constraints have been converted Figure 1 shows a comparison of the two constraint relaxation
to univariate constraints in Eqn. (30), they can be efficiently techniques in which the original constraints are the red
evaluated through, solid lines, the blue dash line is the ellipsoidal relaxation
P(aTi X > bi ) = P(yi > bi ) technique, and the green dash-dot line is for Boole’s in-
bi − aT X̄ (31) equality with a uniform risk allocation. The feasible region
= 1 − Φ( p T i ). for each technique is below the corresponding lines. The
ai ΣX ai
ellipsoidal relaxation technique clearly requires a larger
The function Φ(·) is the Gaussian cumulative distribution backoff from the original constraints resulting in a larger
function which does not have an analytic solution, but it can over-approximation.
737
1
a specified threshold of the allowed value. The algorithm
0 starts with two uniform risk allocations that result in a true
−1 probability of constraint violation, as calculated in lines 3
−2
and 6 of Algorithm 2, above and below the desired value
as described in Algorithm 2. A uniform risk allocation at
−3
the midpoint of the current two allocations that bracket the
−4 desired violation is then passed to the lower stage to solve
−1 0 1 2 3
for the controller parameters. For this solution, if the true
probability of constraint violation, calculated in line 1 of
Fig. 1. A comparison of the two joint chance constraint relaxation Algorithm 1, is less than the allowed risk then it replaces the
techniques. The original constraints are the red solid lines, the ellipsoidal current lower bound otherwise it replaces the upper bound.
relaxation technique is shown as the blue dash line, and the Boole’s
inequality method is shown as the green dash-dot line with a uniform risk This is continued until the actual risk is within a threshold
allocation. The feasible region for the state is below the corresponding lines. of the allowed risk.
D. Optimization Problem Algorithm 1 Bisection Method for Risk Allocation
Using either method for bounding the chance constraints, !
the Program P2.1 can be simplified to P4.1. P bi − aTi X
1: δitrue = i1−Φ p
aTi ΣX ai
Chance Constrained Problem 2: if |δitrue − δ| ≤ then
minimize E (f (X, U)) 3: Solution found.
Q,r
4: end if
subject to
5: if δitrue ≥ δ then
X= (I + HQC) T0 x0 + Hr+
(I + HQC) GW + HQV 6: δ = δi
7: else
U = QCT0 x0 + r + QCGW + QV
W ∼ N (0, Σw ) 8: δ = δi
(P4.1) 9: end if
V ∼ N (0, Σv )
10: δi = 0.5(δ + δ)
E (U) ∈ FU
L=I+ h HQC 1 1 1
i
νi = k LT0 Σ02 , LGΣW 2
, HQΣV2 ai k Algorithm 2 Initialization for the Bisection Method
T
ai X + βi νi ≤ bi , ∀i 1: δ = δ/NFX
Q lower triangular 2: Perform optimization of P4.1 ! with βi = Φ−1 (1 − δ)
T
b − ai X
pi
P
3: δ true = i1−Φ
In the optimization program P4.1, βi = r for the ellip- aTi ΣX ai
soidal relaxation technique, or βi = Φ−1 (1 − δi ) for Boole’s 4: δ = 0.5
inequality. Given a fixed controller F or fixed risk allocation 5: Perform optimization of P4.1 ! with βi = Φ−1 (1 − δ)
δi , the Program P4.1 is a convex optimization problem which T
b − ai X
pi
P
can be efficiently solved. However, optimizing over both the 6: δ true = i1−Φ
aTi ΣX ai
controller parameters and the risk allocation simultaneously
7: if δ true < δ then
isn’t a straightforward procedure given the complication of
8: Solution found, no need to iterate.
βi and νi being involved in a multiplicative constraint.
9: end if
V. S OLUTION M ETHODOLOGY: T WO -S TAGE M ETHOD 10: δi = 0.5(δ + δ)
Given the complexity in simultaneously optimizing the
risk allocation as well as the controller parameters, an 2) Optimal Risk Given Fixed Controller: As was stated
iterative two-stage optimization scheme is utilized. The upper before, if the controller parameters, F or Q, are fixed (and
stage allocates the risk associated to each individual con- hence the covariance ΣX ) then the problem simplifies to a
straint while the lower stage solves a second order cone convex optimization program. Consequently, this problem
program (SOCP) for the optimal control parameters given the can be efficiently solved for the optimal risk allocation, δi ∀i,
current risk allocation. The difficulty in solving this problem for the particular controller parameters. However, there is
is devising the iterative risk allocation scheme. no guarantee that this risk allocation is the optimal risk
A. Upper Stage allocation for the original problem.
There are many heuristic methods that can be devised for
the upper stage which allocates the risk for each constraint. B. Lower Stage
1) Bisection Method: The simplest heuristic is to use a Once the risk allocation has been fixed, the program P4.1
bisection method presented in Algorithm 1 to adjust the simplifies to a standard second order cone program which
uniform allocation of the risk until the actual risk is within can be solved efficiently by many standard solvers.
738
3 3
VI. A PPLICATIONS : M OTION P LANNING
2.5 2.5
The above method is applicable to many different fields
2 2
and is evaluated on a stochastic motion planning problem. In
this example, the system has double integrator dynamics with 1.5 1.5
T −0.5 −0.5
f (X̄, Ū) = (xN − xref ) Qobj (xN − xref ) + ŪT Robj Ū −0.5 0 0.5 1 1.5 2 2.5 −0.5 0 0.5 1 1.5 2 2.5
uncertainty is too large for the allowed risk. The solution for 0 0
the ellipsoidal relaxation technique is shown in Figure 2(b).
−0.5 −0.5
Here, the system also had to take the suboptimal route −0.5 0 0.5 1 1.5 2 2.5 −0.5 0 0.5 1 1.5 2 2.5
through the top of the environment because the relaxation (c) (d)
method for the joint constraints was too conservative. The Fig. 2. The solution and the confidence ellipsoids for a nonconvex
objective function cost is 0.186 with a probability of failure environment. (a) Solution for an assumed LQR trajectory tracking controller.
of 0.004 which is significantly less than the allowed risk. (b) The solution for the ellipsoidal relaxation method. (c) The solution from
the two stage algorithm using the bisection method for the upper stage.
The solution using the two-stage algorithm for optimiz- (d) The solution for the optimal risk given fixed controller upper stage.
ing the output feedback controller parameters is shown in The black line is the solution from the SOCP solver assuming a uniform
Figure 2(c-d). Figure 2(c) shows the solution from the allocation of risk. The blue line, with the confidence ellipsoids around it, is
the solution using the controller parameters from the first SOCP stage and
binary search upper stage algorithm. The algorithm took 16 optimizing the risk for each constraint.
iterations for an = 1 × 10−5 and the objective function’s
value was 0.120. The solution for the optimal risk given fixed [4] A. Prékopa, Stochastic Programming. Kluwer Scientific, 1995.
[5] D. van Hessem, Stochastic inequality constrained closed-loop model
controller upper stage is shown in Figure 2(d) which only predictive control with application to chemical process operation. PhD
took 1 iteration. The algorithm was initialized with a uniform thesis, Delft University of Technology, 2004.
allocation of the risk δi = δ/NFX , and the trajectory for this [6] L. Blackmore, “Robust path planning and feedback design under
stochastic uncertainty,” in Proceedings of the AIAA Guidance, Nav-
allocation is shown as the black line. The final solution is igation, and Control Conference and Exhibit, (Honolulu, HI), August
shown as the blue line with the 90% confidence ellipsoids 2008.
around it. The initial objective cost is 0.146 and the final [7] L. Blackmore, “A probabilistic particle control approach to optimal,
robust predictive control,” in Proceedings of the AIAA Guidance,
objective cost is 0.116 which is a 26% relative improvement. Navigation, and Control Conference, (Keystone, Colorado), August
2006.
VII. C ONCLUSION [8] L. Blackmore and M. Ono, “Convex chance constrained predictive
The planning problem for a linear, Gaussian stochastic control without sampling,” in Proceedings of the AIAA Guidance,
Navigation, and Control Conference, (Chicago, Illinois), August 2009.
system with state constraints was formulated as a stochastic [9] A. Prékopa, “The use of discrete moment bounds in probabilistic
optimal control problem. A two-stage solution methodology constrained stochastic programming models,” Annals of Operations
was proposed that allocates the risk in one stage and opti- Research, vol. 85, pp. 21–38, 1999.
[10] M. Ono, L. Blackmore, and B. C. Williams, “Chance constrained finite
mizes the feedback controller in the other. This methodology horizon optimal control with nonconvex constraints,” in Proceedings of
reduces the conservatism in the probability of violation the 2010 American Control Conference, (Baltimore, Maryland), June
calculation and outperforms the other proposed methods in 2010.
[11] M. P. Vitus, S. L. Waslander, and C. J. Tomlin, “Locally optimal
the example shown. decomposition for autonomous obstacle avoidance with the tunnel-
MILP algorithm,” in Proceedings of the 47th IEEE Conference on
R EFERENCES Decision and Control, (Cancun, Mexico), December 2008.
[12] J. Skaf and S. Boyd, “Design of affine controllers via convex opti-
[1] P. J. Campo and M. Morari, “Robust model predictive control,” in mization,” Automatic Control, IEEE Transactions on, vol. 55, pp. 2476
Proceedings of the 1987 American Control Conference, (Minneapolis, –2487, November 2010.
MN), June 1987. [13] M. P. Vitus and C. J. Tomlin, “Closed-loop belief space planning for
[2] W. Langson, I. Chryssochoos, S. V. Raković, and D. Mayne, “Robust linear, gaussian systems,” in 2011 IEEE International Conference on
model predictive control using tubes,” Automatica, vol. 40, pp. 125– Robotics and Automation, (Shanghai, China), May 2011.
133, January 2004.
[3] A. Charnes and W. Cooper, “Deterministic equivalents for optimizing
and satisfying under chance constraints,” Operations Research, no. 11,
pp. 18–39, 1963.
739