You are on page 1of 10

ISE521 Term Project

By:
Muhammad Dzulqarnain Al Firdausi
g202210120

Supervised by:
Dr. Syed Mujahid
ISE 521 TERM PROJECT

1. Introduction

This term project provides the steps to obtain dual function of unsupervised learning model
from (Goerigk and Kurtz, 2023) and experiments was conducted to solve the dual model. The
model is model (7) on page 3 of the reference.

2. Unsupervised learning model

The original model from (Goerigk and Kurtz, 2023) is as follows:

𝑚
minimize 1
𝑅2 + ∑ 𝜉𝑖
𝑁𝜈
𝑖=1

s.t. 2
‖𝜙(𝒄𝑖 ) − 𝒄̅‖2 ≤ 𝑅 2 + 𝜉𝑖 𝑖 = 1, … , 𝑚

𝒄̅ ∈ ℝ𝑛 , 𝑅 ≥ 0, 𝜉 ∈ ℝ𝑚
+

In this project, the mapping 𝜙 is omitted then the problem is rewritten in standard form as:

𝑚
minimize 1
2
𝑅 + ∑ 𝜉𝑖
𝑁𝜈
𝑖=1

s.t. 2
‖(𝒄𝑖 ) − 𝒄̅‖2 − 𝑅 2 − 𝜉𝑖 ≤ 0 𝑖 = 1, … , 𝑚

−𝑅 ≤ 0 , −𝝃 ≤ 𝟎

Then the steps to construct the dual problem from the previous NLP is as follows:

a. Decide the parameters and variables of the problem:

Parameters are: 𝑁 and 𝜈

Variables are: 𝑅, 𝝃 and 𝒄̅

b. Expand the term:

𝑚
minimize 1
2
𝑅 + ∑ 𝜉𝑖
𝑁𝜈
𝑖=1

Page 1 of 9
ISE 521 TERM PROJECT

s.t. T
(𝒄𝑖 − 𝒄̅) (𝒄𝑖 − 𝒄̅) − 𝑅 2 − 𝜉𝑖 ≤ 0 𝑖 = 1, … , 𝑚
−𝑅 ≤ 0 , −𝝃 ≤ 𝟎

c. Construct the Lagrangian for the dual variables. Let 𝑥 and 𝑦 are the dual variables where
𝒙, 𝒛 ∈ ℝ𝑚 and we keep the constraint 𝑅 ≥ 0.
𝑚 𝑚
1 2 T
ℒ(𝑅, 𝝃, 𝒄̅, 𝒙, 𝒛) = 𝑅 + ∑ 𝜉𝑖 + ∑ 𝑥𝑖 ((𝒄𝑖 − 𝒄̅) (𝒄𝑖 − 𝒄̅) − 𝑅 2 − 𝜉𝑖 ) − 𝒛T 𝝃
𝑁𝜈
𝑖=1 𝑖=1

d. Build the dual function.


𝑚 𝑚
1 2 T
𝑑(𝒙, 𝑦, 𝒛) = inf [𝑅 + ∑ 𝜉𝑖 + ∑ 𝑥𝑖 ((𝒄𝑖 − 𝒄̅) (𝒄𝑖 − 𝒄̅) − 𝑅 2 − 𝜉𝑖 ) − 𝒛T 𝝃]
𝑅≥0,𝝃,𝒄̅ 𝑁𝜈
𝑖=1 𝑖=1

𝑚 𝑚
1 T T
= inf [𝑅 2 + ∑ 𝜉𝑖 + ∑ 𝑥𝑖 (𝒄𝑖 𝒄𝑖 + 𝒄̅T 𝒄̅ − 2𝒄𝑖 𝒄̅ − 𝑅 2 − 𝜉𝑖 ) − 𝒛T 𝝃]
𝑅≥0,𝝃,𝒄̅ 𝑁𝜈
𝑖=1 𝑖=1

𝑚 𝑚
1 2 T T
= inf [𝑅 + ∑ 𝜉𝑖 + ∑(𝑥𝑖 𝒄𝑖 𝒄𝑖 + 𝑥𝑖 𝒄̅T 𝒄̅ − 2𝑥𝑖 𝒄𝑖 𝒄̅ − 𝑥𝑖 𝑅 2 − 𝑥𝑖 𝜉𝑖 ) − 𝒛T 𝝃]
𝑅≥0,𝝃,𝒄̅ 𝑁𝜈
𝑖=1 𝑖=1

𝑚 𝑚 𝑚
𝑖T
1 T
= ∑ 𝑥𝑖 𝒄 𝒄𝑖 + inf [𝑅 2 + ∑ 𝜉𝑖 + ∑(𝑥𝑖 𝒄̅T 𝒄̅ − 2𝑥𝑖 𝒄𝑖 𝒄̅ − 𝑥𝑖 𝑅 2 − 𝑥𝑖 𝜉𝑖 ) − 𝒛T 𝝃]
𝑅≥0,𝝃,𝒄̅ 𝑁𝜈
𝑖=1 𝑖=1 𝑖=1

T T
𝒄1 𝒄1 𝒄1
𝑚×𝑛
Let 𝚪 = [ … ] , 𝚪 ∈ ℝ ; 𝚫 = [ … ] , 𝚫 ∈ ℝ𝑚 ; 𝒄𝑖 ∈ ℝ𝑛 , 𝑖 = 1, … , 𝑚
T T
𝒄𝑚 𝒄𝑚 𝒄𝑚

1 T
= 𝒙T 𝚫 + inf [𝑅 2 + 𝒆 𝝃 + 𝒙T 𝒆𝒄̅T 𝒄̅ − 2𝒙T 𝚪𝒄̅ − 𝒙T 𝒆𝑅 2 − 𝒙T 𝝃 − 𝒛T 𝝃]
𝑅≥0,𝝃,𝒄̅ 𝑁𝜈

1 T
= 𝒙T 𝚫 + inf (𝑅 2 − 𝒙T 𝒆𝑅 2 ) + inf ( 𝒆 𝝃 − 𝒙T 𝝃 − 𝒛T 𝝃) + inf(𝒙T 𝒆𝒄̅T 𝒄̅ − 2𝒙T 𝚪𝒄̅)
𝑅≥0 𝝃 𝑁𝜈 𝒄̅

The first inner minimum is achieved by:

𝜕𝑅 2 − 𝒙T 𝒆𝑅 2
=0
𝜕𝑅

2𝑅 − 2𝒙T 𝒆𝑅 = 0

Page 2 of 9
ISE 521 TERM PROJECT

2𝑅(1 − 𝒙T 𝒆) = 0

If 𝑅 ≠ 0, then 𝒙T 𝒆 = 1

The second inner minimum is achieved by:

1
𝜕 𝑁𝜈 𝒆T 𝝃 − 𝒙T 𝝃 − 𝒛T 𝝃
=0
𝜕𝝃

𝒆
−𝒙−𝒛=0
𝑁𝜈

The third inner minimum is achieved by:

𝜕𝒙T 𝒆𝒄̅T 𝒄̅ − 2𝒙T 𝚪𝒄̅


=0
𝜕𝒄̅

2𝒙T 𝒆𝒄̅ − 2(𝒙T 𝚪)T = 0

𝚪T𝒙
𝒄̅ = T
𝒙 𝒆

Where 𝒙T 𝒆 = 1, then we have:

𝒄̅ = 𝚪 T 𝒙

The dual function will be:

1
𝑑(𝑥, 𝑦, 𝑧) = 𝒙T 𝚫 + inf ( 𝒆T 𝝃 − 𝒙T 𝝃 − 𝒛T 𝝃) + inf(𝒄̅T 𝒄̅ − 2𝒙T 𝚪𝒄̅)
𝝃 𝑁𝜈 𝒄̅

𝑑(𝑥, 𝑦, 𝑧) = 𝒙T 𝚫 + (𝚪 T 𝒙)T (𝚪 T 𝒙) − 2𝒙T (𝚪(𝚪 T 𝒙))

𝑑(𝑥, 𝑦, 𝑧) = 𝒙T 𝚫 + 𝒙T 𝚪𝚪 T 𝒙 − 2𝒙T 𝚪𝚪 T 𝒙

𝑑(𝑥, 𝑦, 𝑧) = 𝒙T 𝚫 − 𝒙T 𝚪𝚪 T 𝒙
The dual problem will be:

minimize 𝒙T 𝚪𝚪 T 𝒙 − 𝒙T 𝚫
s.t. 𝒆
−𝒙−𝒛 =0
𝑁𝜈
𝒙, 𝒛 ≥ 𝟎

Page 3 of 9
ISE 521 TERM PROJECT

Simplify the first constraint by changing the constraint to the inequality constraint.

𝒆 𝟏
− 𝒙 − 𝒛 = 0 => =𝒙+𝒛
𝑁𝜈 𝑁𝜈
𝟏 𝟏
Therefore, we find that 𝑁𝜈 ≥ 𝒙 or 𝑁𝜈 ≥ 𝒛. Then we introduce Κ = 𝚪𝚪 T where Κ ∈ ℝ𝑚×𝑚

Therefore, the dual problem after simplification becomes:

𝑚 𝑚
minimize
∑ 𝑥𝑖 𝑥𝑗 Κ i,j − ∑ 𝑥𝑖 𝚫𝒊
𝑖,𝑗=1 𝑖=1

s.t. 1
0 ≤ 𝑥𝑖 ≤ , 𝑖 ∈ [𝑚]
𝑁𝜈
𝑚

∑ 𝑥𝑖 = 1
𝑖=1
T
𝒄1 𝒄1
Where 𝚫 = [ … ] , 𝚫 ∈ ℝ𝑚 and 𝒙T 𝒆 = 1
T
𝒄𝑚 𝒄𝑚

3. Experiment

The following shows experiment on solving the previous dual problem with generated
random data from (Goerigk and Kurtz, 2023). To generate three different types of data sets,
(Goerigk and Kurtz, 2023) took (Shang et al., 2017) as a foundation and used sklearn Python
package for the Gaussian data (first type) and the mixed Gaussian data (second type) where there
are two Gaussian distributions with the mean-points lie in the different quadrants. The third type
of data set is coming from a polyhedron that constructed with budgeted uncertainty (Bertsimas and
Sim, 2004) which is sampled uniformly. The polyhedral sets are as follow:

𝜐 = {𝑐: 𝑐𝑖 = 𝑐𝑖 + 𝑐̅𝑖 𝛿𝑖 , ∑ 𝛿𝑖 ≤ 𝛤, 𝛿 ∈ [0,1]𝑁 }


𝑖=1

𝑁
where 𝑐𝑖 (the lower bound) and 𝑐̅𝑖 (the upper bound) are randomly chosen and 𝛤 = 2 . Figure 1

shows the distribution of three different data sets with numbers corresponding to their type (first,
second, and third type)

Page 4 of 9
ISE 521 TERM PROJECT

Figure 1 Three different data sets used in the experiment


The experiment was conducted in google collaboratory and for the solver we used gurobi
10.0.1 (the code is in Appendix 1). Each data sets contain 500 data points in 2 dimension and they
were scaled with scikit-learn library with StandardScaler, the parameter setting for 𝜈 = 0.1 and
the optimal dual objective function values are shown in Table 1.

Table 1 Solver results of dual problem

Type Variable values Objective function value


𝑥21 = 0.33642
Gaussian 𝑥22 = 0.384137 -0.36554
𝑥23 = 0.279444
𝑥1 = 0.0805
Mixed Gaussian 𝑥2 = 0.428258 -0.10353
𝑥8 = 0.491242
𝑥1 = 0.446546
Polyhedral 𝑥5 = 0.384137 -0.505937
𝑥8 = 0.166006

Page 5 of 9
ISE 521 TERM PROJECT

Since we know that

2𝑅(1 − 𝒙T 𝒆) = 0

And

𝒄̅ = 𝚪 T 𝒙

We can get the value of 𝑅 and 𝒄̅. For example in Gaussian type data, the calculation will
be:

𝒄̅ = 𝚪 T 𝒙

−0.397169517 4.22330985 ⋯ 0.39206806


Where gaussian input data 𝚪 T = [ ] ∈ ℝ2×500
2.12439709 5.39420061 ⋯ 1.74221526

0
0
and 𝒙 = [ ] ∈ ℝ500×1 where 𝑥21 = 0.33642, 𝑥22 = 0.384137, 𝑥23 = 0.27944, then:

0
3.30929686
𝒄̅ = [ ]
1.73924688

To calculate 𝑅, the steps are as follows:

1. We have 𝑥𝑖 ≠ 0 therefore the corresponding constraint 𝑖 will be tight.

2
‖(𝒄𝑖 ) − 𝒄̅‖2 − 𝑅 2 − 𝜉𝑖 = 0

2. There are two scenarios:

1
𝑥𝑖 = ⇒ 𝑧𝑖 = 0
𝑥𝑖 { 𝑁𝑣
1 1
𝑥𝑖 < ⇒ 𝑧𝑖 = − 𝑥𝑖
𝑁𝑣 𝑁𝑣
1
3. In our experiment, we set 𝑣 = 0.1 and we have 𝑁 = 500, so scenario 𝑥𝑖 < 𝑁𝑣 applies.

Since 𝑧𝑖 is the dual variable for 𝜉𝑖 and 𝑧𝑖 ≠ 0, then 𝜉𝑖 = 0. Then we have:

2
‖(𝒄𝑖 ) − 𝒄̅‖2 = 𝑅 2

Page 6 of 9
ISE 521 TERM PROJECT

Where 𝒄𝑖 is the original data with index 𝑖 corresponding to the non-zero variable 𝑥𝑖 obtained from
the optimal solution.

‖(𝒄21 ) − 𝒄̅‖22 = 𝑅 2

−2.51706196 3.30929686 2
‖[ ]−[ ]‖ = 𝑅 2
0.12433039 1.73924688 2

𝑅 = 6.046

The summary of 𝑅 values for each data set is presented in Table 2

Table 2 Values of 𝑅

Type 𝒄̅ 𝑅
3.30929686
Gaussian [ ] 6.046
1.73924688
0.04091851
Mixed Gaussian [ ] 3.2176
−0.0928156
−2.73523145
Polyhedral [ ] 7.11292
−1.74632099

The plot for boundaries created from the obtained value of 𝒄̅ and 𝑅 is as depicted in Figure
2

Figure 2 Boundaries for the three datasets

Page 7 of 9
ISE 521 TERM PROJECT

Appendix 1
Gurobi code for the dual model.
m = Model('ISE_521')
# Add parameters
fraction = 0.1
# Add variables
alpha = m.addMVar(len(x), name='alpha', lb=0,\
ub=1/(data_ori.ndim * fraction))
# Index generators
i_index = [i for i in range(len(x))]
j_index = [i for i in range(len(x))]
# Add constraints
m.addConstr((quicksum(alpha[i] for i in i_index) == 1))
# Set objectives
obj_1 = quicksum(
alpha[i] * alpha[j] * (data_ori[i].T @ data_ori[j])
for i in i_index
for j in j_index)
obj_2 = quicksum(
alpha[i] * (data_ori[i].T @ data_ori[i])
for i in i_index)
m.setObjective(obj_1 - obj_2, GRB.MINIMIZE)
# Solve model
m.optimize()
#Print variables value
m.printAttr('x')

References

Bertsimas, D., Sim, M., 2004. The Price of Robustness. Operations Research 52, 35–53.
https://doi.org/10.1287/opre.1030.0065

Page 8 of 9
ISE 521 TERM PROJECT

Goerigk, M., Kurtz, J., 2023. Data-driven robust optimization using deep neural networks.
Computers & Operations Research 151, 106087. https://doi.org/10.1016/j.cor.2022.106087
Shang, C., Huang, X., You, F., 2017. Data-driven robust optimization based on kernel
learning. Computers & Chemical Engineering 106, 464–479.
https://doi.org/10.1016/j.compchemeng.2017.07.004

Page 9 of 9

You might also like