Giulia Vitaly Component Hedge PDF

Learning with Large Number of Experts:
Component Hedge Algorithm
Giulia DeSalvo and Vitaly Kuznetsov
Courant Institute
March 24th, 2015
1 / 30
Learning with Large Number of Experts

Regret of RWM is O( T ln N).
Informative even for very large number of experts.
What if there is overlap between experts?
RWM with path experts
FPL with path experts
can we do better?
[Littlestone and Warmuth, 1989; Kalai and Vempala, 2004]
2 / 30
Better Bounds in Structured Case?
Can overlap between experts lead to better regret

guarantees?
What are the lower bounds in the structured
setting?
Computationally efficient solutions that realize these
bounds?
3 / 30
Outline
Learning Scenario
Regret Bounds
Applications & Lower Bounds
Conclusion & Open Problems
4 / 30
Learning Scenario
Assumptions:
Structured concept class C {0, 1}d
Composed of components: C t indicates which
components are used for each trial t.
Additive loss `t incurred at each trial t.

Loss of each concept C is C `t M := maxC C |C |
Goal:
minimize expected regret after T trials
XT XT
t t
RT = E[C ] ` min C `t
C C
t=1 t=1
5 / 30
[Koolen, Warmuth, and Kivinen, 2010 ]
CH maintains weights w t conv (C ) [0, 1]d over the
components at each round t.
Update:
t
1 bit = wit1 e `i
weights: w
2 relative entropy projection:
w t := argminw conv (C) (w ||w
t)
Pd
where (w k v ) = i=1 (wi ln wvii + vi wi )
6 / 30
Prediction:
1 Decomposition of weights:
X
wt = C C
C C
where is a distribution over C
2 Sample C t+1 according to
7 / 30
Efficiency
Need efficient implementation of:

Decomposition (not unique) of weights over the
concepts
Entropy projection step (convex problem)
Sufficient: conv (C) described by polynomial in d
constraints
8 / 30
Regret Bounds
Theorem: Regret Bounds for CH

Let ` = minC C C (`1 + . . . + `T ) be the loss of the
best concept in hindsight, then
p
RT 2` M ln(d/M) + M ln(d/M)
q
by choosing = 2M ln(d/M)
`

Since ` MT , regret RT O(M T ln d).
Matching lower bounds in applications.
9 / 30
Comparison of CH, RWM and FPL
1 CH has significantly
better regret bounds:
CH: RT O(M Tln d).
RWM: RT O(M MT ln d)
FPL: RT O(M dT ln d)
2 CH is optimal w.r.t. regret bounds while RWM and

FPL are not optimal.
3 Standard expert setting (no structure):
CH, RWM and FPL reduce to the same algorithm.
10 / 30
Applications
On-line shortest path problems.

On-line PCA (k-sets).
On-line ranking (k-permutations).
Spanning trees.
11 / 30
On-line Shortest Path Problem (SPP)
G = (V , E ) is a directed graph.
s is the source and t is the destination.
Each s t path is an expert.
The loss is additive over edges.
12 / 30
Unit Flow Polytope
Convex hull of paths cannot be captured by linear

constraints
Unit flow polytope relaxation is used:
wu,v 0, (u, v ) E
X
ws,v = 1
v V
X X
wv ,u = wu,v , u V
v V v V
Relaxation does not hurt regret bounds.
13 / 30
Example of Unit Flow Polytope
3 0.25
0.25
0
0.25 0.25 6
0 0.5
0 2
0.5 5
0.25 7
0.5
0.25 4
1
0.25
14 / 30
Entropy Projection on Unit Flow Polytope
X wu,v
min wu,v ln bu,v wu,v
+w
w w
bu,v
(u,v )E
subject to:
wu,v 0, (u, v ) E
X
ws,v = 1
v V
X X
wv ,u = wu,v , u V
v V v V
15 / 30
Dual problem
n X o
u v
max s w
bu,v e

(u,v )E
No constraints.
Only |V | variables.
bu,v e u v
Primal solution: wu,v = w
16 / 30
Convex Decomposition
1 Find any non-zero path from s to t.

2 Subtract the smallest weight from each edge.
3 Repeat until no path is found.
= At most |E | iteration is needed.
17 / 30
Example of Convex Decomposition
3 0.25
0.25
0
0.25 0.25 6
0 0.5
0 2
0.5 5
0.25 7
0.5
0.25 4
1
0.25
18 / 30
3 0.25
0.25
0
0.25 0.25 6
0 0.5
0 2
0.25 5
0.25 7
0.25
0.25 4
1
19 / 30
3 0.25
0.25
0
0 0.25 6
0 0.5
0 2
0.25 5
0 7
0
0.25 4
1
20 / 30
3 0.25
0.25
0
0 6 0.25
0 0
0 2
0 5
0 7
0
0 4
1
21 / 30
3 0
0 0
0 6
0 0 0
0 2
0 0 5
7
0
0 4
1
22 / 30
Regret Bounds for SPP
Expected regret is bounded by

p p
2 ` k ln |V | + 2k ln |V | O(M T ln |V |)
Bound holds for arbitrary graphs.
23 / 30
Lower Bounds
Any algorithm can be forced to have expected regret

r
|V |
` k ln
k
Idea of the proof:
Minimize the overlap.
Create |V |/k disjoint paths of length k.
Apply lower bounds for standard expert setting.
24 / 30
Conclusions
Regret of CH is often better than that of RWM or

FPL in structured setting.
Regret of CH often matches lower bounds in
applications.
Efficient solutions exist for a wide range of
applications: on-line shortest path, on-line PCA,
on-line ranking, spanning trees.
25 / 30
References
Wouter Koolen, Manfred K. Warmuth, and Jyrki Kivinen.
Hedging Structured Concepts. In COLT (2010).
Nick Littlestone and Manfred K. Warmuth. The Weighted
Majority Algorithm. FOCS 1989: 256-261.
Adam Kalai and Santosh Vempala. Efficient algorithms for
online decision problems. J. Comput. Syst. Sci. , 2003.
Eiji Takimoto and Manfred K. Warmuth. Path kernels and
multiplicative updates. JMLR, 4:773818, 2003.
T. van Erven, W. Kotlowski, and M. K. Warmuth. Follow the
leader with dropout perturbations. In COLT, 2014.
Nicolo Cesa-Bianchi and Gabor Lugosi. Combinatorial bandits.
In Proceedings of the 22nd Annual Conference on Learning
Theory, 2009.
26 / 30
Regret Bounds
Theorem: Regret Bounds for CH

Let ` = minC C C (`1 + . . . + `T ) be the loss of the
best concept in hindsight, then
p
RT 2` M ln(d/M) + M ln(d/M)
q
by choosing = 2M ln(d/M)
`
27 / 30
Proof of CH Regret Bound
1 Bound:
(1e )w t1 `t (C ||w t1 )(C ||w t )+C `t .
1 e x (1 e )x
Generalized Pythagorean Theorem
2 Sum overPtrials t:
(1 e ) Tt=1 w t1 `t (C ||w 0 ) (C ||w T ) + C `T
where `T = `1 + + `T .
3 Use w t1 `t = E [C t ] `t :
PT t t (C ||w 0 )(C ||w T )+C `T
t=1 E [C ] ` (1e )
28 / 30
Proof of CH Regret Bound
4 w 0 assumes uniform distribution over concepts

wi0 = Md = (C ||w 0 ) = M ln( Md )
5 let `qbest concept in hind-sight and choosing
d
2M ln( M )
= ` = Regret bound RT .
29 / 30

Giulia Vitaly Component Hedge PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Giulia Vitaly Component Hedge PDF

Uploaded by

Copyright:

Available Formats

Learning with Large Number of Experts:

Component Hedge Algorithm

Giulia DeSalvo and Vitaly Kuznetsov

March 24th, 2015

Can overlap between experts lead to better regret

Additive loss `t incurred at each trial t.

where is a distribution over C

2 Sample C t+1 according to

Need efficient implementation of:

Theorem: Regret Bounds for CH

2 CH is optimal w.r.t. regret bounds while RWM and

On-line shortest path problems.

Convex hull of paths cannot be captured by linear

Relaxation does not hurt regret bounds.

1 Find any non-zero path from s to t.

Expected regret is bounded by

Bound holds for arbitrary graphs.

Any algorithm can be forced to have expected regret

Regret of CH is often better than that of RWM or

Theorem: Regret Bounds for CH

4 w 0 assumes uniform distribution over concepts

You might also like