You are on page 1of 9

Matrix Completion and Tensor

Decomposition and Completion


Matt Olfat
UC Berkeley IEOR
Overview

Matrix(comple.on(mo.va.on(and(methods(

Background(on(tensors(and(basic(generaliza.ons(from(matrices(

Overview(of(challenges(and(methods(for(tensor(decomposi.on(

Exis.ng(tensor(comple.on(methods(

Proposed(works(

10/5/16 Olfat, M. 2
Matrix Sensing & Completion
Problem:
•  Minimize rank of matrix X subject to affine constraints A(X) = b
•  In the case that A is a sampling operator, called Matrix Completion
•  More general case called Matrix Sensing, contains vector cardinality minimization

Motivation:
•  Motivated by compressed sensing, Candès & Tao (2005)
•  Popularized by the Netflix Prize, recommending movies based on sparse ratings
•  Applications include image compression, sensor positioning, multiclass learning

Challenges:
a) X in the null space of A: b) Matrix rank highly non-convex:
2 3   
1 0 0 1 0 0 0 ↵ 0
e1 eT1 = 40 0 05 ↵
0 0
+ (1 ↵)
0 1
=
0 1 ↵
0 0 0

10/5/16 Olfat, M. 3
Matrix Sensing & Completion Approaches
Convex Relaxation: rank(X) ! kXk⇤
•  Recht, Fazel & Parrilo (2010) show nuclear norm ball is tightest convex relaxation of rank ball
•  Define the r-isometry constant: r = min : (1 )kXkF  kA(X)k  (1 + )kXkF 8X : rank(X)  r
•  Show min kXk⇤ gives exact solution given Restricted Isometry Property: 5r⇤ < 1/10
•  Give cases where RIP holds with high probability
•  Candes & Recht (2009) build on nuclear norm relaxation specifically for matrix completion
•  Can give exact solution given O(rn1.2 log n) samples for small r, or O(rn1.25 log n) samples for any r
•  Constant dependent on coherence of the matrix: µ(X) = nr max1in kPX ei k2

Alternating Least Squares: X := U V T ! min kX Ut VtT k2F


•  Iterate through Ut+1 = minU kX U VtT k2 , Vt+1 = minV kX Ut+1 V T k2
•  In the case of matrix completion, objective becomes min kb A(Ut VtT )k2F
•  Initialize via singular vectors of clipped SVD
•  Uses less memory and converges faster than the convex relaxation method
•  Jain, Netrapalli & Sanghavi (2013) prove first theoretical bounds on ALS
•  For matrix sensing, t > 2 log kXk ✏
F
! kX Ut VtT kF  ✏
kXk rkXkF
•  For matrix completion, t = O(log ✏ F ) ! kX Ut VtT kF  ✏ if |A| = O(4 (X)r4.5 n log n log ✏ )
•  Hardt (2013) improves results by a factor of r4 (X)5 for matrix completion

10/5/16( Olfat,(M.( 4(
Tensor Decomposition
Origination: Applications:
•  First proposed by Hitchcock (1927a, b) •  Signal processing
•  First multi-way model using tensors by Cattel •  Numerical linear algebra
(1944a, b) in the field of psychometrics •  Image compression
•  Employed sporadically throughout 1960’s •  Data mining
•  Spread to chemistry, Appellof & Davidson (1981) •  Neuroscience
•  Of more general interest in 2000’s •  Quantum physics
•  Full survey by Kolda & Bader (2009)

Generalizations from Matrices:


•  A tensor is a multi-dimensional array X 2 RI1 ⇥I2 ⇥···⇥IN
P P
•  Space equipped with inner product hX , Yi = i1 2[I1 ] · · · iN 2[IN ] xi1 ...iN yi1 ...iN
•  Rank-one tensor defined as X = a(1) · · · a(N ) , a(i) 2 RIi 8i
P
•  Standard is CANDECOMP-PARAFAC (CP): X ⇡ [ , A(1) , . . . , A(N ) ] = i2[r] r a(a) r
(N )
· · · ar
•  Generalization of SVD in matrices
•  Unlike matrix decompositions, CP decomposition is often unique (specifics in Kruskal (1989))
•  Rank of a tensor is the rank of its lowest-rank CP decomposition
•  Higher order SVD and Power Methods proposed by De Latheauwer et al. (2000a, b)

10/5/16 Olfat, M. 5
Tensor Decomposition
Some Problems:
•  Hillar & Lim (2012) show that finding most standard decompositions for tensors is NP-hard
•  Clipping smallest components in CP decomposition gives low-rank approximation, but not always the best
•  Finding rank is therefore NP-hard in general as well
•  In fact, De Silva & Lim (2008) show that finding low-rank approximations of tensors is ill-posed
•  However, also show use of hyper-determinant in finding tensor rank (more later)
•  Decomposition can depend on whether tensor can take complex numbers:
 
1 0 0 1 X 2 R2⇥2⇥2 ! rank(X ) = 3; X 2 C2⇥2⇥2 ! rank(X ) = 2
X1 = ,X2 =
0 1 1 0

Approaches:
•  Anandkumar et al. (2012) efficiently solve for low CP-rank orthogonal symmetric tensor via power method
•  Anandkumar et al. (2014) efficiently find approximations for low CP-rank incoherent tensors
•  Generally depend on whitening step to make tensor symmetric first, but this is computationally expensive
•  Ge et al. (2015) suggest online stochastic gradient descent method for decomposition
•  But only provably converges to local solution
•  In the case of Matrix Completion, Ge et al. (2016) showed that all local optima are global optima when
restricting set of matrices to be PSD
•  Raises similar questions for tensors, but first need to rigorously define PSD tensors..

10/5/16 Olfat, M. 6
Tensor Completion Approaches
Alternating Least Squares:
minX rank(X ) X (1) (N )
! min kA(X ) A( i (ai · · · ai ))k2F
s.t. A(X ) = b
i2[r]
•  Generalization of ALS for matrix completion proposed by Jain & Oh (2014)
•  Initialize via iterations of tensor power method
•  Generally cannot guarantee local optimality, but can show initialization starts off close to global optimum
•  Show |A| = O(µ6 4 (X)r5 n 1/5 (log n)4 log rkX✏ kF ) ! error  ✏ for a third-order tensor
•  However, only works for low-rank orthogonal decompositions, has sub-optimal dependence on tensor rank

Convex Relaxation: rank(X ) ! kX k⇤ ?


•  New line of work seeks to design further relaxations of matrix rank ball via new norms or decompositions
•  Rauhut & Stojanac (2015) construct the ✓k -norm based on concepts from computational algebraic geometry
•  Specifically, design nested sets based on relaxations of polynomial ideal generated by hyper-determinant of
tensor, which provably converge to convex hull of original ideal
•  Use Gröbner basis to formulate SDP problem to efficiently minimize norms under affine constraints
•  Use norm of order 1 to recover third-order tensor, but do not provide theoretical bounds (next slide)
•  Nie & Wang (2014) use similar approach to recover best rank-1 approximations
•  Aswani (2016) also defines new decomposition, objective function that allows randomized approach

10/5/16 Olfat, M. 7
Results from Theta Norm Approach
Promising results, merit further investigation

10/5/16 Olfat, M. 8
Proposed Works
Conduct more extensive empirical Attempt to prove theoretical
study of recently proposed norm bounds
relaxation methods for tensors •  Based on previous task
•  Bounds do not currently exists
•  But methods provide promising results

Attempt to construct better Attempt to generalize results


convex relaxations regarding local optima of matrices
•  One option is to build on those currently to tensors
proposed •  To help validate or simplify existing ALS
•  Another is to try to generalize methods methods
from integer programming •  Cut out costly initialization step
•  Otherwise, attempt to show that exact
generalizations cannot exist

10/5/16 Olfat, M. 9

You might also like