Professional Documents
Culture Documents
Dimensional Data
Analytics
Tensor Data Analysis
Multilinear Algebra
Preliminaries I
Learning Objectives
• Define a tensor
• Define some tensor terminologies
• Apply inner product and outer
product
• Perform basic tensor operations
What is Tensor?
Tensor is a multidimensional array, examples include:
4000
3000
Power (Watt)
2500
2000
1500
1000
500
Vectors, i.e., 0
0 0.05 0.1 0.15
Time (Sec)
0.2 0.25 0.3
Matrices, i.e.,
(a) Mode-1 (column) fibers: 𝒙𝒙:𝑗𝑗𝑗𝑗 (b) Mode-2 (row) fibers: 𝒙𝒙𝑖𝑖:𝑘𝑘 (c) Mode-3 (tube) fibers: 𝒙𝒙𝑖𝑖𝑖𝑖:
(a) Horizontal Slices: 𝒙𝒙𝑖𝑖:: (b) Lateral slices: 𝒙𝒙:𝑗𝑗: (c) Frontal slices: 𝒙𝒙∷𝑘𝑘 (or 𝒙𝒙𝑘𝑘 )
Multilinear Algebra
Preliminaries II
Learning Objectives
• Compute tensor multiplication
• Utilize Kronecker product
• Define Khatri-Rao product
• Compute Hadamard product
Tensor Multiplication
• The n-mode product is referred to as multiplying a tensor by a matrix (or a vector) in
mode n.
• The n-mode (matrix) product of a tensor 𝒳𝒳 ∈ ℝ𝐼𝐼1×𝐼𝐼2×⋯×𝐼𝐼𝑛𝑛 ×⋯×𝐼𝐼𝑁𝑁 with a matrix 𝑼𝑼 ∈ ℝ𝐽𝐽×𝐼𝐼𝑛𝑛
is denoted by 𝒳𝒳 ×n 𝑼𝑼 and is of size 𝐼𝐼1 × ⋯ × 𝐼𝐼𝑛𝑛−1 × 𝐽𝐽 × 𝐼𝐼𝑛𝑛+1 × ⋯ × 𝐼𝐼𝑁𝑁 . We have
𝐼𝐼𝑛𝑛
𝒴𝒴 = 𝒳𝒳 ×𝑛𝑛 𝑼𝑼 𝑖𝑖1 ⋯𝑖𝑖𝑛𝑛−1 𝑗𝑗𝑖𝑖𝑛𝑛+1 ⋯𝑖𝑖𝑁𝑁 =� 𝑥𝑥𝑖𝑖1𝑖𝑖2⋯𝑖𝑖𝑛𝑛⋯𝑖𝑖𝑁𝑁 𝑢𝑢𝑗𝑗𝑖𝑖𝑛𝑛
𝑖𝑖𝑛𝑛 =1
𝒀𝒀(𝑛𝑛) = 𝑼𝑼 𝑿𝑿(𝑛𝑛)
• Example:
𝑼𝑼
Suppose 𝒳𝒳 ∈ ℝ𝐼𝐼1×𝐼𝐼2×𝐼𝐼3 and 𝑼𝑼 ∈ ℝ𝐽𝐽×𝐼𝐼2 . × =
The n-mode product is obtained by
multiplying each row fiber by Matrix 𝑼𝑼.
(b) Mode-2 (row) fibers: 𝒙𝒙𝑖𝑖:𝑘𝑘 (b) Mode-2 (row) fibers: 𝒙𝒙𝑖𝑖:𝑘𝑘
Tensor Multiplication
• The n-mode (vector) product of a tensor 𝒳𝒳 ∈ ℝI1×I2×⋯×IN with a vector v ∈ ℝIn is
� n 𝒗𝒗 . The result is of order N − 1, i.e., Elementwise,
denoted by 𝒳𝒳 ×
𝐼𝐼𝑛𝑛
� 𝑛𝑛 𝒗𝒗
𝒴𝒴 = 𝒳𝒳 × 𝑖𝑖1 ⋯𝑖𝑖𝑛𝑛−1 𝑖𝑖𝑛𝑛+1 ⋯𝑖𝑖𝑁𝑁 =� 𝑥𝑥𝑖𝑖1𝑖𝑖2⋯𝑖𝑖𝑁𝑁 𝑣𝑣𝑖𝑖𝑛𝑛
𝑖𝑖𝑛𝑛 =1
𝒗𝒗
× =
MATLAB code
X = tenrand([5,3,4])
A = rand(4,5)
Y = ttm(X, A, 1) %<-- X times A in mode-1.
Example: 𝑛𝑛-mode Vector Product
MATLAB code
X = tenrand([5,3,4])
A = rand(5,1)
Y = ttv(X, A, 1) %<-- X times A in mode 1.
Kronecker Product
The Kronecker product of matrices 𝑨𝑨 ∈ ℝ𝐼𝐼×𝐽𝐽 and 𝑩𝑩 ∈ ℝ𝐾𝐾×𝐿𝐿 is denoted by 𝑨𝑨⨂𝑩𝑩. The result is a
matrix of size 𝐼𝐼𝐼𝐼 × 𝐽𝐽𝐽𝐽 and defined by
= 𝒂𝒂1 ⨂𝒃𝒃1 𝒂𝒂1 ⨂𝒃𝒃2 𝒂𝒂1 ⨂𝒃𝒃3 ⋯ 𝒂𝒂𝐽𝐽 ⨂𝒃𝒃𝐿𝐿−1 𝒂𝒂𝐽𝐽 ⨂𝒃𝒃𝐿𝐿
Example: Kornecker Product
MATLAB code
A=[1 2;3 4;5 6]
B=[1 2 3;4 5 6]
C=kron(A,B)
Khatri-Rao Product
The Khatri-Rao product is the “matching columnwise" Kronecker product
Given matrices 𝑨𝑨 ∈ ℝ𝐼𝐼×𝐾𝐾 and 𝑩𝑩 ∈ ℝ𝐽𝐽×𝐾𝐾 , their Khatri-Rao product, denoted by
𝑨𝑨 ⊙ 𝑩𝑩, is a matrix of size 𝐼𝐼𝐼𝐼 × 𝐾𝐾 and computed by
If 𝒂𝒂 and 𝒃𝒃 are vectors, then the Khatri-Rao and Kronecker products are identical,
i.e.,
𝒂𝒂⨂𝒃𝒃 = 𝒂𝒂 ⊙ 𝒃𝒃
Example: Khatri-Rao Product
MATLAB code
A=[1 2;3 4;5 6]
B=[1 2;3 4]
C=khatrirao(A,B)
Hadamard Product
The Hadamard product, of matrices 𝑨𝑨 ∈ ℝ𝐼𝐼×𝐽𝐽 and 𝑩𝑩 ∈ ℝ𝐼𝐼×𝐽𝐽 , denoted by 𝑨𝑨 ∗ 𝑩𝑩,
is defined by the elementwise matrix product, i.e., the resulting matrix 𝐼𝐼 × 𝐽𝐽 is
computed by
• If 𝑅𝑅 is the rank of higher-tensor then the CP decomposition will be exact and unique.
Rank of Tensor
Rank of a tensor 𝒳𝒳, denoted by rank(𝒳𝒳), is the smallest number of rank-one
tensors whose sum can generate 𝒳𝒳. e.g.,
𝑅𝑅
𝑿𝑿(𝟐𝟐) ≈ 𝑩𝑩𝚲𝚲 𝑪𝑪 ⊙ 𝑨𝑨 ⊤ ;
𝒌𝒌 𝒏𝒏 𝒌𝒌−𝟏𝟏 𝒌𝒌+𝟏𝟏 𝟏𝟏 ⊤
𝑿𝑿(𝒌𝒌) ≈ 𝑨𝑨 𝚲𝚲 𝑨𝑨 ⊙ ⋯ ⊙ 𝑨𝑨 ⊙ 𝑨𝑨 ⊙ ⋯ ⊙ 𝑨𝑨 .
CP Decomposition: Computation
CP decomposition can be obtained by solving the following optimization problem:
2
𝑅𝑅
2
min 𝒳𝒳 − 𝝀𝝀; 𝑨𝑨, 𝑩𝑩, 𝑪𝑪 = 𝒳𝒳 − � 𝜆𝜆𝑟𝑟 𝒂𝒂𝑟𝑟 ∘ 𝒃𝒃𝑟𝑟 ∘ 𝒄𝒄𝑟𝑟
𝑨𝑨,𝑩𝑩,𝑪𝑪,𝝀𝝀
𝑟𝑟=1
Alternative Least Square (ALS) can be used to solve the optimization problem. ALS iteratively
solves the optimization model for one matrix given all other matrices. For example given 𝑩𝑩 and
𝑪𝑪, 𝑨𝑨� = 𝑨𝑨𝚲𝚲 is given by
min 𝑿𝑿 𝟏𝟏
� 𝑪𝑪 ⊙ 𝑩𝑩
− 𝑨𝑨 ⊤ � =𝑿𝑿
𝑨𝑨 𝟏𝟏 (𝑯𝑯⊤ 𝑯𝑯)−𝟏𝟏 𝑯𝑯⊤ ; where 𝑯𝑯 = 𝑪𝑪 ⊙ 𝑩𝑩 ⊤
�
𝑨𝑨 F
�𝑟𝑟
𝒂𝒂
After finding 𝑨𝑨� , 𝑨𝑨 is calculated by 𝒂𝒂𝑟𝑟 =
𝑎𝑎� 𝑟𝑟
CP Decomposition: Computation
The ALS algorithm for an n-mode tensor for a given R:
Initialize (e.g., random entries) 𝑨𝑨(𝟏𝟏) , 𝑨𝑨(𝟐𝟐) , … , 𝑨𝑨 𝒏𝒏
∈ ℝ𝑰𝑰𝒏𝒏 ×𝑹𝑹
Repeat until convergence
for k=1,…, n do
𝟏𝟏 ⊤ 𝟏𝟏 𝒌𝒌−𝟏𝟏 ⊤ 𝒌𝒌−𝟏𝟏 𝒌𝒌+𝟏𝟏 ⊤ 𝒌𝒌+𝟏𝟏 𝒏𝒏 ⊤ 𝒏𝒏
𝑽𝑽 = 𝑨𝑨 𝑨𝑨 ∗ ⋯ ∗ 𝑨𝑨 𝑨𝑨 ∗ 𝑨𝑨 𝑨𝑨 ∗ ⋯ ∗ 𝑨𝑨 𝑨𝑨
𝒌𝒌 𝒏𝒏 𝒌𝒌+𝟏𝟏 𝒌𝒌−𝟏𝟏 𝟏𝟏
𝑨𝑨 = 𝑿𝑿(𝒌𝒌) 𝑨𝑨 ⊙ ⋯ ⊙ 𝑨𝑨 ⊙ 𝑨𝑨 ⊙ ⋯ ⊙ 𝑨𝑨 (𝑽𝑽⊤ 𝑽𝑽)−𝟏𝟏 𝑽𝑽⊤
𝒌𝒌
Normalize columns of 𝑨𝑨 and store its norm as 𝝀𝝀
end for
Return 𝑨𝑨(𝟏𝟏) , 𝑨𝑨(𝟐𝟐) , … , 𝑨𝑨 𝒏𝒏
and 𝝀𝝀
Some convergence criteria include little or no change in the value of objective function, no or
little change in the factor matrices, and reaching a predefined number of iterations.
Example: CP Decomposition
MATLAB code
X = sptenrand([5 4 3], 10)
P = parafac_als(X,2)
Example: Heat Transfer Data
An image is generated from the following heat transfer process:
y
x
(a) Without noise (b) With noise
Example: Decoupling Spatial and
Temporal
For R = 1, the columns 𝒂𝒂1 , 𝒃𝒃1 , and 𝒄𝒄𝑟𝑟 are computed.
x t
𝒂𝒂1 ∘ 𝒃𝒃1 Spatial pattern 𝒄𝒄𝑟𝑟 Temporal pattern
Example: Rank Selection
2
We can choose 𝑅𝑅 using AIC =2 𝒳𝒳 − ∑𝑅𝑅𝑟𝑟=1 𝜆𝜆𝑟𝑟 𝒂𝒂𝑟𝑟 ∘ 𝒃𝒃𝑟𝑟 ∘ 𝒄𝒄𝑟𝑟 + 2𝑘𝑘
𝑅𝑅 = 4
Example: Temporal and Spatial
Patterns
For R = 4
𝒂𝒂3 ∘ 𝒃𝒃3
𝒂𝒂4 ∘ 𝒃𝒃4
Topics on High-
Dimensional Data
Analytics
Tensor Data Analysis
• The core tensor captures the interaction among different modes and since often
𝑃𝑃< 𝐼𝐼 , 𝑄𝑄< 𝐽𝐽 , 𝑅𝑅< 𝐼𝐼 , the core tensor is considered as the compressed version of
the original tensor.
• In most cases, it is assumed that factor matrices are column-wise orthogonal (not
required though).
• CP decomposition can be seen as a special case of Tucker where the core matrix
is super-diagonal and P = Q = R.
n-Rank of Tensor
• Consider an n-mode tensor 𝒳𝒳 ∈ ℝ𝐼𝐼1 ×⋯×𝐼𝐼𝑁𝑁 . The n-rank of this tensor, denoted by 𝑅𝑅𝑛𝑛 =
rank 𝑛𝑛 (𝒳𝒳), is defined by the column rank of the mode-n fibers, 𝑿𝑿(𝒏𝒏) .
• The n-rank is different from rank of the tensor introduced before.
• Given the set of n-ranks, we can find the exact Tucker decomposition of 𝒳𝒳 for this set.
• If the rank used for decomposition is smaller than the corresponding n-rank, truncated
Tucker decomposition is obtained.
Tucker Decomposition: HOSVD
• Higher Order Singular Value Decomposition (HOSVD) is a simple method for performing
Tucker decomposition.
• The idea is to find the factor matrices for each mode separately such that the maximum
variation of the mode is captured.
• This can be done by performing SVD decomposition for each mode-k fiber of the tensor and
keep the 𝑅𝑅𝑘𝑘 leading left singular values of the matrix 𝑿𝑿(𝒌𝒌) , denoted by 𝑨𝑨(𝒌𝒌) .
• The core tensor is then obtained by
• The truncated HOSVD is not optimal with respect to the least squared lack of fit measure.
Tucker Decomposition: ALS
• A natural approach for finding the Tucker decomposition components is to minimize the
squared error between the original tensor and the reconstructed one. That is
2
min
(𝟏𝟏) (𝟐𝟐)
𝒳𝒳 − 𝒢𝒢; 𝑨𝑨(𝟏𝟏) , 𝑨𝑨(𝟐𝟐) , … , 𝑨𝑨(𝒏𝒏)
𝒢𝒢;𝑨𝑨 ,𝑨𝑨 ,…,𝑨𝑨(𝒏𝒏)
S.T. 𝒢𝒢 ∈ ℝ𝑅𝑅1 ×⋯×𝑅𝑅𝑁𝑁 ; 𝑨𝑨(𝒌𝒌) ∈ ℝ𝐼𝐼𝑘𝑘 ×𝑅𝑅𝑘𝑘 and 𝑨𝑨(𝒌𝒌) column-wise orthogonal for all k’s
2
= max 𝑨𝑨(𝒌𝒌)⊤ 𝑿𝑿(𝒌𝒌) 𝑨𝑨 𝒏𝒏
⨂ … ⨂𝑨𝑨 𝒌𝒌+𝟏𝟏
⨂𝑨𝑨 𝒌𝒌−𝟏𝟏
⨂ … ⨂𝑨𝑨 𝟏𝟏
𝒢𝒢;𝑨𝑨(𝟏𝟏) …,𝑨𝑨(𝒏𝒏)
S.T. 𝒢𝒢 ∈ ℝ𝑅𝑅1 ×⋯×𝑅𝑅𝑁𝑁 ; 𝑨𝑨(𝒌𝒌) ∈ ℝ𝐼𝐼𝑘𝑘 ×𝑅𝑅𝑘𝑘 and 𝑨𝑨(𝒌𝒌) column-wise orthogonal for all k’s
Tucker Decomposition: ALS
• Using ALS, given all factor matrices but 𝑨𝑨 𝒌𝒌 , the solution is determined by applying SVD
decomposition on 𝑿𝑿(𝒌𝒌) 𝑨𝑨 𝒏𝒏 ⨂ … ⨂𝑨𝑨 𝒌𝒌+𝟏𝟏 ⨂𝑨𝑨 𝒌𝒌−𝟏𝟏 ⨂ … ⨂𝑨𝑨 𝟏𝟏 and keeping the 𝑅𝑅𝑘𝑘 leading
left singular values
2 2
max 𝒳𝒳 ×1 𝑨𝑨(𝟏𝟏)⊤ ×2 𝑨𝑨(𝟐𝟐)⊤ ×3 … ×𝑛𝑛 𝑨𝑨(𝒏𝒏)⊤ = 𝑨𝑨(𝒌𝒌)⊤ 𝑿𝑿(𝒌𝒌) 𝑨𝑨 𝒏𝒏
⨂ … ⨂𝑨𝑨 𝒌𝒌+𝟏𝟏
⨂𝑨𝑨 𝒌𝒌−𝟏𝟏
⨂ … ⨂𝑨𝑨 𝟏𝟏
𝑨𝑨(𝒌𝒌)
S.T. 𝑨𝑨(𝒌𝒌) ∈ ℝ𝐼𝐼𝑘𝑘 ×𝑅𝑅𝑘𝑘 and column-wise orthogonal for all k’s
Tucker Decomposition:
Computation
The ALS algorithm for an n-mode tensor for a given 𝑅𝑅1 , 𝑅𝑅2 , … , 𝑅𝑅𝑛𝑛 :
Initialize 𝑨𝑨(𝟏𝟏) , 𝑨𝑨(𝟐𝟐) , … , 𝑨𝑨 𝒏𝒏
∈ ℝ𝐼𝐼𝑛𝑛 ×𝑅𝑅𝑛𝑛 using HOSVD
Repeat until convergence
for k=1,…, n do
𝓨𝓨 = 𝒳𝒳 ×1 𝑨𝑨(1)⊤ ×2 𝑨𝑨(2)⊤ … ×𝑘𝑘−1 𝑨𝑨 𝑘𝑘−1
×𝑘𝑘+1 𝑨𝑨 𝑘𝑘+1
… ×𝑛𝑛 𝑨𝑨(𝑛𝑛)⊤
𝒌𝒌
𝑨𝑨 the 𝑅𝑅𝑘𝑘 leading left singular values of 𝒀𝒀(𝒌𝒌) (i.e., the corresponding vectors)
end for
𝒢𝒢 = 𝒳𝒳 ×1 𝑨𝑨(𝟏𝟏)⊤ ×2 𝑨𝑨(𝟐𝟐)⊤ ×3 … ×𝑛𝑛 𝑨𝑨(𝒏𝒏)⊤
Return 𝑨𝑨(𝟏𝟏) , 𝑨𝑨(𝟐𝟐) , … , 𝑨𝑨 𝒏𝒏
and 𝒢𝒢
Some convergence criteria include little or no change in the value of objective function, no or
little change in the factor matrices, and reaching a predefined number of iterations.
Example: Tucker Decomposition
MATLAB code
X = sptenrand([5 4 3], 10)
T = tucker_als(X,[2 2 1]) %<-- best rank(2,2,1)
Example: Heat Transfer Data
An image is generated from the following heat transfer process:
y
x
(a) Without noise (b) With noise
Example: Rank Selection
1-rank = 21; 2-rank = 21; 3-rank = 10
Try different combinations and use AIC to decide which one is optimal.
for i = 1:4
for j = 1:4
for k = 1:4
T = tucker_als(X,[i,j,k]);
T1 = tenmat(T,1);
dif = tensor(X1-T1);
err(i,j,k) = innerprod(dif,dif);
AIC(i,j,k) = 2*err(i,j,k) + 2*(i+j+k);
end
end
end
[3,3,3]
Example: Temporal and Spatial
Patterns
For rank = [3,3,3]
𝓖𝓖 ∈ ℝ3×3×3
𝒂𝒂𝟑𝟑 ⨂𝒃𝒃𝟑𝟑 ∈ ℝ21×21
𝑪𝑪 ∈ ℝ10×3
Time-to-failure
Degradation Analysis using Image
Streams
• A degradation image stream that contains the degradation information of a
system, can be used to predict the remaining lifetime of a system.
Prediction - Scalar Response and
Tensor Predictors
System 1
Historical
Signals
System 2
System 𝑁𝑁
Real-Time
Signals
Prediction Model
TTF Challenge: High-dimensionality
(too many parameters)
2 + 20 × 20 × 20
= 8,002
Solution: Exploiting the inherent
Low-dimensional structure of HD
data
Dimension Reduction
2 + 20 × 20 × 20
= 8,002 Further reduce the
number of parameters:
Decompose using
2 + 10 × 10 × 10 (1) CP decomposition; or
= 1,002
(2) Tucker decomposition
Dimension Reduction using CP
CANDECOMP/PARAFAC (CP) decomposition
Reformulated regression
Model Estimation using MLE
Maximum likelihood estimation (MLE)
Rank selection:
Dimension Reduction using CP
Tucker decomposition
Reformulated regression
Case Study
Experiments: degradation tests for
rolling element bearings
Each image: 40 × 20 pixels
Time-to-failure: [15, 55] minutes
Case Study
Case Study
0.4
Cutting
Depth 0.8
(mm)
1.2
A?
High-dimensionality challenge
• Number of parameters: A is of dimension 210 × 64 × 2 = 26,880.
Tucker Decomposition of Tensor
Coefficient
Reduce dimension of 𝒜𝒜 ∈ 𝑅𝑅𝐼𝐼1 ×𝐼𝐼2 ×𝑝𝑝 by Tucker Decomposition
Basis expansion:
𝒜𝒜 = ℬ ×1 𝐔𝐔 (1) ×2 𝐔𝐔 (2) + 𝜖𝜖𝐴𝐴
𝐔𝐔 (𝑘𝑘) ∈ 𝑅𝑅𝐼𝐼𝑘𝑘×𝑃𝑃𝑘𝑘 is orthogonal factor matrices
ℬ ∈ 𝑅𝑅𝑃𝑃1 ×𝑃𝑃2 ×𝑝𝑝 is the core tensor, 𝑃𝑃𝑘𝑘 < 𝐼𝐼𝑘𝑘
𝒜𝒜 𝐔𝐔 (1) ℬ 𝐔𝐔 (2)
𝜖𝜖𝐴𝐴
Optimal settings:
speed: 80m/min
cutting depth: 0.8250mm