Professional Documents
Culture Documents
KD V 2 Relational Learning 1415
KD V 2 Relational Learning 1415
Relational Learning II 10
Prof. Dr. Rudi Studer, Dr. Achim Rettinger*, Dipl.-Inform. Lei Zhang, M.Sc Aditya Mogadala, M. Sc.
Steffen Thoma
{rudi.studer, achim.rettinger, l.zhang, aditya.mogadala, steffen.thoma}@kit.edu
2 Institut AIFB
Chapter 5 - 2
Matrix Factorization
Tensor Decomposition
3 Institut AIFB
What is Different?
Most of the data that is available
in the newly emerging era of big
data does not look like this
knows
#BlogPosts posted dateOfBirth Date
Person
holds has
OnlineChat residence attends
Account Image
5 Institut AIFB
Recap: The Learning Tasks (I)
skos:prefLabel
„Machine
topic110 person100
Learning“
foaf:knows
foaf:topic_interest foaf:gender
?
„Jane Doe“ foaf:name person200 foaf:gender „female“
6 Institut AIFB
Recap: The Learning Tasks (II)
„Machine skos:prefLabel
? foaf:topic_interest
topic110 person100
Learning“
foaf:knows
foaf:topic_interest foaf:gender
… link prediction, …
7 Institut AIFB
Recap: The Learning Tasks (III)
skos:prefLabel
„Machine
topic110 person100
Learning“
foaf:knows
foaf:topic_interest ? foaf:gender
… clustering,…
8 Institut AIFB
Recap: The 3 most essential dimensions
defining a KD problem:
! Data representation
! Method (learning algorithm)
! Task (application)
! Example:
Data Representation Method Task
Feature vector + Label Perceptron Classification
Graph Matrix Factorization Recommendation
Feature vector K-Means Clustering
9 Institut AIFB
Recap: Methods: 3 Dimensions [Domingos12]
! Classifier representation
! Evaluation function
! Optimization
Table 1: technique
The three components of learning algorithms.
Representation Evaluation Optimization
Instances Accuracy/Error rate Combinatorial optimization
K-nearest neighbor Precision and recall Greedy search
Support vector machines Squared error Beam search
Hyperplanes Likelihood Branch-and-bound
Naive Bayes Posterior probability Continuous optimization
Logistic regression Information gain Unconstrained
Decision trees K-L divergence Gradient descent
Sets of rules Cost/Utility Conjugate gradient
Propositional rules Margin Quasi-Newton methods
Logic programs Constrained
Neural networks Linear programming
Graphical models Quadratic programming
Bayesian networks
Conditional random fields
10 [Domingos12] „A Few Useful Things to Know about Machine Learning“ Institut AIFB
Recap: Feature Vector – Data Representation
Gender
0 1
User Occupation
x1,1 ... x1,N
@ ... ... ... A
Age
xM,1 ... xM,N
11 Institut AIFB
Recap: Feature Vector - Method
! Decision Trees
! Perceptron
! Linear Classifiers
! Neural Networks
! Support Vector Machines
! Naive Bayes Classifier
! AdaBoost
12 Institut AIFB
Recap: Feature Vector - Task
! Limits:
! Multi-task learning (related but different situations)
! Time-series data (stocks, text,...)
! Single output with many states (tags, movies,...)
! Multiple outputs (movie ratings, knows-relation,...)
! Relational domains (most real world problems)
13 Institut AIFB
Beyond Vector Representations
User Occupation
likes
0 1
x1,1 ... x1,N 2 x1,N 2 +1 ... x1,N 2 +M
Movie
@ ... ... ... ... ... ... A
xN 1 ,1 ... xN 1 ,N 2 xN 1 ,N 2 +1 ... xN 1 ,N 2 +M
15 Institut AIFB
Single Relational Representation - Methods
! hierarchical Bayes
! multi-label prediction
! mixed models
! hierarchical linear models
! collaborative filtering
! canonical correlation analysis
! multivariate regression
! structured output prediction
! principal component analysis
! matrix factorization
16 Institut AIFB
Single Relational Representation - Task
17 Institut AIFB
Factorizing Single Relations
20 Institut AIFB
Method: Low-rank Matrix Factorization
item
user r̂
r =
p
http://www2.research.att.com/~volinsky/papers/ieeecomputer.pdf r̂ = p
21 Institut AIFB
Example: Movie recommendation
! q and p are latent features of movies and items
! Latent dimensions are much smaller (e.g., 40)
! Problem:
! Sparsity (overfitting)
! Matrix dimensions
22 Institut AIFB
Learning objective
(u,i)2K
With regularization
X 2!
min (rui qi pu ) + β!(||qi ||
T
α 2
+ ||pu || ) 2
(u,i)2K
23 Institut AIFB
Stochastic Gradient Descent
Errui = rui T
q i pu
24 Institut AIFB
Single Relation – Level 2: One Binary Relation + Features
User Occupation
likes
0 1
x1,1 ... x1,N 2 x1,N 2 +1 ... x1,N 2 +M
Movie
@ ... ... ... ... ... ... A
xN 1 ,1 ... xN 1 ,N 2 xN 1 ,N 2 +1 ... xN 1 ,N 2 +M
26 Institut AIFB
Single Relation – Level 3: One N-ary Relation
LocationURI interest
Topic1
Topic2
...
morning
interest
Topic1
Topic2
...
noon
Person1
interest
evening
?
?
Topic1
Topic2
...
...
TopicURI
Person1
?
?
...
Person2
Person1
?
?
?
...
...
interest Person2
?
...
? PersonURI ...
Person2
...
...
...
?
...
...
...
...
...
...
...
...
...
TimeID
29 29 Institut AIFB
Factorizing Multiple Relations
knows
#BlogPosts posted dateOfBirth Date
Person
holds has
OnlineChat residence attends
Account Image
32 Institut AIFB
Multi-Relational Representation - Algorithm
33 Institut AIFB
Multi-Relational Representation - Application
for
! Graph structures (web, networks)
34 Institut AIFB
Multiple Relations – Level 1: Many Binary Relations, Two Entity-
Classes
SUNS
e2 e5
e3
e6 X
e1
e4
e1 e2 L e1 e2 L e1 e2 L
e1 1 1
e2 1 1
e3 1 1
Persons e4 1 1 1
e5 1
e6
M predicate1 predicate2 predicate3
di T 3
di T 2
http://www.dbs.ifi.lmu.de/~tresp/papers/ESWC2012-Huang-Tresp.pdf d i2 T
ˆ
X = U r diag r 2
Vr = U r diag r 2 U r X = XVr diag r 2 Vr
di + λ di + λ di + λ
36 Knowledge Discovery Institut AIFB
Different Relations + Features
SUNS with Aggregation
e2 e5
e3 X
e6
e1
e4
e1 e2 L e1 e2 L e1 e2 L
e1 1 1
e2 1 1
e3 1 1
Aggregated
e4 1 1 1 Information
e5 1 0 .2
e6
M relation1 relation2 relation3
Page 51
e4
e1 e2 L e1 e2 L e1 e2 L
e1 1 1
e2 1 1
e3 1 1
Persons e4 1 1 1
e5 1
e6
M
Cities 1
1 1 1
ek 1
ek +1 1 1
Movies 1 1 1
M
predicate1 predicate2 predicate3
Options:
! One SUNS model for each set of entities (defined appropriately)
39 Knowledge Discovery Institut AIFB
! Or: One global SUNS model (scalability problems)
Decomposing Multigraphs
knows
Homer
interest
Bart
Topic2
Topic1
Topic1
Topic2
...
...
Topic1 Bart
Homer
-
? ?? ?
Homer
interest
Homer
...
Topic1
Topic2
Bart
hates
...
...
Topic1
Topic2
...
? -
??-
? ? ??
hates Bart
-
...
Bart
Homer
?
...
...
Homer
Bart
Topic1
Topic2
...
...
? ? ? -
-
? ? ?
hates
Homer
Topic1
...
Bart
...
? ? ? ?? -
-
? ? ? ??
...
...
...
Homer
...
Bart
...
...
Topic2
...
Topic1
...
...
...
? -
-
-
...
Bart
...
...
...
...
...
? ...
...
? ? ? ?? -
knows
Topic2
Homer
Bart
Topic1
...
...
? ...
Homer -
?
...
Homer
...
...
...
...
...
-
? ? ? Topic2
-
...
Bart
...
-
...
...
...
...
...
-
...
...
...
...
40
40 Institut AIFB
Movie Recommendation Tensor ! We obtain
e1 ! The s
e2 all ent
e3
Persons e4 ! The o
e5 by all
e6
! The re
M
Cities the pr
ek predicate3
Movies ek +1 predicate2
M predicate1
41 Page
Knowledge 85
Discovery Institut AIFB
US-Politics Tensor
http://www.cip.ifi.lmu.de/~nickel/iswc2012-slides/#/
43 Knowledge Discovery Institut AIFB
Multiple Relations – Level 3: Many N-ary Relations, Many Entity-
Classes
46
Knowledge Discovery Lecture WS14/15
22.10.2014 Einführung
Basics, Overview
29.10.2014 Design of KD-experiments
05.11.2014 Linear Classifiers
12.11.2014 Data Warehousing & OLAP
19.11.2014 Non-Linear Classifiers (ANNs) Supervised Techniques,
26.11.2014 Kernels, SVM Vector+Label Representation
03.12.2014 entfällt
10.12.2014 Decision Trees
17.12.2014 IBL & Clustering Unsupervised Techniques
07.01.2015 Relational Learning I
Semi-supervised Techniques,
14.01.2015 Relational Learning II
Relational Representation
21.01.2015 Relational Learning III
28.01.2015 Textmining
04.02.2015 Gastvortrag Meta-Topics
11.02.2015 Challenge, Klausur Q&A
47 Institut AIFB