Professional Documents
Culture Documents
KD V 3 Relational Learning 1415
KD V 3 Relational Learning 1415
2 Institut AIFB
RECAP: Feature Vector – Data Representation
Gender
0 1
User Occupation
x1,1 ... x1,N
@ ... ... ... A
Age
xM,1 ... xM,N
3 Institut AIFB
RECAP: Feature Vector - Method
Decision Trees
Perceptron
Linear Classifiers
Neural Networks
Support Vector Machines
Naive Bayes Classifier
AdaBoost
4 Institut AIFB
RECAP: Feature Vector - Task
Limits:
Multi-task learning (related but different situations)
Time-series data (stocks, text,...)
Single output with many states (tags, movies,...)
Multiple outputs (movie ratings, knows-relation,...)
Relational domains (most real world problems)
5 Institut AIFB
RECAP: Single Relational Representation -
Data
User Occupation
likes
0 1
x1,1 ... x1,N 2 x1,N 2 +1 ... x1,N 2 +M
Movie
@ ... ... ... ... ... ... A
xN 1 ,1 ... xN 1 ,N 2 xN 1 ,N 2 +1 ... xN 1 ,N 2 +M
6 Institut AIFB
RECAP: Single Relational Representation -
Methods
hierarchical Bayes
multi-label prediction
mixed models
hierarchical linear models
collaborative filtering
canonical correlation analysis
multivariate regression
structured output prediction
principal component analysis
matrix factorization
7 Institut AIFB
RECAP: Single Relational Representation -
Task
8 Institut AIFB
RECAP: Multi-Relational Representation - Data
10 Institut AIFB
Multi-Relational Representation - Application
for
Graph structures (web, networks)
11 Institut AIFB
Chapter 5 - 3
12 Institut AIFB
The Beauty of PGMs
13 Institut AIFB
The 3 most essential dimensions
defining a KD problem:
Data representation
Method (learning algorithm)
Task (application)
Example:
Data Representation Method Task
Feature vector + Label Perceptron Classification
Graph Matrix Factorization Recommendation
Feature vector K-Means Clustering
14 Institut AIFB
Graphical Models: 3 Dimensions
Representation
aka „classifier representation“, „ML model“
An abstract representation formalism to encode a model about the
world
Learning
aka „training“
Building an concrete world model, by learning from real-world
observations
Calculate a „joint probability distribution“
Inference
aka „prediction“, „task“
Machinery to answer questions, given one specific situation in the
concrete world model
Calculate the „posterior distribution“
15 Institut AIFB
Example: Features of persons
Representation
Network of random variables
One node (random variable) for gender, one for age, one for
occupation
Edges according to dependencies between nodes
Learning
Estimate the parameters of the probability distribution of each
random variable
Observe many real-world persons and their gender, age and
occupation
„Count“ the probability of their gender, age and occupation
Inference
Given a concrete person‘s age and occupation calculate the
probability of it‘s gender
16 Institut AIFB
From simple to complex probabilistic models
User Occupation
Age
Gender
P (gender|age, occupation)
Age Occupation
17 Institut AIFB
Graphical Models – (In)Dependence
Independence:
P (x, y) = P (x)P (y) Gender Occupation
Dependence:
P (x, y) = P (x|y)P (y) Gender Occupation
18 Institut AIFB
Graphical Models - Observations
Gender y
Age Occupation x1 x2
19 Institut AIFB
Model parameters as nodes
x1 x2
20 Institut AIFB
Background: Bayes Rule
21 Institut AIFB
Complex Dependencies: Bayesian Networks
22 Institut AIFB
Problem
23 Institut AIFB
Naïve Bayes Classifier
x1 x2 ...
24 Institut AIFB
From Bayesian Nets to Relational PGMs
25 Institut AIFB
Relational Schema and its Instantiation
26 Institut AIFB
Relational Skeleton and Probabilistic
Dependencies
27 Institut AIFB
CPD and dependency graph
28 Institut AIFB
Probabilistic modeling of an instance graph
29 Institut AIFB
Plate representation
30 Institut AIFB
Making use of the Schema
(this is just a figure not a plate representation)
31 Institut AIFB
Introducing Latent Classes (now this is a relational model)
32 Institut AIFB
Dirichlet Distribution
Dirichlet Process
33 Institut AIFB
Non-parametric latent PGM (this one is called „IHRM“)
http://www.aifb.kit.edu/images/e/eb/Xu_socialnetmining_SNMwNRM.pdf
34 Institut AIFB
Parameters of the IHRM
35 Institut AIFB
Parameter Estimation:
Expectation Maximization (cmp. K-means)
36 Institut AIFB
Parameter Estimation for IHRM
37 Institut AIFB
Features of IHRM
38 Institut AIFB
IHRM as an example of a Relational PGM
39 Institut AIFB
dels, such as the Hidden Markov Models, is outlined.
below might aid in (non-relational)
Famous understanding the relationship between
Probabilistic hidden Markov
Graphical
l Models Models
and Bayesian networks.
HMM KF NN
Data representations
• Graphs, Matrices, Entity-Relationship Models, RDF
Learning algorithms
• Hierarchy of suitable algorithms ranging from simple feature-vector
based to multi-relational / logical representations
Applications
• Social, biological, computer networks. Domains with complex
dependencies between heterogenous variables which violate the
i.i.d. assumption.
41 Institut AIFB
Knowledge Discovery Lecture WS14/15
22.10.2014 Einführung
Basics, Overview
29.10.2014 Design of KD-experiments
05.11.2014 Linear Classifiers
12.11.2014 Data Warehousing & OLAP
19.11.2014 Non-Linear Classifiers (ANNs) Supervised Techniques,
26.11.2014 Kernels, SVM Vector+Label Representation
03.12.2014 entfällt
10.12.2014 Decision Trees
17.12.2014 IBL & Clustering Unsupervised Techniques
07.01.2015 Relational Learning I
Semi-supervised Techniques,
14.01.2015 Relational Learning II
Relational Representation
21.01.2015 Relational Learning III
28.01.2015 Textmining
04.02.2015 Gastvortrag Meta-Topics
11.02.2015 Challenge, Klausur Q&A
42 Institut AIFB