Carlos-Automl School PDF

recent advances in metalearning:
from metafeatures to streams
Carlos Soares
csoares@fe.up.pt
automated ml @ ICMC 2018

todo
• the world where automated ml lives

– a world of many models
– needs model management
– metalearning/automl can help
– but opportunities and challenges are still open
• a couple of interesting issues
[somewhat selfish perspective]
– systematic dataset characterization
– model selection for stream data
carlos soares @ automated ml @ ICMC 2018 2

the world where
automated ml lives
lots of data
+
lots of detail
+
lots of problems
+
lots of models
=
extreme data mining
(adapted from
Soulié-Fogelman)
lots of data
• data mining typically
associated with big
volumes of data
• but big is getting
bigger
from H. Tirri, What Do People Really Do – On the power of

carlos soares @Reality Mining
automated ml @for E-learning,
ICMC 2018 EDEN 2005 Annual Conference 4
+ lots of detail
• possible to collect more detailed data
– including about time and space
– ... images and sound
• about people and their behavior

– e.g. social relations, cards, call center, web and mobile
devices
• ... products, their production, distribution and usage
– e.g. machines, RFID, sensors in cars and clothes
• ... processes
– e.g. email and collaborative environments

+ lots of problems
• new applications in business

– from marketing to manufacturing control, product
development, supply chain management, etc.
• ... government/science
– e.g. tax fraud detection, catastrophe management and
environmental monitoring
• ... and people

– e.g. search, spam detection, mail organization

+ lots of models
• more specific knowledge
• ... that is, models for smaller subsets

– e.g. [Fogelman 06]
• “broadband communications company moved from 5 cross-sell
models per year to 1600;
• A wireless communications company that produces 700 CRM models
per year;”
• ... eventually, individual entities

– e.g. a recommendation model for each customer
– e.g. soft sensors
– e.g. UPV’s project with large retail company
• 50 million models to predict the sales of products

domains for automated ml:
predictive maintenance
• improved service to • improved operation of OEM
customers – more effective maintenance
– increased availability planning
– more efficient/effective – better machine design,
operation customization and retrofitting
– more effective maintenance

domains for automated ml:
taxi fleet management
• improved service to
customers and drivers
– less waiting time
– more information about trip
• improved operation of fleet

– more suitable distribution of
taxis
– more revenue
– less fuel consumption
http://www.gta.ufrj.br/ensino/eel879/trabalhos_vf_2010_2/lemos/introducao.html

todo


extreme data mining
lots of data
+
lots of detail
+
lots of problems
+
lots of models
=
extreme data mining
lots of data mining
(adapted from
Soulié-Fogelman)
traditional DM methodology
source: Chapman P, Clinton J, Kerber R, et al. CRISP-

DM 1.0: Step-by-Step Data Mining Guide. 2000.
this is not possible!

[the dream]
shameless copy of someone who prepares more beautiful slides than I do

... but maybe this is...

todo


metalearning: summary
1 s my
tu l i d
to e
ri a
• metalearning l
for algorithm (meta)data

xi,1 xi,2 xi,3 decision
selection
i
1 0.7 327.2 0 A
– induce model 2 -0.6 1234.2 1 B train
from 3 ... ... ... ...
metadata to decision = 1.04 × x1 + 0.38 × x1 + 

predict the
-0.8 37.2 1 ?
best algorithm €
0.2 14.32 1 ? predict
on a new ... ... ... ...
dataset

autoML: metalearning++
can’t resist it: his slides look so much better than mine!

todo


metadata: data volume
• large volume
– sufficient meta-
examples
– … how to store data

and metadata?
Han, Chen, Dong, Pei, Wah, Wang, Cai, “Stream Cube: an architecture for
multi-dimensional analysis of data streams”; Distributed and Parallel
Databases, Springer, 2005
ELo, Kao, Ho, Lee, Chui, Cheung, “OLAP on sequence data”, SIGMOD, 2008
Vanschoren, “Understanding machine learning performance with

experiment databases,” PhD Thesis, Katholieke Universiteit Leuven, 2010

metadata: data schema
• constant
• meta-features
characterizing
individual variables
– same across different
datasets
• base-level features as
meta-data
Rossi, Carvalho, Soares, Souza, “MetaStream: A meta-learning

based method for periodic algorithm selection in time-changing
data.” Neurocomputing 127: 52-64 (2014)
metadata: multiple sources
• different levels of similarity
between data distributions
– model reuse
Caruana, Niculescu-Mizil , Crew G, Ksikes. Ensemble Selection from
Libraries of Models. In: Proc. of the 21st Int. Conf. on Machine Learning.;
2004.
Pinto, Soares, Space Allocation in the Retail Industry: A Decision Support

System Integrating Evolutionary Algorithms and Regression Models, to be
published in ECMLPKDD 2013
Rethinking the Essence, Flexibility and Reusability of Advanced Model

Exploitation (REFRAME project – http://reframe-d2k.org/)
– … with adaptation of models

Vilalta, Giraud-Carrier, Brazdil, Soares. Inductive Transfer. In Encyclopedia of
Machine Learning. Springer US; 2010:545-548
Zaki. Editorial: Online, Interactive and Anytime Data Mining. SIGKDD Explorations.
2002;3(2)
Felix, Soares and Jorge; Metalearning for multiple-domain Transfer Learning,

MetaSel Workshop 2015
– … or the whole process
Suyama, Negishi, Yamaguchi. CAMLET: A Platform for Automatic Composition of
Inductive Applications Using Ontologies. In Proc. ICML Workshop on Recent
Advances in Meta-Learning and Future Work.; 1999:59-65
Serban, Vanschoren, Kietz, Bernstein. A survey of intelligent assistants for data

analysis. ACM Computing Surveys. 2012;(in press)
e-lico project (http://www.e-lico.eu/)
carlos soares
PlanLearn workshop@ECAI 2012 (http://datamining.liacs.nl/planlearn.html) @ automated ml @ ICMC 2018 22
metadata: ... often distributed
• use “local” data
– more relevant
– … but possibly not sufficient
for reliable model
• or higher-level data
– more data
– … but possibly less relevant
• use domain information
• but aggregation has costs

– computational
– … transmition
Nozari and Soares, Meta-Learning to Choose the Level of Analysis in Nested Data:
A Case Study on Error Detection in Foreign Trade Statistics, Proc. of the
International Joint Conference on Neural Networks (IJCNN), 2015

metadata: dynamic distributions
• changing and possibly re-
occurring concepts
– monitoring/understanding data
Kosina, Gama, Sebastião: Drift Severity Metric. ECAI 2010: 1119-1120
– … and models
Gama, Kosina. Learning about the learning process. In: Proc. of the 10th
Int. Conf. on Advances in Intelligent Data Analysis. Springer-Verlag;
2011:162-172.
Kadlec, Gabrys. Local Learning-Based Adaptive Soft Sensor for Catalyst

Activation Prediction. American Institute of Chemical Engineers J.
2011;57(5):1288-1301.
– reavaluate selection often

Rossi, Carvalho, Soares, Souza, “MetaStream: A meta-learning
based method for periodic algorithm selection in time-changing
data.” Neurocomputing 127: 52-64 (2014)
Jan N. van Rijn, Geoffrey Holmes, Bernhard Pfahringer, Joaquin

Vanschoren:Algorithm Selection on Data Streams. Discovery
Science 2014: 325-336
– evolving models
Gama, Knowledge Discovery from Data Streams (2010), Chapman & Hall/CRC
Press
Mendes-Moreira; Jorge; Soares & Freire de Sousa. Ensemble learning: a study on

different variants of the dynamic selection approach. In Proc. 6th Int. Conf. on
Machine Learning and Data Mining, Springer, 2009, LNAI 5632, 191-205.
– … and metamodels carlos soares @ automated ml @ ICMC 2018 24

meta-development:
metafeature design
Rossi, Carvalho, Soares, Souza, “MetaStream: A meta-learning based
• hardest problem in metalearning

method for periodic algorithm selection in time-changing data.”
Neurocomputing 127: 52-64 (2014)
Pinto, Soares, Mendes-Moreira; A Framework To Decompose And

– as in any learning problem Develop Metafeatures. MetaSel@ECAI 2014; 32-36
• methodology for systematic design of metafeatures

(meta-)development:
infrastructure
• similar infrastructure across
problems
– meta-level problem structure
tends to be similar
– … even if base-level problems are
diverse
• AaaS++
– currently being developed
• RS
• classification/regression
Amazon Machine Learning Web Services,
https://aws.amazon.com/machine-learning/
Vanschoren, van Rijn, Bischl, and Torgo. OpenML: networked science in

machine learning. SIGKDD Explorations 15(2), pp 49-60, 2013.
Abreu, Soares, Camacho; Distributed Environment Framework for

Optimization Experiments. ICCSA (Workshops/Short Papers/Posters) 2014;
256-259
Félix, Soares, Jorge, Vinagre; Monitoring Recommender Systems: A

Business Intelligence Approach. ICCSA (6) 2014; 277-288
wrap-up (part I)
• model management
– exciting field
• e.g. autoML (http://www.automl.org/)
– new challenges
• do not forget the basic issues

– … not all of them, at least
• learn from other areas

– e.g., algorithm portfolios
Smith-Miles. Cross-disciplinary perspectives on meta-learning for algorithm selection.
ACM Comput. Surv. 2008;41(1):1-25
Kotthoff, Algorithm Selection for Combinatorial Search Problems: A Survey

AI Magazine, 2014
todo

[somewhat selfish perspective]

metalearning: summary
• metalearning
for algorithm (meta)data
xi,1 xi,2 xi,3 decision
selection
i
1 0.7 327.2 0 A
– induce model 2 -0.6 1234.2 1 B train
from 3 ... ... ... ...
metadata to decision = 1.04 × x1 + 0.38 × x1 + 

predict the
-0.8 37.2 1 ?
best algorithm €
0.2 14.32 1 ? predict
on a new ... ... ... ...
dataset

metadata: attributes
• represent dataset using a vector of metafeatures

i xi,1 xi,1 xi,1 xi,1 decisão
1 0.7 327.2 0 5 -1
2 -0.6 1234.2 1 4 1 characterization 2399,49,1,0.65,…
3 ... ... ... ... ...
max.
examples correlation
1 0.7 327.2 0 5 -1 continuous

symbolic -target
2 -0.6 1234.2 1 4 1 continuous
variables
3 ... ... ... ... ... variables
1 0.7 327.2 0 5 -1 metadata

2 -0.6 1234.2 1 4 1
i xi,1 xi,1 target
3 ... ... ... ... ...
1 . . A
…
2 . . B
3 . . A
systematic metafeature
generation
• hardest problem in metalearning
– as in any learning problem
• methodology for systematic design of metafeatures

fitting existing metafeatures in
the framework: simple ones
POST
METAFEATURE OBJECT META-
PROCESSING
FUNCTION
Number of Examples Set Non-

Count
examples aggregated
Target
Class entropy Examples Col Entropy

Descriptive
Statistic
Attr
Absolute mean
correlation Descriptive
Examples Col
between numeric Correlation
Statistic
features
Attr
Correlation
between numeric Examples Col Correlation
Non-
features aggregated
Average degree
of dataset
Examples Set Descriptive
characterization Degree
Statistic
graph
Jensen-Shannon
distance between Examples Set Jensen-
Descriptive
dataset and Shannon
Statistic
bootstrap Bootstrap Set distance

fitting existing metafeatures in
the framework: landmarkers
METAFEATURE OBJECT META- POST

FUNCTION PROCESSING
Predictions
Target Non-
Decision stump Accuracy
aggregated
landmarker
Examples Col
Highest class Predictions Max Non-

probability aggregated

systematic metafeature
generation

results: single function
ENTROPY CORRELATION
C5.0
C5.0
CD
CD
5 4 3 2 1
5 4 3 2 1
Default Syst.ReliefF Traditional Default
Traditional Syst.CFS Set
Syst.ReliefF
Set Systematic Syst.CFS
Systematic
SVM SVM
CD CD
5 4 3 2 1 5 4 3 2 1
Default Syst.ReliefF Sys.CFS Syst.ReliefF

Traditional Set Traditional Set
Systematic Systematic
Syst.CFS Default
RF RF
CD CD
5 4 3 2 1 5 4 3 2 1
Default Syst.ReliefF Default Systematic

Traditional Systematic Traditional Syst.ReliefF
Set Syst.CFS Set Syst.CFS

conclusions
• generating metafeatures has typically been an

art
– rather than science/engineering
• framework to generate metafeatures
systematically
– taking advantage of the common form of datasets

todo


motivation & goals
Difference between SVM and Random Forests

• Periodic algorithm
selection for non-
stationary
0.6
environments
0.4
NMSE SVM − NMSE Random Forests
0.2
• MetaStream: a
0.0
metalearning approach
−0.2
– Meta-model relates data

−0.4
characteristics to base-
−0.6
0 50 100 150 200

level model performance
Batches of 25 examples

Base-level
Base3data' • For each new batch of n
x" yb'
!" !" !" !" test examples:
!" !" !" !" – Induces regression
."
Training'
." models using most

." recent training data
Meta3example'
." x" ym'
!" !" !" !" Feature'
model
!" Predicts
– !" !" target value for
!" !" !" extractor' test examples
." – Evaluates predictions
."
Test'
." when true target values

!" !" !" become available

Meta-level: meta-data
Base3data'
x" yb'
!" !" !" !" model 1
!" !" !" !" model 2
."
Training'
."
." Meta3example'
." x" ym'
!" !" !" !" Feature' !" !" !"
!" !" !" extractor'
."
." BEST
Test'
." ALGORITHM
!" !" !"

Meta-level: learning and
prediction
Meta%data*
x" ym*
!" !" !" !"
!" !" !" !" Meta%example*
." Learning* Meta% x" ym*
." algorithm* model* !" !" !"
."
!" !" !" !"
Base3data'
x" yb'
!" !" !" !"
!" !" !" !"
."
Training'
."
." Meta3example'
." x" ym'
!" !" !" !" Feature' !" !" !"
!" !" !" extractor'
."
."
Test'
."
!" !" !"
Meta-features issues: base level

Meta-features issues: meta level

systematic generation of
metafeatures

results with SVM

results with RF

selecting model for single
observation

conclusions
• adaptation of metalearning approach to

stream data
• systematic approach to generate metafeatures
– taking advantage of the constant morphology
• mixed results
• towards integrated meta & base learning?

“THE” Book
Metalearning – Applications
to Data Mining
Pavel Brazdil
Christophe Giraud-Carrier
Carlos Soares
Ricardo Vilalta
http://www.springer.com/computer/artificial/book/978-3-540-73262-4

Carlos-Automl School PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Carlos-Automl School PDF

Uploaded by

Copyright:

Available Formats

recent advances in metalearning:

from metafeatures to streams

automated ml @ ICMC 2018

• the world where automated ml lives

carlos soares @ automated ml @ ICMC 2018 2

from H. Tirri, What Do People Really Do – On the power of

• about people and their behavior

carlos soares @ automated ml @ ICMC 2018 5

• new applications in business

• ... and people

carlos soares @ automated ml @ ICMC 2018 6

• ... that is, models for smaller subsets

• ... eventually, individual entities

carlos soares @ automated ml @ ICMC 2018 7

carlos soares @ automated ml @ ICMC 2018 8

• improved operation of fleet

carlos soares @ automated ml @ ICMC 2018 9

• the world where automated ml lives

carlos soares @ automated ml @ ICMC 2018 10

source: Chapman P, Clinton J, Kerber R, et al. CRISP-

carlos soares @ automated ml @ ICMC 2018 13

shameless copy of someone who prepares more beautiful slides than I do

carlos soares @ automated ml @ ICMC 2018 14

carlos soares @ automated ml @ ICMC 2018 15

• the world where automated ml lives

carlos soares @ automated ml @ ICMC 2018 16

for algorithm (meta)data

metadata to decision = 1.04 × x1 + 0.38 × x1 + 

carlos soares @ automated ml @ ICMC 2018 17

carlos soares @ automated ml @ ICMC 2018 18

• the world where automated ml lives

carlos soares @ automated ml @ ICMC 2018 19

– … how to store data

Vanschoren, “Understanding machine learning performance with

carlos soares @ automated ml @ ICMC 2018 20

Rossi, Carvalho, Soares, Souza, “MetaStream: A meta-learning

Pinto, Soares, Space Allocation in the Retail Industry: A Decision Support

Rethinking the Essence, Flexibility and Reusability of Advanced Model

– … with adaptation of models

Felix, Soares and Jorge; Metalearning for multiple-domain Transfer Learning,

Serban, Vanschoren, Kietz, Bernstein. A survey of intelligent assistants for data

e-lico project (http://www.e-lico.eu/)

• but aggregation has costs

carlos soares @ automated ml @ ICMC 2018 23

Kadlec, Gabrys. Local Learning-Based Adaptive Soft Sensor for Catalyst

– reavaluate selection often

Jan N. van Rijn, Geoffrey Holmes, Bernhard Pfahringer, Joaquin

Mendes-Moreira; Jorge; Soares & Freire de Sousa. Ensemble learning: a study on

– … and metamodels carlos soares @ automated ml @ ICMC 2018 24

• hardest problem in metalearning

Pinto, Soares, Mendes-Moreira; A Framework To Decompose And

• methodology for systematic design of metafeatures

carlos soares @ automated ml @ ICMC 2018 25

Vanschoren, van Rijn, Bischl, and Torgo. OpenML: networked science in

Abreu, Soares, Camacho; Distributed Environment Framework for

Félix, Soares, Jorge, Vinagre; Monitoring Recommender Systems: A

• do not forget the basic issues

• learn from other areas

Kotthoff, Algorithm Selection for Combinatorial Search Problems: A Survey

• the world where automated ml lives

carlos soares @ automated ml @ ICMC 2018 28

metadata to decision = 1.04 × x1 + 0.38 × x1 + 

carlos soares @ automated ml @ ICMC 2018 29