Professional Documents
Culture Documents
Machine Learning
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation
Agenda
• Reality, Abstraction and Modeling
• Artificial Intelligence (AI)
• Machine Learning (ML)
• Supervised, Unsupervised and Reinforced Learning
• Training, Testing, Validation
• Cross-validation
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 2
Abstraction and human mind
• Complexity of reality –> Simplification
• The human mind continuously re-works reality by applying cognitive
processes
• Abstraction: capability of finding the commonality in many different
observations:
• Generalize specific features of real objects (generalization)
• Classify the objects into coherent clusters (classification)
• Aggregate objects into more complex ones (aggregation)
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 3
Models
Model: a simplified or partial representation of reality, defined
in order to accomplish a task or to reach an agreement
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation
The Role of Models
Data Science
AI MODELS ML
Big Data
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 5
Artificial Intelligence
• Many Interpretations
• Artificial General Intelligence (aka. Strong AI or Full AI)
• General intelligent actions
• Discerning problems
• Acting as humans would
• … up to self-consciousness
• Restricted AI (aka. Weak or Applied AI or Narrow AI)
• Implementing intelligence in specific applications or problem solving tasks
• No full cognitive abilities
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 6
Machine Learning
“The field of machine learning is concerned with the question of
how to construct computer programs
that automatically improve with experience.”
Tom Mitchell (1997)
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 7
Supervised Learning
• Task to be learnt: to extract a description / labeling or pattern from
the data, based on the training
• Training examples labeled by a (human) supervisor
• Use it to predict the output for further examples
• Performance measured as how accurate the output is
• Example applications
• Credit approval
• Medical diagnosis
• Fraud detection
• Text and image labeling or classification
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 8
Unsupervised Learning
• Task to be learnt: finding interesting patterns/ groups/ categories in
the data based on evidence
• No pre-labeled data à detection of facts from raw data
• Performance measured as how good / meaningful the groups /
patterns are
• Example applications
• Customer segmentation
• User behavior categorization
• Grouping of items by similarity
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 9
Reinforcement Learning
• The machine learns through trial-and-error interactions
• The goal is to maximize the amount of reward received from the
environment
• Iterative process
• Trained through interactions with the environment
• Rewards assigned upon success
• Performance measured as amount of rewards collected
• Example applications
• Robot learning
• Games
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 10
Training, Validation, and Test
Split process into: training, validation, and test
MODEL
Training
Validation
Optimize the predictive capability of the model using the validation set
MODEL
Training Validation
Testing
See how well the model works on the test set (unseen before)
MODEL
Testing
Dataset
Dataset
Dataset
MODEL
set set set
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 18
Overfitting
• Learns «too much» from
existing data
• Does not generalize well
on new data
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation
Overfitting
• Learns «too much» from Training Data
existing data Test Data
Error
Cycles / Data
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 20
Bias
• Learning bias
• Algorithmic bias
Marco Brambilla, Emanuele Della Valle - Data Science for Business Innovation 21