Machine Learning Bloque 2

02 WHAT IS MACHINE LEARNING
1. WHAT IS LEARNING?
• Learning consists of acquiring new knowledge or skills that allow us to change our behavior
or response to external stimuli. Cambiamos y adaptamos nuestro comportamiento según las
circunstancias (tanto humanos como laptops)
• Humans are very good at pattern recognition and ML tries to mimic human behaviour
Nuestro cerebro se ha acostumbrado y somos geniales para reconocer patrones, es para sobrevivir.
Somos muy malos siendo precisos (los ojos, los colores etc etc) los computers para eso son geniales, pero
nosotros humanos clasificando y entendiendo el enviroment somos muy buenos.
• The way a machine learns (acquires new skills) is through a change in its physical
components or programming. The goal of machine learning is that these new skills get
acquired automatically.
Softwares versión updates are ML? No, porq no lo ha aprendido por si mismo el hecho de pasar
softwares y configuraciones. Son requirements pero no aprendizaje automático. La respuesta explicada
está bajo en la pptt
• A system is said to learn whether it can modify its behavior after a set of experiences, so
that it can perform the task more accurately, more efficiently, or perform tasks beyond its
intended capabilities.
2. WHAT IS MACHINE LEARNING?
• Set of techniques and tools to obtain

information that we did not know
previously from available data.
• System that learns to combine known input
data, to make useful predictions about data
we do not know.
MACHINE LEARNING IS MATHEMATICS. Apply maths
to data
It is not hitting a button and data coming out, nor is it programming, nor is it obtaining
information by looking at the data.
ML is set of mathematical tools and techniques, to apply to data and get new knowledge.
Machine learning applies to a wide variety of problems. Among them are the following:
• Classification and estimation: We can take advantage of the available data to classify them
and assign some interesting value to each of the data. For example, if we identify the proper
characteristics of visitors to a mall, we can estimate the probability that they will enter a
certain store.
• Clustering: It consists of grouping unorganized data into different categories. It is one of the
basic functions of human knowledge, which allows us to distinguish whether an object is a
chair or a weapon, without having seen it before. It has great application in image
processing, autonomous navigation. Etc.
1
• Optimization: Machine Learning is closely related to optimization problems, since the basis
of ML methods is to optimize some type of function (loss, error, distance, etc.). Unlike the
original optimization methods, Machine Learning is capable of optimizing data sets as they
arise.
3. BIG DATA
Big data refers to a series of techniques, technologies, commercial platforms and mechanisms
to deal with the enormous amount of data that floods us continuously and be able to obtain
information from it.
Input data from a big data system can come from: Social Media, Server log files, IoT Sensors,
Images, videos and sounds, Digital content on web pages, Customer transaction information
with companies, Vehicle and device telemetry, Health Data, etc.
We need new tools to deal with such an amount of data. Among the tools that have been
developed for big data would be Hadoop, Cloudera, R language, Phython language, non-SQL DB,
etc. Obviously, within the treatment of the data are the machine learning algorithms, but due
to the specific characteristics of dealing with such high volumes, a different name is used.
Differentiate ML, Big data, Big learning, Data mining,

AI Data mining es una function: user that analyses.
4. DATA MINING
• Set of techniques to analyze data, trying to look for

new features and useful information for the user.
• The data can come from a massive source of data
(big data) or it can be a small volume. The important
thing is the analysis that is carried out with them.
• Data Mining uses the techniques developed by
machine learning, always supervised by a user who
is the one who expects to obtain results
5. ARTIFICIAL INTELLIGENCE (AI)

Since the beginning of structured science, scientists and engineers have tried to model human
intelligence through machines. Artificial intelligence is about trying to imitate human
intelligence when interacting with your environment, solving problems and making decisions..
Some of the features that AI seeks to mimic are:
1. Perception and recognition of the environment
2. Ability to learn from experiences
2
3. Adapting to changes
4. Natural language interaction (speaking and listening)
5. Human logical rules
6. Troubleshooting?????
7. Achievement of planned objectives
8. External human or animal physical form (cybernetics)
From these characteristics, we can see that AI has a

lot in common with Machine Learning: learning,
adaptation, recognition, .... although it expands the
scope of AI with the development of specific
learning algorithms, data mining, etc. When we see
the historical development of ML we will see that
they have had a parallel evolution.
Is AI a subset of ML, or ML s subset of AI? 1

Is AI the application of ML techniques to humans, or ML the application of AI techniques to get
that a machine learns? There is no clear answer in the classification and we can approach it in
both ways. In any case, both are techniques of Computer Science applied to specific areas
APUNTE SOBRE DEEP LEARNING

relates ML xq cuando no tienes las características correctas de lo q tienes apply un algorithm de Deep
learning to know what to consider. Deep learning builds up with neuronal networks, pero neuronal
networks es una tecnología que proviene de AI. Deep learning system analiza si la imagen es un perro o
un gato.
6. MACHINE LEARNING TERMINOLOGY

- FEATURE: A feature is an input of our model. Ie: days of the week, images with different
vehicles, or sportify songs
- LABEL: What we want to predict. Ie: the money a customer will spend in a store, whether it
rains tomorrow. Not all models have labels, ie: there are times where you’ve got data but
not a label. En unsupervised learning there is no label.
• Labelled examples: {features, label}: (x,y)
• Unlabelled examples: {features, ?}: (x,?)
The labeled examples we use to train our model, or check if it works (for example, a set of emails
and an indicator of whether it is spam or not), while the unlabeled ones correspond to data that
we want to classify, or from which we have to predict a label.
- MODEL: system, program or algorithm. Program or algorithm that produces a new

knowledge inferred from the input data. Defines the relationship between features and tags.
1
AI vs. ML
(In ml la maquina aprende de su propio cerebro y develop new skills and undertsanding of the world) En
ai programas cual será el environment y entonces la maquina ya sabe cómo. Programas al robot para que
esquive un obstáculo, no para que aprenda a esquivarlos. Lo único es q hoy en día se mezcla y a muchos
sistemas de ia les implementan ml)
ML es para ganar new knowdledge, AI no (DALLE). TROUBLESHOOTING?
Ej de AI: dond poner molinos de vientos para maximizar el espacio y el viento sea optimo. No es ml porq
para otro escenario hay q reprogramarlo, no aprende nada de sí mismo, solo de
3
In our previous example the model is the mechanism that learns to obtain the estimated
expense from the characteristics of a person
- TRAINING DATA: data used in our model to train a new skill. Teaching the model to obtain
the labels from the future.
- VALIDATION DATA/ TESTING DATA: used to validate and verify the correctness of a model
- INFERENCE: It consists of obtaining predictions (labesl) for the characteristics of which we
have no information.
- REGRESSION: It consists of predicting continuous values. That is, y ∈ ℝ , where y are the
labels. Example, the estimated expense per customer, the expected price of a share, etc..
- CLASSIFICATION: In this case, the predicted values are discrete. For example, if an email is
spam, if a song is pop, funk, disco, hip-hop, …
- CATEGORICAL DATA: Non numerical data: codigo postal (no siguen números, sino que clasifican)
OUTLIER: Analyse the system/enviroment and what you are predicting . Outliers detection is critical
in all ML models. They are data whose value is separated from the rest, either because it is at a
distant end, because it is exceptional, or because it is an error. Data far from the mean- outlier.
Lo más importante es entender y luego analizar. The following are examples of outliers:
• A temperature data of T=-10ºC at 03:00 PM in July in the Mojave Desert.
• A DNI between 10 and 99. These numbers are reserved for the Royal Family.
• Foreign tourists in Spain in 2020 in COVID time.
• Robert Wadlow (2.72m height), tallest human in history
Lo q se debe hacer es detectarlos y si notienen sentido conviene borrarlos. If there are millions of data, 1
number does not matter (La persona mas alta del mundo 3 m, en relacion a todo el mundo, pero sí afectará
en la altura media del pueblo d 30 personas… a no ser que todo el mundo en ese pueblo sea muy alto)
7. TYPES OF MACHINE LEARNING ALGORITHMS

We can classify ML algorithms from different perspectives. One of them is based on whether or
not we have learning data.
1. Supervised learning: A type of learning which he dataset contains labelled examples and
there is a so-called supervisor who provides information to the learner, thereby trying
to learn a function from input to output.
2. Unsupervised learning: Unlike supervised learning, there is unlabelled examples in
dataset and the purpose is to classify them.
3. Semi-supervised learning: It is also a type of learning that uses both labelled and
unlabelled examples to learn a model.
4. Reinforcement learning: In this type of machine learning, there is no supervisor to teach
models, and algorithms called Agents learn from and develop their experiences.
4
7.1 Supervised learning
In this model we start from a series of data (characteristics) already classified with their
corresponding label. We call this dataset a training set. The process consists of two parts. In the
first phase the algorithm is provided with the training data for analysis and in the second phase
it is provided with new unclassified data, for which it will have to obtain its corresponding label.
For example, we can train our system to recognize whether a vehicle is a truck or not, by
providing it with a set of training images, so that when we pass a new piece of information,
which you have never seen decides if it is a truck, or not.
Image recognition is one of the main applications of supervised

learning. This recognition does not have to be of definite forms.
Also recognizing the letters in a handwritten letter, or if an
image provokes "disgust", is also part of supervised learning
(obviously a computer does not know what is "disgust" but can
learn to classify it.).
- DECISION TREES: Decision trees are a method that allows us to structure the information
we know in branches and leaves, so that when a new data arrives it can automatically go
through the tree and reach the correct decision (leaf)
- NAÏVE BAYES CLASSIFIER: It consists of applying Bayes' theorem to a set of independent
random variables. this method classifies the data according to the probability that a certain
characteristic will or will not happen.
- LOGISTIC REGRESSION: It consists of obtaining the unknown parameters of a linear
regression, minimizing the difference between the observed variables.
- LOGISTIC REGRESSION: When our model has variables that take discrete values. For
example, we can use this method to determine if a volcano is going to erupt, if our product
will sell well, or if we should invest in a certain company..
- SUPPORT VECTOR MACHINES: It consists of dividing the data set into regions while
maintaining a certain distance between the border and the data. Used to solve complex and
multidimensional cases. (metodo de AI para clasificar info)
7.2 Unsupervised learning
You dont have to train the system. These types of algorithms are used when there are no labels for
the data. Let's imagine that we provide the algorithm with a list of vehicles like the one we saw
earlier. An unsupervised algorithm will classify the different vehicles according to the most
reasonable categories, based on the input characteristics (size, color, number of wheels, etc.).
In each execution of the algorithm (that is, with each new data) it will classify the vehicles in one
of the existing categories (or perhaps creating a new one). Main algorithms:
- CLUSTERING ANALYSIS: The clustering method analyzes and divides the elements into
different groups (clusters) with similar characteristics, as we have seen in the previous
example. In this way, when we have a new data, the algorithm will decide to which group it
belongs, based on its similarity.
5
- PRINCIPAL COMPONENT ANALYSIS: This method is used to reduce the size of the dataset.
PCA consists of an orthogonal linear transformation that sorts the data into a new
coordinate system, so that the greatest variance of the data is on the first axis, the second
largest on the second axis, etc. In this way it preserves those data that may have the greatest
impact on the variance.
- SINGULAR VALUE DESCOMPOSITION: Singular Value Decomposition, like Independent
Component Analysis, are techniques that allow us to divide a signal into the sum of other
signals, so that they are independent and have a non-Gaussian distribution. An example of
the application of these techniques could be the analysis of conversations in a meeting. We
could have several microphones distributed around the room, which would capture the
composite signals. Using SVD or ICA we could separate the individual conversations from
the complete signal
Note: ICA is different than PCA
7.3 Reinforcement learning
These types of algorithms make use of the observations collected during their execution. These
algorithms learn from the environment through repetitive iterations. Reinforcement learning
methods are those used to train animals, in which they are rewarded, or punished, according to
the result of their actions..
Reinforcement methods are trained by continuous execution by trial and error and perform well.
The specific models are called agents and in each action that the agent executes we can give him
a reward, or a punishment. The agent is never told what the solution is before execution, which
allows complex problems to be solved very easily. As in psychology, among the reinforcement
options we have:
1. Positive and negative reinforcement

2. Positive and negative punishment
Reinforcement learning methods are not supervised, but they are also not considered
unsupervised, the name of which is reserved for classification methods. A field of great
application of these methods are video games, in which it is taught to play machines through
reinforcement learning (War Games)
7.4 Deep learning
It consists of a series of methods derived from

neural networks, which try to simulate the
complete behavior of a brain. Deep Learning is
designed to work with large volumes of data, or
options, unlike other ML neural network
methods. Deep Learning methods are able to
identify the most relevant characteristics of a
data set and process them.
Series of methods derived from neural networks, which try to simulate the complete behaviour of the brain.
Designed to work with large volumes of data or options, unlike other ML neural network methods. Deep
learning methods can identify the most relevant characteristics of a data ser and process them.
6
An DL System that has to analyze an image to detect if it is a dog, or another animal, analyze the
shape of the head, number and type of legs, exterior color, eyes, etc. and will select which
characteristics are the most important to detect if it is a dog or not.
As we have said, the fundamental difference between ML and DL is its operation with high data
rates. DL needs large volumes to be able to determine
which are the most important characteristics for the
specific problem. For this reason, a DL system consumes
much more time in the training phase than other ML
methods.
8. LEARNING BY TYPE OF REASONING

We can classify learning by the reasoning strategy used. On this purpose, we can use Michalski
Inference Equation, that states the following:
- If we have the premise P and combine it with everything we know in advance (that is the
set of rules S), we logically obtain a consequence C that will be the result of our learning.
(P U S -> C) Si a un set d datos P añades nuevos datos S, obtienes una diff. Combinación C
In premise P is included everything we know about the system. For example, the pressure,
temperature and humidity data of the atmosphere in Madrid.
The rules related to the system and to the premise P we call S, e.g., the ideal gas equation:
P·V=n·R·T
Consequence S is the learning, or new knowledge, that we obtain from the previous ones. In our
case it would be the weather forecast for tomorrow.
8.1 Deductive reasoning
If we know P and S, applying logical rules we can obtain C.
• P = My cat is called Tara

• S = All cats meow
• C = Tara meow
• P = Jennifer has COVID

• S = If you have the COVID then you have a fever
• C = Jennifer has a fever
If we analyze it well, we are not really learning anything new, since C was implicit in P and S
(applying a brief reasoning), but we have gained a new knowledge of the system.
8.2 Inductive reasoning
7
Inductive reasoning happens when we know several values of A and C (features and labels) and
what we want to deduce are the rules that can be applied to them.
• P1 = Tara is a cat.
• C1 = Tara meows
• P2 = Sebastian is a cat
• C2 = Sebastian meows
• S = Cats meow
8.3 Abductive reasoning
In this case we know S and C and try to obtain P. To some extent it is to go back in time to
understand what motivated consequence C. Continuously applied on medical science.
• S1 = If you have COVID you have fever.

• C = Jennifer has a fever
• Does Jennifer have COVID? → Sure: P1 = Jennifer has COVID › …
but, there are hundreds of fever-causing diseases. F.i. we could have the following rule:
• S2 = If you have the flu you have fever and P will be very different.1.8.4 Analog reasoning
(mixed of inductive and deductive reasoning)
8.4 Analog reasoning
This is a mix of inductive and deductive reasoning. In this case we know several pairs of P and C
from which we obtain S, in such a way that the next P is applied directly.
✓ P1 = María has celiac disease / C1 = María does not eat wheat

✓ P2 = Bob is from Canada / C2 = Bob eats a lot of spaghetti …and now appears Paula
✓ P3 = Paula is Canadian
✓ From (P2 , C2 ) ➔ S3 (induction) = Canadians eat a lot of spaghetti.
✓ C3 (deduction) = Paula eats a lot of spaghetti Is that right?
EXERCISES: DATA LOSS PREVENTION
Data Loss Prevention (DLP) consists of detecting and preventing sensitive data from a company
from going abroad, or being destroyed in an unauthorized manner. Companies need to prevent
data loss, or leakage, for their own interest and to comply with regulations; for example GDPR.
A DLP System analyzes the activity of users and devices to detect possible information leaks
(intentional, or involuntary), monitoring the following elements:
- Emails
- Chats from Teams, Whatsapp, Messenger ...
- Data entered in a web form
- Sharepoint Documents / Google Docs / Dropbox ...
- Data exchanged with CRM, ERP, etc. systems
Once a DLP system has determined that this is a data loss event, it can block the connection to
the user/device and alert the Security Department.
8
Obviously, a DLP cannot generate an event to a user because he puts a DNI on a website, because
maybe it is his, or maybe it is something necessary of his work. The same goes for sending
photos, putting bank accounts on a website, etc. DLP hasto clearly determine what is an illicit
activity and what is not.
Diferencia entre reinforcement learning and deep learning
Questions:
1. What labels you would use in your system

2. What features (input data) you would use in your system
3. Give an example of each type of ML algorithm (supervised, unsupervised,
reinforcement) applied to DLP. For example, we could apply an unsupervised learning
algorithm to determine whether an incoming email contains phishing or not.
4. Search the Internet for 3 DLP systems and describe one of them, explaining how they
use ML.
1 En examen preguntas d data pero tb definir q es learning, o si en una pregunta en especifico decir si eso es ML, Deep
learning o AI. Si un caso especifico es ML
Tipos dif de reasoning y tienes que elegir cuál
9
10

Machine Learning Bloque 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning Bloque 2

Uploaded by

Copyright:

Available Formats

02 WHAT IS MACHINE LEARNING

2. WHAT IS MACHINE LEARNING?

• Set of techniques and tools to obtain

Differentiate ML, Big data, Big learning, Data mining,

• Set of techniques to analyze data, trying to look for

5. ARTIFICIAL INTELLIGENCE (AI)

From these characteristics, we can see that AI has a

Is AI a subset of ML, or ML s subset of AI? 1

APUNTE SOBRE DEEP LEARNING

6. MACHINE LEARNING TERMINOLOGY

- MODEL: system, program or algorithm. Program or algorithm that produces a new

7. TYPES OF MACHINE LEARNING ALGORITHMS

7.1 Supervised learning

Image recognition is one of the main applications of supervised

7.2 Unsupervised learning

Note: ICA is different than PCA

7.3 Reinforcement learning

1. Positive and negative reinforcement

7.4 Deep learning

It consists of a series of methods derived from

8. LEARNING BY TYPE OF REASONING

8.1 Deductive reasoning

If we know P and S, applying logical rules we can obtain C.

• P = My cat is called Tara

• P = Jennifer has COVID

8.2 Inductive reasoning

• S1 = If you have COVID you have fever.

8.4 Analog reasoning

✓ P1 = María has celiac disease / C1 = María does not eat wheat

EXERCISES: DATA LOSS PREVENTION

Diferencia entre reinforcement learning and deep learning

1. What labels you would use in your system

Tipos dif de reasoning y tienes que elegir cuál

You might also like