Professional Documents
Culture Documents
Daniel Manrique
2021
2
Text Books, tutorials, and articles
n D. Manrique. (2021). From Artificial Cells to Deep Learning. An
Evolutionary Story. Archivo Digital UPM. Madrid.
n Python Tutorial: https://docs.python.org/3/tutorial/
n T. P. Lillicrap, A. Santoro, L. Marris, C. J. Akerman, and G. Hinton.
(2020). Backpropagation and the brain. Nature Reviews Neuroscience,
21, 335-346.
n A. Géron. (2019). Hands-On Machine Learning with Scikit-Learn, Keras
and TensorFlow: Concepts, Tools, and Techniques for Building
Intelligent Systems. O’Reilly, CA, USA.
n M. Abadi et al. (2016). Tensorflow: Large-Scale Machine Learning on
Heterogeneous Distributed Systems. ArXiv: 1603.04467v2.
n J. M. Font, D. Manrique, J. Ríos. (2009). Redes de Neuronas Artificiales
y Computación Evolutiva. Fundación General de la UPM, Madrid,
España.
n S. Haykin (1999). Neural Networks. A Comprehensive Foundation.
Prentice Hall, Ontario, Canada, 2nd Edition.
3
Machine learning
4
Learning
Labelled data
https://scorecardstreet.wordpress.com/2015/12/09/is-
machine-learning-the-new-epm-black/ Laura Edell, 2015. http://www.ashbooth.com/blog/tag/machine-learning-2/.
Ash Booth. Iris flower database. 6
Unsupervised learning
http://www.frankichamaki.com/data-driven-market-
segmentation-more-effective-marketing-to-segments-using-ai/.
Franki Chamaki, 2016
Carla
María
Alejandra
Pablo
Miguel
Samuel
Unsupervised Image-to-Image Translation
Networks: https://arxiv.org/abs/1703.00848 8
Reinforcement learning
9
Artificial neural networks
An artificial neural network is a machine learning system
inspired by the natural (animal) nervous system, composed
of processing elements, units, or (artificial) neurons
interconnected by weighted connections.
Kernel W[3]
Kernel W[1]
Kernel W[2]
https://medium.com/autonomous-agents/mathematical-foundation-
for-activation-functions-in-artificial-neural-networks-
a51c9dd7c089#.r0uddzxdd
Features
n Learning from a set of examples.
n Neural networks learning is about finding weights that make the neural
network exhibits the desired behavior.
n This set of weights is called a solution to the problem.
n The learning process changes the synaptic weights (parameters)
between neurons to adapt their responses to achieve the expected
network behavior: to fit the dataset
n Generalization. To give adequate answers to unseen data.
W[1] W[2]
W[3]
Input 0419213143
MNIST 5361928694
examples ….
11
12
http://www.asimovinstitute.org/neural-network-zoo/
A little bit about history
2013
EANN
Manrique
2001
GANN
Manrique &
Ríos
x1
y1
x2
y2
x3
W[1] W[2]
15
Components of FF neural networks
Neuron: basic network processing unit. A neuron i receives a lot of
different information from multiple inputs X, processes the information
received, and sends out a single output or response yi that is transmitted
identically to multiple neurons.
x1
x2 Neuron i yi
X
xj
xj is the jth input to a neuron i.
xn𝓍
X is the input vector to a neuron i.
n𝓍 x 1
X is a column vector of n𝓍 x 1.
yi is the output from neuron i. 16
Components of FF neural networks
Synapse: directed connection from neuron j in the previous layer ℓ-1 to
another neuron i in the current layer ℓ. And from neuron i to neuron k in
the following layer ℓ+1.
17
Components of FF neural networks
Synaptic weight: a real number wij representing the strength of the
connection between the neuron j in the previous layer and the neuron i. A
large weight means that the information communicated through the
connection makes a significant contribution to the new state of the receptor
neuron i.
x1
wi1
x2 wi2 Neuron i yi
wij
j i X wij
xj
wi n𝓍
n𝓍 x 1
xi n𝓍
18
Components of FF neural networks
x1 bi
wi1
x2 wi2 Neuron i yi
X wij
neti=∑n𝓍
!"# x ! w$! + bi
xj
wi n𝓍
n𝓍 x 1 xn𝓍
19
Components of FF neural networks
Activation: A neuron’s level of excitation ai given by the activation
function f (net). It is usually the output of the neuron, yi.
x1 bi
wi1
x2 wi2 Neuron i yi
X yi = f (neti)
wij
xj
wi n𝓍
xn𝓍
n𝓍 x 1
20
Components of FF neural networks
Kernel: the matrix of weights corresponding to layer ℓ, noted as W[ℓ]. We do not
consider the input layer since it has not either kernel or activation function.
The activation function is the same for all neurons in the layer ℓ, but it may differ
for the neurons in different layers.
Hidden layer 1
Input layer f[1](x) Hidden layer 2
x1
f[2](x) Output layer
Input vector X f[1](x) f[3](x) y
Output vector Y
x2
f[2](x)
W[3]
W[1]
f[1](x)
W[2]
21
The artificial neuron
x1 w1 Net
x2 y W1x n𝓍 Net
y
x3 x
Activation
bias function
b
wn𝓍
xn𝓍 n𝓍 x 1
y = f $ x ! w! + b y = f 𝑊𝑥 + 𝑏
!"#
Net = 𝑊𝑥 + 𝑏
$𝓍
net = $ x! w! + b
!"# 22
Neural network dynamics
y1=f[1](w1,01+w1,x1x1+w1,x2x2)
y2=f[1](w2,01+w2,x1x1+w2,x2x2)
W1,x1
f[1](x) f[1](x)
x1 x1
W1,x2 f[2](x) W2,x1 f[2](x)
f[1](x) f[1](x)
x1 x1
f[2](x) f[2](x) W6,4
f[1](x) W5,1 f[3](x) y f[1](x) f[3](x) y
W5,2 f[2](x) W6,5 y=f[3](w6,01+w6,4y4+w6,5y5)
f[2](x)
x2 f[1](x) W5,3 y5=f[2](w5,01+w5,1y1+w5,2y2+w5,3y3) x2 f[1](x)
23
Case study: California housing dataset
Median house value regression and classification
Dataset: m=20,640 examples corresponding to districts in
California, ranging from 600 to 3,000 people.
n𝓍=9 attributes.
Label = Median house value
ny= 3 classes (classification)
ny= 1 for regression.
Attributes Output
Long. Lat. Age Rooms Beds Pop. HouH. Inc. Ocean proximity Median H. Val.
-122.23 37.88 41 880 129 322 126 8.3252 Near Bay 452600
-121.97 37.57 21 4342 783 2172 789 4.6146 <1h ocean 247600
-124.17 41.8 16 2739 480 1259 436 3.7557 Near ocean 109400
25
Median house value classification problem
Attributes: longitude, latitude, median age, total rooms,
total bedrooms, population, households, median income,
ocean proximity.
Median house value: Cheap [15, 141.3]; Averaged [141.4,
230.2]; Expensive [230.3, 500] th. Dollars.
Attributes Output
Long Lat. Age Rooms Beds Pop. HouH Inc. Ocean proximity Median H. Val.
. .
-122.23 37.88 41 880 129 322 126 8.3252 Near Bay Expensive
Classes
-121.97 37.57 21 4342 783 2172 789 4.6146 <1h ocean Expensive
-124.17 41.8 16 2739 480 1259 436 3.7557 Near ocean Cheap
26
ANN project stages
Christof Angermueller et al., 2016. Deep learning for computational biology, Molecular Systems Biology.
27
n Cleaning and Preparing data:
1. The total_bedrooms attribute has 207 missing values, (na or nan), which are
removed since they are very few compared to the whole dataset.
2. ISLAND has only 5 samples, not enough to generalize. This class is removed.
<1h ocean: 0;
Inland: 1; Discretizing attributes
Near bay: 2;
Near ocean: 3;
3. Dataset is randomized.
4. Classes are encoded: first discretized
and then one-hot encoded.
Cheap: 1,0,0
Averaged: 0,1,0 One-hot encoded labels
Expensive: 0,0,1 28
Data cleaning and preparing
-0.5623 0.763 0.17 -0.87 -0.74 -0.85 -091 0.93 0.33 0.87 001
0.4297 0.477 -.13 0.44 0.32 0.44 0.52 0.23 -1 -0.45 100
0.0212 -0.75 0.21 0.67 0.87 0.94 0.9 0.64 -0.33 0.34 010
-0.6171 -0.11 -.37 -0.31 -0.23 -0.34 -0.27 -0.34 1 -082 100
6. The correlation matrix between all pairs of attributes has been calculated
to visualize their dependencies. The results show that total_rooms,
total_bedrooms, population, and households are highly (positive)
correlated.
7. Finally, a partition of the dataset is created with three subsets: 16,342
(80%) samples for training the neural model; 2,043 (10%) for the
development testing; and 2,043 (10%) for final testing purposes.
29
Dataset partition
Long. Lat. Age Rooms Beds Pop. HouH. Inc. O. proximity H. val.
-0.5623 0.763 0.17 -0.87 -0.74 -0.85 -091 0.93 0.33 0.87
0.477 -.13 0.44 0.32 0.44 0.52 0.23 -1 -0.45
Training set 0.4297
0.0212 -0.75 0.21 0.67 0.87 0.94 0.9 0.64 -0.33 0.34
📉 -0.6171
-0.5623
-0.11
0.763
-.37
0.17
-0.31
-0.87
-0.23
-0.74
-0.34
-0.85
-0.27
-091
-0.34
0.93
1
0.33
-082
0.87
🔍
0.0212
🔒
0.4297 -1
0.0212 -0.75 0.21 0.67 0.87 0.94 0.9 0.64 -0.33 0.34
32
Computational resources
33
Our choice
34
Other open source DL libraries
n H2O: Python, R. h2o.ai 2014.
39
Lecture slides Representing Neural Networks of the master course “Intelligent Systems”.
2021 Daniel Manrique