AI Lec-06

AI Foundations and Applications
6. Artificial Neural Networks
Thien Huynh-The
HCM City Univ. Technology and Education
Jan, 2023
How is a neuron modeled?
HCMUTE AI Foundations and Applications 03/18/2024 2

Why are neural network used?
• Neuronal networks can theoretically estimate any function, regardless of its

complexity.
• But what are the differences between neural networks and other methods of
machine learning? The answer is based on the Inductive Bias phenomenon, a
psychological phenomenon.
• An Inductive Bias of Linear Regression is the linear relationship between X and Y. In this way, a
line or hyperplane gets fitted to the data.
• When X and Y have a complex relationship, it can get difficult for a Linear Regression method
to predict Y. For this situation, the curve must be multi-dimensional or approximate to the
relationship.

What is a FeedForward Neural Network
• Feed forward neural networks are ANN in which nodes do not form loops. This type
of neural network is also known as a multi-layer neural network as all information is
only passed forward.
• During data flow, input nodes receive data, which travel through hidden layers, and
exit output nodes. No links exist in the network that could get used to by sending
information back from the output node.
• A feed forward neural network approximates functions in the following way:
• An algorithm calculates classifiers by using the formula y = f*(x).
• Input x is therefore assigned to category y.
• According to the feed forward model, y = f(x;θ). This value determines the closest
approximation of the function.

Applications

Working principle of FFNN
• Working principle
• This model multiplies inputs with weights as they enter the layer.
• The weighted input values get added together to get the sum
• The sum of the values rises above a certain threshold, set at zero, the output value is usually
1, while if it falls below the threshold, it is usually -1.

Working principle of FFNN
• As a FFNN model, the single-layer perceptron often

gets used for classification.
• Through training, neural networks can adjust their
weights based on a property called the delta rule,
which helps them compare their outputs with the
intended values.
• As a result of training and learning, gradient descent
occurs. This process gets known as back-
propagation. If this is the case, the network's hidden
layers will get adjusted according to the output values
produced by the final layer.

Layers of FFNN

Layers of FFNN
Input layer:
• The neurons of this layer receive input and pass it on to the other layers of the network.
• Feature or attribute numbers in the dataset must match the number of neurons in the input
layer.
Output layer:
• According to the type of model getting built, this layer represents the forecasted feature.
Hidden layer:
• Input and output layers get separated by hidden layers. Depending on the type of model, there
may be several hidden layers.
• There are several neurons in hidden layers that transform the input before actually transferring
it to the next layer. This network gets constantly updated with weights in order to make it
easier to predict.

Layers of FFNN
Neuron weights:
• Neurons get connected by a weight, which measures
their strength or magnitude. Similar to linear
regression coefficients, input weights can also get
compared, normally values between 0 and 1.
Neuron:
• Artificial neurons get used in FFNN, which later get
adapted from biological neurons.
• Neurons function in two ways: they create weighted
input sums and they activate the sums to make them
normal.
• Activation functions can either be linear or nonlinear.
Neurons have weights based on their inputs. During
the learning phase, the network studies these
weights.
Layers of FFNN
Activation function:
• Neurons are responsible for making decisions in this
area.
• According to the activation function, the neurons
determine whether to make a linear or nonlinear
decision. Since it passes through so many layers, it
prevents the cascading effect from increasing neuron
outputs.
• An activation function can be classified into three
major categories: sigmoid, Tanh, and Rectified Linear
Unit (ReLu).

Calculations in FFNN
So easy, right?

Calculations in FFNN

Training ANN
• Already know how to calculate estimated output

• Training a neural network is a process of using an
optimization algorithm to find a set of neuron
weights to best map inputs to outputs.
Mean squared error
Cross-entropy loss
The loss of a network measure the cost
incurred from incorrect prediction

Training ANN
• Loss function
• In other word, this is the way to minimize the

loss
The loss of a network measure the cost

incurred from incorrect prediction
As the purpose of training ANN is to find W
that the Loss function is minimum

Gradient Descent
03/18/2024 HCMUTE 16
Gradient Descent
• Giả sử một người đang lạc ở trên đỉnh núi như

trên, trời sắp tối và đỉnh núi rất lạnh khi về đêm
nên cần phải đi xuống thung lũng để dựng trại,
xung quanh toàn là sương mù nên anh ta không
thể xác định chính xác thung lũng ở đâu. Làm thế
nào để đi xuống thung lũng?
• Vì đam mê toán học, trong đầu anh ấy lóe lên ý
tưởng rằng sẽ cố gắng xác định đồ thị hàm số
biểu diễn bề mặt của dãy núi, không chần chừ,
anh ta bắt tay vào thực hiện.
• Bằng một cách nào đó anh ấy đã tìm ra được hình
dạng của dãy núi (như hình bên)

Gradient Descent
• Bằng một phép nội suy, anh ta suy ra được phương trình của đồ thị hàm số
• Đạo hàm của nó:
Giải phương trình đạo hàm bằng 0 và tìm dc

thung lũng gần đó

Gradient Descent
• Trong thực tế ko ai làm như thế? Vì tìm ra

hàm biểu diễn cho đồ thị hay vấn đề thực
tế rất khó
• Thực tế, chúng ta sẽ tiến hành các bước sau
• Bước 1: Khảo sát vị trí đang đứng và tất cả những
vị trí xung quanh có thể nhìn thấy, sau đó xác định
vị trí nào hướng xuống dốc nhiều nhất.
• Bước 2: Di chuyển đến vị trí đã xác định ở bước 1.
• Bước 3: Lặp lại hai các bước trên đến khi nào tới
được thung lũng.
Hay nói cách khác là đi men theo mặt dốc, đây chính
là ý tưởng của thuật toán Gradient Descent.

Example
Giả sử chúng ta đang quan tâm đến một hàm số một biến
có đạo hàm mọi nơi. Nhắc lại vài điều đã quá quen thuộc:
• Điểm local minimum x∗ của hàm số là điểm có đạo hàm

f’(x∗) bằng 0. Hơn thế nữa, trong lân cận của nó, đạo
hàm của các điểm phía bên trái x∗ là không dương, đạo
hàm của các điểm phía bên phải x∗ là không âm.
• Đường tiếp tuyến với đồ thị hàm số đó tại 1 điểm bất

kỳ có hệ số góc chính bằng đạo hàm của hàm số tại
điểm đó.
Điểm màu xanh lục là điểm local minimum (cực tiểu),

và cũng là điểm làm cho hàm số đạt giá trị nhỏ
nhất
• local minimum để thay cho điểm cực tiểu
• global minimum để thay cho điểm mà tại đó hàm
số f(x) đạt giá trị nhỏ nhất

Example
Trong hình phía trên, các điểm bên trái của điểm local
minimum màu xanh lục có đạo hàm âm, các điểm bên phải có
đạo hàm dương.
Và đối với hàm số này, càng xa về phía trái của điểm local
minimum thì đạo hàm càng âm, càng xa về phía phải thì đạo
hàm càng dương.

nhất

Example
• Việc tìm global minimum của các hàm mất mát

trong các thuật toán học máy rất phức tạp, thậm
chí là bất khả thi. Thay vào đó, người ta th ường cố
gắng tìm các điểm local minimum, và ở một mức độ
nào đó, coi đó là nghiệm cần tìm của bài toán.
• Các điểm global minimum là nghiệm của phương
trình đạo hàm bằng 0. Tuy nhiên, trong hầu hết
các trường hợp, việc giải phương trình đạo hàm
bằng 0 là bất khả thi. Nguyên nhân có thể đến từ
sự phức tạp của dạng của đạo hàm, từ việc các
điểm dữ liệu có số chiều lớn, hoặc từ việc có quá
nhất
nhiều điểm dữ liệu.

Example
• Thay vì cố gắng tìm global minimum thì việc tìm

local minimum thông qua phương pháp xấp xỉ sẽ
khả thi hơn, đặc biệt là với sự giúp đỡ của máy tính
thông qua việc tính toán với nhiều vòng lập
• Hướng tiếp cận phổ biến nhất là xuất phát từ một
điểm mà chúng ta coi là gần với nghiệm của bài
toán, sau đó dùng một phép toán lặp để tiến dần
đến điểm cần tìm, tức đến khi đạo hàm gần với 0

nhất

Example
• Giả sử xt là điểm ta tìm được sau vòng lặp th ứ t. Ta

cần tìm một thuật toán để đưa xt về càng gần x∗
càng tốt.
• Nếu đạo hàm của hàm số tại xt : f’(xt)>0 thì xt nằm

về bên phải so với x∗ (và ngược lại). Để điểm tiếp
theo xt+1 gần với x∗ hơn, chúng ta cần di chuyển xt
về phía bên trái, tức về phía âm. Nói các khác, chúng
ta cần di chuyển ngược dấu với đạo hàm:
Điểm màu xanh lục là điểm local minimum (cực tiểu), xt+1=xt+
nhất trong đó Δ là một đại lượng ngược dấu với đạo hàm
f(xt).

Example
• xt càng xa x∗ về phía bên phải thì f’(xt) càng

lớn hơn 0 (và ngược lại). Vậy, lượng di chuyển
Δ, một cách trực quan nhất, là tỉ lệ thuận
với − f’(xt)
• Hai nhận xét phía trên cho chúng ta một cách
cập nhật đơn giản là (với η là learning rate –
tốc độ học)
xt+1=xt −ηf’(xt)
nhất

Example
• Dùng phương pháp gradient descent, tìm điểm cực tiểu của phương trình sau
f(x)=x2+5sin(x)
• Kiểm tra với python code

https://machinelearningcoban.com/2017/01/12/gradientdescent/

Gradient Descent in ANN

Python
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
print('load data from MNIST')
mnist = tf.keras.datasets.mnist
(x_train,y_train), (x_test,y_test) = mnist.load_data()
# get the digit 0 - 9
dig = np.array([1,3,5,7,9,11,13,15,17,19])
x = x_train[dig,:,:]
y = np.eye(10,10)
plt.subplot(121)
plt.imshow(x[0])
plt.subplot(122)
plt.imshow(x[1])
x = np.reshape(x,(-1,784))/255

Python
def sigmoid(x):
return 1./(1.+np.exp(-x))
W = np.random.uniform(-0.1,0.1,(10,784))
# matrix multiplication
o = sigmoid(np.matmul(x,W.transpose()))
print('output of first neuron with 10 digits ', o[:,0])
fig = plt.figure()
plt.bar([i for i, _ in enumerate(o)],o[:,0])
plt.show()

#training process
n = 0.05 # set learning rate
num_epoch = 10 # number of epochs for training
for epoch in range(num_epoch):
loss =np.power(o-y,2).mean()
#calculate update for all wegihts in matrix
dW =np.transpose((y-o)*o*(1-o))@x
#update
W=W+n*dW
print(loss)
print('output of the first neuron with 10 input digits ', o[:,0])
fig = plt.figure()
plt.bar([i for i, _ in enumerate(o)],o[:,0])
plt.show()

AI Lec-06

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AI Lec-06

Uploaded by

Copyright:

Available Formats

AI Foundations and Applications

6. Artificial Neural Networks

HCMUTE AI Foundations and Applications 03/18/2024 2

• Neuronal networks can theoretically estimate any function, regardless of its

HCMUTE AI Foundations and Applications 03/18/2024 3

HCMUTE AI Foundations and Applications 03/18/2024 4

HCMUTE AI Foundations and Applications 03/18/2024 5

HCMUTE AI Foundations and Applications 03/18/2024 6

• As a FFNN model, the single-layer perceptron often

HCMUTE AI Foundations and Applications 03/18/2024 7

HCMUTE AI Foundations and Applications 03/18/2024 8

HCMUTE AI Foundations and Applications 03/18/2024 9

HCMUTE AI Foundations and Applications 03/18/2024 11

HCMUTE AI Foundations and Applications 03/18/2024 12

HCMUTE AI Foundations and Applications 03/18/2024 13

• Already know how to calculate estimated output

Mean squared error

HCMUTE AI Foundations and Applications 03/18/2024 14

• In other word, this is the way to minimize the

The loss of a network measure the cost

HCMUTE AI Foundations and Applications 03/18/2024 15

• Giả sử một người đang lạc ở trên đỉnh núi như

HCMUTE AI Foundations and Applications 03/18/2024 17

• Đạo hàm của nó:

Giải phương trình đạo hàm bằng 0 và tìm dc

HCMUTE AI Foundations and Applications 03/18/2024 18

• Trong thực tế ko ai làm như thế? Vì tìm ra

HCMUTE AI Foundations and Applications 03/18/2024 19

• Điểm local minimum x∗ của hàm số là điểm có đạo hàm

• Đường tiếp tuyến với đồ thị hàm số đó tại 1 điểm bất

Điểm màu xanh lục là điểm local minimum (cực tiểu),

HCMUTE AI Foundations and Applications 03/18/2024 20

Điểm màu xanh lục là điểm local minimum (cực tiểu),

HCMUTE AI Foundations and Applications 03/18/2024 21

• Việc tìm global minimum của các hàm mất mát

HCMUTE AI Foundations and Applications 03/18/2024 22

• Thay vì cố gắng tìm global minimum thì việc tìm

Điểm màu xanh lục là điểm local minimum (cực tiểu),

HCMUTE AI Foundations and Applications 03/18/2024 23

• Giả sử xt là điểm ta tìm được sau vòng lặp th ứ t. Ta

• Nếu đạo hàm của hàm số tại xt : f’(xt)>0 thì xt nằm

HCMUTE AI Foundations and Applications 03/18/2024 24

• xt càng xa x∗ về phía bên phải thì f’(xt) càng

HCMUTE AI Foundations and Applications 03/18/2024 25

• Kiểm tra với python code

HCMUTE AI Foundations and Applications 03/18/2024 26

HCMUTE AI Foundations and Applications 03/18/2024 27

HCMUTE AI Foundations and Applications 03/18/2024 28

HCMUTE AI Foundations and Applications 03/18/2024 29

HCMUTE AI Foundations and Applications 03/18/2024 30

You might also like