Welcome to Scribd!

Skip carousel

Activation Functions

Uploaded by

AFFIFA JAHAN ANONNA

0% found this document useful (0 votes)

3 views10 pages

activation function

Original Title

ActivationFunctions

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

activation function

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

3 views10 pages

Activation Functions

Uploaded by

AFFIFA JAHAN ANONNA

activation function

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 10

Search inside document

Widely used Activation function inside the neurons

Moin Mostakim

Department of Computer Science and Enigneering

Faculty of School of Data Science

October 2023

Moin Mostakim (BRAC University) Activation Functions October 2023 1 / 10

Contents of the slide

1 Sigmoid Activation Function

2 Hyperbolic Tangent (Tanh) Activation Function

3 Rectified Linear Unit (ReLU) Activation Function

4 Leaky Rectified Linear Unit (Leaky ReLU) Activation Function

5 Exponential Linear Unit (ELU) Activation Function

6 Swish Activation Function

7 Gated Linear Unit (GLU) Activation Function

8 Softmax Activation Function Activation Functions

Moin Mostakim (BRAC University) October 2023 2 / 10
Sigmoid Activation Function

1
Formula: σ(x) = 1+e −x
Range: (0, 1)
First-order Derivative: 1
σ ′ (x) = σ(x) · (1 − σ(x)) 0.8

σ(x)
Output: 0.6
0.4
• Shape: S-shaped curve.
0.2
• Use Cases: Binary classification, 0
sigmoid neurons in the output −5 0 5
layer. x
• Benefits: Smooth gradient, suitable
for converting network outputs to
probabilities.

Moin Mostakim (BRAC University) Activation Functions October 2023 3 / 10

Hyperbolic Tangent (Tanh) Activation Function

e x −e −x
Formula: tanh(x) = e x +e −x
Range: (-1, 1)
First-order Derivative:
tanh′ (x) = 1 − tanh2 (x) 1
0.5

tanh(x)
Output:
0
• Shape: S-shaped curve similar to
sigmoid. −0.5
• Use Cases: Regression, −1
−2 −1 0 1 2
classification. x
• Benefits: Centered around zero,
mitigates vanishing gradient
problem, and provides smooth
gradients.

Moin Mostakim (BRAC University) Activation Functions October 2023 4 / 10

Rectified Linear Unit (ReLU) Activation Function

Formula: ReLU(x) = máx(0, x)

Range: [0, ∞)
First-order Derivative:
(
0 if x < 0
ReLU′ (x) =
1 if x ≥ 0

Output:
• Shape: Linear for positive values, zero for negatives.
• Use Cases: Hidden layers in most neural networks.
• Benefits: Efficient, mitigates vanishing gradient, induces sparsity.

Moin Mostakim (BRAC University) Activation Functions October 2023 5 / 10

Leaky Rectified Linear Unit (Leaky ReLU) Activation
Function
(
x if x ≥ 0
Formula: LeakyReLU(x, α) = Range : (−∞, ∞)
αx if x < 0

First-order Derivative:
(
1 if x ≥ 0
LeakyReLU′ (x, α) =
α if x < 0

Output:
• Shape: Linear for positive values, non-zero slope for negatives.
• Use Cases: Alternative to ReLU to prevent ”dying
ReLU”problem.
• Benefits: Addresses ”dying ReLUı̈ssue, retains sparsity.

Moin Mostakim (BRAC University) Activation Functions October 2023 6 / 10

Exponential Linear Unit (ELU) Activation Function

(
x if x ≥ 0
Formula: ELU(x, α) = x
Range : (−∞, ∞)
α(e − 1) if x < 0

First-order Derivative:
(
′ 1 if x ≥ 0
ELU (x, α) =
αe x if x < 0

Output:
• Shape: Smooth S-shaped curve with an exponential increase for
negative values.
• Use Cases: An alternative to ReLU with smoother gradients.
• Benefits: Smoother gradients, better training on negative values.

Moin Mostakim (BRAC University) Activation Functions October 2023 7 / 10

Swish Activation Function

Formula: Swish(x) = x · σ(x)

Range: (-∞, ∞)
First-order Derivative: Swish′ (x) = Swish(x) + σ(x) · (1 − Swish(x))
Output:
• Shape: Smooth, non-monotonic curve.
• Use Cases: Considered in some architectures as an alternative to
ReLU.
• Benefits: Smoothness, performance improvements observed in
deep networks.

Moin Mostakim (BRAC University) Activation Functions October 2023 8 / 10

Gated Linear Unit (GLU) Activation Function

Formula: GLU(x) = x · σ(g (x))

Range: (-∞, ∞)
First-order Derivative:
GLU′ (x) = σ(g (x)) + x · σ ′ (g (x)) · (1 − σ(g (x)))
Output:
• Shape: Complex, involving a sigmoid gate.
• Use Cases: Used in architectures like the Transformer and other
sequence-to-sequence models.
• Benefits: Enables modeling dependencies in sequences, better
than standard RNNs.

Moin Mostakim (BRAC University) Activation Functions October 2023 9 / 10

Softmax Activation Function

x
Formula (for class i): Softmax(x)i = Pe i xj
j e

Range: (0, 1)
First-order Derivative:
∂
∂xi Softmax(x)j = Softmax(x)i · (δij − Softmax(x)j )
Output:
• Shape: Probability distribution over classes.
• Use Cases: Used in the output layer of multi-class classification
for probability distribution over classes.
• Benefits: Converts scores to class probabilities, essential for
classification tasks.

Moin Mostakim (BRAC University) Activation Functions October 2023 10 / 10

Difference Equations in Normed Spaces: Stability and Oscillations
From Everand
Difference Equations in Normed Spaces: Stability and Oscillations
Michael Gil
No ratings yet
Deep Learning: International Islamic University of Chittagong
Document31 pages
Deep Learning: International Islamic University of Chittagong
Ayat Ullah
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
Rating: 3.5 out of 5 stars
3.5/5 (8)
Wilson2020 Part2
Document47 pages
Wilson2020 Part2
hu jack
No ratings yet
Green's Function Estimates for Lattice Schrödinger Operators and Applications. (AM-158)
From Everand
Green's Function Estimates for Lattice Schrödinger Operators and Applications. (AM-158)
Jean Bourgain
No ratings yet
3 Non Linear Classifiers
Document74 pages
3 Non Linear Classifiers
Vaishali D
No ratings yet
Fast and Accurate Bessel Function Computation: John Harrison, Intel Corporation
Document22 pages
Fast and Accurate Bessel Function Computation: John Harrison, Intel Corporation
feprinciples1607
No ratings yet
Applied Machine Learning: One Variable (Simple) Linear Regression
Document38 pages
Applied Machine Learning: One Variable (Simple) Linear Regression
Wanida Kratae
No ratings yet
ML_Support_Vector_Machines_1710509643100266850365f44e4b83832
Document22 pages
ML_Support_Vector_Machines_1710509643100266850365f44e4b83832
23mb0072
No ratings yet
Konveksna Optimizacija
Document179 pages
Konveksna Optimizacija
Vladimir Stojanović
No ratings yet
Large Scale Deep Learning
Document170 pages
Large Scale Deep Learning
pavancreative81
No ratings yet
Epfl Machine Learning Final Exam 2021 Solutions
Document21 pages
Epfl Machine Learning Final Exam 2021 Solutions
Diogo Valdivieso
No ratings yet
Output) ) : 1.8 Empirical Results and Implementations 21
Document2 pages
Output) ) : 1.8 Empirical Results and Implementations 21
rotero_pujol
No ratings yet
Journal of Computational and Applied Mathematics: Yanli Zhai, Dazhi Zhang, Jiebao Sun, Boying Wu
Document8 pages
Journal of Computational and Applied Mathematics: Yanli Zhai, Dazhi Zhang, Jiebao Sun, Boying Wu
spaval
No ratings yet
MATLAB Built-in Functions
Document24 pages
MATLAB Built-in Functions
Lan Anh Nguyen
No ratings yet
Classical Optimization
Document36 pages
Classical Optimization
nitishhgaming
No ratings yet
Edgar Osuna Robert Freund Federico Girosi Center For Biological and Computational Learning and Operations Research Center Massachusetts Institute of Technology Cambridge, MA, 02139, U.S.A
Document8 pages
Edgar Osuna Robert Freund Federico Girosi Center For Biological and Computational Learning and Operations Research Center Massachusetts Institute of Technology Cambridge, MA, 02139, U.S.A
RanaBilalShahid
No ratings yet
Detection of temporal bone abnormalities using hybrid wavelet Support Vector Machine classification
Document6 pages
Detection of temporal bone abnormalities using hybrid wavelet Support Vector Machine classification
saahithyaalagarsamy
No ratings yet
Study various neural network activation functions
Document6 pages
Study various neural network activation functions
nicO nee
No ratings yet
IEEE SIGNAL PROC. LETTERS, VOL. 21, NO. 8, PP. 985–989, AUG. 2014
Document5 pages
IEEE SIGNAL PROC. LETTERS, VOL. 21, NO. 8, PP. 985–989, AUG. 2014
Augusto Zebadúa
No ratings yet
Lecture 9 - SVM
Document42 pages
Lecture 9 - SVM
Husein Yusuf
No ratings yet
Key Concepts of Exponential Functions - Part 002
Document26 pages
Key Concepts of Exponential Functions - Part 002
Kissha Tayag
No ratings yet
Introduction To Matlab Tutorial 11
Document37 pages
Introduction To Matlab Tutorial 11
Syarif Hidayat
No ratings yet
Vietnam National University's Calculus 1 MATLAB Project
Document14 pages
Vietnam National University's Calculus 1 MATLAB Project
Kim Ngân Trương
No ratings yet
Classical Optimization Techniques
Document48 pages
Classical Optimization Techniques
kesisdrderejesh
No ratings yet
NLP Nctu
Document19 pages
NLP Nctu
larasmoyo
No ratings yet
Support Vector Machines in R: A Comparison of Packages
Document28 pages
Support Vector Machines in R: A Comparison of Packages
zhaozilong
No ratings yet
Mesh Free Methods: Nico Van Der Aa
Document32 pages
Mesh Free Methods: Nico Van Der Aa
Benjamin HQ
No ratings yet
Lab Report #1: Transient Stability Analysis For Single Machine Infinite Bus Bar Using MATLAB
Document5 pages
Lab Report #1: Transient Stability Analysis For Single Machine Infinite Bus Bar Using MATLAB
Ijaz Ahmad
No ratings yet
Chapter 0: Introduction: 0.2.1 Examples in Machine Learning
Document4 pages
Chapter 0: Introduction: 0.2.1 Examples in Machine Learning
TAMOGHNO NATH
No ratings yet
Linear Regression Model with Gradient Descent
Document52 pages
Linear Regression Model with Gradient Descent
shinjo
No ratings yet
Comp Intel Cheat Sheet
Document2 pages
Comp Intel Cheat Sheet
Daniel
No ratings yet
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
Document9 pages
CS 188 Introduction To Artificial Intelligence Fall 2017 Note 10 Neural Networks: Motivation
Eman Jaffri
No ratings yet
An Introduction To Nonsmooth Convex Optimization: Numerical Algorithms
Document61 pages
An Introduction To Nonsmooth Convex Optimization: Numerical Algorithms
Leonardo Augusto
No ratings yet
Decision Trees and Nearest Neighbors
Document47 pages
Decision Trees and Nearest Neighbors
Hassanein Al-hadad
No ratings yet
Tutorial: Gaussian Process Models For Machine Learning
Document35 pages
Tutorial: Gaussian Process Models For Machine Learning
Saundaraya Gupta
No ratings yet
Dimensionality Reduction Using PCA (Principal Component Analysis)
Document13 pages
Dimensionality Reduction Using PCA (Principal Component Analysis)
kolluriniteesh111
No ratings yet
جدول الاختبارات 2023-2024 الفصل الاول - مع القاعات - ٠٦١١٣٥
Document36 pages
جدول الاختبارات 2023-2024 الفصل الاول - مع القاعات - ٠٦١١٣٥
rafeak rafeak
No ratings yet
Nonlinear Classifiers, Kernel Methods and SVM: Václav Hlaváč Czech Technical University in Prague
Document15 pages
Nonlinear Classifiers, Kernel Methods and SVM: Václav Hlaváč Czech Technical University in Prague
happy happy
No ratings yet
Giovanni Chierchia, Nelly Pustelnik, Jean-Christophe Pesquet, and B Eatrice Pesquet-Popescu
Document5 pages
Giovanni Chierchia, Nelly Pustelnik, Jean-Christophe Pesquet, and B Eatrice Pesquet-Popescu
Mahaveer k
No ratings yet
Learning 2
Document104 pages
Learning 2
Noel Roy Denja
No ratings yet
Linear Regression 18may
Document28 pages
Linear Regression 18may
pratikgohel
No ratings yet
Socn 1
Document117 pages
Socn 1
nagatopein6
No ratings yet
Lab - Introduction To Finite Element Methods and Matlab'S Pdetoolbox
Document15 pages
Lab - Introduction To Finite Element Methods and Matlab'S Pdetoolbox
Edgar Mugarura
No ratings yet
Finite Element Method (FEM) : Course Code: NTME 637
Document26 pages
Finite Element Method (FEM) : Course Code: NTME 637
Jendral Yoga Aryatama
No ratings yet
12 2 Maxima N Minima PDF
Document24 pages
12 2 Maxima N Minima PDF
tarek mahmoud
No ratings yet
cs188 sp23 Note25
Document8 pages
cs188 sp23 Note25
sondos
No ratings yet
11 ANN (Backpropagation)
Document37 pages
11 ANN (Backpropagation)
Jonathan Pervaiz
No ratings yet
Hdout Chap 2 5210
Document42 pages
Hdout Chap 2 5210
park ji hye
No ratings yet
Kriging: Principle of Kriging Types of Kriging Influence of Variogram Model Parameters On Kriging Results
Document25 pages
Kriging: Principle of Kriging Types of Kriging Influence of Variogram Model Parameters On Kriging Results
Ulisses Miguel Correia
No ratings yet
Kdd12 Tutorial Inf Part III
Document56 pages
Kdd12 Tutorial Inf Part III
Đỗ Thế Sang
No ratings yet
Elegant Slides
Document6 pages
Elegant Slides
Muhammad Saleem Khan
No ratings yet
Unsupervised Learning
Document29 pages
Unsupervised Learning
Uddalak Banerjee
No ratings yet
Roots Open Methods
Document16 pages
Roots Open Methods
Muhamad Alvan Ardiansyah
No ratings yet
Duff Ing
Document7 pages
Duff Ing
Cédric Kuete
No ratings yet
Lec 12
Document10 pages
Lec 12
stathiss11
No ratings yet
On Deep Learning For Inverse Problems: Jaweria Amjad Jure Sokoli C Miguel R.D. Rodrigues
Document5 pages
On Deep Learning For Inverse Problems: Jaweria Amjad Jure Sokoli C Miguel R.D. Rodrigues
Jaweria Amjad
No ratings yet
Anthony-Carlisle - An Off-The-Shelf PSO
Document6 pages
Anthony-Carlisle - An Off-The-Shelf PSO
Kei.N YNUorch
No ratings yet
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 2: Solutions
Document7 pages
Massachusetts Institute of Technology: 6.867 Machine Learning, Fall 2006 Problem Set 2: Solutions
juanagallardo01
No ratings yet
DT Notes 2020
Document14 pages
DT Notes 2020
Ling Min Hao
No ratings yet
Lecture 11
Document11 pages
Lecture 11
AFFIFA JAHAN ANONNA
No ratings yet
Lecture 12
Document11 pages
Lecture 12
AFFIFA JAHAN ANONNA
No ratings yet
Week 5 Problemt Set
Document16 pages
Week 5 Problemt Set
AFFIFA JAHAN ANONNA
No ratings yet
Department Transfer Form
Document2 pages
Department Transfer Form
AFFIFA JAHAN ANONNA
No ratings yet
6th Sem CSE PYQ
Document27 pages
6th Sem CSE PYQ
coderabhi2003shukla
No ratings yet
Lecture 6: Communication Skills of Teachers - Verbal and Non-Verbal - Language Register
Document23 pages
Lecture 6: Communication Skills of Teachers - Verbal and Non-Verbal - Language Register
EcahWahab
No ratings yet
Age and Gender Detection-3
Document20 pages
Age and Gender Detection-3
Anand Dubey
67% (12)
Simulation of The Inverted Pendulum: Interdisciplinary Project in Computer Science and Mathematics
Document33 pages
Simulation of The Inverted Pendulum: Interdisciplinary Project in Computer Science and Mathematics
Hoang Nguyen
No ratings yet
Concept of Block Diagram & Rules of Block Diagram Reduction: G H Patel College of Engineering & Technology
Document23 pages
Concept of Block Diagram & Rules of Block Diagram Reduction: G H Patel College of Engineering & Technology
Hiral Hirani
No ratings yet
BCS Knowledge Based Systems Exam Questions
Document7 pages
BCS Knowledge Based Systems Exam Questions
Mbinga
No ratings yet
Reviste AIS 2011
Document349 pages
Reviste AIS 2011
Madalina Tomescu
100% (1)
Analysing Stock Market Trend Prediction Using Machine Amp Deep Learning Models A Comprehensive Review
Document10 pages
Analysing Stock Market Trend Prediction Using Machine Amp Deep Learning Models A Comprehensive Review
rekha
No ratings yet
Master's Thesis. The Viable System Model in The Analysis of The Project Management
Document53 pages
Master's Thesis. The Viable System Model in The Analysis of The Project Management
Arsenij Krassikov
100% (1)
English 4 It Praktyczny Kurs Jezyka Angielskiego Dla Specjalistow It I Nie Tylko Beata Blaszczyk
Document30 pages
English 4 It Praktyczny Kurs Jezyka Angielskiego Dla Specjalistow It I Nie Tylko Beata Blaszczyk
AniaBrakowska
No ratings yet
What Is Cognitive Ergonomics
Document3 pages
What Is Cognitive Ergonomics
Dian
No ratings yet
Swati Gaur, Alok Pandey
Document4 pages
Swati Gaur, Alok Pandey
Alok Pandey
No ratings yet
Lab 8
Document8 pages
Lab 8
Aman Bansal
No ratings yet
Towards Continuous Domain Adaptation For Healthcare
Document5 pages
Towards Continuous Domain Adaptation For Healthcare
Sai Hareesh
No ratings yet
Data Engineering With DBT (2023)
Document615 pages
Data Engineering With DBT (2023)
Eduardo Leyton
No ratings yet
Oral & Non Verbal Communication
Document19 pages
Oral & Non Verbal Communication
2ruchi8
100% (1)
Decision Tree Learning Based Feature Evaluation and Selection For Image Classification
Document9 pages
Decision Tree Learning Based Feature Evaluation and Selection For Image Classification
Sanjay
No ratings yet
Puter Vision With TensorFlow 2 X 1838827064
Document419 pages
Puter Vision With TensorFlow 2 X 1838827064
超揚林
100% (2)
BackPropagation PDF
Document48 pages
BackPropagation PDF
sridhiya
No ratings yet
Presentation On Interactive Communication Model: Prepared by
Document14 pages
Presentation On Interactive Communication Model: Prepared by
Nawab Akhtar
100% (1)
Classification and Prediction
Document126 pages
Classification and Prediction
Sonal Singh
No ratings yet
cs231n 2018 Midterm Review-2 PDF
Document86 pages
cs231n 2018 Midterm Review-2 PDF
Vard Farrell
No ratings yet
Experiment 4
Document13 pages
Experiment 4
Usama Nadeem
No ratings yet
Signal Flow Graph
Document38 pages
Signal Flow Graph
gaurav_juneja_4
No ratings yet
Email Classification KNN SVM Performance
Document5 pages
Email Classification KNN SVM Performance
fgfsgsg
No ratings yet
Control Tutorials For MATLAB and Simulink - Introduction - Frequency Domain Methods For Controller Design
Document12 pages
Control Tutorials For MATLAB and Simulink - Introduction - Frequency Domain Methods For Controller Design
cesarinigillas
No ratings yet
AZ AI Lec 08 Machine Learing1
Document60 pages
AZ AI Lec 08 Machine Learing1
Ahmed Maged
No ratings yet
AI
Document24 pages
AI
Robot 410
No ratings yet
SQL - ORA-01748 - Only Simple Column Names Allowed Here in Oracle - Stack Overflow
Document3 pages
SQL - ORA-01748 - Only Simple Column Names Allowed Here in Oracle - Stack Overflow
soydeyudru
No ratings yet
Context and Its Significance To Pragmatics
Document5 pages
Context and Its Significance To Pragmatics
inventionjournals
No ratings yet