Welcome to Scribd!

Activations

Uploaded by

0% found this document useful (0 votes)

13 views8 pages

The document discusses different activation functions used in neural networks - ReLU, Tanh, Sigmoid, Softmax, GeLU. It provides the pros and cons of each function. The key points are: ReLU is default for hidden layers, Sigmoid for binary classification output, Softmax for multi-class classification output, and GeLU or Tanh can also be used. Thumb rules suggest starting with ReLU and GeLU for hidden layers and choosing the appropriate activation for the output based on the problem type.

Original Description:

Activation functions

Original Title

activations

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

13 views8 pages

Activations

Uploaded by

nitin garg

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 8

Search inside document

ACTIVATION FUNCTIONS

WHICH ONE TO USE

swipe right
ReLU

0 when x <= 0; x when x > 0. 0 when x <0; 1 when x > 0

PROs CONs
• Solves vanishing gradient • Neurons with -ve input die
• Computationally Eﬀicient • Sensitive to initialization
• Faster Convergence • Not diﬀerentiable at 0
• Default for hidden layers

swipe right
Tanh

Range of Tanh : [-1,1]

Range of Deriv. Tanh: (0,1]

PROs CONs
• Zero-centered range (better for • Vanishing gradient problem
optimization) • Computationally expensive
• Smooth gradient • Not suitable for deeper networks

swipe right
Sigmoid

Range of Sigmoid : [0,1]

Range of Deriv. Sigmoid : (0,0.25]

PROs CONs
• Output suitable for binary • Vanishing gradient problem
classification • Computationally expensive
• Used for multi-label • Too much compression
classification
• Smooth gradient

swipe right
Softmax

Range : (0,1)

PROs CONs
• Interpretable as likelihood of a • Doesn't work for multi-label
class classification
• works well with categorical • Vulnerable to Imbalance
cross entropy (CCE) loss datasets
• Optimal for multi-class • Instable to large input values
classification (Overflow errors)

swipe right
GeLU

Range similiar to ReLU

PROs CONs
• Smooth gradient • Computationally expensive that
• Dynamic gating makes network ReLU
adaptable • Reduced Interpretibility
• Used in SoTA transformer
models (GPT, BERT, SAM)
Thumb rules

• Start with ReLU for hidden layers, then GeLU

and Tanh
• If binary classification use Sigmoid for output
• If multi-class, use Softmax for output
• For transformer-based models, start with
GeLU
References
https://towardsdatascience.com/why-rectified-linear-unit-relu-in
-deep-learning-and-the-best-practice-to-use-it-with-tensorflow-e
9880933b7ef

https://medium.com/@omkar.nallagoni/activation-functions-wit
h-derivative-and-python-code-sigmoid-vs-tanh-vs-relu-44d23915
c1f4

https://www.researchgate.net/figure/The-Softmax-activation-fun
ction-and-its-derivative_fig5_373474238 [accessed 1 Mar, 2024]

https://www.cs.cmu.edu/~bhiksha/courses/deeplearning/Spring
.2019/archive-f19/www-bak11-22-2019/document/note/hwnotes
/HW1p1.html.backup

https://www.researchgate.net/figure/The-GELU-function-and-its
-derivative-with-respect-to-x_fig2_373051870 [accessed 2 Mar,
2024]

ACS Biochemistry Study Prep
Document13 pages
ACS Biochemistry Study Prep
JOHN
100% (2)
Roland KC-350 Parts List Despiece
Document31 pages
Roland KC-350 Parts List Despiece
lperez_110824
No ratings yet
PHD Defense Presentation
Document41 pages
PHD Defense Presentation
Ivar Løkken
No ratings yet
Airbus A350-900 Booklet
Document215 pages
Airbus A350-900 Booklet
explorer8111
93% (15)
Honeywell UOP Gas Processing Brochure v2
Document6 pages
Honeywell UOP Gas Processing Brochure v2
Satria 'igin' Girindra Nugraha
100% (1)
Neural Networks
Document38 pages
Neural Networks
fdwadjitazmncwgwuw
No ratings yet
Neural Networks
Document29 pages
Neural Networks
Shailesh Sivan
No ratings yet
Baraniuk IMA Compression June07 Final
Document87 pages
Baraniuk IMA Compression June07 Final
Senthil Murugan
No ratings yet
3-3-Arithmetic Coding
Document71 pages
3-3-Arithmetic Coding
MuhamadAndi
100% (1)
Arithmetic Supplementary
Document71 pages
Arithmetic Supplementary
Geethabaalan
No ratings yet
Lab 2
Document35 pages
Lab 2
Mohammed Mustafa
No ratings yet
TensorFlow With R
Document46 pages
TensorFlow With R
biondimi
No ratings yet
Convolutional Neural Networks (1) : Geena Kim
Document28 pages
Convolutional Neural Networks (1) : Geena Kim
Huston LAM
No ratings yet
CVlecture 5
Document56 pages
CVlecture 5
David B
No ratings yet
Module1 Algorithm Analysis
Document26 pages
Module1 Algorithm Analysis
NIKHIL SOLOMON P URK19CS1045
No ratings yet
AML 03 Dense Neural Networks
Document20 pages
AML 03 Dense Neural Networks
Vaibhav
No ratings yet
L06 Features
Document44 pages
L06 Features
Paulo Santos
No ratings yet
Unit 2b
Document11 pages
Unit 2b
Akshaya Gopalakrishnan
No ratings yet
Deep Learning: International Islamic University of Chittagong
Document31 pages
Deep Learning: International Islamic University of Chittagong
Ayat Ullah
No ratings yet
Data Structures and Algorithms: CS3007 CEN2018
Document19 pages
Data Structures and Algorithms: CS3007 CEN2018
Khawaja Muhammad Awais Arif
No ratings yet
Data Structures - Lecture 3 - Solution - English
Document36 pages
Data Structures - Lecture 3 - Solution - English
6zy6mhkp2m
No ratings yet
Appendix 3 Additional Shell Meshing Features: Introduction To ANSYS Icem CFD
Document4 pages
Appendix 3 Additional Shell Meshing Features: Introduction To ANSYS Icem CFD
Anita Ani
No ratings yet
Intro CNN PDF
Document31 pages
Intro CNN PDF
Aditi Jaiswal
No ratings yet
CNN
Document31 pages
CNN
gourav Verma
No ratings yet
Lecture 24
Document29 pages
Lecture 24
akililu
No ratings yet
AI SVM Network
Document10 pages
AI SVM Network
rajthakre81
No ratings yet
Lecture - Genetic Algorithm - Part 1
Document87 pages
Lecture - Genetic Algorithm - Part 1
Aprendu Aman
No ratings yet
Introduction To Data Science and Machine Learning
Document21 pages
Introduction To Data Science and Machine Learning
gb_oprescu
No ratings yet
2020R1 Fluent
Document65 pages
2020R1 Fluent
agrbovic
100% (1)
Z-Trak Datasheet
Document3 pages
Z-Trak Datasheet
Tomás G
No ratings yet
Microgage LLL Precision
Document2 pages
Microgage LLL Precision
Rushikesh Joshi
No ratings yet
3 2KNN
Document27 pages
3 2KNN
damasodra33
No ratings yet
Advanced RF Board Skills in ADS
Document65 pages
Advanced RF Board Skills in ADS
Nabil Dakhli
No ratings yet
Course 4 Intro To Unstructered Grid Generation
Document38 pages
Course 4 Intro To Unstructered Grid Generation
Arya Hartawan
No ratings yet
Optimizer CST PDF
Document30 pages
Optimizer CST PDF
Huong Thu Tran
No ratings yet
Lec 2
Document44 pages
Lec 2
Mado Saeed
No ratings yet
TSN Basics
Document47 pages
TSN Basics
Mohamed YaCine Laidani
No ratings yet
CII4Q3 VISI KOMPUTER - Deep Learning - CNN
Document106 pages
CII4Q3 VISI KOMPUTER - Deep Learning - CNN
Zee Ingame
No ratings yet
Feature Engineering
Document76 pages
Feature Engineering
sentilbalan@gmail.com
100% (1)
Unit 3 - Diving - Deep - Learning
Document108 pages
Unit 3 - Diving - Deep - Learning
Alekhya Roy
No ratings yet
BIDM Session 07-08
Document44 pages
BIDM Session 07-08
Ajit chowdary
No ratings yet
Auto Encoder
Document39 pages
Auto Encoder
Sreetam Ganguly
No ratings yet
Autoencoder
Document39 pages
Autoencoder
Rivujit Das
No ratings yet
CRC 6
Document3 pages
CRC 6
FaizanQazi
No ratings yet
Per Löwenborg, Håkan Johansson, Lars Wanhammar
Document1 page
Per Löwenborg, Håkan Johansson, Lars Wanhammar
Dhruv Hirpara
No ratings yet
RSS With Pure Signal Technology - Datasheet - 20220706
Document2 pages
RSS With Pure Signal Technology - Datasheet - 20220706
Đức Minh
No ratings yet
NN 06
Document18 pages
NN 06
youssef hussein
No ratings yet
Intro - Channel - Coding 2023
Document16 pages
Intro - Channel - Coding 2023
洪邦邦
No ratings yet
Lecture 4
Document33 pages
Lecture 4
Venkat ram Reddy
No ratings yet
Christopher Manning Lecture 5: Language Models and Recurrent Neural Networks (Oh, and Finish Neural Dependency Parsing J)
Document66 pages
Christopher Manning Lecture 5: Language Models and Recurrent Neural Networks (Oh, and Finish Neural Dependency Parsing J)
Muhammad Arshad Awan
No ratings yet
Jpeg
Document49 pages
Jpeg
Monark Mehta
No ratings yet
L03 Logic Simulation
Document13 pages
L03 Logic Simulation
VLSISD36 Edwin Dominic
No ratings yet
Ann 2
Document34 pages
Ann 2
Harsh Mohan Sahay
No ratings yet
Data Structure
Document22 pages
Data Structure
Lubna
No ratings yet
Optimization Techniques Code Optimizations
Document10 pages
Optimization Techniques Code Optimizations
Vishal Saini
No ratings yet
Micro Wind
Document15 pages
Micro Wind
Nikita Patel086
No ratings yet
Deep Learning
Document40 pages
Deep Learning
Dr. Dnyaneshwar Kirange
No ratings yet
Week8 Geo372 Decision Making PDF
Document58 pages
Week8 Geo372 Decision Making PDF
Avram Vlad
No ratings yet
Week 2 of ML KNN and Learning Process
Document25 pages
Week 2 of ML KNN and Learning Process
shoaib ahmed
No ratings yet
Logic Design (CS33) : Logic Levels and Families
Document27 pages
Logic Design (CS33) : Logic Levels and Families
Kodukula Sowmya
No ratings yet
Technical Specification: High Fidelity CFD System For Naval Architecture
Document2 pages
Technical Specification: High Fidelity CFD System For Naval Architecture
museblade
No ratings yet
PT 121 G c11 MPB PDF
Document9 pages
PT 121 G c11 MPB PDF
laboratorioelectro
No ratings yet
ch3 Describing Logic Circuits
Document51 pages
ch3 Describing Logic Circuits
Minh Mẫn Nguyễn
No ratings yet
The Art of Programming Embedded Systems
From Everand
The Art of Programming Embedded Systems
Jack Ganssle
Rating: 3 out of 5 stars
3/5 (3)
Machine Learning 1
Document245 pages
Machine Learning 1
nitin garg
No ratings yet
Porous Material: It Is A Material Containing Pores (Voids) .: Totalizer: Totalizer A Device (Such As A Meter) That
Document5 pages
Porous Material: It Is A Material Containing Pores (Voids) .: Totalizer: Totalizer A Device (Such As A Meter) That
nitin garg
No ratings yet
Blood
Document3 pages
Blood
nitin garg
No ratings yet
Integumentary System: Epidermis: This Tough Layer of Cells Is The Outermost Layer
Document5 pages
Integumentary System: Epidermis: This Tough Layer of Cells Is The Outermost Layer
nitin garg
No ratings yet
PCB Training Course DDR2&DDR3 Integrity 1
Document3 pages
PCB Training Course DDR2&DDR3 Integrity 1
librian_30005821
No ratings yet
Reflection Paperin Practical Research 1
Document2 pages
Reflection Paperin Practical Research 1
Andrei Lacataria
No ratings yet
Is 1077 Common Burnt Clay Building Bricks
Document7 pages
Is 1077 Common Burnt Clay Building Bricks
Kathiravan Manimegalai
No ratings yet
General Physics 1-1Sst Quarter Exam
Document11 pages
General Physics 1-1Sst Quarter Exam
Javer Degorio
100% (1)
Ranitidine
Document14 pages
Ranitidine
Papaindo
No ratings yet
V Lift Presentation JBP
Document9 pages
V Lift Presentation JBP
rekyhoffman
0% (1)
The New Companies Act (Chapter 24 - 31) Impact On Corporate Governance - Honey and Blanckenberg
Document5 pages
The New Companies Act (Chapter 24 - 31) Impact On Corporate Governance - Honey and Blanckenberg
Luke Madzikoto
No ratings yet
Service Manual: Tennant A60
Document25 pages
Service Manual: Tennant A60
Павел Корчагин
No ratings yet
Assistant Director For Planning
Document4 pages
Assistant Director For Planning
Sultan Kudarat State University
No ratings yet
Islamic Architecture
Document29 pages
Islamic Architecture
Heleene
No ratings yet
LM64P89L SharpElectrionicComponents
Document25 pages
LM64P89L SharpElectrionicComponents
home made
No ratings yet
ISO 14343 Welding Consumbale Specification
Document18 pages
ISO 14343 Welding Consumbale Specification
Ranjith Kumar
100% (2)
International Piano Competition R. Schumann - Fondazione Caript
Document2 pages
International Piano Competition R. Schumann - Fondazione Caript
toritori88
No ratings yet
Popular Electronics 1963-10
Document120 pages
Popular Electronics 1963-10
Lemmy Kilmister
No ratings yet
ManagedServer 2
Document37 pages
ManagedServer 2
rsakthi@gmailcom
No ratings yet
Acti 9 Ipf K - Iprd - A9l65601
Document2 pages
Acti 9 Ipf K - Iprd - A9l65601
ibrahim
No ratings yet
Darms SLD
Document1 page
Darms SLD
Michael Darmstaedter
No ratings yet
Cushion Tee
Document1 page
Cushion Tee
Antonio Gutierrez
No ratings yet
Carbohydrates Practical Report
Document9 pages
Carbohydrates Practical Report
shuba
No ratings yet
Matter and Interaction Chapter 04 Solutions
Document71 pages
Matter and Interaction Chapter 04 Solutions
langemar
No ratings yet
Vista Safire e IV LHD
Document504 pages
Vista Safire e IV LHD
Sesar Sehat Santoso
No ratings yet
Fresh Frozen Plasma (FFP) and Cryoprecipitate: Patient Information
Document6 pages
Fresh Frozen Plasma (FFP) and Cryoprecipitate: Patient Information
Agus Lastya
No ratings yet
Hammarby Sjojstad
Document14 pages
Hammarby Sjojstad
Ashim Manna
No ratings yet
Cotton Cost Recent
Document3 pages
Cotton Cost Recent
Younus Khan
No ratings yet
Atkins' Physical Chemistry: Chapter 2 - Lecture 4
Document15 pages
Atkins' Physical Chemistry: Chapter 2 - Lecture 4
Budi Abut
No ratings yet
HR Management Midterm
Document4 pages
HR Management Midterm
shabana
No ratings yet