You are on page 1of 1

Which of the following functions can be used as an activation function in the output layer if we wish to

predict the probabilities of n classes (p1, p2..pk) such that sum of p over all n equals to 1?
Softmax
Which of following activation function can’t be used at output layer to classify an image ?
ReLU
What is the name of the function in the following statement “A perceptron adds up all the weighted
inputs it receives, and if it exceeds a certain value, it outputs a 1, otherwise it just outputs a 0”?
Heaviside function
Output of the activation function should be symmetrical at zero so that the gradients do not shift to a
particular direction.
Zero-Centered
Activation functions are applied after every layer and need to be calculated millions of times in deep
networks. Hence, they should be computationally inexpensive to calculate.

non-linearity

You might also like