Why do we need non-linear activation function?

If we only allow linear activation functions in a neural network, the output will just be a [linear transformation](https://en.wikipedia.org/wiki/Linear_map#Matrices) of the input, which is not enough to form a [universal function approximator](https://en.wikipedia.org/wiki/Universal_approximation_theorem). Such a network can just be represented as a matrix multiplication, and you would not be able to obtain very interesting behaviors from such a network. Activation functions cannot be linear because neural networks with a linear activation function are effective only one layer deep, regardless of how complex their architecture is. And real world and problems are non-linear. Non-linearity is needed in activation functions because its aim in a neural network is to produce a nonlinear decision boundary via non-linear combinations of the weight and inputs.

Ref 1 Ref 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

interview_questions.md

interview_questions.md

Files

interview_questions.md

Latest commit

History

interview_questions.md

File metadata and controls