Skip to content

Ahmed-G-ElTaher/The-Challenge-of-Vanishing-Exploding-Gradients-in-Deep-Neural-Networks

Repository files navigation

The Challenge of Vanishing/Exploding Gradients in Deep Neural Networks 🌋

==========================

📎 As known : Vanishing Gradient problem appear from using Tanh or Sigmoid Activation Functions & Exploding Gradient problem appear from using Relu Family Activation Functions.

💡 If we used aggregation between two activation functions it will balance between Vanishing and Exploding problems ??

__________________________________________________________

like using aggregation between Tanh & Leaky RELU. 👇

📌 AD2(z) = Tanh(z) + 0.5*Leaky_Relu(z) <-- blue line in image

__________________________________________________________

However, Learning speed & Accuracy won't decrease when using aggregation than using Leaky RELU only 📊

(please check this notebook in Colab link --> https://lnkd.in/dkhkfWJA).

__________________________________________________________

Difference between just Leaky_Relu (green) and Aggregation 2 functions in post (blue). 📏

it seem decrease exploding because it decrease inputs, Aren't it? 🤔

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published