Optimizations¶

February 1, 2025
in Mathematics, Programming, Optimizations, Machine Learning, Deep Learning
7 min read

SGD, Momentum & Exploding Gradient

Gradient descent is fundamental method in training a deep learning network. It aims to minimize the loss function \(\mathcal{L}\) by updating model parameters in the direction that reduces the loss. By using only batch of the data we can compute the direction of the steepest descent. However, for large networks or more complicated challenges, this algorithm may not be successful! Let's find out why this happens and how we can fix this.

Training Failure: `SGD` can't classify the spiral pattern

January 22, 2025
in Deep Learning, Machine Learning, Neural Networks, Optimizations
18 min read

Mastering Neural Network - Linear Layer and SGD

The human brain remains one of the greatest mysteries, far more complex than anything else we know. It is the most complicated object in the universe that we know of. The underlying processes and the source of consciousness, as well as consciousness itself, remain unknown. Neural Nets are good for popularizing Deep Learning algorithms, but we can't say for sure what mechanism behind biological Neural Networks enables intelligence to arise.

December 12, 2024
in Mathematics, Programming, Optimizations, Machine Learning, Deep Learning
15 min read

Gradient Descent Ninja with Momentum

Today, we'll build the gradient descent for a complex function. It's not as easy as it was for the 2D parabola; we need to construct a more complicated method! Momentum - a powerful method to help us solve this challenge!

Local minima stuck — Gradient descent got stuck in a local minimum!

December 5, 2024
in Mathematics, Programming, Optimizations, Machine Learning, Data Science
9 min read

Gradient Descent - Downhill to the Minima

The gradient, \( \nabla f(\textbf{x}) \), is a vector of partial derivatives of a function. Each component tells us how fast our function is changing. If you want to optimize a function, you head in the negative gradient direction because the gradient points towards the steepest ascent.

Tangent Line of a function at a given point

December 4, 2024
in Mathematics, Programming, Machine Learning, Data Science, Optimizations
10 min read

Why Does the Gradient Point Upwards?

The gradient, \( \nabla f(\textbf{x}) \), tells us the direction in which a function increases the fastest. But why?

Gradient direction in 3D from Min => Max