Deep Learning¶

February 5, 2025
in Mathematics, Programming, Classification, Machine Learning, Deep Learning
11 min read

Classification in Depth – Cross-Entropy & Softmax

Fashion-MNIST is a dataset created by Zalando Research as a drop-in replacement for MNIST. It consists of 70,000 grayscale images (28×28 pixels) categorized into 10 different classes of clothing, such as shirts, sneakers, and coats. Your mission? Train a model to classify these fashion items correctly!

February 1, 2025
in Mathematics, Programming, Optimizations, Machine Learning, Deep Learning
7 min read

SGD, Momentum & Exploding Gradient

Gradient descent is fundamental method in training a deep learning network. It aims to minimize the loss function \(\mathcal{L}\) by updating model parameters in the direction that reduces the loss. By using only batch of the data we can compute the direction of the steepest descent. However, for large networks or more complicated challenges, this algorithm may not be successful! Let's find out why this happens and how we can fix this.

Training Failure: `SGD` can't classify the spiral pattern

January 28, 2025
in Deep Learning, Machine Learning, Neural Networks
9 min read

Solving Non-Linear Patterns with Deep Neural Network

The Perceptron, created by Frank Rosenblatt in the 1950s, was one of the first neural networks designed to classify patterns. Initially celebrated, it became a foundational milestone in machine learning.

Frank Rosenblatt and the Perceptron, a simple neural network machine designed to classify patterns

January 22, 2025
in Deep Learning, Machine Learning, Neural Networks, Optimizations
18 min read

Mastering Neural Network - Linear Layer and SGD

The human brain remains one of the greatest mysteries, far more complex than anything else we know. It is the most complicated object in the universe that we know of. The underlying processes and the source of consciousness, as well as consciousness itself, remain unknown. Neural Nets are good for popularizing Deep Learning algorithms, but we can't say for sure what mechanism behind biological Neural Networks enables intelligence to arise.

January 14, 2025
in Deep Learning, Machine Learning, Neural Networks
14 min read

Weight Initialization Methods in Neural Networks

Weight initialization is crucial in training neural networks, as it sets the starting point for optimization algorithms. The activation function applies a non-linear transformation in our network. Different activation functions serve different purposes. Choosing the right weight initialization and activation function is key to better neural network performance. Xavier initialization is ideal for Sigmoid or Tanh in feedforward networks. He initialization pairs well with ReLU for faster convergence, especially in CNNs. Matching these improves training efficiency and model performance.

Initialization methods comparison — Comparison of different initialization methods