Learn Everything You Need To Be An AI Researcher

Master the fundamentals and publish your own papers

Under active development

Mathematics Fundamentals

Essential math concepts for AI

1.Functions

Linear, quadratic, and activation functions

2.Derivatives

Understanding rates of change and gradients

3.Vectors

Understanding magnitude, direction, and vector operations

4.Matrices

Matrix operations and transformations

5.Gradients

Partial derivatives and gradient descent

PyTorch Fundamentals

Working with tensors and PyTorch basics

1.Creating Tensors

Building blocks of deep learning

2.Tensor Addition

Element-wise operations on tensors

3.Matrix Multiplication

The core operation in neural networks

4.Transposing Tensors

Flipping dimensions and axes

5.Reshaping Tensors

Changing tensor dimensions

6.Indexing and Slicing

Accessing and extracting tensor elements

7.Concatenating Tensors

Combining multiple tensors

8.Creating Special Tensors

Zeros, ones, identity matrices and more

Neuron From Scratch

Understanding the fundamental unit of neural networks

1.What is a Neuron

The basic building block of neural networks

2.The Linear Step

Weighted sums and bias in neurons

3.The Activation Function

Introducing non-linearity to neurons

4.Building a Neuron in Python

Implementing a single neuron from scratch

5.Making a Prediction

How a neuron processes input to output

6.The Concept of Loss

Measuring prediction error

7.The Concept of Learning

How neurons adjust their parameters

Activation Functions

Understanding different activation functions

1.ReLU

Rectified Linear Unit - The most popular activation function

2.Sigmoid

The classic S-shaped activation function

3.Tanh

Hyperbolic tangent - Zero-centered activation

4.SiLU

Sigmoid Linear Unit - The Swish activation

5.SwiGLU

Swish-Gated Linear Unit - Advanced activation

6.Softmax

Multi-class classification activation function

Neural Networks from Scratch

Build neural networks from the ground up

1.Architecture of a Network

Understanding neural network structure and design

2.Building a Layer

Constructing individual network layers

3.Implementing a Network

Putting together a complete neural network

4.The Chain Rule

Mathematical foundation of backpropagation

5.Calculating Gradients

Computing derivatives for network training

6.Backpropagation in Action

Understanding the backpropagation algorithm

7.Implementing Backpropagation

Coding the backpropagation algorithm from scratch

Attention Mechanism

Understanding attention and self-attention

1.What is Attention

Understanding the attention mechanism

2.Self Attention from Scratch

Building self-attention from the ground up

3.Calculating Attention Scores

Computing query-key-value similarities

4.Applying Attention Weights

Using attention scores to weight values

5.Multi Head Attention

Parallel attention mechanisms

6.Attention in Code

Implementing attention mechanisms in Python

Transformer Feedforward

Feedforward networks and Mixture of Experts

1.The Feedforward Layer

Understanding transformer feedforward networks

2.What is Mixture of Experts

Introduction to MoE architecture

3.The Expert

Understanding individual expert networks

4.The Gate

Routing and gating mechanisms in MoE

5.Combining Experts

Merging multiple expert outputs

6.MoE in a Transformer

Integrating mixture of experts in transformers

7.MoE in Code

Implementing mixture of experts in Python

8.The DeepSeek MLP

DeepSeek's advanced MLP architecture

Building a Transformer

Complete transformer implementation from scratch

1.Transformer Architecture

Understanding the complete transformer structure

2.RoPE Positional Encoding

Rotary position embeddings for transformers

3.Building a Transformer Block

Constructing individual transformer layers

4.The Final Linear Layer

Output projection and prediction head

5.Full Transformer in Code

Complete transformer implementation

6.Training a Transformer

Training process and optimization

Large Language Models

Understanding LLM training and optimization

1.Batch Size vs Sequence Length

Understanding the trade-offs between batch size and sequence length