LLM Tutorial

Build a modern LLM.
From numpy to agents.

A comprehensive tutorial in 30 chapters covering every aspect of how modern language models actually work.

What you'll learn

Numpy implementations of every primitive — attention, transformers, RoPE, MoE routing, selective scan
The transformer end-to-end, plus alternative architectures (Mamba, state-space models)
Pre-training: data, training infrastructure, distributed training, scaling laws
Post-training: SFT, RLHF, DPO, RLVR, Constitutional AI, LoRA, distillation
Inference: KV caches, FlashAttention, PagedAttention, quantization, speculative decoding
Agents: tool use, retrieval, reasoning, building harnesses and frameworks from scratch