jaydeep raijada

Writing

Notes & experiments

Mostly on post-training, RL environments, and small-model experiments. Write-ups of what I learn while training things.

July 2, 2026
5D Parallelism #1: What needs to be stored in the GPU during LLM training?
The first post in a series building up to 5D parallelism. Before we split a model across many GPUs, let's figure out what actually fills a single GPU: parameters, gradients, optimizer states, and activations, which usually end up taking the most.
June 23, 2026
PagedAttention, simply explained
Why naive KV cache allocation wastes GPU memory, and how PagedAttention fixes it with block-based paging.
April 27, 2026
Small monitors, large attackers: training a 1.5B sabotage detector with GRPO
10 reward designs, 2 training frameworks, 5 GPU runs. The one-line reward function that finally moved the needle — and why temperature matters more than reward shape.