Learn the AI stack without pretending it is obvious.

LocalsOnly.run now has two public learning tracks: a hands-on NVIDIA/DGX Spark course, and a field guide to the papers behind modern LLMs. One teaches the hardware stack. The other gives you the research taste to know why the stack matters.

TRACK 01 · NVIDIA COURSE

Week 3 is live.

Fine-Tuning Deep Dive is open now: LoRA, QLoRA, DPO, dataset craft, and a capstone where you train Llama 3.1 8B into a real estate analyst on your own DGX.

WK 03 LIVEQLoRAUnsloth

TRACK 02 · LLM PAPER FIELD GUIDE

Read with taste.

Digestible guides to 31 essential and bonus papers: attention, scaling laws, open weights, FlashAttention, RAG, instruction tuning, preference optimization, reasoning, agents, MoE, and interpretability.

31 guidestakeawaysbuild instincts

The 12-week program

~7 days each. ~1–2 hours per day. One hands-on project per week.

WK 01

What & Why of CUDA

Origin story · the moat · DGX vs Mac · pretenders · landscape · your bet · fine-tune capstone

LIVE WK 02

Inference at Scale

vLLM · TensorRT-LLM · continuous batching · paged attention

LIVE WK 03

Fine-Tuning Deep Dive

LoRA · QLoRA · DPO · SFT vs RLHF · evals that matter

LIVE WK 04

Quantization & the FP4 Revolution

FP16 · BF16 · INT8 · FP4 · AWQ · GPTQ · Marlin kernels

SOON WK 05

Local Agents & Tool Use

Function calling · MCP · tool routing · multi-step reasoning

SOON WK 06

RAG Done Right

Embeddings · vector DBs · rerankers · evaluation · the failure modes

SOON WK 07

Multi-Modal

Stable Diffusion · Whisper · F5-TTS · LLaVA · vision agents

SOON WK 08

Real-Time AI

Streaming · voice agents · sub-second latency engineering

SOON WK 09

Custom CUDA via Triton

Your first real GPU kernel — in Python, not C++

SOON WK 10

Production Serving

NVIDIA Triton Inference Server · monitoring · scaling · cost

SOON WK 11

Training Your Own Architecture

Beyond fine-tuning · novel models · the deep cut (optional)

SOON WK 12

Capstone — Ship a Real Product

Twelve weeks compounded into one shipped thing

SOON

What this is

Learning Local LLMs from scratch. Building what I learn when I learn it.

One track follows the NVIDIA stack hands-on. The other makes the essential LLM papers easier to read. The point is simple: understand the ideas, run the experiments, and turn the confusing parts into something useful.

Build track

12-week NVIDIA course

Research track

31 paper guides

Hardware

NVIDIA DGX Spark · $4,700

Cost to follow

$0 (read-along free)

Follow along, fork anything useful, and build from wherever you are.

Start Week 3 → Review Week 2 Read the Apple comparison Open the paper guide