The whole point of the box on your desk.
A 7-day, A-to-Z field guide to what CUDA is, why it matters, and why a $4,700 DGX Spark is a different animal from a similarly-priced Mac Studio with the same 128 GB of unified memory. Six days of mental models with small hands-on demos, then one project at the end where you fine-tune Llama 8B on a public US real estate dataset and ship an AI analyst running on your hardware. By Sunday night you'll know — to a depth most operators never reach — what you actually bought.
This is Week 1 of a 12-week curriculum to make you a working master of NVIDIA's stack on the DGX Spark. Each week stacks on the last. By Week 12 you ship a real product on hardware you understand from the silicon up.
The 12-week roadmap
Each week ~7 days, ~1–2 hours/day. Today is the start.
What you'll be able to answer by Sunday night
- What is CUDA, in one sentence — and what it isn't.
- Why "an Apple Mac Studio with 128 GB unified memory" is a different machine for AI work than your DGX Spark with 128 GB unified memory, even though the spec sheet looks identical.
- Which open-source AI techniques (FlashAttention, vLLM, Unsloth, TensorRT-LLM…) are CUDA-only, and why that matters for the next decade.
- Why AMD, Intel, Apple, Google and a half-dozen Chinese chip companies have all failed to dent the moat — and the one scenario in which they could.
- How to actually fine-tune a real model on a real public dataset on your own hardware, and what it cost you in dollars and watts.
The seven days
What CUDA Actually Is
The 17-year origin story. The three layers (language, runtime, libraries). Demystifying a word everyone uses and few define. Hands-on: peek at the stack on your DGX.
The Software Stack — The Real Moat
cuBLAS, cuDNN, cuTLASS, NCCL, Triton, FlashAttention, TensorRT, vLLM. Seventeen years of NVIDIA-paid PhDs that nobody else can replicate overnight.
DGX Spark vs Mac Studio
Same 128 GB, very different machines. Side-by-side on bandwidth, compute, FP4, software ecosystem. The single best lens for understanding why CUDA is the answer.
The Pretenders
AMD ROCm, Intel Gaudi, Google TPU, AWS Trainium, Huawei Ascend, Cerebras. Why each is interesting, where each falls short, and the one threat NVIDIA actually watches.
What CUDA Lets You Actually Do
The use-case landscape — training, inference, custom kernels, scientific simulation, real-time graphics+AI. What's exclusive to the green stack and what isn't.
The Future of CUDA & Your Bet
Blackwell to Rubin. NVL fabric. Sovereign AI. Where NVIDIA could lose. How to think about your own 3-year compute strategy without becoming a fanboy.
Project — AI Real Estate Analyst
Hands-on capstone. Pull a public Realtor.com dataset, fine-tune Llama 3.1 8B on it with LoRA, deploy via Ollama. By dinner you have an AI that drafts listings and reasons about US housing.
What you need before Day 1
Hardware
- NVIDIA DGX Spark — powered, on your network, you can SSH in.
- ~250 GB free disk for Day 7's dataset and checkpoints.
Mindset
- You don't need to write CUDA C++. The point of this week is the map, not the kernel.
- If a command breaks, paste it into Claude Code. That's allowed. That's the point.
- End each day with a 60-second "what surprised me" voice memo. The most important detail is usually the one that surprised you.