How to think about NVIDIA without becoming a fanboy.
Five days in, you understand the moat. Today is about the time horizon. Where is NVIDIA going? What's the bull case, the bear case, the boring case? And — given all of that — what's the rational AI compute strategy for your situation, from solo founder to mid-stage company? We'll end with a one-page framework you can actually use.
The roadmap NVIDIA has actually shown
NVIDIA publishes their architecture roadmap years in advance. As of mid-2026:
Blackwell (B100, B200, GB200, GB10)
The current generation. Native FP4 tensor cores. NVLink 5 fabric. NVL72 datacenter pods (72 GPUs that look like one big GPU to PyTorch). Your DGX Spark is the desk-friendly version — the GB10 is a Blackwell GPU married to a Grace ARM CPU on a single package.
Blackwell Ultra (B300)
Mid-cycle bump. Higher memory (288 GB HBM3e), modest compute lift. Mostly a margin and TAM expansion. Datacenter only — no consumer/desk variant.
Rubin · the next generation
Successor architecture, ships late 2026 / early 2027. HBM4. New tensor core formats. NVLink 6 with higher bandwidth. The first chip designed end-to-end in the post-ChatGPT era — explicitly optimized for transformer workloads at unprecedented scale.
Rubin Ultra · Feynman
Already on slides. Detail thin. The pattern is clear: a major architecture every 2 years, a mid-cycle Ultra in between. Each one ~2× the perf of the prior at roughly the same power envelope.
NVIDIA's three secret weapons that aren't chips
1 · The fabric — NVLink & NVSwitch
A single GPU is one thing. A cluster of GPUs is a different thing entirely. NVIDIA's fabric — NVLink between chips, NVSwitch between racks — lets thousands of GPUs work as one giant GPU. NVL72 puts 72 Blackwells in a single rack with shared memory. Nobody else has anything close. AMD has Infinity Fabric; it scales to single nodes, not racks.
Your DGX Spark has 200 Gbps NVLink-C2C — designed to pair two of them into one logical machine with 256 GB unified memory. The same fabric philosophy, scaled down to a desk.
2 · The full-stack play
NVIDIA has stopped selling chips. They sell "AI factories" — DGX systems, full software stacks, networking, cloud (DGX Cloud), enterprise software (NIM, NeMo), reference designs. The Apple-of-AI strategy: control every layer.
Your DGX Spark is the smallest model in this lineup. The same software you'll use this week scales — without changing — to a $400M datacenter. That portability is itself a moat.
3 · The CUDA developer base
By NVIDIA's own count, ~5M registered CUDA developers as of 2026. Every CS graduate in the last 10 years took a CUDA class. Every ML PhD wrote their dissertation on it. Apple has maybe 200k MLX developers. AMD even fewer for ROCm in earnest. Software platforms are won by mindshare, not features, and NVIDIA's mindshare lead is 25:1.
Where NVIDIA could actually lose
A clear-eyed bull is one who can name the bear case. Here are the three real ones, ranked by probability.
Inference ASICs eat the bottom 80% of inference
Big Tech designs custom chips for their own inference fleets (OpenAI/Broadcom, Meta MTIA, Anthropic + AWS Trainium, Google TPU). Inference becomes a commodity. NVIDIA loses ~30% of total addressable revenue but keeps training and the cutting edge. Margins compress, growth slows.
An AMD-led open standard catches on
If ROCm reached genuine PyTorch parity and an open NVLink-equivalent standard emerged (UALink, in progress), enterprises start dual-sourcing. NVIDIA loses pricing power but keeps share. The current "70% gross margin" world ends.
A regulatory hammer
Antitrust action forces NVIDIA to open CUDA, license to competitors, or divest a part of the stack. Possible in the EU; less likely in the US. Would reshape the industry but probably leaves NVIDIA dominant for the transition window.
None of these are imminent. None of them eliminate NVIDIA — they just turn it from "monopoly" into "dominant incumbent with high margins." If you're betting on a 5-year time horizon, NVIDIA staying dominant is the base case. The bear case is "still very profitable, just less unfairly so."
The sovereign AI moment
A trend specific to 2025–26 worth understanding: countries are buying NVIDIA fleets at the national level. UAE, Saudi Arabia, Singapore, the EU, Japan, India — all standing up "sovereign AI" infrastructure with NVIDIA as the default vendor. The UK government bought an NVIDIA supercomputer. France did. Germany did. This isn't an Anthropic-vs-OpenAI fight; it's a Nation-vs-Nation fight, and NVIDIA is the supplier.
For you, this matters two ways:
- NVIDIA's customer base is now governments, not just hyperscalers. Demand is structurally less elastic.
- The same drivers — privacy, geopolitical risk, data sovereignty — that make a country want its own datacenters apply, in miniature, to your company wanting its own DGX Spark.
Your strategy — what to actually do
Strategy advice has to be specific to who you are. Pick the card that matches your situation.
One DGX Spark · iterate cheap
You're building an MVP. You need fast iteration without metered API anxiety. One Spark covers everything until you have ~5,000 paying users.
Buy one. Use it for everything. Re-evaluate when you stop fitting on it.Sparks for the team · cloud for spikes
Each engineer gets a Spark for development and small fine-tunes. Production hosting on cloud H100s for elastic scaling.
Buy one Spark per engineer. Use Lambda Labs / Together / Fireworks for production until break-even.Hybrid · own training, rent serving
Buy enough on-prem GPUs for daily training and experimentation. Rent inference at burst. Avoid a giant capex commitment until usage is provably durable.
2× DGX Station OR 4× DGX Spark, plus reserved cloud capacity.One DGX as the "AI sandbox"
You don't need to ship product on it. You need a place where your team can experiment, prototype, and prove out workflows without IT/legal blocking every test on the cloud.
Buy one Spark, give it to a curious operator or junior eng, set a 6-month learning budget.Pilot before the big buy
Vendor pitches are 90% air. Run a 90-day Spark pilot on one real workload before committing to a $5M H100 cluster.
Buy 2 Sparks. Pick one workload. Measure. Decide.Set your team's compute principles
Don't go shopping yet. Decide the policies first: privacy tier, model autonomy, vendor lock-in tolerance. Then buy.
Write a one-page "AI compute principles" doc. Then go shopping.The framework: should you bet bigger on NVIDIA?
Three questions, in order. If you answer "yes" to all three, lean further in. If you answer "no" to any, slow down.
- Is the workload a CUDA-only one? If yes, NVIDIA isn't a choice, it's a precondition. If no, you have options.
- Is the data sensitive enough that local inference / private hosting matters? If yes, owning hardware beats renting. If no, cloud is fine.
- Is the volume high enough that owning beats renting? Plug into Day 6 of last week's calculator (or any equivalent). Above ~$1,500/month in inference spend, ownership wins. Below that, cloud is more efficient.
For most businesses considering AI today, the right portfolio is "buy 1–4 DGX-class boxes for development & small fine-tunes; rent everything else." Almost nobody should buy a giant H100 cluster. Almost everyone should own at least one Spark.
The 5-year scenario
A reasonable forecast for May 2031 (your DGX Spark's likely effective end-of-life):
- NVIDIA still owns training and the frontier. Market cap higher than today, margin lower.
- Inference at huge scale has migrated 30-50% to custom ASICs (Anthropic's, OpenAI's, Meta's, Google's). NVIDIA still makes more inference money than today, just a smaller share of a much bigger pie.
- AMD has reached genuine PyTorch parity for top-30 workloads. Used as a price negotiator at hyperscalers; rarely a sole vendor.
- Apple Silicon is the default for on-device AI. MLX has caught up to PyTorch for inference. Training still mostly NVIDIA.
- Your $4,700 Spark from 2026 is roughly equivalent to $47,000 in 2031 cloud rental — still useful, less impressive.
The asset gets less special over time. The skill of having mastered it does not.
Today's reflection
Pick the strategy card above that fits you. Write a one-paragraph answer to: "what is the role of NVIDIA hardware in our AI strategy for the next 18 months?" — specific enough that you'd say it on a board call. An opinion you'd defend, not a generic "we're going AI-first."