DAY 06 · THE FUTURE + YOUR BET

How to think about NVIDIA without becoming a fanboy.

Five days in, you understand the moat. Today is about the time horizon. Where is NVIDIA going? What's the bull case, the bear case, the boring case? And — given all of that — what's the rational AI compute strategy for your situation, from solo founder to mid-stage company? We'll end with a one-page framework you can actually use.

The roadmap NVIDIA has actually shown

NVIDIA publishes their architecture roadmap years in advance. As of mid-2026:

2024 · SHIPPED

Blackwell (B100, B200, GB200, GB10)

The current generation. Native FP4 tensor cores. NVLink 5 fabric. NVL72 datacenter pods (72 GPUs that look like one big GPU to PyTorch). Your DGX Spark is the desk-friendly version — the GB10 is a Blackwell GPU married to a Grace ARM CPU on a single package.

2025–26 · SHIPPING

Blackwell Ultra (B300)

Mid-cycle bump. Higher memory (288 GB HBM3e), modest compute lift. Mostly a margin and TAM expansion. Datacenter only — no consumer/desk variant.

2026 · ANNOUNCED

Rubin · the next generation

Successor architecture, ships late 2026 / early 2027. HBM4. New tensor core formats. NVLink 6 with higher bandwidth. The first chip designed end-to-end in the post-ChatGPT era — explicitly optimized for transformer workloads at unprecedented scale.

2027–28 · ROADMAPPED

Rubin Ultra · Feynman

Already on slides. Detail thin. The pattern is clear: a major architecture every 2 years, a mid-cycle Ultra in between. Each one ~2× the perf of the prior at roughly the same power envelope.

NVIDIA's three secret weapons that aren't chips

1 · The fabric — NVLink & NVSwitch

A single GPU is one thing. A cluster of GPUs is a different thing entirely. NVIDIA's fabric — NVLink between chips, NVSwitch between racks — lets thousands of GPUs work as one giant GPU. NVL72 puts 72 Blackwells in a single rack with shared memory. Nobody else has anything close. AMD has Infinity Fabric; it scales to single nodes, not racks.

Your DGX Spark has 200 Gbps NVLink-C2C — designed to pair two of them into one logical machine with 256 GB unified memory. The same fabric philosophy, scaled down to a desk.

2 · The full-stack play

NVIDIA has stopped selling chips. They sell "AI factories" — DGX systems, full software stacks, networking, cloud (DGX Cloud), enterprise software (NIM, NeMo), reference designs. The Apple-of-AI strategy: control every layer.

Your DGX Spark is the smallest model in this lineup. The same software you'll use this week scales — without changing — to a $400M datacenter. That portability is itself a moat.

3 · The CUDA developer base

By NVIDIA's own count, ~5M registered CUDA developers as of 2026. Every CS graduate in the last 10 years took a CUDA class. Every ML PhD wrote their dissertation on it. Apple has maybe 200k MLX developers. AMD even fewer for ROCm in earnest. Software platforms are won by mindshare, not features, and NVIDIA's mindshare lead is 25:1.

Where NVIDIA could actually lose

A clear-eyed bull is one who can name the bear case. Here are the three real ones, ranked by probability.

PROBABILITY · MEDIUM-HIGH

Inference ASICs eat the bottom 80% of inference

Big Tech designs custom chips for their own inference fleets (OpenAI/Broadcom, Meta MTIA, Anthropic + AWS Trainium, Google TPU). Inference becomes a commodity. NVIDIA loses ~30% of total addressable revenue but keeps training and the cutting edge. Margins compress, growth slows.

PROBABILITY · MEDIUM

An AMD-led open standard catches on

If ROCm reached genuine PyTorch parity and an open NVLink-equivalent standard emerged (UALink, in progress), enterprises start dual-sourcing. NVIDIA loses pricing power but keeps share. The current "70% gross margin" world ends.

PROBABILITY · LOWER

A regulatory hammer

Antitrust action forces NVIDIA to open CUDA, license to competitors, or divest a part of the stack. Possible in the EU; less likely in the US. Would reshape the industry but probably leaves NVIDIA dominant for the transition window.

None of these are imminent. None of them eliminate NVIDIA — they just turn it from "monopoly" into "dominant incumbent with high margins." If you're betting on a 5-year time horizon, NVIDIA staying dominant is the base case. The bear case is "still very profitable, just less unfairly so."

The sovereign AI moment

A trend specific to 2025–26 worth understanding: countries are buying NVIDIA fleets at the national level. UAE, Saudi Arabia, Singapore, the EU, Japan, India — all standing up "sovereign AI" infrastructure with NVIDIA as the default vendor. The UK government bought an NVIDIA supercomputer. France did. Germany did. This isn't an Anthropic-vs-OpenAI fight; it's a Nation-vs-Nation fight, and NVIDIA is the supplier.

For you, this matters two ways:

Your strategy — what to actually do

Strategy advice has to be specific to who you are. Pick the card that matches your situation.

Solo founder · creator

One DGX Spark · iterate cheap

You're building an MVP. You need fast iteration without metered API anxiety. One Spark covers everything until you have ~5,000 paying users.

Buy one. Use it for everything. Re-evaluate when you stop fitting on it.
Pre-seed / seed startup

Sparks for the team · cloud for spikes

Each engineer gets a Spark for development and small fine-tunes. Production hosting on cloud H100s for elastic scaling.

Buy one Spark per engineer. Use Lambda Labs / Together / Fireworks for production until break-even.
Series A · scaling

Hybrid · own training, rent serving

Buy enough on-prem GPUs for daily training and experimentation. Rent inference at burst. Avoid a giant capex commitment until usage is provably durable.

2× DGX Station OR 4× DGX Spark, plus reserved cloud capacity.
Operator / non-tech CEO

One DGX as the "AI sandbox"

You don't need to ship product on it. You need a place where your team can experiment, prototype, and prove out workflows without IT/legal blocking every test on the cloud.

Buy one Spark, give it to a curious operator or junior eng, set a 6-month learning budget.
Enterprise IT lead

Pilot before the big buy

Vendor pitches are 90% air. Run a 90-day Spark pilot on one real workload before committing to a $5M H100 cluster.

Buy 2 Sparks. Pick one workload. Measure. Decide.
Public company exec

Set your team's compute principles

Don't go shopping yet. Decide the policies first: privacy tier, model autonomy, vendor lock-in tolerance. Then buy.

Write a one-page "AI compute principles" doc. Then go shopping.

The framework: should you bet bigger on NVIDIA?

Three questions, in order. If you answer "yes" to all three, lean further in. If you answer "no" to any, slow down.

  1. Is the workload a CUDA-only one? If yes, NVIDIA isn't a choice, it's a precondition. If no, you have options.
  2. Is the data sensitive enough that local inference / private hosting matters? If yes, owning hardware beats renting. If no, cloud is fine.
  3. Is the volume high enough that owning beats renting? Plug into Day 6 of last week's calculator (or any equivalent). Above ~$1,500/month in inference spend, ownership wins. Below that, cloud is more efficient.
The honest answer

For most businesses considering AI today, the right portfolio is "buy 1–4 DGX-class boxes for development & small fine-tunes; rent everything else." Almost nobody should buy a giant H100 cluster. Almost everyone should own at least one Spark.

The 5-year scenario

A reasonable forecast for May 2031 (your DGX Spark's likely effective end-of-life):

The asset gets less special over time. The skill of having mastered it does not.

Today's reflection

Pick the strategy card above that fits you. Write a one-paragraph answer to: "what is the role of NVIDIA hardware in our AI strategy for the next 18 months?" — specific enough that you'd say it on a board call. An opinion you'd defend, not a generic "we're going AI-first."
← Day 5What CUDA Lets You Do