Nvidia · Pricing

The H100's Price Has Almost Nothing to Do With the Chip

An H100 costs an estimated $3,320 to build and sold for as much as $40,000. The reflex is to call that a hardware monopoly. It isn't. The premium is rent on seventeen years of software nobody wants to rewrite.

Pricing · 8 min

Comes with a free Pricing Power Diagnostic template — plus a worked example for Nvidia.

Take an H100 apart and the bill of materials is almost disappointing. Roughly $300 for the logic die. About $1,350 for the stacks of high-bandwidth memory. Around $750 to glue it all together with advanced packaging, and another $920 to test and assemble. Add it up and one of the most fought-over objects on earth costs an estimated $3,320 to build.5 That same card moved through the market at $25,000 to $40,000 depending on form factor and the month you asked.6 The gap between those two numbers is the whole story — and almost nobody is paying for the thing in the gap.

The official story is that Nvidia has a hardware monopoly: it makes the fastest AI chip, scarcity does the rest, and the price is just supply meeting demand. That story is half true and badly aimed. The H100 is fast — 80 billion transistors on a custom 4nm-class process, nearly 1,000 teraflops, memory bandwidth measured in thousands of gigabytes a second.10 But specs are catchable. The thing that isn't catchable is the seventeen years of software the buyer is quietly renting along with the silicon.

The chip is the cheap part of the chip

Here is the thesis a smart friend can repeat at dinner: Nvidia doesn't sell a GPU at a hardware price. It sells access to CUDA, and prices the GPU as the only door into it. CUDA is the programming layer — the libraries, the compilers, the cuDNN and cuBLAS routines that the dominant machine-learning framework, PyTorch, was built to lean on. Around 4 million developers already write against it, by Nvidia's own count.9 When a lab buys an H100, the chip is not the product. The product is that everything they have already built runs the day it arrives. That's worth far more than $3,320, and Nvidia knows exactly how much more.

The siliconThe software
Roughly what it costs Nvidia~$3,320 to build a cardBuilt once, copied at near-zero cost
What a rival can matchSpecs, transistor count, bandwidthYears of ecosystem, 4M developers
Switching cost to the buyerBuy a different card6–12 months of re-engineering, tens of millions
Where the price premium really sitsA few thousand dollarsMost of the $25,000–$40,000
What the buyer is actually paying for

This is why the popular '90% margin' line is a category error worth correcting. The ~88% figure that circulates is a per-chip manufacturing-cost-to-street-price ratio — an analyst model, not a disclosed number.5 Nvidia's actual company-wide GAAP gross margin for FY2024 was 72.7%: $44.3 billion of gross profit on $60.9 billion of revenue, filed with the SEC.1 Its data-center segment runs an estimated 74–75%.8 Still extraordinary for a hardware company — but it is a software-economics margin wearing a hardware company's clothes. The chip carries the price the code commands.

72.7%
Nvidia's actual FY2024 GAAP gross margin — not the 90% myth, but a software-grade margin earned by a company that ships physical silicon1

Scarcity set the spike. Software set the floor.

It's tempting to credit the eye-watering 2023 prices to a plain chip shortage, and that's the second thing worth getting right. The bottleneck wasn't raw wafers. It was CoWoS — the advanced packaging step that stacks high-bandwidth memory onto the die — and that capacity was the explicit gating factor on H100 availability through 2024.10 Scarcity at the packaging stage is what drove rental rates past $7 to $10 per GPU-hour at the 2023 peak.7 But scarcity is the part that decays. By late 2025 those same cards rented for $2 to $4 an hour, and secondhand SXM5 boards that fetched $40,000 in late 2023 now move for as little as $6,000.67 If the price were really just a hardware monopoly, that collapse would have been the end. It wasn't.

Because while the spot price fell, the segment margin didn't. Nvidia's data-center revenue went from a record $14.51 billion in the quarter ending October 20232 to a record $62.3 billion in the quarter ending January 20264 — and margins stayed in the mid-70s. Scarcity is a candle; it burns down. The CUDA moat is the floor under the wax. When supply loosened, the premium didn't vanish, because the reason to pay it was never the shortage. It was the cost of leaving.

The switching-cost identity
Defensible premium ≈ value of the running stack − cost (in time, not dollars) to rebuild it elsewhere

A rival chip can be cheaper per teraflop and still lose, because the buyer's real expense is re-engineering. Porting an AI stack to AMD's ROCm can take 6–12 months, and hyperscalers peg the migration at tens of millions of dollars.9 As long as that number stays larger than the price gap between an H100 and the alternative, Nvidia keeps the premium — and it gets to set the gap.

But isn't the moat already cracking?

The honest objection is that this can't last, and the evidence for it is real. AMD's ROCm reached roughly 85% CUDA parity for training workloads by 2025, PyTorch added native ROCm support, and the hyperscalers stopped waiting — Google, Amazon, Microsoft, and Meta have all built custom silicon for large slices of their own internal work, routing around Nvidia where the volume justifies the engineering.89 Anyone who calls the CUDA moat 'insurmountable' is overselling it. It is contested, and it is eroding at the edges.8

But notice where the erosion happens: at the margins, among the handful of players large enough to amortize a multi-year, multi-million-dollar port across millions of their own chips. For everyone else — the startup, the lab, the enterprise that needs its model training next quarter — '85% parity' is not 'good enough.' It is the 15% that breaks at 2 a.m. on a deadline, plus the 6–12 months you don't have. The moat doesn't need to be insurmountable. It only needs to be more expensive to cross than the premium Nvidia charges to stay. So far, for almost everyone, it is.

Price the switching cost, not the part

When you can, build the position where the customer's real expense is the cost of leaving — and then price the thing they buy as the only door into it. Nvidia gives the software away to win developers, then charges for the silicon that runs it. The lesson isn't 'make the fastest chip'; competitors can do that. It's that ecosystems compound and specs don't. Two cautions: a switching-cost moat decays whenever a rival makes leaving cheaper (better tooling, automated porting, a framework that abstracts you away), so you defend it by lowering your own friction faster than they lower theirs. And a premium that looks like pure lock-in invites the biggest customers to fund their own way out — which is exactly what the hyperscalers are doing.

So the next time someone marvels that a $3,320 chip sells for $40,000, the marvel is aimed at the wrong object. Nvidia isn't charging twelve times cost for sand and solder. It's charging for the seventeen years of code that turns the sand into something a buyer can use the morning it arrives — and for the inconvenient truth that the cheapest H100 is usually the one you don't have to rewrite your company around. The silicon is what ships. The lock-in is what sells. And the price was never really about the chip.

Take it further — The Pricing Power
Assessment

Pricing Power Diagnostic

A scored diagnostic of pricing power: brand pull, switching costs, substitutes, and how critical the product is to the buyer. Each dimension rated 1-5 so you can see, at a glance, whether a price rise sticks or sends customers running. Blank to grade your own offer; filled as the worked example scoring a story's business on its real ability to charge more.

Preview the blank →

The worked example unlocks with a subscription. See plans →

Sources

Where this comes from — the filings, records, and reporting behind it.

  1. 1
    Primary · SEC filingDocumented
    Nvidia FY2024 (year ended Jan 28, 2024) total revenue was $60.922 billion, cost of revenue $16.621 billion, and gross profit $44.301 billion — implying a GAAP gross margin of approximately 72.7%.
  2. 2
    Primary · Company recordDocumented
    Nvidia FY2024 Q3 earnings release (quarter ended Oct 29, 2023) reported record Data Center revenue of $14.51 billion, up 279% year-over-year, and guided Q4 non-GAAP gross margin of ~75.5%.
  3. 3
    Primary · SEC filingDocumented
    Nvidia's FY2023 10-K (year ended Jan 29, 2023) confirms the H100 started shipping in fiscal year 2023 and introduced the Hopper architecture with a Transformer Engine designed to accelerate AI transformer model training.
  4. 4
    Primary · Company recordDocumented
    Nvidia Q4 FY2026 (quarter ended Jan 25, 2026) total GAAP revenue was $68.1 billion, up 73% year-over-year, and Data Center revenue reached a record $62.3 billion, up 75% year-over-year.
  5. 5
    SecondaryAttributed to source
    Estimated manufacturing cost of an NVIDIA H100 SXM5 is approximately $3,320: ~$300 for the TSMC 4N logic die (814mm²), ~$1,350 for HBM3 memory, ~$750 for CoWoS-S packaging, and ~$920 for test and assembly — implying a per-unit gross margin near 88% at a $28,000 street price. Note: this is an analyst-model estimate, not a disclosed Nvidia figure.
  6. 6
    SecondaryWidely reported
    H100 80GB GPU purchase prices ranged from ~$25,000 (PCIe) to ~$40,000 (SXM5) as of Q1 2026; secondary market SXM5 cards that sold for $40,000 in late 2023 now move for $6,000–$22,000; Nvidia does not publish a fixed retail price, and street prices move through OEM/channel partners.
  7. 7
    SecondaryWidely reported
    Early H100 rental prices exceeded $7–$10/GPU-hour in 2023, reflecting extreme scarcity and AI demand; by late 2025 the same H100 GPUs were available for $2–$4/hour across non-hyperscale marketplace providers, with hyperscaler on-demand rates in low single digits.
  8. 8
    SecondaryAttributed to source
    Nvidia's data center gross margin is approximately 74–75%, described as extraordinary for hardware, and defended primarily by the CUDA software lock-in premium; AMD's ROCm had achieved approximately 85% CUDA parity for training workloads as of 2025, making the moat contested but still material.
  9. 9
    SecondaryAttributed to source
    CUDA has approximately 4 million active developers as of 2026 (Nvidia's own estimate); PyTorch, the dominant ML framework, was developed with heavy CUDA optimization and has deep integration with Nvidia's cuDNN and cuBLAS libraries; porting an AI stack to AMD ROCm can require 6–12 months of engineering and hyperscalers estimate the migration cost at tens of millions of dollars.
  10. 10
    SecondaryWidely reported
    The H100 GH100 die contains 80 billion transistors on TSMC's 4N (4nm-class) process, delivering 989 TFLOPS FP16 and 3,350 GB/s HBM3 memory bandwidth in SXM5 form; the CoWoS advanced packaging bottleneck — not raw wafer supply — was the primary constraint on H100 availability through 2024.