Why is the Nvidia H100 so expensive?

Not mainly because of the silicon. An H100 is estimated to cost around $3,320 to build, yet street prices ran from roughly $25,000 to $40,000. The premium is paid for the CUDA software ecosystem — the libraries, frameworks, and roughly 4 million developers already built around Nvidia — which competitors can match on raw specs but cannot easily replace.

Can AMD or custom chips break Nvidia's pricing power?

At the margins, yes. AMD's ROCm reached roughly 85% CUDA parity for training by 2025, and hyperscalers like Google, Amazon, and Meta have built custom silicon for internal workloads. But porting a full AI stack can take 6–12 months and cost tens of millions, so the moat is contested and eroding, not broken.

Nvidia · Pricing

The H100's Price Has Almost Nothing to Do With the Chip

Q: Does Nvidia really earn a 90% margin on the H100?

That figure is a category error. The ~88% number compares an estimated per-chip manufacturing cost to street price. Nvidia's actual company-wide GAAP gross margin for FY2024 was about 72.7%, and analysts estimate its data-center segment margin near 74–75% — extraordinary for hardware, but not 90%.

An H100 costs an estimated $3,320 to build and sold for as much as $40,000. The reflex is to call that a hardware monopoly. It isn't. The premium is rent on seventeen years of software nobody wants to rewrite.

Pricing · 8 min

Comes with a free Pricing Power Diagnostic template — plus a worked example for Nvidia.

Take an H100 apart and the bill of materials is almost disappointing. Roughly $300 for the logic die. About $1,350 for the stacks of high-bandwidth memory. Around $750 to glue it all together with advanced packaging, and another $920 to test and assemble. Add it up and one of the most fought-over objects on earth costs an estimated $3,320 to build.⁵ That same card moved through the market at $25,000 to $40,000 depending on form factor and the month you asked.⁶ The gap between those two numbers is the whole story — and almost nobody is paying for the thing in the gap.

The official story is that Nvidia has a hardware monopoly: it makes the fastest AI chip, scarcity does the rest, and the price is just supply meeting demand. That story is half true and badly aimed. The H100 is fast — 80 billion transistors on a custom 4nm-class process, nearly 1,000 teraflops, memory bandwidth measured in thousands of gigabytes a second.¹⁰ But specs are catchable. The thing that isn't catchable is the seventeen years of software the buyer is quietly renting along with the silicon.

The chip is the cheap part of the chip

Here is the thesis a smart friend can repeat at dinner: Nvidia doesn't sell a GPU at a hardware price. It sells access to CUDA, and prices the GPU as the only door into it. CUDA is the programming layer — the libraries, the compilers, the cuDNN and cuBLAS routines that the dominant machine-learning framework, PyTorch, was built to lean on. Around 4 million developers already write against it, by Nvidia's own count.⁹ When a lab buys an H100, the chip is not the product. The product is that everything they have already built runs the day it arrives. That's worth far more than $3,320, and Nvidia knows exactly how much more.

	The silicon	The software
Roughly what it costs Nvidia	~$3,320 to build a card	Built once, copied at near-zero cost
What a rival can match	Specs, transistor count, bandwidth	Years of ecosystem, 4M developers
Switching cost to the buyer	Buy a different card	6–12 months of re-engineering, tens of millions
Where the price premium really sits	A few thousand dollars	Most of the $25,000–$40,000

What the buyer is actually paying for

This is why the popular '90% margin' line is a category error worth correcting. The ~88% figure that circulates is a per-chip manufacturing-cost-to-street-price ratio — an analyst model, not a disclosed number.⁵ Nvidia's actual company-wide GAAP gross margin for FY2024 was 72.7%: $44.3 billion of gross profit on $60.9 billion of revenue, filed with the SEC.¹ Its data-center segment runs an estimated 74–75%.⁸ Still extraordinary for a hardware company — but it is a software-economics margin wearing a hardware company's clothes. The chip carries the price the code commands.

72.7%

Nvidia's actual FY2024 GAAP gross margin — not the 90% myth, but a software-grade margin earned by a company that ships physical silicon¹

Scarcity set the spike. Software set the floor.

It's tempting to credit the eye-watering 2023 prices to a plain chip shortage, and that's the second thing worth getting right. The bottleneck wasn't raw wafers. It was CoWoS — the advanced packaging step that stacks high-bandwidth memory onto the die — and that capacity was the explicit gating factor on H100 availability through 2024.¹⁰ Scarcity at the packaging stage is what drove rental rates past $7 to $10 per GPU-hour at the 2023 peak.⁷ But scarcity is the part that decays. By late 2025 those same cards rented for $2 to $4 an hour, and secondhand SXM5 boards that fetched $40,000 in late 2023 now move for as little as $6,000.⁶⁷ If the price were really just a hardware monopoly, that collapse would have been the end. It wasn't.

Because while the spot price fell, the segment margin didn't. Nvidia's data-center revenue went from a record $14.51 billion in the quarter ending October 2023² to a record $62.3 billion in the quarter ending January 2026⁴ — and margins stayed in the mid-70s. Scarcity is a candle; it burns down. The CUDA moat is the floor under the wax. When supply loosened, the premium didn't vanish, because the reason to pay it was never the shortage. It was the cost of leaving.

The switching-cost identity

Defensible premium ≈ value of the running stack − cost (in time, not dollars) to rebuild it elsewhere

A rival chip can be cheaper per teraflop and still lose, because the buyer's real expense is re-engineering. Porting an AI stack to AMD's ROCm can take 6–12 months, and hyperscalers peg the migration at tens of millions of dollars.⁹ As long as that number stays larger than the price gap between an H100 and the alternative, Nvidia keeps the premium — and it gets to set the gap.

But isn't the moat already cracking?

The honest objection is that this can't last, and the evidence for it is real. AMD's ROCm reached roughly 85% CUDA parity for training workloads by 2025, PyTorch added native ROCm support, and the hyperscalers stopped waiting — Google, Amazon, Microsoft, and Meta have all built custom silicon for large slices of their own internal work, routing around Nvidia where the volume justifies the engineering.⁸⁹ Anyone who calls the CUDA moat 'insurmountable' is overselling it. It is contested, and it is eroding at the edges.⁸

But notice where the erosion happens: at the margins, among the handful of players large enough to amortize a multi-year, multi-million-dollar port across millions of their own chips. For everyone else — the startup, the lab, the enterprise that needs its model training next quarter — '85% parity' is not 'good enough.' It is the 15% that breaks at 2 a.m. on a deadline, plus the 6–12 months you don't have. The moat doesn't need to be insurmountable. It only needs to be more expensive to cross than the premium Nvidia charges to stay. So far, for almost everyone, it is.

Price the switching cost, not the part

When you can, build the position where the customer's real expense is the cost of leaving — and then price the thing they buy as the only door into it. Nvidia gives the software away to win developers, then charges for the silicon that runs it. The lesson isn't 'make the fastest chip'; competitors can do that. It's that ecosystems compound and specs don't. Two cautions: a switching-cost moat decays whenever a rival makes leaving cheaper (better tooling, automated porting, a framework that abstracts you away), so you defend it by lowering your own friction faster than they lower theirs. And a premium that looks like pure lock-in invites the biggest customers to fund their own way out — which is exactly what the hyperscalers are doing.

So the next time someone marvels that a $3,320 chip sells for $40,000, the marvel is aimed at the wrong object. Nvidia isn't charging twelve times cost for sand and solder. It's charging for the seventeen years of code that turns the sand into something a buyer can use the morning it arrives — and for the inconvenient truth that the cheapest H100 is usually the one you don't have to rewrite your company around. The silicon is what ships. The lock-in is what sells. And the price was never really about the chip.

Where the real money hides in plain sight

Visa's thin, unavoidable toll

It lends nothing and risks nothing — and that's the genius.

Read →

The Bloomberg lock-in

A network nobody can afford to leave.

Read →

Costco's real product

The profit is in the membership, not the cart.

Read →

Take it further — The Pricing Power

Assessment

Pricing Power Diagnostic

A scored diagnostic of pricing power: brand pull, switching costs, substitutes, and how critical the product is to the buyer. Each dimension rated 1-5 so you can see, at a glance, whether a price rise sticks or sends customers running. Blank to grade your own offer; filled as the worked example scoring a story's business on its real ability to charge more.

Preview the blank →

The worked example unlocks with a subscription. See plans →

Sources

Where this comes from — the filings, records, and reporting behind it.

1
Primary · SEC filingDocumented
Nvidia FY2024 (year ended Jan 28, 2024) total revenue was $60.922 billion, cost of revenue $16.621 billion, and gross profit $44.301 billion — implying a GAAP gross margin of approximately 72.7%.
U.S. Securities and Exchange Commission / NVIDIA Corporation, NVIDIA Corp Form 10-K FY2024 (Annual Financial Statement Tables) ↗ · 2024-02-21
2
Primary · Company recordDocumented
Nvidia FY2024 Q3 earnings release (quarter ended Oct 29, 2023) reported record Data Center revenue of $14.51 billion, up 279% year-over-year, and guided Q4 non-GAAP gross margin of ~75.5%.
U.S. Securities and Exchange Commission / NVIDIA Corporation, NVIDIA Announces Financial Results for Third Quarter Fiscal 2024 ↗ · 2023-11-21
3
Primary · SEC filingDocumented
Nvidia's FY2023 10-K (year ended Jan 29, 2023) confirms the H100 started shipping in fiscal year 2023 and introduced the Hopper architecture with a Transformer Engine designed to accelerate AI transformer model training.
U.S. Securities and Exchange Commission / NVIDIA Corporation, NVIDIA Corp Form 10-K FY2023 (nvda-20230129) ↗ · 2023-02-24
4
Primary · Company recordDocumented
Nvidia Q4 FY2026 (quarter ended Jan 25, 2026) total GAAP revenue was $68.1 billion, up 73% year-over-year, and Data Center revenue reached a record $62.3 billion, up 75% year-over-year.
NVIDIA Newsroom / NVIDIA Corporation, NVIDIA Announces Financial Results for Fourth Quarter and Fiscal 2026 ↗ · 2026-02-26
5
SecondaryAttributed to source
Estimated manufacturing cost of an NVIDIA H100 SXM5 is approximately $3,320: ~$300 for the TSMC 4N logic die (814mm²), ~$1,350 for HBM3 memory, ~$750 for CoWoS-S packaging, and ~$920 for test and assembly — implying a per-unit gross margin near 88% at a $28,000 street price. Note: this is an analyst-model estimate, not a disclosed Nvidia figure.
Silicon Analysts, AI Chip Cost Bridge: Manufacturing Cost Breakdown for 18 Accelerators (2026) ↗ · 2026-04
6
SecondaryWidely reported
H100 80GB GPU purchase prices ranged from ~$25,000 (PCIe) to ~$40,000 (SXM5) as of Q1 2026; secondary market SXM5 cards that sold for $40,000 in late 2023 now move for $6,000–$22,000; Nvidia does not publish a fixed retail price, and street prices move through OEM/channel partners.
CloudZero, H100 GPU Cost In 2026: Buy, Rent, And Cloud Pricing Compared ↗ · 2026-05-20
7
SecondaryWidely reported
Early H100 rental prices exceeded $7–$10/GPU-hour in 2023, reflecting extreme scarcity and AI demand; by late 2025 the same H100 GPUs were available for $2–$4/hour across non-hyperscale marketplace providers, with hyperscaler on-demand rates in low single digits.
Silicon Data, H100 Rental Price Over Time (2023–2025): A Complete Market Analysis ↗ · 2025
8
SecondaryAttributed to source
Nvidia's data center gross margin is approximately 74–75%, described as extraordinary for hardware, and defended primarily by the CUDA software lock-in premium; AMD's ROCm had achieved approximately 85% CUDA parity for training workloads as of 2025, making the moat contested but still material.
PitchGrade Research, NVIDIA: AI Infrastructure Monopoly or Peak Cycle Risk in the Compute Stack? ↗ · 2026-02-23
9
SecondaryAttributed to source
CUDA has approximately 4 million active developers as of 2026 (Nvidia's own estimate); PyTorch, the dominant ML framework, was developed with heavy CUDA optimization and has deep integration with Nvidia's cuDNN and cuBLAS libraries; porting an AI stack to AMD ROCm can require 6–12 months of engineering and hyperscalers estimate the migration cost at tens of millions of dollars.
Alphastreet / PitchGrade Research / Introl Blog, Nvidia's CUDA Lock-In and Supply Scarcity Make Its AI Chip Moat Harder to Break Than It Looks ↗ · 2026-03-27
10
SecondaryWidely reported
The H100 GH100 die contains 80 billion transistors on TSMC's 4N (4nm-class) process, delivering 989 TFLOPS FP16 and 3,350 GB/s HBM3 memory bandwidth in SXM5 form; the CoWoS advanced packaging bottleneck — not raw wafer supply — was the primary constraint on H100 availability through 2024.
AIToolDiscovery / NVIDIA H100 Specs and Pricing 2026, NVIDIA H100 GPU: Price, Full Specs, and Cloud Rates 2026 ↗ · 2026-05

Keep going

The Decade of Looking Wrong: How Nvidia's CUDA Bet Survived Wall Street's Contempt →Nvidia's Moat Isn't the Code. It's the Eighteen Years You'd Have to Re-Live to Catch Up. →Nvidia Didn't Get Caught in the Chip War. It Spent Three Years Designing Its Way Around It. →

The H100's Price Has Almost Nothing to Do With the Chip

The chip is the cheap part of the chip

Scarcity set the spike. Software set the floor.

But isn't the moat already cracking?

Where the real money hides in plain sight

Pricing Power Diagnostic

Sources

More from Nvidia

Pricing

Explore

Start here