Snowflake · Decision Forks

Snowflake Didn't Invent Splitting Storage From Compute. It Did Something Harder.

The textbook tells you Snowflake invented the separation of storage and compute. Its own SIGMOD paper says otherwise: the concept was already lying around in the public cloud. The real bet was building from scratch on commodity object storage — the one move Redshift and Oracle couldn't copy without burning their own codebase.

Decision Forks · 8 min

Comes with a free Distribution Channel Map template.

Every data warehouse ever shipped before 2015 made one quiet, expensive assumption: that the disks holding your data and the processors crunching it had to live in the same box. Buy more storage, you bought more compute you didn't need. Need more compute for one heavy query, you paid to scale the storage too. The two were welded together, and the weld was the business model. Snowflake's founders looked at that weld and decided not to cut it. They decided to build a system that had never had one.

The story you'll hear is that Snowflake invented the separation of storage and compute. That's the part that's wrong. Snowflake's own academic paper — the one it presented at the field's most prestigious conference — frames the idea as something it found lying in the public cloud, not something it dreamed up. The genius wasn't the concept. The genius was the willingness to start from zero.

The idea was free. Almost nobody could afford to use it.

When Amazon's S3 and EC2 went mainstream, they handed the industry a gift few recognized: storage you could rent by the gigabyte and compute you could spin up and kill by the hour, as two completely independent meters. The architectural insight that you could disaggregate the two was, by then, not new — researchers had been circling it for years. Snowflake's SIGMOD 2016 paper, 'The Snowflake Elastic Data Warehouse,' is admirably honest about this: it describes building a new warehouse design on top of already-available public cloud resources, not on a breakthrough the company alone possessed.18 The platform that resulted had been available since June 2015.1

So if the idea was sitting in plain sight, why did three founders building from a blank editor get a decade's head start? Because the incumbents could not pick the idea up without dropping the product they were already selling. That is the whole game.

...inherent scalability and capacity constraints... not originally designed for the adoption of cloud-based workloads.2
Snowflake Inc.Describing legacy database architectures in its 2020 S-1 filing

Why an incumbent can't simply un-weld its own product

Here is the mechanism, worked all the way down. A traditional warehouse — Redshift, Oracle, Teradata — keeps data on the same nodes that run the queries. That coupling isn't a bug to those systems; it's the foundation every other assumption rests on. The query planner assumes the data is local. The pricing assumes you scale both together. The codebase, written across years, is shot through with the premise. To separate storage from compute, an incumbent doesn't tune a setting — it rewrites the engine, and in doing so admits the architecture it has been selling is the problem. Snowflake had no such admission to make. It had no installed base to protect, no on-premises customers to keep happy, no legacy planner to honor. It could build a three-layer system — storage on commodity object storage, independent compute clusters, and a separate cloud-services layer holding the metadata — because it had nothing to preserve.1 The blank page wasn't a disadvantage. It was the moat.

Legacy warehouse (Redshift, Oracle, Teradata)Snowflake
Where data livesOn the compute nodes themselvesOn commodity cloud object storage
Scaling storage and computeTied together — buy one, buy bothTwo independent meters
Cost of adopting separationRewrite the engine, abandon the codebaseIt was the starting assumption
What it has to protectAn installed base built on couplingNothing
The same idea, seen from inside two different starting points
The cheap idea, the expensive starting point

When a powerful idea becomes freely available — cloud object storage, an open protocol, a new chip — the advantage rarely goes to whoever thought of using it first. It goes to whoever can build around it without contradicting what they already sell. Incumbents are slow not because they're foolish but because adopting the idea means confessing their current product was built on the wrong assumption. A challenger pays that price for free, because it has no current product. The question to ask of any incumbent isn't 'do they see the threat?' It's 'can they answer it without breaking the thing that pays the bills?' Usually the honest answer is no.

The founders weren't generalists who stumbled onto cloud. Benoit Dageville and Thierry Cruanes came out of the deep end of database engineering, and the third founder, Marcin Zukowski, brought something the Oracle veterans didn't have: he came not from a database vendor but from CWI, the Dutch national research institute, where the X100 project spun off into VectorWise — a company eventually acquired by Ingres, later rebranded Actian.45 Zukowski left in 2012 to join Dageville and Cruanes, carrying vectorized query execution and lightweight compression into the new system.4 The popular telling that all three were Oracle alumni misses the point: the engine got its speed from a research lineage that had nothing to unlearn about the cloud.

June 2015
the date Snowflake says its platform became available — five years before the public markets caught on1

Isn't this just a clever repackaging of S3?

The fair objection is sharp: if the idea was free and the cloud did the heavy lifting, isn't Snowflake just a thin wrapper around Amazon's storage — a middleman taking credit for plumbing it didn't build? It's the strongest case against the company, and it deserves a real answer. The answer is that the hard part was never the disaggregation; it was making the disaggregated parts behave like one coherent, fully managed system. Anyone can store files in object storage. What's hard is a multi-tenant SaaS warehouse that resizes compute on demand, holds a consistent view of metadata in a separate services layer, and lets dozens of independent compute clusters read the same data without stepping on each other — all while a customer just runs SQL and never touches a knob.8 That coordination is the product. The honest counter to the 'thin wrapper' charge is that wrappers don't get written up at SIGMOD, and they don't take a decade to be convincingly copied. Snowflake's later filings make the same point in reverse: the platform now reads data whether it sits inside Snowflake or in external open formats like Apache Iceberg, because the value lives in the coordinating layer, not in owning the bytes.3

The market eventually agreed, loudly. When Snowflake went public in September 2020, it priced at $120 a share, opened at $245 — more than double — and closed up roughly 112% on the day, the largest software IPO on record at the time.6 Berkshire Hathaway and Salesforce each agreed to buy $250 million of stock at the IPO price in a concurrent private placement, and Berkshire took on millions more shares besides.67 Warren Buffett, who famously avoids the technology he doesn't understand, wrote a check for an architecture diagram. That's how clearly the bet had paid off.

Don't chase the invention. Chase the unburdened builder.

When a new primitive arrives — cloud storage, a model API, a payment rail — the temptation is to ask who invented the clever use of it. Ask a better question: who can build on it without dragging a legacy along? The durable advantage almost never belongs to the originator of the idea. It belongs to the team whose starting assumptions match the new world, because they're the only ones who don't have to demolish their own house first. Snowflake didn't out-think Oracle. It out-started it.

Snowflake's whole rebellion fits in a single sentence: it refused to inherit the weld. The incumbents weren't blind to the separation of storage and compute — they were imprisoned by the products that couldn't survive it. So the company built the one thing a market leader structurally cannot build: the system that assumes its own architecture from the very first line, with no past to apologize for. The idea was free. Starting over was the price. Snowflake was the only one willing to pay it.

Take it further — The Distribution Rebellion
Map

Distribution Channel Map

A map of every hop between the company and the customer — each intermediary, who owns the relationship at each step, and where the company controls the channel versus where it's at the channel's mercy. Blank to chart your own route to market; filled as the worked example showing where the story's company went direct, fought its gatekeepers, or got disintermediated.

Preview the blank →

The worked example unlocks with a subscription. See plans →

Sources

Where this comes from — the filings, records, and reporting behind it.

  1. 1
    Primary · AcademicDocumented
    Snowflake's architecture is described as a 'multi-cluster, shared-data' design in the paper 'The Snowflake Elastic Data Warehouse,' authored by Benoit Dageville, Thierry Cruanes, Marcin Zukowski, Vadim Antonov, and others, presented at ACM SIGMOD/PODS 2016 (San Francisco, June 26–July 1, 2016). The paper's ACM DOI is 10.1145/2882903.2903741. The platform was stated to be available since June 2015.
  2. 2
    Primary · SEC filingDocumented
    Snowflake's S-1 filing (August 24, 2020) describes legacy database architectures as having 'inherent scalability and capacity constraints' and 'not originally designed for the adoption of cloud-based workloads,' and big data architectures as creating 'data integrity and governance challenges' — the market rationale for Snowflake's architectural approach.
  3. 3
    Primary · SEC filingDocumented
    Snowflake's most recent annual report (10-K for FY ending January 31, 2026) states: 'Our platform is built on a cloud-native architecture that leverages the massive scalability and performance of the public cloud,' and describes customer consolidation of data into 'a single source of truth, whether stored in Snowflake or connected from external storage like Apache Iceberg tables.'
  4. 4
    Primary · ArchivalDocumented
    Marcin Zukowski co-founded CWI spin-off VectorWise (now Actian) after his PhD, then left in 2012 to co-found Snowflake when Thierry Cruanes and Benoit Dageville asked him to join. He brought vectorized query execution and lightweight compression techniques to the Snowflake architecture. He did NOT come from Oracle.
  5. 5
    SecondaryWidely reported
    VectorWise originated from the X100 research project at CWI (2003–2008), was spun off as a company in 2008, and was acquired by Ingres Corporation in 2011. Co-founders included Peter A. Boncz and Marcin Żukowski. Ingres later rebranded as Actian.
  6. 6
    SecondaryWidely reported
    Snowflake's IPO on September 16, 2020 priced at $120/share, opened at $245/share (~104% above IPO price), and closed up approximately 112% on the day — the largest software IPO in history at that time. Berkshire Hathaway and Salesforce each agreed to purchase $250 million of stock at the IPO price in a concurrent private placement; Berkshire also bought approximately 4 million additional shares in a secondary transaction from former CEO Bob Muglia.
  7. 7
    SecondaryWidely reported
    CNBC confirmed on September 16, 2020 that Berkshire Hathaway and Salesforce each agreed to buy $250 million of Snowflake stock at the IPO price in a concurrent private placement, and that Berkshire also agreed to buy 4.04 million shares in a secondary transaction.
  8. 8
    Primary · Company recordDocumented
    Snowflake's own official resource page confirms 'The Snowflake Elastic Data Warehouse' paper was accepted for publication and presentation at SIGMOD 2016, and describes its purpose as explaining why Snowflake built a new cloud warehouse architecture and how it enables dynamic resizing, diverse data support, and heterogeneous workloads — differentiating from Amazon Redshift and Google BigQuery.