Last week, the tech world was shaken by a headline-grabber: Databricks has surged past the $100 billion valuation mark. Databricks has surged past the $100 billion valuation mark And remember, this isn’t a consumer internet giant, nor is it SpaceX—it’s a software company that builds the data and AI infrastructure layer. In a capital market environment dominated by caution, Databricks has managed to defy gravity, pushing the ceiling of what a unicorn can be worth. software company The obvious question is: why? What exactly gives Databricks the right to this number? why? I did some digging. Let’s look at why Databricks is valued at $100 billion—and more importantly, what software entrepreneurs might take away from it. 1. Revenue is the hard metric: Numbers speak louder than stories When analyzing Databricks’ valuation, forget the buzzwords. Let’s start with the numbers. Valuation trajectory: At the end of 2024, Databricks was worth $62 billion. Less than a year later, its Series K financing pushed that to $100 billion—a 61% jump. Lead and participating investors included Thrive Capital, Insight Partners, and a16z. The fact that the same top-tier funds doubled down from Series J to Series K is not random betting—it’s a vote of confidence.Revenue growth: By mid-2025, Databricks had reached ~$3.7 billion in annual recurring revenue (ARR), up 50% year over year. Compare that to Snowflake, whose growth slowed to 25–30%. In other words, at the same revenue scale, Databricks is running twice as fast.Profitability: Subscription gross margins exceed 80%, and more than 500 customers now generate contracts over $1 million annually. For FY2025, the company turned cash-flow positive—no longer burning cash for growth, but actually self-sustaining. That’s crucial. For an open-source company, once ARR hits self-funding mode, profit expansion becomes almost inevitable.Customer base: Over 15,000 customers worldwide, spanning finance, retail, manufacturing, and internet giants. Databricks has already become a piece of core enterprise infrastructure.Valuation multiple: $100B ÷ $3.7B ARR = 27× ARR. Expensive? Maybe. But remember, Snowflake trades at a strong multiple despite growing at half the rate. Databricks’ premium is justified. Valuation trajectory: At the end of 2024, Databricks was worth $62 billion. Less than a year later, its Series K financing pushed that to $100 billion—a 61% jump. Lead and participating investors included Thrive Capital, Insight Partners, and a16z. The fact that the same top-tier funds doubled down from Series J to Series K is not random betting—it’s a vote of confidence. Valuation trajectory: Revenue growth: By mid-2025, Databricks had reached ~$3.7 billion in annual recurring revenue (ARR), up 50% year over year. Compare that to Snowflake, whose growth slowed to 25–30%. In other words, at the same revenue scale, Databricks is running twice as fast. Revenue growth: twice as fast. Profitability: Subscription gross margins exceed 80%, and more than 500 customers now generate contracts over $1 million annually. For FY2025, the company turned cash-flow positive—no longer burning cash for growth, but actually self-sustaining. That’s crucial. For an open-source company, once ARR hits self-funding mode, profit expansion becomes almost inevitable. Profitability: Customer base: Over 15,000 customers worldwide, spanning finance, retail, manufacturing, and internet giants. Databricks has already become a piece of core enterprise infrastructure. Customer base: enterprise infrastructure Valuation multiple: $100B ÷ $3.7B ARR = 27× ARR. Expensive? Maybe. But remember, Snowflake trades at a strong multiple despite growing at half the rate. Databricks’ premium is justified. Valuation multiple: half At this scale, capital markets are brutally objective. A story alone won’t cut it—the fundamentals must back it up. And by that measure, Databricks’ numbers are rock solid. 2. The moat: More than a tool—an open-source ecosystem standard This is something I’ve been bullish on since last year, when Databricks acquired Tabular, the commercial company behind Apache Iceberg, for over $1 billion. From an AI ecosystem perspective, the Iceberg + Databricks combination is a formidable moat. In the coming “Agentic AI” era, the Lakehouse model will define how enterprises manage data. Just as Snowflake displaced Teradata a decade ago, Iceberg has the potential to displace Snowflake. Iceberg + Databricks What makes Databricks special is that it was never just a point solution. From day one, it has been steadily building out an ecosystem. point solution. ecosystem. The Lakehouse standard The Lakehouse standard Acquisition of Tabular: Bringing the Iceberg core team in-house for >$1B.Lakehouse architecture: Blending the openness of data lakes with the analytical power of warehouses.Delta Lake UniForm: A single dataset readable by both Delta and Iceberg engines—no more painful migrations or duplicate storage.Unified governance: Native Iceberg support, combined with Unity Catalog, makes “multi-format unified governance” possible in one step. Acquisition of Tabular: Bringing the Iceberg core team in-house for >$1B. Acquisition of Tabular: Lakehouse architecture: Blending the openness of data lakes with the analytical power of warehouses. Lakehouse architecture: Delta Lake UniForm: A single dataset readable by both Delta and Iceberg engines—no more painful migrations or duplicate storage. Delta Lake UniForm: Unified governance: Native Iceberg support, combined with Unity Catalog, makes “multi-format unified governance” possible in one step. Unified governance: Unity Catalog: Governance as a competitive edge Unity Catalog: Governance as a competitive edge For AI adoption, the biggest hurdle isn’t the model—it’s data governance. Access control, lineage, compliance, and cross-cloud consistency are all non-negotiable. Unity Catalog acts as the enterprise command center: command center Clear permissions, zero compliance headaches.Transparent lineage, full traceability.Cross-cloud compatibility, no vendor lock-in fears. Clear permissions, zero compliance headaches. Transparent lineage, full traceability. Cross-cloud compatibility, no vendor lock-in fears. One of my mentors at IBM used to say: “First-rate companies make standards, second-rate companies make products, third-rate companies provide services.” “First-rate companies make standards, second-rate companies make products, third-rate companies provide services.” That wisdom applies here. Open-source isn’t just about code—it’s about shaping the standard. Once you control the standard, products and services naturally follow. 3. AI-Native: A Trifecta of Models, Agents, and Applications Databricks has gone all-in on building an AI-native stack, with an integrated strategy across the model, engineering, and application layers: AI-native stack Model layer: Its open-source LLM, DBRX, performs competitively on benchmarks like MMLU and HumanEval, approaching the performance of proprietary leaders while still being trainable at a sustainable cost. It may not match OpenAI or Anthropic in sheer firepower, but for a company of this scale, not owning “hard currency” in models would mean risking dependency on others.Engineering layer:Mosaic AI + MLflow 3.0 bring agent training, evaluation, monitoring, and iteration into a unified platform. Enterprises don’t need to build pipelines from scratch—they can leverage an out-of-the-box system.Application layer: The Databricks Apps platform has already seen adoption by 2,500+ organizations and 20,000+ applications within six months. The takeaway is clear: what enterprises really want is in-place data development combined with embedded AI apps.The Neon acquisition: By acquiring Neon, Databricks filled the database gap. In the near future, AI agents won’t just read from data lakes—they’ll be able to directly operate OLTP applications. This shifts Databricks from being a back-end analytics platform into a front-end application enabler, completing the enterprise usage cycle. Model layer: Its open-source LLM, DBRX, performs competitively on benchmarks like MMLU and HumanEval, approaching the performance of proprietary leaders while still being trainable at a sustainable cost. It may not match OpenAI or Anthropic in sheer firepower, but for a company of this scale, not owning “hard currency” in models would mean risking dependency on others. Model layer: DBRX sustainable cost Engineering layer:Mosaic AI + MLflow 3.0 bring agent training, evaluation, monitoring, and iteration into a unified platform. Enterprises don’t need to build pipelines from scratch—they can leverage an out-of-the-box system. Engineering layer:Mosaic AI + MLflow 3.0 Application layer: The Databricks Apps platform has already seen adoption by 2,500+ organizations and 20,000+ applications within six months. The takeaway is clear: what enterprises really want is in-place data development combined with embedded AI apps. Application layer: Databricks Apps in-place data development combined with embedded AI apps. The Neon acquisition: By acquiring Neon, Databricks filled the database gap. In the near future, AI agents won’t just read from data lakes—they’ll be able to directly operate OLTP applications. This shifts Databricks from being a back-end analytics platform into a front-end application enabler, completing the enterprise usage cycle. The Neon acquisition: back-end analytics platform front-end application enabler In short, Databricks has evolved along the path: data lake → lakehouse → AI platform → database applications, creating a closed loop where AI agents can not only read but also write, not only analyze but also act. data lake → lakehouse → AI platform → database applications 4. Lessons for Entrepreneurs: Capital Strategy + Ecosystem Building For open-source founders and software entrepreneurs, the Databricks story offers several lessons worth reflection. Capital is a means, not the end Capital is a means, not the end Capital is a means, not the end Databricks didn’t reach a $100B valuation by telling stories. It got there because its revenue, customers, and margins were defensible. Capital simply prepaid for its growth trajectory. revenue, customers, and margins The point is not to treat fundraising as the finish line, but as an amplifier. You must first prove your model can make money—only then will capital accelerate your growth. amplifier Think ecosystem, not just tools Think ecosystem, not just tools Think ecosystem, not just tools Databricks was never just a single-point tool. It always pushed toward platformization: From Spark to the Lakehouse;From Delta Lake to governance with Unity Catalog;From data to AI-native. From Spark to the Lakehouse; From Delta Lake to governance with Unity Catalog; From data to AI-native. Entrepreneurs can’t stop at “building tools.” You need to string upstream and downstream together to form a true moat. Balance open source and commercialization Balance open source and commercialization Balance open source and commercialization The foundation of Databricks is open source (Spark, Iceberg, MLflow). But its revenue engine comes from the commercial platform. It knows how to nurture open-source communities while charging enterprises for governance, security, and compliance—the must-pay capabilities. must-pay capabilities Be ambitious in M&A and partnerships Be ambitious in M&A and partnerships Be ambitious in M&A and partnerships Databricks moved fast: acquiring Tabular and Neon to close product gaps, while partnering with global giants like Microsoft, Google, and SAP to embed itself in broader ecosystems. Entrepreneurs often underestimate this—thinking it’s just about tech investment. But no company reaches tens of billions by going it alone. Both integration capacity and ecosystem positioning are critical. integration capacity and ecosystem positioning Databricks’ capital playbook Databricks’ capital playbook Databricks’ capital playbook Its approach is textbook: fundraise → acquire → expand ecosystem. fundraise → acquire → expand ecosystem. End of 2024: raised $10B plus $5.25B in credit.By 2025: hit $100B valuation, acquired Tabular and Neon—expanding products while patching weaknesses.On ecosystem: Microsoft, Google, SAP, Anthropic, Palantir—all are partners. End of 2024: raised $10B plus $5.25B in credit. By 2025: hit $100B valuation, acquired Tabular and Neon—expanding products while patching weaknesses. On ecosystem: Microsoft, Google, SAP, Anthropic, Palantir—all are partners. Databricks is no longer just a cloud vendor; it has become part of the enterprise AI supply chain itself. enterprise AI supply chain itself. Implications for WhaleOps For WhaleOps, the commercial company behind Apache SeaTunnel, Databricks’ journey offers both validation and inspiration. WhaleOps has long bet on the future of data infrastructure—supporting Iceberg integration early on and building connectors for 200+ databases and data lakes. WhaleOps Iceberg integration early on 200+ databases and data lakes As the industry shifts toward the Agentic AI era, where autonomous agents require seamless access to multi-modal data, WhaleOps is uniquely positioned. Its commitment to next-generation data storage and processing echoes the same playbook that propelled Databricks: early alignment with open-source standards, broad ecosystem integration, and a vision that sees AI not as an add-on, but as the core driver of enterprise data platforms. Agentic AI era data storage and processing Just as Databricks turned Spark into a $100B enterprise, WhaleOps’ ability to evolve SeaTunnel into the backbone of AI-driven data integration could mark the next chapter in enterprise data infrastructure. Conclusion & Recap Looking back at Databricks’ financing journey, a very clear trajectory emerges: Early stage: Built on the foundation of the open-source Spark community, leveraging technical depth and community influence.Mid stage: The Lakehouse concept took shape. Open-source projects like Delta Lake and MLflow expanded the ecosystem and gradually won enterprise customers.Series J (late 2024): Valuation reached $62B, backed by $10B in equity financing plus $5.25B in credit facilities. This marked the start of the “capital-backed + acquisition-driven expansion” phase.Series K (mid-2025): Valuation surged past $100B. Existing investors doubled down, and new capital joined in. This signals not only confidence in Databricks’ growth curve but also recognition of its entrenched industry position. Early stage: Built on the foundation of the open-source Spark community, leveraging technical depth and community influence. Early stage: Mid stage: The Lakehouse concept took shape. Open-source projects like Delta Lake and MLflow expanded the ecosystem and gradually won enterprise customers. Mid stage: Series J (late 2024): Valuation reached $62B, backed by $10B in equity financing plus $5.25B in credit facilities. This marked the start of the “capital-backed + acquisition-driven expansion” phase. Series J (late 2024): Series K (mid-2025): Valuation surged past $100B. Existing investors doubled down, and new capital joined in. This signals not only confidence in Databricks’ growth curve but also recognition of its entrenched industry position. Series K (mid-2025): So why is Databricks worth $100B? The logic comes down to three pillars: Impressive fundamentals Impressive fundamentals Impressive fundamentals $3.7B ARR with 50% growth;80%+ subscription gross margins;15,000+ customers, including 500+ seven-figure contracts;Positive free cash flow. $3.7B ARR with 50% growth; 80%+ subscription gross margins; 15,000+ customers, including 500+ seven-figure contracts; Positive free cash flow. A deep moat A deep moat A deep moat From Lakehouse standards to Delta/Iceberg interoperability and governance via Unity Catalog, Databricks has effectively seized control of the data standard.Acquisitions of Tabular and Neon plugged ecosystem gaps, strengthening its position end-to-end. From Lakehouse standards to Delta/Iceberg interoperability and governance via Unity Catalog, Databricks has effectively seized control of the data standard. data standard Acquisitions of Tabular and Neon plugged ecosystem gaps, strengthening its position end-to-end. An AI-era necessity An AI-era necessity An AI-era necessity With DBRX, Mosaic AI, and the Apps platform, Databricks has become more than a data platform—it’s evolving into the operating system for enterprise AI.The Neon acquisition pushes AI agents directly into the database application layer, creating a full closed loop. With DBRX, Mosaic AI, and the Apps platform, Databricks has become more than a data platform—it’s evolving into the operating system for enterprise AI. operating system for enterprise AI. The Neon acquisition pushes AI agents directly into the database application layer, creating a full closed loop. At its core, Databricks’ $100B valuation reflects a combination of sustained capital commitment, solid financial performance, systematic moat-building, and strategic positioning in the AI era. sustained capital commitment, solid financial performance, systematic moat-building, and strategic positioning in the AI era. Its value doesn’t come from storytelling—it comes from the fact that Databricks has become a must-have option for enterprise AI. must-have option for enterprise AI.