Tabular, data platform from Apache Iceberg creators, scores $26M

Head over to our on-demand library to view sessions from VB Transform 2023. Register Here

California-based Tabular, an independent data platform built by the creators of Apache Iceberg, today announced $26 million in a fresh round of funding.

The company said it plans to use the capital to grow its offering and end ‘data lock-in’ by giving enterprises the option to evolve their architecture as needs change and new technologies emerge.

The round has been led by Altimeter Capital, with participation from existing investor Andreessen Horowitz (a16z) as well as Zetta Venture Partners. It takes the total capital raised by Tabular to $37 million and comes as the data ecosystem continues to see massive consolidation, leaving a handful of “full-stack” vendors delivering closed data platforms with tightly coupled storage and compute layers. 

The trend risks rent-seeking behavior that drives up costs and negatively impacts enterprises by stifling innovation.


VB Transform 2023 On-Demand

Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.


Register Now

How does Tabular help?

Back in 2015, Netflix engineers Daniel Weeks and Ryan Blue created the Iceberg table format to address issues with Apache Hive (previously integrated with Netflix infrastructure) and enable improved data storage and query performance. The format provided complete database functionality on top of a cloud object store and was later donated to the Apache Software Foundation (ASF) for open development and foundation.

After Icerbeg was open-sourced, Weeks and Blue worked with another Netflix colleague, Jason Reid, in 2021 to commercialize the same Iceberg-based data platform that ran at Netflix as a managed service.

The platform there was compute-agnostic, treating different compute sources all equally (unlike other monolithic database offerings). This gives enterprises an open standards storage layer that can be attached to any compute engine, allowing them to mix and match according to their needs. The trio used the same approach to launch Tabular in 2021.

“Tabular is an ‘easy button’ for anyone wanting to use Iceberg for their storage,” Blue, now CEO of Tabular, told VentureBeat. “It’s a SaaS offering where the data stays in your account and we provide a simple managed service that secures and optimizes the data. You can think of it as a ‘headless’ data warehouse that can be efficiently queried by multiple engines. Our stance is that we are unopinionated about how you use your data, and therefore will support every compute engine equally well.”

Tabular architecture

As is typical with open source code, Apache Iceberg requires manual data engineering to be set up and used. Tabular goes around that problem as a managed service. It provides configuration-based ingestion for files, streams and CDC for data events, automatically tunes tables, vacuums old files and snapshots, handles data expiration for compliance and many other time-consuming, low-level tasks. Plus, there’s centralized security with role-based access control tied to the storage layer.

“A data engineer would connect their cloud storage to Tabular and connect a compute engine such as AWS Athena or Google BigQuery (both in preview),” Blue explained. “They can then ingest data using our File Loader, mirror relational tables using our CDC capability or stream events to tables using Kafka Connect and access the tables.  We analyze the tables continuously and optimize the file structure accordingly based on the shape of the data and query patterns we observe.”

Over the years, a number of data platforms, most recently Databricks and Snowflake, have embraced open-source Apache Iceberg tables in one way or the other, but Tabular claims none of these provide an independent solution to work with – keeping data hostage.

“All of these companies have their [own] compute engines, so there is a significant risk that they will advantage their in-house compute engines over those of their competitors when it comes to performance enhancements that they release,” Blue noted. “We alone provide an independent, level playing field across compute environments.”

‘Thousands of signups’

While the CEO did not share the exact customer count of Tabular, he did note that the company’s SaaS offering has seen ‘thousands’ of signups since its launch in March 2023.

Among them are a leading real estate marketplace, which is using Tabular to provide data lake security, and one of the world’s largest gaming studios that automates ingestion and optimization with Tabular and then uses the data from Snowflake. 

“These customers are able to get into production with us very quickly, and due to our automatic optimization, they see their query times and storage costs cut by as much as 50%,” Blue said.

With the fresh round of funding, Tabular will scale up these efforts. The company said it will focus on R&D and work towards expanding its served market from AWS to Google Cloud, Azure and MinIO object storage for on-premises and hybrid cloud deployment. 

The support for the Google Cloud Platform, with GCS for storage and Google BigQuery for compute, is now being previewed, it said.

Originally appeared on: TheSpuzz