To further strengthen our commitment to providing industry-leading coverage of data technology, VentureBeat is excited to welcome Andrew Brust and Tony Baer as regular contributors. Watch for their articles in the Data Pipeline.
Voltron Data, provider of enterprise support services for Apache Arrow, today announced that it is adding an enterprise support offering for Ibis, an open source Python framework for standardizing analytics and queries across multiple backends. The move is a next step in building out Voltron Data’s portfolio of enterprise support offerings and furthers the company’s goal of continuing to invest in open source technologies. This announcement comes at The Data Thread, a virtual learning event hosted by Voltron Data to highlight Apache Arrow and its practical applications, as well as to bring together practitioners, industry leaders, and members of the Arrow community.
VentureBeat spoke with Voltron Data CEO Josh Patterson and discussed the importance of creating standards and communication across systems to eliminate pain points and reduce duplication of effort.
Apache Arrow and Ibis are both designed to provide standardized solutions to common pain points. Apache Arrow is an open source project that was developed to provide a unified format for in-memory representation of vector data, which eliminates the need for data to be serialized and deserialized between multiple formats. Eliminating that duplication of work generates efficiencies and, as Patterson said, “just allows developers to focus on things that they want to focus on.”
Ibis is a Python dataframe API and it provides a standardized framework to access data and query multiple backends using Python. The backends supported consist of both SQL databases and big data analytics databases, including Google BigQuery, Heavy.ai (formerly OmniSci), PostgreSQL, MySQL, and others. Ibis allows developers to write one kind of code to target multiple systems and switch between backends if needed, while keeping the same framework. Patterson said, “it gives users this freedom to not have to keep rewriting their code if the way that they do compute changes.”
“The more we have people building on these standards like Arrow [and] Ibis, the more we can start building better and better systems,” said Patterson. “There’s a lot of waste in the ecosystem serializing data, there’s also a lot of waste just re-writing code, over and over again as systems change, and so these kind of standard front-ends are really important so we can start to reduce these inhibitors to innovation.”
A natural extension
Voltron Data was launched in February, 2022 with an enterprise support offering for Apache Arrow. Voltron Data co-founder and CTO Wes McKinney is also the co-creator of Apache Arrow, Ibis, and the Python pandas project, and a committer and member of the project management committee for Apache Parquet. The company says the decision to introduce enterprise support for Ibis came in part as a result of high customer demand. Patterson said that “the same Arrow customers are asking… well what about Ibis? They see that the same people are maintaining these projects” and as a result “we just felt that it was a natural extension.”
The efficiencies and advantages generated by Apache Arrow and Ibis stem from their ability to provide a unified solution to users’ pain points. The more widely a standard is adopted, the more value it tends to bring, with Pandas and its dataframe paradigm a case in point. Whether Ibis can achieve comparable adoption and value isn’t clear, but Voltron’s move makes sense, and should help to standardize data-oriented software development rather than fragment it.