We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 – August 3. Join AI and data leaders for insightful talks and exciting networking opportunities. Learn more about Transform 2022
Today, Google opened up their Data Cloud Summit with a bevy of announcements of new products and enhancements designed to help data scientists leverage the power of the Google Cloud Platform to perform data science. The company invested heavily in artificial intelligence over the years and its new products can help companies and users make sense of the flood of data with both traditional analysis and machine learning.
“Data is probably on the top of the agenda of every C-suite on the planet,” explained Gerrit Kazmaier, general manager and VP for databases, data analytics and looker at Google Cloud. “Every company is a big data company. It is multiformat. It’s streaming and it’s everywhere.”
Google wants to compete for that demand with its cloud platform by offering sophisticated tools for applying artificial intelligence and machine learning. At the same time, it’s nurturing an open ecosystem so that companies can use and share data from wherever it may be captured. The new releases emphasize breaking barriers between clouds from different merchants and also self-hosting options by the customers.
This open strategy can help Google battle with large competitors like Amazon or Microsoft. Amazon’s Web Services offers close to a dozen different options for data storage and these are all tightly integrated with many platforms for data analysis with traditional reports or machine learning. Microsoft’s Azure also offers a wide range of options that leverage their deep history with enterprise computing.
Google’s BigLake platform is designed to work with data across various clouds, both stored locally on premises and in commercial clouds, including its competitors. The service can offer enterprises a chance to unify their data warehouses and lakes in one multi-cloud platform.
In the past, many companies created data warehouses, a well-governed model that combined good report generation with solid access control. Lately, some have been using the term “data lake” to describe systems that are optimized more for large size than sophisticated tools. Google wants to absorb these different approaches with their BigLake model.
“By bringing these worlds together, we take goodness of one side and apply it onto the other side and that way you just make your storage infinite,” explained Sudhir Hasbe, a director at Google’s Cloud. “You can put as much data as you want. You get the richness of the governance and management that you want in your environment in a vastly changing regulatory environment. You can store all the data and manage it and govern it really well.”
One part of Google’s strategy is to create the Data Cloud Alliance, a collaboration between Google and Confluent, Databricks, Dataiku, Deloitte, Elastic, Fivetran, MongoDB, Neo4j, Redis and Starburst. The group wants to help standardize open formats for data so that information can flow as easily as possible between the different clouds across political and corporate barriers.
“We are excited to partner with Google Cloud and the members of this Data Cloud Alliance to unify access to data across clouds and application environments to remove barriers to digital transformation efforts,” said Mark Porter, CTO at MongoDB. “Legacy frameworks have made working with data hard for too many organizations. There couldn’t be a more timely and important data initiative to build faster and smarter data-driven applications for customers.”
At the same time, Google must also watch a growing number of smaller cloud vendors like Vultr or DigitalOcean that offer prices that are often dramatically lower. Google’s deeper commitment to artificial intelligence research allows them to offer much more sophisticated options than any of these commodity cloud vendors.
“The one thing that sets Google truly apart is that we believe in developing one-of-a-kind technical products,” said Kazmaier. “Our mindset for innovation is rooted and understanding the data is a vast and limitless resource if you harness it in the right way. Most importantly, you need to have an open ecosystem around it for it to be successful.”
The Vertex AI Workbench is a tool that integrates Jupyter notebooks with the major components of Google’s Cloud, from data processing instances to serverless to the event-driven tools like Spark. The tool can draw information from any of these sources and feed it into analytic routines so data scientists can search for signals in the data. It becomes provisionally available in some regions on April 6th and everywhere by June.
“At Google Cloud, we’re removing the limits of data clouds to further cose the Data-to-AI-Value gap.” said June Yang, VP of cloud AI and innovation at Google. “This capability enables teams to be able to build and train and deploy models five times faster than traditional notebooks.”
The company also wants to encourage teams and businesses to share some of the AI models that they create. The Vertex AI Model Registry, now in preview, will offer a way for data scientists and application developers to store and repurpose AI models.