To further strengthen our commitment to providing industry-leading coverage of data technology, VentureBeat is excited to welcome Andrew Brust and Tony Baer as regular contributors. Watch for their articles in the Data Pipeline.
Data catalog and governance provider Collibra today announced the release of a number of new features for its Data Intelligence Cloud platform, as well as new standalone product offerings and integrations with Snowflake, Azure Data Factory, and Google Cloud Storage.
The company says the new releases are designed to improve ease of access to data for more users across an organization. This announcement comes at Data Citizens ’22, a conference hosted by Collibra to bring together industry leaders and data professionals around a range of topics, including data quality, catalogs, privacy, governance and more. VentureBeat corresponded with Laura Sellers, chief product officer at Collibra, regarding these new releases.
New and improved Data Intelligence Cloud
The new features introduced to Collibra’s Data Intelligence Cloud platform are designed to make the platform more accessible to users regardless of technical expertise. The first of these is the addition of a new data marketplace, which provides a shopping-like experience from which users can access a company’s internal datasets. Sellers says that the data marketplace is designed “for the data consumer – the casual user of data who doesn’t need to know the ins and outs of the entire data ecosystem.”
Users search for datasets using a Google-like search interface and datasets are surfaced based on the employee’s domain and role within the organization. Administrators have the capability to design marketplaces based on user roles: for example, creating a marketplace specifically for business analysts, which would surface specifically Tableau, Power BI or Looker reports certified for business use. Admins can also design a marketplace based on data domain; for example, a marketplace specific to the marketing team, which would display only datasets owned and curated by this department.
Learn how to build, scale, and govern low-code programs in a straightforward way that creates success for all this November 9. Register for your free pass today.
The next feature of note is the Usage Analytics dashboard, which tracks platform users as well as the domains, communities and data assets that are being used. Visibility of information on users can be restricted to comply with local data privacy regulations. Views of the data, filtered by date range, can be generated, displaying information on which teams are using the platform and which data assets are being used.
In an example scenario, after setting up a new marketplace for a company’s marketing team, admins could use the Usage Analytics feature to check in on engagement with the marketplace, including which data assets are getting the most use and whether new users are engaging with the data. This information can be used to curate similar data assets that might also be useful to the marketing team. Information about user engagement or non-engagement can be used to reallocate product licenses or to direct resources to understand and remove barriers to engagement.
This release also includes a new homepage for Collibra, which is intended to simplify the user’s experience as well as provide suggestions tailored to a user’s browsing history or display recently popular items.
The final feature being added to Collibra’s Data Intelligence Platform is a tool called Workflow Designer, which helps automate common data management tasks, such as granting users access to datasets, certifying new datasets, and more. Workflow Designer, embedded within the Collibra platform navigation, is a tool that aims to help users more easily build processes. For example, Sellers says, to build a workflow process, users can “drag an icon from the left sidebar for running a script, add it into the workflow process at the right point, load the needed script, etc.”
Workflow Designer also includes a form editor that helps users build forms to gather requisite information for business processes, including defining the information that needs to be gathered, adjusting the form layout, and adding dependencies for form components. A new “Apps” feature within Workflow Designer allows users to pull together “a defined set of processes and forms that are used together to automate a process (e.g., approve access to a dataset),” Sellers says. Once an App is built and verified, it can be exported and deployed into any environment.
New standalone offerings
Let’s move on from that list of features to new standalone offerings. First up is Collibra Protect, which enables no-code policy creation and execution in Snowflake. Policies authored can restrict use or access based on sensitivity level or business purpose; for example, a policy can be created that says that third-party marketing data can only be used for marketing research purposes, thus restricting access to that data to those responsible for conducting market research data.
Data Quality and Observability in the Cloud, the next offering, brings predictive analytics to data quality: this offering proactively identifies data quality issues across data sources through machine learning-generated checks and rules. Sellers says the value-add of “predictive, continuous, self-service data quality” is that it both frees up data professionals to focus on higher-impact tasks and ensures business users can still access high-quality data. Sellers says that Collibra Data Quality & Observability can be run on any cloud and that users can “connect to more than 40 databases and file systems to scan data where it resides via pushdown or pull-up processing.”
Data Quality Pushdown for Snowflake, currently in Beta, is designed for Snowflake customers, to help “eliminate egress charges and dependencies on Spark compute while running their [data quality] jobs,” Sellers says. With this feature, data is never read out of the Snowflake environment, which is designed to improve privacy regulation compliance and eliminate egress fees. “Pushdown is an alternative compute option for running a [data quality] job, where all processing … is submitted to the target data warehouse. To use pushdown, you can run a setup script that creates a dedicated Snowflake Virtual Warehouse and a service account user for DQ job runs. This designated service account user will need read access on all schemas covering the target data. Collibra will provide customers with a Snowflake Pushdown setup script [to] run to use this new feature,” she describes.
In this release, Collibra has also added new integrations with Snowflake, Azure Data Factory and Google Cloud Storage. The integration with Snowflake provides end-to-end visibility of data stored in Snowflake Data Cloud, including column-level lineage and transformations. The integration with Azure Data Factory “automatically harvests and stitches lineage from Azure Data Factory so that [users] can get a complete picture of data flow from source to destination,” Sellers says. The integration with Google Cloud Storage allows retrieving, mapping and ingesting metadata from buckets, directories and files, allowing users to discover and govern Google Cloud Storage data within Collibra.
With its new platform enhancements, offerings and integrations, Collibra says it hopes to make data more accessible to more users. Expanding access to these capabilities, in turn, helps democratize data within organizations, which should help increase benefits to both technical and business users.