How semantic-based knowledge graphs accelerate the value of data lakes

We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 – 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!

Democratizing data and generating insights have never been more important to achieving a competitive advantage. Whether performing advanced analytics to drive decision-making or modeling complex relationships against data that is both too wide and big to describe people, places, things, and how they relate, knowledge graphs are making a difference in how information is found, used, and leveraged. 

We may not realize it, but we are actually using a knowledge graph when searching Google for things such as a nearby restaurant that offers live music on a Tuesday. So it’s no surprise that Enterprise Knowledge Graphs (EKG) are gaining in popularity in the workplace as well. By helping extract, relate, and deliver knowledge as answers, and then make recommendations and insights to every data-driven application, EKGs structure an organization’s information so they can supercharge BI and analytics and generate better results from chatbots and recommendation engines. 

However, a knowledge graph that is enabled with a semantic layer can take these benefits one step further by providing organizations with the foundation for an enterprise data fabric architecture. This combination allows cross-functional, cross-enterprise, and/or cross-organizational teams to ask and answer complex queries across domain silos and then make data sharing easy and accessible in support of existing and future organizational needs. It may also be why Gartner believes that by 2024, data fabric deployments will quadruple efficiency in data utilization while cutting human-driven data management tasks in half.

Pulling back the curtain: Why data fabrics are gaining in popularity

Massive advances in data management enable businesses to achieve untapped value by leveraging and connecting data from both inside and outside their organization. Open standards in the form of prescribed ontologies (i.e. semantic models) from FIBO in Financial Services to D3FEND in the cybersecurity domain are further promoting both data sharing and the data fabric concept and are making data re-use even more possible. 

The premise behind the semantic layer concept is not new, as it came on the scene more than three decades ago thanks, in part, to BI vendors who were building purpose-built dashboards. Like other proprietary systems, adoption stalled, largely because they were too rigid, too complex and they suffered from the same limitations as a physical relational database system which models data to support its structured query language rather than to support how data is related in the real world (i.e. many-to-many). Likewise, the rigidity of relational or graph database structures made it difficult to link and network the complex relationships contained within data warehouses and data lakes without changing the underlying data. 

In response, organizations are turning to new approaches and are applying technologies like a knowledge graph powered semantic data layer to act as the go-between with an organization’s storage and consumption layers. Acting as both the glue and the multiplier that attaches all data, the knowledge graph delivers value to citizen data scientists and analysts in the context of the actual business use-case without the need for additional IT involvement. 

Data fabric in action: How a multi-carrier insurance company improved decision making

Similar to many large organizations, insurance companies face a number of challenges when it comes to data. Most notably is the lack of access to the internal and external sources users need for everyday decision-making. From policy administration to claims management to underwriting risk assessment, insurance professionals need a variety of data to perform daily tasks and are joining other industries in their efforts to make data FAIR — Findable, Accessible, Interoperable and Reusable.

Like others, the journey to FAIR for this multi-carrier insurance provider began by accumulating all of their data sources into a data lake regardless of its types. Once amassed, they began the process of cleansing, transforming, and disambiguating the data before moving to the data harmonization stage, which involved connecting data based on its meaning to deepen its context. They then added a semantic layer, delivered by a knowledge graph, to provide a connected fabric of cross-domain insights and to shift the focus to data analysis and processing so underwriters, risk analysts, agents and customer service teams could manage risk and deliver an exceptional customer experience.

Why a semantic data layer is key to increasing productivity & develop insights

By organizing data in a knowledge graph, data scientists can significantly decrease the amount of time they spend wrangling data from external sources in support of ad hoc data analysis. New relationships between entities can be inferred without explicitly modeling them into the knowledge graph using techniques like “statistical inference,” which is the act or process of deriving logical conclusions from premises known or assumed to be true, and/or “logical inference,” which is the process used in natural deduction to deduce the validity of statement forms from other statement forms by use of proof rules. 

Moreover, the resulting semantic layer can serve as the enterprise data fabric foundation, providing enterprise-wide knowledge in support of new data-driven initiatives. This may be why Gartner said in the above-mentioned report that data fabric’s real value exists in its ability to make recommendations for more, different, and better data, reducing data management by up to 70%.

Taking the time to apply a semantic data layer is key to providing knowledge workers with critical just-in-time insight across a connected universe of data assets. By embarking on the data fabric journey, organizations can supercharge analytics and accelerate the value of earlier data lake investments by making the information available and usable to a broader audience. More importantly, it also allows data and analytics teams to focus on a singular business need quickly and easily, and then evolve the effort into an enterprise data fabric as organizational maturity scales to harness the power of all of their data assets. 

Navin Sharma is VP of Product at Stardog.

Originally appeared on: TheSpuzz