Inheritance in SQL to Power Complex Analytics

7 minutes reading

Introduction

One core of data management in relational databases lies in accurately modeling real-world hierarchies, a task that becomes exponentially more challenging with the concept of inheritance. Traditional SQL databases are not inherently designed to conveniently handle inheritance, leading to inefficient data representation and cumbersome query mechanisms. This inefficiency is a significant roadblock to fast delivery of analytics.

Timbr makes this task intuitive and easy to address, enabling data modeling with SQL Ontologies that represent the business model and encapsulate the complexity of relationships, so inheritance queries become much simpler and shorter.

The Challenge

Representing inheritance in SQL databases is loaded with challenges, primarily due to the relational model’s limitations in directly supporting hierarchical data structures.

The essence of the problem lies in modeling data in a way that reflects real-world entities and their relationships without compromising on the performance and scalability of database operations.

Traditionally, three main approaches have been employed, each with its set of trade-offs:

1. Single Table Inheritance: This approach involves using a single table to represent all classes in an inheritance hierarchy. While it simplifies the schema and avoids joins, it leads to tables with numerous nullable columns, which can become sparse and inefficient as the hierarchy grows. This inefficiency impacts query performance and complicates data integrity constraints.

2. Class Table Inheritance: Also known as table-per-class hierarchy, this model creates a separate table for each class in the hierarchy. It improves data integrity and avoids the sparsity issue of single table inheritance. However, it introduces complexity in querying, as retrieving a complete object hierarchy requires multiple joins across several tables, impacting performance and complicating query logic.

3. Concrete Table Inheritance: In this model, each concrete class in the hierarchy has its table, which includes columns for all attributes of the class, including those inherited from any parent classes. This approach eliminates the need for joins when querying instances of a concrete class but at the expense of data redundancy and the potential for inconsistency. Additionally, it complicates operations involving polymorphic associations and queries across different types in the hierarchy.

Each of these approaches attempts to balance the representation of inheritance with the operational efficiency of the database. However, they often require significant trade-offs in terms of complexity, performance, and maintainability, presenting a substantial barrier to effective data management and analytics.

Semantic SQL

Timbr intelligent semantic layer introduces a paradigm shift in how inheritance is modeled in SQL databases. By abstracting the complexities of relational data models into a more intuitive and flexible semantic framework, Timbr allows for a natural and efficient representation of hierarchical and polymorphic relationships.

Timbr semantic abstraction layer acts as a middleware between the user and the database, enabling the seamless translation of complex inheritance structures into optimized semantic SQL queries without the need for manual schema design or query optimization. This approach not only streamlines the modeling process but also significantly enhances query performance and scalability, as the semantic layer intelligently manages the intricacies of inheritance and polymorphism behind the scenes.

Semantic SQL is standard SQL used for querying SQL ontologies mapped to the data instead of directly querying the database. By querying the ontology’s concepts, users benefit from semantic graph and reasoning features: explicit relationships (one to one, one to many, many to many), hierarchies, inheritance and inference. Semantic SQL queries are considerably less complex and query size is reduced significantly.

The beauty of Semantic SQL is that the coder of a query still uses SQL capabilities to express the business questions from the analytical point of view, so coders can join elements and perform calculations, but they don’t need to write and debug the part of the query that is handling connections and expression of relations between the tables.

Semantic SQL enables users to express more with fewer logic, because the logic is implicit in the semantic model. For coders and tools that write/generate queries there is no major change, it is just easier and more intuitive.

Expressing and Querying a Sample Business Model using Semantic SQL

Let’s show how Semantic SQL enables inheritance, as well as acceleration of analytical workflows and time to value. For our example we use the Crunchbase open dataset. Crunchbase is a platform for finding information about private and public companies. Crunchbase information includes investments, mergers, acquisitions and funding information, founding members and individuals in leadership positions. We downloaded a 2013 dump composed of 11 tables and stored it in MySQL.

The Crunchbase ontology was modeled and mapped to the 11 tables following a simple automated process that can be done manualy no-code, or by coding. The resulting model with the highest hierarchy of concepts that represent the data in business terms is shown below.

Crunchbase SQL ontology high level concepts
Crunchbase SQL ontology high-level concept's tree

Inheritance

To explain the benefit of using “inheritance” and how it works we need some explanation of the underlying model logic.

Our business model or ontology uses abstract concepts (things) to represent the mapped data (the “real world”). An inheritance is a relationship between two or more hierarchies of concepts in which the attributes and relationships of the parent concepts are inherited to the lower hierarchy. This is very similar to inheritance in OODBs and PLs such as C++.

Example of inheritance relationship

Alcoholic and Non-Alcoholic Drinks inherit from Beverage, since (necessarily) anything that is a member of the Drinks class is a member of the Beverage class. The inheritance relationship is used to create a hierarchy of classes, typically with a maximally general class named thing at the top, and very specific classes like Orange Juice at the bottom. The critically important consequence of the inheritance relation is the inheritance of properties from the parent (inheriting) class to the child (inherited) class. Thus, anything that is necessarily true of a parent class such as Beverage is also necessarily true of all of its inherited child classes, one of them being Orange Juice.

In our model, the ontology hierarchy starts with concept thing. Continuing the hierarchy tree, we have organizations that can be companies (investee companies such as startups) or financial organizations (investors such as VCs funds).

Crunchbase ontology organization branch concepts

Let’s define an inheritance relationship in our model:

Using our ontology modeler, we use a SQL CREATE command to define an inheritance relationship (using the semantic SQL INHERITS command) such that: advertising company: thing -> organization -> company -> advertising_company (the blue labeled inheritance relationships were already in our model)

By defining this inheritance, we no longer need to specify characteristics of advertising_company besides a new category code: advertising. The rest of the characteristics are already implicit because of the previously defined relationships (organization->company). There can be numerous levels of hierarchy, and the inheritance of characteristics is implicit in the hierarchy.

Now, let’s see the benefits of inheritance when we create a query.

Query Example:

Our model includes the relationship between concept person and concept organization (has employee / works at). Since company inherits the characteristics of organization, it also inherits the relationships of organization with concept person.

Crunchbase SQL ontology person-organization relationships

Our query: we want to find all the persons that work in advertising companies.

We make use of the implicit hierarchy to query the advertising_company concept directly, and use the relationship has_employee to find the company’s employees:

The (partially displayed) results:

Inheritance query results

If we would query using relational SQL (query of the database), we can see the complexity of the inheritance that we would need to express:

So, we can clearly see that by querying the business model we reduce the complexity of the query and more importantly: this complexity is saved not only from the person doing this query but from all other persons that require to do queries that express the inheritance.

Transitive Relationship

Transitivity reasoning corresponds to relationships chained by a property.

Example of transitive relationship

We define relationships located at between: (i) tourist attractions Louvre and Eiffel Tower, and the city of Paris; (ii) between cities Marseille and Paris, and the country France, and (iii) between countries Italy and France, and the continent Europe. Transitivity of the relationship located at allows us to infer that the Eiffel Tower is located in Europe without need of explicitly express the intermediate relationships.

In our model we created transitive relationships that enable simplification of ownership detection.

Query example:

We want to find all the companies (direct or not) that were acquired by Microsoft up to a 3 level (a company purchased by a company purchased by Microsoft).

Since a company can be acquired and/or purchased, we use it in a transitive way. We can traverse the tree in levels using the * sign, and then the number of levels we want to search in the hierarchy.

The (partially displayed) results:

Transitive relationships query results

If we would query using relational SQL (query of the database), we can see the complexity of the transitivity that we would need to express:

Creating queries that require transitivity relationships is increasingly difficult as the chain of relationships is extended. Timbr semantic layer offers the best alternative to eliminate this complexity and does so in standard SQL, without need of transforming data or acquiring new skills.

Conclusion

Adopting Timbr for modeling inheritance brings numerous advantages to the table. It simplifies the data modeling process, enabling developers to represent complex real-world relationships more naturally and accurately. This simplification leads to cleaner, more maintainable code and reduces the likelihood of errors. Furthermore, Timbr’s efficient handling of inheritance structures significantly improves query performance, enabling faster, more reliable analytics. Perhaps most importantly, Timbr’s semantic layer facilitates greater flexibility and interoperability, making it easier to adapt and extend data models as requirements evolve. These benefits collectively result in accelerated development cycles, reduced maintenance overhead, and enhanced analytical capabilities, making Timbr an invaluable tool for organizations looking to leverage complex analytics.

How do you make your data smart?
timbr virtually transforms existing databases into semantic SQL knowledge graphs with inference and graph capabilities, so data consumers can deliver fast answers and unique insights with minimum effort.

Thank You!

Our team has received your inquiry and will follow up with you shortly.

In the meantime, we invite you to watch demo and presentation videos of Timbr in our Youtube channel:

Partner programs enquiry

The information you provide will be used in accordance with the terms of our

privacy policy.

Schedule Meeting

Model a Timbr SQL Knowledge Graph in just a few minutes and learn how easy it is to explore and query your data with the semantic graph

Model a Timbr SQL Knowledge Graph in just a few minutes and learn how easy it is to explore and query your data with the semantic graph

Graph Exploration

Register to try for free

The information you provide will be used in accordance with the terms of our privacy policy.

Talk to an Expert

The information you provide will be used in accordance with the terms of our

privacy policy.