Frequently Asked Questions
General
Timbr provides a graph-style layer featuring ontology modeling and semantic reasoning in SQL, for virtual integration of data from diverse data sources, including data lakes and data warehouses. Timbr introduces a new paradigm for knowledge representation and machine learning support of data consumption, alleviating difficulties in integration and data management and providing conventional SQL-fluent databases with relationship-rich, smart knowledge graph capabilities. Timbr enables querying data from abstracted views with no-joins SQL queries that save time and effort.
A SQL Knowledge Graph is the implementation of ontologies and graph theory in standard SQL. It has three components: (i) a virtual SQL ontology of connected, context-enriched concepts with inference capabilities and graph analytics features; (ii) a mapping of the virtual SQL ontologies to existing databases accessible in SQL and, (iii) a query execution engine that translates SQL queries of the ontology into SQL queries pushed down to the underlying databases. The SQL Knowledge Graph closes the gap between knowledge representation and enterprise databases/legacy systems/data warehouses/data lakes, to conveniently enable smart, semantic data fabrics and digital twins without need to change DBMS infrastructure.
Timbr offers a fast, easy and no-risk implementation of the semantic graph. The main reasons are that there’s no need to move data or learn any new proprietary query languages to work with the Knowledge Graph.
Modeling a SQL ontology can be done either manually or automatically from an ERD, OWL ontologies, or from data catalogs.
The mapping of the data to the Knowledge Graph is also done either manually or semi-automatically.
Conceptual modeling is a representation of the real world. It is the first step of data modeling, a method developed to help with the design of databases and defining a formal vocabulary for the organization.
The process leading to the actual modeling and creation of databases leaves out information that is key to understanding and using data effectively. To make up for this information left behind, enterprises require coding complex queries in complex applications.
Ontologies are an effective means to re-create the information left behind, giving back business meaning to the data, simplifying data access and delivering unique analytical capabilities.
An ontology defines a common vocabulary for an organization that needs to share information in a domain.
This includes machine-interpretable definitions of basic concepts in the domain and relations among them.
An ontology is structured as a graph, where every node on the graph represents a “concept.”
A concept could be anything: Person, Place, Customer, Car, Country, Product, Event etc.
SQL ontologies are ontologies that implement the Semantic Web in SQL and are designed to provide common business meaning to data distributed in varied sources and enable them as concepts with inference and graph traversal capabilities to facilitate discovery, use and access to data.
With Timbr, you can model and explore your ontology visually or in standard SQL. The SQL Ontology is exposed to the SQL user as a virtual schema with virtual tables (concepts) using any SQL client with JDBC/ODBC.
Graph Data Exploration is a unique feature of Timbr that automatically transforms non-graph data into a virtual graph to allow users to visualize and explore data relationships as a network. Data consumers, business analysts and domain experts can discover insights and answer questions without need of any coding.
Semantic SQL is simple to create SQL queries with no Joins or Union statements. The semantic SQL queries are formulated in standard SQL and query the semantic business model (ontology) mapped to the data, instead of querying the data directly. It is also used to query Views created with the semantic model. Users benefit from a 360° view of data, graph traversals and semantic reasoning features, so SQL queries become easy to understand and query size is reduced significantly.
The Semantic Web is a project devised by Tim Berners Lee and James Hendler (et al), and adopted by the W3C (the manager of the Internet). The Semantic Web implements ontologies so that machines connected to the Web “understand” each other by sharing common meaning of data using a set of standards. The standards developed by the W3C define among others, an ontology modeling language (OWL) and a query language (SPARQL).
Timbr implements the principles of the Semantic Web in standard SQL, meaning that both the ontology modeling and the queries are done in SQL.
Creating a SQL Knowledge Graph is a simple process:
1. Connect your databases to the virtual layer using JDBC connectors.
2. Model the SQL ontology visually or using Timbr SQL DDL statements, or import from other sources (data catalogs, OWL ontologies, ERD tools).
3. Map the ontology concepts to the data.
That’s it, your SQL Knowledge Graph is ready for use and can start delivering unique insights via SQL queries, graph data exploration, your BI tools, or using Timbr’s embedded charts and dashboard module.
The SQL Knowledge Graph serves as a virtual graph for all the enterprise data engines to simplify SQL queries, implement graph traversals in SQL and manage the enterprise data conveniently to deliver advanced analytics conveniently and with minimum effort. Organizations use it to integrate, analyze and explore their data sources and silos of information without the need to move or transform data. Data consumers benefit from a 360° access to data to get fast answers to key business questions. By querying concepts instead of the tables, SQL queries are reduced in length and complexity significantly. The SQL Knowledge Graph seamlessly integrates with popular business intelligence tools so business analysts can focus on the business questions and derive deeper insights.
Timbr is not a database. Timbr is a platform used for creating virtual SQL Knowledge Graphs that enable semantic (ontology-based) graph capabilities on existing data engines (data warehouses and data lakes). The SQL Knowledge Graphs integrate data sources into a semantic data fabric queryable in SQL. Timbr does not require to copy or transform data (no ETL operations), no new DBMS infrastructure and no new skills as required by graph databases.
Exploring relationships is done by using Timbr’s Graph Data Exploration module.
Timbr enables knowledge representation and reasoning – a field of artificial intelligence, in SQL.
Timbr’s AI features include:
- Logical concept/object compositions combining objects or data types into more complex ones.
- Contextual modeling that enables contextual adaptation required to construct models for classes of real-world phenomena.
- Composition of semantic relationships.
- A semantic reasoner that infers knowledge from semantic relationships.
Knowledge representation is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can utilize to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. Knowledge representation incorporates findings from psychology about how humans solve problems and represent knowledge in order to design formalisms that will make complex systems easier to design and build. Knowledge representation and reasoning also incorporates findings from logic to automate various kinds of reasoning, such as the application of rules or the relations of sets and subsets (Wikipedia).
Translated for the use of enterprises, knowledge representation provides the most efficient means for non-technical data consumers to access and retrieve data stored in databases (in the form of tables and columns for example), using abstract concepts that represent the real world (known as semantics in knowledge representation), such as “customer”, “product”, “employee”, “asset”, etc. This need arises from the fact that databases do not provide a means to use such concepts to give uniform meaning to data, because real world concepts such as “customer”, “product”, etc. are usually contained in several tables and columns or even multiple databases which are inaccessible to non-technical data consumers.
A virtual semantic graph is a layer that virtually integrates heterogeneous data from many sources and provides a common meaning to linked datasets. It focuses on the relationships between entities, is able to infer new knowledge out of existing information and makes possible advanced graph analytics. Timbr delivers these and additional data management capabilities.
Gartner describes the data fabric architecture as the means of supporting “frictionless access and sharing of data in a distributed network environment.” To make this architecture work, it is necessary to implement a means to understand and assembly heterogeneous data by providing it with business meaning and flexibly integrating sources of any structural type. This is challenging due to several reasons, starting from the constraints of the architectural alternatives, and ending with the complexity to understand and query the aggregated sources.
The available options haven’t changed much in the last decade: Consolidating data into a data warehouse or graph-structured database, federating reporting, and data federation. Each alternative introduces implementation and operational constraints that involve investments in infrastructure and maintenance and varied skill sets. Moreover, till now there hasn’t been an integrative solution that resolves both the architectural and the “last mile” usability challenges, that is, encapsulating the complexity of the solution so data consumers across the enterprise conveniently gain a 360° view of data and easily generate complex queries to deliver advanced analytics and reporting conveniently and fast.
Data virtualization solutions are a viable solution to this challenge but their use is costly because of the continuous maintenance required to maintain indexes and because of the lack of relationship-rich semantic capabilities which are key to reduce complexity for end users and speed up analytics that make use of dynamic data sources.
The semantic data fabric is a flexible, reusable layer and set of data services used as the single source providing universal meaning and context to data for the entire organization. The data fabric integrates on-premise and cloud data sources in use by the organization, handing them semantic capabilities to provide answers to complex queries and to facilitate understanding and use of data. It provides consistent capabilities across on-premises and multiple cloud environments to accelerate digital transformation. Timbr enables the fastest and most convenient implementation of semantic data fabric connected to your cloud and on-premise databases and business intelligence tools. Contact us to schedule a demo.
Timbr’s Master Data Management is stored in a git. Users can connect their own git, or instead use the git server that comes with Timbr. All the different tables will be stored in the git. When querying the data later users don’t need to query the git each time, instead, they can cache the data in Timbr.
SQL access to graph based solutions means that queries are created in SQL and then converted into SPARQL or other graph query language. For the most part, this means that there’s a need to create in advance a set of SPARQL queries or components which relate to SQL query statements, so constant maintenance and availability of SPARQL skilled personnel is required. In addition, the data is stored in a graph database so there’s a constant need to ETL data from the source to graph format.
In the case of a SQL knowledge graph, semantic SQL queries use the standard SQL commands and are not transalated into a separate graph query language. The data itself remains on site without need to be transformed.
A knowledge graph is a knowledge base that uses a graph-structured data model or topology to integrate data. Knowledge graphs are often used to store interlinked descriptions of objects, events or concepts with free-form semantics. Knowledge graphs use ontologies to put data in context via linking and semantic metadata providing a framework for data integration, unification, analytics and sharing.
They are also prominently associated with and used by Google, Bing, and Yahoo, and with question-answering services such as Google Assistant, Siri and Alexa. All these examples were developed with proprietary tools.
For organizations that look to benefit from knowledge graphs, the available solutions in the market require significant changes in their IT departments. This is due to the fact that most of the data in the world is stored in formats that are not compatible with the format in which data is stored in knowledge graphs, so data needs to be extracted from its current DBMS, transformed to a new format and loaded into a separate, suitable DBMS. Another reason being is that to use knowledge graphs, data engineers and consumers need to acquire news skills to model in OWL and query in SPARQL.
Different from most other solutions, the Timbr SQL Knowledge Graph platform creates a virtual layer that works in standard SQL to seamlessly connect to existing databases and is implemented without requiring new skills.
Contact us to learn how Timbr can help your organization join the knowledge revolution.
Yes. Timbr’s default implementation for graph algorithms is networkX and it happens automatically, meaning that when a user writes an SQL query Timbr automatically runs the algorithm behind the scenes. Timbr also supports Nvidia’s Cugraph (GPU) enabling graph algorithms with advanced performance.
- Computing power depending on the application size:
Small-Medium: 4 CPU 16 GB RAM server.
Large : 8 CPU 32 GB RAM server. - Deployment options (Docker or Kubernetes):
Docker/Linux: Linux image or automatic deployment via docker-compose can be installed on any Linux server (you can extend our YAML to add your security protocols, configurations, and customizations. - Platform requirements:
List of databases with type and version information to validate connectors.
Timbr MySQL metadata DB (mandatory – configurable as a container or managed externally).
Use Cases
A semantic data catalog is an intelligent catalog/inventory of data assets that automatizes sharing common meanings of data across data silos and provides a means to define hierarchies and relationships featuring semantic reasoning. It serves as a queryable, AI-enabled knowledge encyclopedia of the organization. Timbr enables the fastest and most convenient implementation of semantic data catalogs connected to your databases and business intelligence tools, and can leverage existing data catalog solutions such as Collibra or Informatica. Contact us to schedule a demo.
The semantic data fabric is a flexible, reusable layer and set of data services used as the single source providing universal meaning and context to data for the entire organization. The data fabric integrates on-premise and cloud data sources in use by the organization, handing them semantic capabilities to provide answers to complex queries and to facilitate understanding and use of data. It provides consistent capabilities across on-premises and multiple cloud environments to accelerate digital transformation. Timbr enables the fastest and most convenient implementation of semantic data fabric connected to your cloud and on-premise databases and business intelligence tools. Contact us to schedule a demo.
A digital twin refers to a digital replica of potential and actual physical assets, processes, people, places, systems and devices that can be used for various purposes. The digital representation provides both the elements and the dynamics of how an Internet of things (IoT) device operates and lives throughout its life cycle.
Digital twins have two important characteristics.
1. each definition emphasizes the connection between the physical model and the corresponding virtual model or virtual counterpart.
2. this connection is established by generating real-time data using sensors.
Timbr helps enterprises create digital twins by enabling the definition of the virtual model using SQL ontologies and by connecting the virtual model to data lakes that contain the sensor’s data. Contact us to schedule a demo to see why Timbr facilitates the fastest and most convenient implementation of digital twins.
Gartner describes the data fabric architecture as the means of supporting “frictionless access and sharing of data in a distributed network environment.” To make this architecture work, it is necessary to implement a means to understand and assembly heterogeneous data by providing it with business meaning and flexibly integrating sources of any structural type. This is challenging due to several reasons, starting from the constraints of the architectural alternatives, and ending with the complexity to understand and query the aggregated sources.
The available options haven’t changed much in the last decade: Consolidating data into a data warehouse or graph-structured database, federating reporting, and data federation. Each alternative introduces implementation and operational constraints that involve investments in infrastructure and maintenance and varied skill sets. Moreover, till now there hasn’t been an integrative solution that resolves both the architectural and the “last mile” usability challenges, that is, encapsulating the complexity of the solution so data consumers across the enterprise conveniently gain a 360° view of data and easily generate complex queries to deliver advanced analytics and reporting conveniently and fast.
Data virtualization solutions are a viable solution to this challenge but their use is costly because of the continuous maintenance required to maintain indexes and because of the lack of relationship-rich semantic capabilities which are key to reduce complexity for end users and speed up analytics that make use of dynamic data sources.
The semantic data fabric is a flexible, reusable layer and set of data services used as the single source providing universal meaning and context to data for the entire organization. The data fabric integrates on-premise and cloud data sources in use by the organization, handing them semantic capabilities to provide answers to complex queries and to facilitate understanding and use of data. It provides consistent capabilities across on-premises and multiple cloud environments to accelerate digital transformation. Timbr enables the fastest and most convenient implementation of semantic data fabric connected to your cloud and on-premise databases and business intelligence tools. Contact us to schedule a demo.
A knowledge graph is a knowledge base that uses a graph-structured data model or topology to integrate data. Knowledge graphs are often used to store interlinked descriptions of objects, events or concepts with free-form semantics. Knowledge graphs use ontologies to put data in context via linking and semantic metadata providing a framework for data integration, unification, analytics and sharing.
They are also prominently associated with and used by Google, Bing, and Yahoo, and with question-answering services such as Google Assistant, Siri and Alexa. All these examples were developed with proprietary tools.
For organizations that look to benefit from knowledge graphs, the available solutions in the market require significant changes in their IT departments. This is due to the fact that most of the data in the world is stored in formats that are not compatible with the format in which data is stored in knowledge graphs, so data needs to be extracted from its current DBMS, transformed to a new format and loaded into a separate, suitable DBMS. Another reason being is that to use knowledge graphs, data engineers and consumers need to acquire news skills to model in OWL and query in SPARQL.
Different from most other solutions, the Timbr SQL Knowledge Graph platform creates a virtual layer that works in standard SQL to seamlessly connect to existing databases and is implemented without requiring new skills.
Contact us to learn how Timbr can help your organization join the knowledge revolution.
Yes. Timbr’s default implementation for graph algorithms is networkX and it happens automatically, meaning that when a user writes an SQL query Timbr automatically runs the algorithm behind the scenes. Timbr also supports Nvidia’s Cugraph (GPU) enabling graph algorithms with advanced performance.
Data mesh tries to solve three challenges with a centralized data lake/warehouse:
- Lack of ownership: who owns the data – the data source team or the infrastructure team?
- Lack of quality: the infrastructure team is responsible for quality but does not know the data well
- Organizational scaling: the central team becomes the bottleneck, such as with an enterprise data lake/warehouse
While a data mesh aims to solve many of the same problems as a data fabric–namely, the difficulty of managing data in a heterogeneous data environment–it tackles the problem in a fundamentally different manner. In short, while the data fabric seeks to build a single, virtual management layer atop distributed data, the data mesh encourages distributed groups of teams to manage data as they see fit, albeit with some common governance provisions.
Its goal is to treat data as a product, with each source having its own data product manager/owner (who are part of a cross-functional team of data engineers) and being its own clearly-focused domain that has an autonomous offering, becoming the fundamental building blocks of a mesh, leading to a domain-driven distributed architecture.
Another component in a data mesh is data infrastructure as a platform, which provides storage, pipeline, data catalog, and access control to the domains. The main idea is to avoid duplicating effort. This will allow each data product team to build its data products quickly. Note this data infrastructure platform should not become a data platform (it stays domain agnostic).
Timbr provides all the features required to successfully implement the enterprise data mesh. Contact us to learn more.
Modeling
Timbr offers a visual ontology modeler as well as a SQL DDL editor to easily model the ontology. Timbr allows to conveniently generate concepts from the database schema, data catalogs or through importing existing OWL ontologies.
Timbr offers a visual data mapper to manually or semi-automatically select tables and columns from the database, as well as the option for coders to conveniently use SQL DDL statements. Timbr can filter, clean and transfer the data that is been mapped to the ontology. No need for ETL operations.
Timbr implements the Semantic Web in SQL. For this reason there’s a correspondence between OWL ontologies and SQL ontologies that is used by Timbr to make the transformation. One example of such a process is Timbr DBpedia.
Timbr allows creating virtual PKs for concepts (used as unique identifiers), and FK to PKs in the ontology (used as relationships between concepts). As long as the ontology author maps the physical tables PKs to the ontology PKs, client join will follow these declarations. In the ontology, you can create relationships between concepts using FK statements. In each relationship, you specify the properties in the ontology that represent the relationship (used for the JOIN).
Yes, Timbr is accessible in JDBC/ODBC and the ontology can be created programmatically using Timbr SQL DDL statements:
CREATE CONCEPT (extension of CREATE TABLE statement)
CREATE MAPPING (extension of CREATE VIEW statement)
In many cases, we build small scripts to generate parts of the ontology programmatically.
Knowledge representation is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can utilize to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. Knowledge representation incorporates findings from psychology about how humans solve problems and represent knowledge in order to design formalisms that will make complex systems easier to design and build. Knowledge representation and reasoning also incorporates findings from logic to automate various kinds of reasoning, such as the application of rules or the relations of sets and subsets (Wikipedia).
Translated for the use of enterprises, knowledge representation provides the most efficient means for non-technical data consumers to access and retrieve data stored in databases (in the form of tables and columns for example), using abstract concepts that represent the real world (known as semantics in knowledge representation), such as “customer”, “product”, “employee”, “asset”, etc. This need arises from the fact that databases do not provide a means to use such concepts to give uniform meaning to data, because real world concepts such as “customer”, “product”, etc. are usually contained in several tables and columns or even multiple databases which are inaccessible to non-technical data consumers.
Timbr fully complies with OWL DL (the computable closed world version of OWL). Timbr’s SQL re-write engine is an inference engine that automatically generates queries based on inference rules, including inheritance, transitivity clause and others. Defining inference in Timbr can be done through Timbr’s visual interface or via SQL statements. In addition, Timbr adds capabilities that can’t be found in OWL only in extensions such as SWRL or in Spin-Rules. In Timbr users simply add business logic using SQL. Timbr can also express aggregations as ontology concepts, which cannot be done in OWL.
Integrations
Yes, Timbr is compatible with OWL-DL and some OWL-2 inferences.
If there is a clear business value to add more OWL-2 inferences, we can support them as well. Timbr’s inference engine is based on query-rewriting techniques. If Timbr encounters slow queries/performance, Timbr can specifically materialize the part of knowledge that is required.
Yes. Timbr provides a comprehensive solution to integrate multiple databases located in varied locations. In terms of deployment, Timbr is deployed in Kubernetes or Docker at the user’s choices. Timbr also supports multi-cluster deployments so users can deploy Timbr on Azure, GC or AWS. In general, Timbr recommends cloud because of the managed services, though Timbr can also run-on premise. The user can decide whether to run the queries locally on-premise or on the cloud to benefit from Timbr’s multi-cluster deployment.
Timbr connects to all popular data lakes, databases, BI tools, data science tools and notebooks, as well as various applications (APIs).
Once connected, the data can be queried in SQL, Python/R, dataframes, and natively in Apache Spark (SQL, Python, R, Java, Scala). GraphQL can be supported by integrating external open source projects that support the translation of GraphQL to SQL.