Building the Data Warehouse Semantic Layer

Kimball Group puts out some fantastic design tips, and one I especially liked in the last year was one from Joy Mundy (Design Tip #158 Making Sense of the Semantic Layer).  The arrival of this in my inbox was extremely timely because I was at the time involved in a disagreement with a BI consultant who was currently specialized in one of the new breed of BI tools.  His position was that semantic layers were unnecessary and the realm of “traditional, IT-centric reporting environments.”  If I didn’t understand that, well, I just didn’t understand the niche that particular product was fulfilling towards the goal of enabling ad-hoc end user reporting without IT involvement.  The argument started because I was inquiring about the capability of third-party solutions like Kalido of preconfiguring the semantic layer for the tool rather than having to manually configure it.

The cynical side of me leans towards Upton Sinclair’s quote in this area: “It is difficult to get a man to understand something, when his salary depends upon his not understanding it.” In other words, if a BI tool doesn’t yet support the ability to automatically create a semantic layer, it is understandable the vendor takes the approach that a semantic layer isn’t needed—right up until the point they delivery that feature.

In this area, Joy didn’t mince words:

“Is the semantic layer a mandatory component of a DW/BI architecture? The answer is yes, if you plan to open the doors to your DW/BI system for ad hoc use. Every hour you spend on building a rich semantic layer will pay off in improved adoption and success in the user community. Don’t simply run the wizard for your BI tool and generate a semantic layer that’s equivalent to the underlying tables (even if they’re dimensional). Spend the time to make it as rational and polished as you can.”

Let me provide just a few quick examples of how I think this can be useful for any BI tool, even for the new breed of BI tools that pride themselves on not requiring semantic layers.

1)  Let’s say you have 1,000 tables, all with German acronyms.  You already have all of the fields mapped with business friendly names to the unfriendly physical schema, and you even have business definitions for each field, but now it is time for some ad-hoc reporting.  Do you A) turn your end users loose on the repository as is, or B) do you spend a significant amount of time retyping your mappings and definitions into the BI tool import layer to make the fields easier for the end users to understand?  Obviously (B) is what happens in the real world, but why should I have to go through that effort?  Every mapping you make manually in the BI tool is something I should have been able to create automatically if I had direct access to a programmatic API on the BI layer side.

2)  Kalido can create ‘hybrid’ slowly changing dimensions that are consumed by joining to the surrogate key of an object and constraining it at report time with a SQL where clause for the version of time variance you want (the current hierarchy, the hierarchy at time of transaction, or the hierarchy at a specific point in time).  Most BI tools try to auto-join the keys, but in this case, auto-join by itself doesn’t work because the result needs to be further constrained.  For tools with configurable semantic layers, Kalido can provide the join the end user wants automatically. But for less sophisticated BI tools, manual work needs to be done for each join, if the BI tool supports it at all.  For BI tools that don’t support it, you might have to maintain a view layer to resolve the time variance issues to fit with how the tool prefers to see the data.  That’s right — you have to adapt your physical design to the limitations of the BI tool, which greatly conflicts with many of the BI tool messages that they can just accept the data the way it is.

The reason I feel so passionately about this is Kalido is able to automatically generate the semantic layer for leading BI tools with a simple 7-tab wizard we call the Kalido Universal Information Director (see attached image).

Kalido Universal Information Director is a wizard-based tool that automatically generates popular BI tool semantic layers from the business-model-driven Kalido Dynamic Information Warehouse. Both components are part of the Kalido Information Engine.

Kalido Universal Information Director is a wizard-based tool that automatically generates popular BI tool semantic layers from the business-model-driven Kalido Dynamic Information Warehouse. Both components are part of the Kalido Information Engine.

This wizard makes it fast and easy to generate BI tool semantic layers, and because it is reading from the Kalido business information model changes to the underlying warehouse structure can be rapidly reflected in your BI tool. We intend to add support for more BI tools, however we are limited in what we can do if the BI tool vendors themselves don’t have a programmatic API for us to talk to.  Fortunately, as the newer tools evolve and get more success in the enterprise, and their users are faced with significant manual effort to hook these products to larger enterprise data warehouses, we are seeing positive movement in opening up the BI configuration layers to third parties for some of these tools.  As Joy points out, some BI layers are “microscopically thin” while others are “rich and robust,” but Kalido should be able to save you significant design and build work in either case.  And that’s a good thing! This is another example of the automation Kalido brings to data warehousing.

If you are a data warehouse professional and aren’t subscribed to Kimball Design Tips, I highly encourage you to do so.  Even if you are an expert, there are always things to learn.

3 replies
  1. Tim Webber says:

    Stephen, can Kalido generate SAS semantic meta-data? I believe descriptions, table joins, caculated measures ect are all part of the meta-data in their Visual Analytics tool.

    • Stephen Pace says:

      @Tim: We have plenty of customers that use SAS with Kalido; however, at the moment you have to map SAS to the schemas that Kalido generates (just as you would if you had generated the schemas yourself using a custom built approach). While we can’t auto-generate SAS semantic metadata at the moment, we’d certainly consider adding that platform if we get enough customer demand. We’ve recently made a change that makes adding new BI tool platforms easier than it was in the past, but sometimes we are limited in what we can do by how open the target BI API and documentation really is. I haven’t personally looked at the SAS semantic layer to know if it would be easier or harder than other tools we have done like SAP BOBJ, IBM Cognos, or Microsoft SSAS. If you are aware of other solutions that do this for SAS or know of some good documentation that walks through the process, please pass them our way.

      • Tim Webber says:

        Thanks for the reply Stephen. I am in contact with a SAS pre-sales person that has been very helpful and I am sure he can point me in the right direction for SAS documentation. I am not aware of other DW automation software that Generates SAS metadata.

        I really enjoyed this article and the discussion/disagreement you had with the consultant using the “new breed of tool”. I have had similar discussions with enthusiasts using the other popular “new breed tool” over some of its modelling constraints (this tool could not properly model more than one fact table over conformed dimensions without serious hoop jumping). I love the “new breed of tools”. However, some of their developers/enthusiasts have “blinders” to go with those “rose colored glasses”.


Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *