Live and Learn… The Cost of Industry Standard Data Models

In a previous life I pushed for the development of industry data models at my employer… a large data warehousing company. The company was not interested so I left, started running a consulting firm, built a comprehensive Telco Data Model, and sold it to my ex-employer. It was the first of what is now a suite of industry-standard models offered by the firm.

Over time I have become less-enamoured of these industry models. First, 15 years ago, I became concerned that too many IT Enterprise Data Modelers were trying to force their companies to all speak the same enterprise langauge. I spoke at conferences suggesting that an Enterprise Model was a canonical model… a Rosetta stone… that let all of the specific dialects of the enterprise: marketing-speak, and finance-speak, and manufacturing-speak be integrated and translated. It was not meant to be spoken. I said that the Enterprise Data Model was like Esperanto… the best language ever invented… but that no one needed to speak it as English, Urdu, and Cantonese work just fine.

But as time has gone on the dialects in any given Enterprise have lost some of their uniqueness and there is a larger common dictionary available. This is in part the result of data warehousing… which provides a common tongue. And in part the result of MDM, which seeks to define a dictionary and grammar for the Enterprise. But I think that it is just the widespread use of a single relational paradigm more than anything else that provided a relational grammar to standardize how we describe Enterprise data. The result is more that each company has formed their own unique Enterprise language and the departmental dialects have softened.

So we were both a little right and wrong 15 years ago. The Enterprise modellers were right to believe that a universal Enterprise language would be a good thing. I was maybe correct in believing that no language should be imposed.

What has this to do with standard data models? First,  a standard model imposes an outside language on your corporation. The nouns and verbs map to the business… but the sound feels a little foreign and there is a learning curve to get there… it has the uncomfortable feel of being a Bostonian in Atlanta or of being a Yankee in Australia… each speaks English but there is something different going on that requires extra thought and concentration.

But the real issue is the hidden cost to your BI eco-system. When you ETL data from your source systems… systems in your unique Enterprise dialect… into the “standard” dialect more than a little effort is required.

I recently participated in a discussion with a company who had just requested bids for an end-to-end data warehouse/business intelligence re-development. They said that the bids for services: no hardware or software… bunched into three categories based on the fixed price bid.

At one end were two companies who offered industry standard data models. You could probably guess who they are. In the middle was a company that offered two pre-defined data mart data models in support of a very specific application requirement. At the other end was three companies who offered no data models… but would build them as part of the service. The bids within each group were remarkably similar.

Who was the least expensive? The firms who bid no data model were about 50% less expensive than the bidder who offered two standard models. The firms who bid no data models were about 1/2 the cost of the firms who offered standard data models.

What would explain this? It is the ETL costs associated with mapping and translating the unique Enterprise language of this company to the offered standard that make the difference. Fitting your companies data dictionary into the dictionary from another dialect within the same industry in an expensive proposition… it cost twice as much to develop into a standard data model… besides the human costs of having to learn the new dialect.

Sometimes my current employer is under-rated due to the lack of industry-standard models. But I’m no longer so sure that these are a good thing. Now may be the time to rethink.

I would like to acknowledge the article by Margy Ross at Kimball University (see http://www.informationweek.com/news/227400287) for starting my personal rethink on this topic.