12 May 2009

What is a conformed Dimension?

In Data Warehouse speak, you will hear many terms being thrown around. Today we will briefly discuss conformed dimensions and attempt to remove the trepidation and intimidation from this phrase.

Without getting into relational and dimensional design, in everyday life, when you have a sale of a product, there is a fact / event established in that a sale has been generated. All the attributes that 'fill in the details' are known as dimensions. For instance, I bought a new set of tyres for my vehicle (let's say a nice round figure like $400 for the set). When this sales event is propagated to the warehouse, the details are known as the dimensions, thus the dimensions provide the answers around:
  • Who? new customer or existing customer?
  • Where (which branch)?
  • What was the product sold?
  • When was the sale made?
  • Who made the sale?
  • Was there a supplementary upsell (wheel alignment, tyre disposal, etc.)?

For today's example, let's use the date and time of the sale as our example of a dimension.

In the fact record, once it has passed through a staging area, there should be a key like 20090512 (12th May 2009) to identify the date the sale was made (we could also delve into time of day, but for simplicity's sake, let's not).

This key 20090512 can serve as a key to the date dimension, in which a record will exist for the record identified by key value 20090512. This record might have columns (fields) like:
  • Year 2009
  • MonthNum 05
  • FinMonth 11 (In Australia, the federal tax year end is 30th June, so July is month 1 of the next tax year)
  • MonthName May
  • QTR 2
  • FinQtr 4
  • WeekNumCal 20
  • WeekNumFin 46

so when we query for Calendar Month = 5 and Calendar Year = 2009, we will get the 31 records in the date dimension for May 2009.
These records will then be joined to the facts, where the measures (how much) reside.

That is the easy part done with, dimensions are not that hard to understand really.

So what do we mean by a CONFORMED DIMENSION?

A conformed dimension is a dimension that has the same values for all areas of the business and the dimension that lends itself most readily to this is the date dimension, since April usually means the same thing to every department in the organization. So unless there are some departments that operate on a different fiscal calendar to the rest of the organization, we can now label the date dimension as conformed.

In the event that a particular division / department that wants to use the data warehouse who operates by a different set of dates to the rest of the organization, there are two choices:
  1. Create a custom dimension for this department to use, in which it does not conform to the rest of the organization, resulting in a non-conformed dimension
  2. Add the additional information into the conformed dimension so that the outlying department is able to make use of a single conformed dimension with some custom fields (columns) that no other department uses. This change needs to be included into the ETL / data integration layer so that the non-conformed fields of the dimension are populated and maintained properly.

Customer dimensions are a little more difficult to conform, since various divisions / departments in the organization often disagree in their elementary definitions of a customer. This is where a good business analyst working together with an experienced data modeler can bring in the harmony that is required for a data warehouse to present the value to organizations in that the dimensions are conformed and can be readily used by the different types of business users and they can get reliable, predictable, accurate results.

Creating the integration processes present the largest challenge in data warehouse projects and is often the most time consuming phase of any data warehousing project and conforming the dimensions is just one part of that.

Some dimension values will forever remain static, but some of these values change over time, like address details, Marital status. Some of these dimensions change slowly, some rapidly (like daily exchange rates, hourly pricing configurations). We'll address the historical aspect of slowly changing dimensions in a subsequent blog post.

Hopefully, this post has cleared up some of the mystery surrounding the basic meaning of what a conformed dimension.