/How much does your data weigh?

How much does your data weigh?

Business Improvement via data metrics

Measurement can be key to improving data. But, there are too many potential measures when it comes to data. Every column, every row, every table, every relationship can be measured. And that does not even get into the possibilities of metadata or data quality. With all these possibilities coming up with a measurement scheme can seem too costly. And without proper focus it will be.

So what to focus on?

The four areas to really need the most focus:

  1. Check if objectives are being met
  2. See how the expected “control points” are changing
  3. Make sure the processes put in place work as intended
  4. Watch to see when sizing and other assumptions will be violated

1. Objective Metrics: Check if objectives are being met

As part of Data Governance it is important the business visit this topic on a regular basis. Here are some examples of objectives I have discussed with clients recently

  • Reducing the time it takes to onboard a new customer / product / location
  • Reduce bounced communications (e.g. mailings, emails, phone calls, …)
  • Improve Customer response (e.g. conversion rate, click throughs, ..)
  • Improve Compliance (e.g. Know The Customer, Physician Spend, Conflict Minerals, …)

There are many other examples I could give, but in all cases this is one of the key areas to measure. As much as possible these items should measured based on historic data so that a baseline can be created to obtain a before and after view.

Any new data governance initiatives (e.g. an MDM or data cleansing implementation) need to have identified requirements expected to be met. As these requirements are developed the corresponding metrics to measure success should also be created. Then the data governance team should review these metrics going forward compared to historic data.

2. “Control Point” Metrics: See how the expected “control points” are changing

“Control points” just refers to the data elements that are expected to actually effect the objectives. For example in the case on on boarding a customer, what are the data elements that would slow down the process. This could be an invalid addresses, duplicate entries in the SFA tool, missing phone number, etc. Each of the potential causes would be a “control point” and should be measured.

Each new data project should included a design that showing what data changes need to occur to meet the requirements / goals. As these designs are created metrics should be identified to measure. Note these may be direct data, i.e. counting the customer records with and without a home phone. Other may be metadata, i.e. counting missing field descriptions for customer data sources. The data governance team should review control point metrics along with the business objective measurements.

3. Process Metrics: Make sure the processes put in place are working as intended

As new processes and systems are put in place it important to measure the activity of these systems. Like the control point metrics the process metrics need to be based on the design work for data projects. These metrics will ensure the design is meeting functional and non-functional requirements. They are a key way on ensuring SLAs are met.

Process metrics are also likely to be specific to underlying technology choices. For example user of the Informatica MDM Hub can use a product like the Hub Analyzer by GlobalSoft (http://www.globalss.com/mdmsol_hubanalyzer ). Tools like this can be vital in tracking day to day operations and help in tuning system configuration. Process metrics should be collected and review as early in the development cycle as possible to create baselines. Process metrics should be reviewed by the operation team on a regular basis. The data governance team should track if process metrics are varying unexpectedly.

4. Assumption Metrics: Watch to see when sizing and other assumptions will be violated

As part of the design process key assumptions should be collected. These should also be turned into metrics to ensure that the assumption are met. Collecting and reviewing these metrics will allow more proactive planning if trends so they will be violated at some point. A common example of this is sign assumptions. These metrics should be reviewed by the operations team and the data governance team as any projections show limits begin exceeded.

By focusing on a few metrics in each of these four areas will allow a data governance team to make sure data initiatives are on track and to identify new opportunities.

I am not suggesting these are the only metrics. There should be someone always looking at new potential metrics that are not part of the initial design. For these it is key to take a good “data science” approach and understand what actions the potential metrics suggest. If an action can’t be determined more need to be done.

To help discover new metrics it is best that key data assets be organized in such a way that meta data, data changes and other operations can be measured at points of time in the future. In other words design data repositories, both “Big Data” and “small data”, to be measured as potential “control points” in the future.