June 14th, 2013 by Ron Dovich

One of the first things that you must do in any reporting project is to clearly define the metrics you want to measure.  Without a well-documented description, the data will be misinterpreted and, in the worst-case situation, it will provide incorrect guidance to the business.


This post may seem pretty obvious, but I just went through a relatively small project where a major outcome was to ensure that we were all describing the metrics using the same definitions.  That part of the project took longer than expected and generated the most “excitement” by far.  Interestingly, the metrics we were trying to define were not some obscure, domain-specific set of terms but instead they were common SaaS concepts like registrations and engagements.  The problem is not that people don’t understand the basic terminology but instead it is in how you attempt to qualify it for your particular company.  For example, what does it mean to be an “engaged user”?  Is it someone that comes to the site once a day…a week…a month?  What if they only come once a month but contribute a bunch of content or spend the whole day using the site?  Do you measure the amount of wall clock time they spend on the site or maybe the number of pages they visit?  What if you have both an unauthenticated value prop and an authenticated one?  Is a user “engaged” if they don’t authenticate?  You can see where this goes.

Simply getting agreement on the definition is not enough.  You also have to agree on the source of the data.  In our engagement example, do you use log files, Google Analytics or some other tracking mechanism.  They all might produce an answer but is it the one you want to use as your source-of-truth about the business?  In some cases you may have to correlate data from multiple sources to get the metric you want and that creates additional complexity.  In addition to identifying the source of the data, you should also identify the method used to get it.  Are you writing ETL to parse a log file or calling an API?  Those two methods may produce different answers because of the assumptions made by the developer.

The process of defining your terms is critical in any company that has more than one employee.  All of us have a bunch of lexical baggage that will often generate bad assumptions.  The only way to avoid making the wrong assumptions is to get in a room, roll up your sleeves and work with all of the constituents that are going to use your reports.  In the end make sure you have documented and agree on your terms.