In today’s world of smart gadgets and free WiFi, the majority of businesses collect data on its users. Companies want to know where you are, how you use their website or app, if you prefer to use a mobile phone or desktop, whether users run into problems in the same areas, and more. Analytics are analyzed to determine market strategy, budget, areas of opportunity, where most business is conducted and other areas of need. But collecting these metrics means nothing if data quality is low.
Data can be a very useful and powerful tool. There is a sentiment that the more data you can gather, the better. However, that is not necessarily the case. Some companies and tech and marketing folks will say to take all the data you can and store it in a big lake where it just sits waiting on the company to find a use for it, and when they do they’ll fish the data out. But the truth is, data that is not governed is low quality. And more often than not companies end up with a costly swamp instead of a lake and very little insights.
Data that is misaligned quickly becomes a data swamp. It’s murky. You can’t find anything in it. Misalignments occur on the format of how dates are stored, for example 1-1-1900 or 01-01-1900 or 1/1/1900, etc. There could be a situation where you go in looking for a specific state, but some systems have states abbreviated, others are written out, others combine the whole address and some include foreign states.
There are so many variations because of poor governance that the data doesn’t align, and your ability to be able to say find all customers in who like a certain product and who live in NY or New York or NewYork or NY,NY or 10001 begins to suffer. If the data is either inconsistent or denormalized in any way, gleaning information becomes difficult.
In order for data quality to be high, it needs to have a couple of qualities. It needs to be governed. It needs to be robust enough to actually provide information, which means you have enough of it and you need to have enough that is a high level of quality. Essentially, you need to have an understanding of where the data is coming from. For instance, is it coming from a user who has purchased a product, who has an account or is it someone who’s just passing through the website or using the app? It makes all the difference of you want to know, for instance, what features customers like.
So data needs context. And if you have enough data with context that is well governed (AKA high quality data), then you can get good insights for business intelligence and analytics initiatives. You can really start to understand and drive the direction of your business. You can start to apply more advanced methods of parsing through data, potentially with machine learning and artificial intelligence. You can categorize the data and begin to see patterns and trends.
If you don’t have high quality data, your first step in the process is working with the data. You will have to massage it, to normalize it, to parse it so that it can become meaningful. This way it can be categorized, governed and normalized. And you can attempt to create context. But in order for analytics to work properly, you still have to have volume. Without volume, there’s not enough real sample to make proper business decisions.
Low volume can be misleading, which is dangerous. Many companies out there want to immediately start making decisions based on analytics. Using analytics to drive your business and make analytical decisions is an excellent way to operate a business, but there has to be context. Early on, it is simply better to just collect the data – the useful stuff, not everything – and leave it alone. Don’t run analysis, don’t look at it, don’t do anything with it. Without a certain level of volume, the data sample size is going to be too small compared with your potential population. Generally, the data is going to be misleading or skewed in one direction or another.
The biggest factor in data quality is governance from the start as a business is creating its analytics programs. If your business doesn’t have data governance, today is the day to start. Give your data the context it needs so as you’re collecting it, you know where it comes from. You know how users are behaving with your product, what they are using, what they are not finding useful, how long it takes to use things. When you can pinpoint and sharpen the focus of what you provide to a user, you can then use that data not only to help convert sales, but to ensure customer satisfaction and encourage repeat customers.
Data quality is an important piece of analytics and metrics. Proper governance, context and volume all play a role in whether your data is high quality or not. High quality data will allow you to drive your business in the proper direction to growth and profitability.