While Data Quality may not be a hot topic, it’s an important one, because bad data could be a driving factor in keeping your organization, business unit, or department from achieving a return on your analytics investment.
Having data quality issues creates distrust in your analytics solutions, slows workflows, and generates extra work. The downstream affect is poor decision making which can end up costing you a lot. These are the main costs of having data quality issues:
>> Bad decisions or no decision at all. A decision is only as good as the information it is based upon, so making decisions based on bad data leads to lost opportunities and expensive rework, research, and remediation. The cost of bad decisions will inevitably surface, whether financially, through bad publicity, or by losing a competitive edge. What they say about “garbage in, garbage out” is true.
>> Decreased productivity. If you don’t have to worry about data quality issues, staff can be more productive. Instead of spending time validating and fixing data errors, they can focus on more profitable activities.
>> Missed decisions. When bad or inaccurate information hides an underlying opportunity, this affects your ability to excel above competitors, and this cost is almost never known or understood.
Because it’s hard to empirically measure the cost of bad data, directors and managers often don’t have the time or budget allocated to ensure good data. But, prioritizing data quality as a core competency will ensure proactive decision-making and help you thrive with your data.
One of our clients who manufactures bridges had problems with late deliveries due to data quality issues. The procurement team orders parts from a range of suppliers, but they often failed to put shipping information into their ERP system. In the meantime, a day’s work would be planned out only to discover that parts hadn’t arrived. This meant that there were idle employees waiting for work, or a manufacturing space had been prepped and now needed to be cleared which impacted completion dates and profitability. The solution was to address the procurement team’s work processes to capture and maintain shipping data accurately. This manufacturing team could then plan work more accurately and dramatically reduce idle time.
While this may appear to be a simple problem to address, sometimes it’s hard to change long-standing organizational processes. But because we were able to express – in dollars – the cost of bad data each time employees were idle or work spaces weren’t used, the issue became much more realistic to our client and was then prioritized.
Bad data can take many forms, and it may not be readily obvious. Here’s what we commonly uncover with our clients:
>> Missing data: This is the most common issue, and it impedes accurate analysis. If this is your issue, you need to develop a strategy to address missing data. Do you suppress rows with missing data? Do you define default values? Your strategy must be unique to your issues and data.
>> Inaccurate data: When you have inaccurate data, the cause may not be discoverable without detailed profiling and cross verification. The root cause of inaccurate data is often difficult to find, and decisions based on incorrect information have a costly impact to your business.
>> Duplicate data: Duplicate data is often caused by errant business processes which leads to a variety of issues. Luckily, it can be readily addressed through consistent profiling and remediation.
>> Wrong data source: Data is often acquired through a dizzying array of sources. At the advent of computer systems, human error was by far the most common issue due to manual input. Today, data comes from partners, clients, internal systems, and external systems; and companies often choose the most readily available sources which are not necessarily the most trusted.
At Analytics8, we believe that creating a culture of analytics is the foundational step to addressing data quality issues. 10 Critical Behaviors of Analytics Maturity is a great place to start. Beyond creating a data-driven culture, here are some tactical steps you can take now to improve your data:
1.) Conduct a data audit and assess data quality issues. To start, we recommend conducting a data audit to document things like what data you are collecting, where it‘s kept, and who has access to it. Take a full inventory of all the data your company uses and processes.
2.) Assess your data quality issues. Identify and rank the business intelligence activities that are the most susceptible to the impact of bad data. Document processes of data for these back to the source. Creating data flow diagrams is an excellent way to identify the “data path” or the process that data goes through from source to target analytics.
Then, flag poor sources and failing processes. You may discover an Excel spreadsheet right in the middle of your data flow that gets manually updated.Profile data and utilize subject matter experts to validate.
Do: Be iterative and agile in your assessment, and don’t be afraid to mark complete if the sources check out. Document the sources and ensure a process is in place to govern future changes.
Don’t: Over focus on creating documentation as part of the assessment. Doing so delays critical remediation efforts. Do only as much documentation as necessary and focus documentation efforts on what’s wrong.
3.) Evaluate the assessment and create a prioritization matrix. Once you’ve identified issues, you have action items that need to be prioritized. Prioritize those items by considering Business Value on one dimension and Feasibility of Remediation on the other. The items can then be categorized as high feasibility-high value, low feasibility-high value, low feasibility-low value, and high feasibility-low value.
Do: Communicate findings and gain critical consensus. Get everyone committed to resolving data quality issues early in the project. Consensus can often be reached when people understand the impact to their respective operations.
Don’t: Skip communication. Skipping this critical element will only create reactionary work later.
4.) Put a plan in place that parallels the prioritization matrix. Many make the mistake of starting with what’s familiar. Refer to your prioritization matrix to identify what actions can be taken early on for high impact, and plan out what should be accomplished next.
Do: Work iteratively and agile. Get started quickly and work incrementally. Set the scope of work in short manageable efforts and don’t be afraid to reprioritize between sprints.
Don’t: Make dramatic changes in scope in the middle of a sprint. It’s important to complete work that’s started and that progress is being made with each iteration. Reprioritization is fine, but avoid changing scope.
5.) Finally, as with any project, ensure that you have adequate time, tools, and skills on hand to address data quality. Even when data quality projects are difficult to justify among competing high priority projects, remember the importance of data quality to the success of your business.
Questions about where to start? Sign up for a data strategy session, and one of our analytics experts will consult with your company about your data and analytics strategies and processes.
To thrive with your data, your people, processes, and technology must all be data-focused. This may sound daunting, but we can help you get there. Sign up to meet with one of our analytics experts who will review your data struggles and help map out steps to achieve data-driven decision making.