Managing proprietary and public-facing data is key to your core business, growth and reputation and means that it is necessary to manage business and data growth while also handling increased management complexities. Unfortunately, Data Quality is a term that is often misused or ignored, and even causes alarm in organizations. In this 3-part series, I’ll explore:
- how to measure or understand if you have data quality issues (this post)
- some of the bad habits we typically see that lead to bad data quality
- process errors that lead to data quality issues
I’ll also provide some advice on where you should focus and how we are able to help.
Four ways to measure the quality of your data
- Completeness: Completeness validates whether mandatory attributes have a value, or optional attributes have a value based on a specific business condition. Conformance on the other hand ensures whether instances of data are stored in a format that is consistent with the business domain values. It also validates if the data value is within the range of acceptability for the given attribute.
- Duplication: Duplication (uniqueness) validates if a data object exists more than once within the data set. It can validate primary key uniqueness and check other composite field attributes to search for duplicates within the data.
- Consistency: Consistency verifies if the data values in one field are in sync with other values within the same record set. For instance, one could check if the zip code entered is consistent to the value for the state.
- Accuracy: Accuracy checks the degree to which data correctly represents the real objects they are intended to model. When a value cannot be systematically checked for correctness, accuracy is measured by how the values accord with the master source of record.
Don’t get ostrich syndrome: face data quality issues head on
It is often overwhelming for management to address the data quality issue when the proverbial “ostrich syndrome” kicks in. With “heads in sand,” the problem is ignored and dealing with poor data quality becomes an accepted way of operating.
What you can do right now
The first step is admitting there is a problem. Start by using these measures of data quality to assess your current state. This sounds like a lot of work (and it is). But it’s also possible to get help here. At Softchoice, we offer a Data Integrity Healthcheck a service which provides data collection, analysis and reports with recommended actions for remediation.
Some of the things we look at in the Healthcheck are:
- examine solution architectures for data integrity and compliance policies, procedures, incident response and risk management strategies.
- review existing documentation
- look at sample outputs and configurations of existing storage solutions
- determine data integrity gaps, single points of failure and weaknesses
- scalability and performance
Whether you leverage our service, or do this yourself – looking at the items above to get a clear understanding of your current state is a great place to start. Ultimately, your end goal will be to develop a comprehensive data governance program that can address some of the common causes of poor data quality and ensure that you don’t end up back where you started.
Here’s a quick overview of some of the common problems I’ve encountered that lead to poor data quality:
- Bad Habits: The act of data cleansing is where you recognize data issues and cleanse it or augment it as just part of good data hygiene. Just like brushing your teeth – if your users aren’t adhering to routines that keep your data healthy it can lead to bigger, more expensive, issues down the road. To learn more, read our 2nd blog post in this series on Top 3 Worst Data Quality Habits.
- Bad Processes: Data is much more stable than any person in your organization. This is why you need a strategy (also known as a meta-data management strategy) to advise the IT and business communities in your organization on how to view and use the data that a company captures and stores. The biggest challenge here is making sure people stick to the new processes.
To learn more, skip to the 3rd post in this series and read Top 3 Process errors and how to fix them. All of this work will lead you down a path to clean reporting and quality business intelligence! If you want to learn more about how Softchoice can help make this complex process easier, check out our Data Integrity Healthcheck or leave your questions below and we will get back to you.