Data Quality Monitoring using Azure Data Catalog

In May 2019 in a county office in Utah someone made a multimillion-dollar mistake.

A typo, possibly caused by a phone dropped on a keyboard, resulted in a perfectly normal residence being valued at $987 million, an overestimate of about $543 million in taxable value.

By the time the input error was discovered in October it was already too late. Tax revenue estimates and budgets had been made and approved, causing revenue shortfalls for the next three years. The deficit is likely to be covered by an increased tax rate for county residents for years to come.

Here’s another costly error. In 1999 NASA lost a $125 million spacecraft headed for Mars because of a difference in the unit of measurement used on the spacecraft and the unit of measurement used by ground control.2

Both examples illustrate the importance of having proper metadata management and timely data quality checks. A simple data quality rule checking for outliers could have revealed the input error in Utah, while leveraging metadata checks could have spared NASA this catastrophic result.

Data volumes are growing at an incredible pace as organisations experience rapid digitisation of their business. This creates the need for extensive metadata management and for an automated way of checking potential data quality issues. Luckily, this is now easily manageable as VIQTOR DAVIS leveraged and extended the possibilities of Azure Data Catalog to do just that!

Azure Data Catalog

Azure Data Catalog is an enterprise-wide metadata catalogue that makes data asset discovery straightforward. As a fully automated cloud service it enables users to register, enrich, discover, understand and consume data sources.

Building on this service, VIQTOR DAVIS enables the end-user to generate a data quality dashboard based on simple ‘metadata tags’. Data that is imported into the Azure Data Catalog can be ‘tagged’, categorising it for a variety of purposes. These tags consequently form the basis for data quality rules which are defined in a central data quality (DQ) script. The script generates a report which is automatically converted into a dashboard end-product. Alternatively, the solution can be integrated with other notification or workflow systems.

After setting up a set of data quality rules once, the application of this set of rules to certain tables and columns in a target database is as easy as selecting the data quality rule and applying it to a data object using Azure Data Catalog. This allows for easy management of data quality monitoring when data objects or business rules change over time. Instead of extensive development efforts, the data quality dashboard can be brought up to date by just a few clicks.

Leveraging Azure Data Catalog for automated data quality checks could provide for a cost-effective and user-friendly way to maintain high data quality and help you recognize potential issues before they can have a negative impact on you or your customers.

If you would like to learn more about what Azure Data Catalog quality monitoring could do for your organisation, don’t hesitate to contact us!

Check

Thanks for reaching out. We'll be in touch.

Check

Successfully subscribed to the VIQTOR DAVIS newsletter