Understanding Data Quality and Why Teams Struggle with It | by Elliott Stam | Mar, 2024

Understanding Data Quality and Why Teams Struggle with It | by Elliott Stam | Mar, 2024

[ad_1]

Data quality: the catch-all term for business logic, reliability, validity, and consistency

Elliott Stam
Towards Data Science
The elephant in the room. (Photo by Alberico Iusso on Unsplash)

Conversations about data quality can be difficult, especially when the elephant in the room is an underperforming product.

Situations where these discussions play out typically include disappointed stakeholders, frustrated product managers, and misunderstood engineers.

Familiar phrases might bounce off the walls, including:

  • “fix the data”
  • “discrepancy”
  • “data validation”
  • “trust”
  • “data quality”

But there is a force at work preventing individuals from arriving at a common understanding. Words are being spoken, yet for some reason they aren’t landing. Reading between the lines of what each person says, it’s clear multiple definitions of “data quality” are at play.

The meaning and implication behind the words is different for each person. The validity of the team’s collective experience and perspective is undermined as they continue to talk past each other. The clock keeps ticking and they eventually exit the conversation without a clear resolution.

That’s quite the pickle, and it’s a common theme in data products.

The phrase “data quality” is widely used and can mean different things to different people. Let’s kick things off with a subjective definition of this term from the perspective of the three archetypes introduced above: stakeholders, product managers, and engineers.

Stakeholders: let’s assume these are less technical people who interact with data products (like dashboards) in their day-to-day operations. To them good data quality means the information accurately reflects the real-world processes they interact with. When they see a dashboard their first thought is to download/export the data to a spreadsheet so they can reconcile the numbers against other known quantities they trust.

Product managers: the primary concern here is that the numbers in product x match the numbers in product y and tell a cohesive story. If the numbers match across…

[ad_2]
Source link

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *