Top 5 data quality & accuracy challenges and how to overcome them


We’re excited to convey Remodel 2022 again in-person July 19 and nearly July 20 – 28. Be part of AI and knowledge leaders for insightful talks and thrilling networking alternatives. Register right this moment!


Each firm right this moment is data-driven or a minimum of claims to be. Enterprise choices are now not made based mostly on hunches or anecdotal traits as they had been up to now. Concrete knowledge and analytics now energy companies’ most important choices.

As extra firms leverage the ability of machine studying and synthetic intelligence to make crucial decisions, there have to be a dialog across the high quality—the completeness, consistency, validity, timeliness and uniqueness—of the information utilized by these instruments. The insights firms anticipate to be delivered by machine studying (ML) or AI-based applied sciences are solely nearly as good as the information used to energy them. The previous adage “rubbish in, rubbish out,” involves thoughts relating to data-based choices.

Statistically, poor knowledge high quality results in elevated complexity of knowledge ecosystems and poor decision-making over the long run. In truth, roughly $12.9 million is misplaced yearly as a consequence of poor knowledge high quality. As knowledge volumes proceed to extend, so will the challenges that companies face with validating and their knowledge. To beat points associated to knowledge high quality and accuracy, it’s crucial to first know the context through which the information parts will probably be used, in addition to finest practices to information the initiatives alongside. 

1. Information high quality is just not a one-size-fits-all endeavor

Information initiatives usually are not particular to a single enterprise driver. In different phrases, figuring out knowledge high quality will at all times rely on what a enterprise is making an attempt to realize with that knowledge. The identical knowledge can influence a couple of enterprise unit, perform or venture in very other ways. Moreover, the checklist of knowledge parts that require strict governance could differ in line with totally different knowledge customers. For instance, advertising groups are going to want a extremely correct and validated e-mail checklist whereas R&D can be invested in high quality consumer suggestions knowledge.

One of the best staff to discern a knowledge ingredient’s high quality, then, can be the one closest to the information. Solely they’ll be capable of acknowledge knowledge because it helps enterprise processes and in the end assess accuracy based mostly on what the information is used for and the way.

2. What you don’t know can harm you

Information is an enterprise asset. Nonetheless, actions converse louder than phrases. Not everybody inside an enterprise is doing all they will to verify knowledge is correct. If customers don’t acknowledge the significance of knowledge high quality and governance—or just don’t prioritize them as they need to—they aren’t going to make an effort to each anticipate knowledge points from mediocre knowledge entry or elevate their hand after they discover a knowledge difficulty that must be remediated.

This may be addressed virtually by monitoring knowledge high quality metrics as a efficiency purpose to foster extra accountability for these instantly concerned with knowledge. As well as, enterprise leaders should champion the significance of their knowledge high quality program. They need to align with key staff members in regards to the sensible influence of poor knowledge high quality. For example, deceptive insights which can be shared in inaccurate reviews for stakeholders, which might doubtlessly result in fines or penalties. Investing in higher knowledge literacy might help organizations create a tradition of knowledge high quality to keep away from making careless or ill-informed errors that injury the underside line.

3. Don’t attempt to boil the ocean

It isn’t sensible to repair a big laundry checklist of knowledge high quality issues. It’s not an environment friendly use of sources both. The variety of knowledge parts lively inside any given group is big and is rising exponentially. It’s finest to start out by defining a company’s Essential Information Parts (CDEs), that are the information parts integral to the principle perform of a selected enterprise. CDEs are distinctive to every enterprise. Web Income is a typical CDE for many companies because it’s necessary for reporting to buyers and different shareholders, and many others.

Since each firm has totally different enterprise objectives, working fashions and organizational buildings, each firm’s CDEs will probably be totally different. In retail, for instance, CDEs would possibly relate to design or gross sales. Alternatively, healthcare firms will probably be extra eager about guaranteeing the standard of regulatory compliance knowledge. Though this isn’t an exhaustive checklist, enterprise leaders would possibly contemplate asking the next questions to assist outline their distinctive CDEs: What are your crucial enterprise processes? What knowledge is used inside these processes? Are these knowledge parts concerned in regulatory reporting? Will these reviews be audited? Will these knowledge parts information initiatives in different departments inside the group? 

Validating and remediating solely essentially the most key parts will assist organizations scale their knowledge high quality efforts in a sustainable and resourceful means. Ultimately, a company’s knowledge high quality program will attain a stage of maturity the place there are frameworks (usually with some stage of automation) that can categorize knowledge belongings based mostly on predefined parts to take away disparity throughout the enterprise.

4. Extra visibility = extra accountability = higher knowledge high quality

Companies drive worth by understanding the place their CDEs are, who’s accessing them and the way they’re getting used. In essence, there isn’t a means for an organization to establish their CDEs in the event that they don’t have correct knowledge governance in place firstly. Nonetheless, many firms wrestle with unclear or non-existent possession into their knowledge shops. Defining possession earlier than onboarding extra knowledge shops or sources promotes dedication to high quality and usefulness. It’s additionally smart for organizations to arrange a knowledge governance program the place knowledge possession is clearly outlined and folks might be held accountable. This may be so simple as a shared spreadsheet dictating possession of the set of knowledge parts or might be managed by a complicated knowledge governance platform, for instance.

Simply as organizations ought to mannequin their enterprise processes to enhance accountability, they need to additionally mannequin their knowledge, by way of knowledge construction, knowledge pipelines and the way knowledge is reworked. Information structure makes an attempt to mannequin the construction of a company’s logical and bodily knowledge belongings and knowledge administration sources. Creating such a visibility will get on the coronary heart of the information high quality difficulty, that’s, with out visibility into the *lifecycle* of knowledge—when it’s created, the way it’s used/reworked and the way it’s outputted—it’s not possible to make sure true knowledge high quality.

5. Information overload

Even when knowledge and analytics groups have established frameworks to categorize and prioritize CDEs, they’re nonetheless left with hundreds of knowledge parts that must both be validated or remediated. Every of those knowledge parts can require a number of enterprise guidelines which can be particular to the context through which it is going to be used. Nonetheless, these guidelines can solely be assigned by the enterprise customers working with these distinctive knowledge units. Subsequently, knowledge high quality groups might want to work carefully with subject material consultants to establish guidelines for every distinctive knowledge ingredient, which might be extraordinarily dense, even when they’re prioritized. This usually results in burnout and overload inside knowledge high quality groups as a result of they’re chargeable for manually writing a big sum of guidelines for quite a lot of knowledge parts. Relating to the workload of their knowledge high quality staff members, organizations should set life like expectations. They could contemplate increasing their knowledge high quality staff and/or investing in instruments that leverage ML to cut back the quantity of guide work in knowledge high quality duties.

Information isn’t simply the brand new oil of the world: it’s the brand new water of the world. Organizations can have essentially the most intricate infrastructure, but when the water (or knowledge) working via these pipelines isn’t drinkable, it’s ineffective. Those who want this water will need to have quick access to it, they need to know that it’s usable and never tainted, they need to know when provide is low and, lastly, the suppliers/gatekeepers should know who’s accessing it. Simply as entry to scrub ingesting water helps communities in quite a lot of methods, improved entry to knowledge, mature knowledge high quality frameworks and deeper knowledge high quality tradition can shield data-reliant applications & insights, serving to spur innovation and effectivity inside organizations all over the world.

JP Romero is Technical Supervisor at Kalypso

DataDecisionMakers

Welcome to the VentureBeat neighborhood!

DataDecisionMakers is the place consultants, together with the technical individuals doing knowledge work, can share data-related insights and innovation.

If you wish to examine cutting-edge concepts and up-to-date info, finest practices, and the way forward for knowledge and knowledge tech, be a part of us at DataDecisionMakers.

You would possibly even contemplate contributing an article of your personal!

Learn Extra From DataDecisionMakers



Supply hyperlink

Leave a Reply

Your email address will not be published.