Data quality can be a major challenge in any data modeling project. Issues can creep in from sources like typos, different naming conventions and integration problems. But data quality for big data projects that involve a much larger volume, variety and velocity of data takes on even greater importance.
And because big data quality issues can create several contextual concerns related to different applications, data types, platforms and use cases, Faisal Alam, emerging technology lead at consultancy EY Americas, suggested adding a fourth V for veracity in big data management projects.
Why data quality for big data is important
Big data quality issues can lead not only to inaccurate algorithms, but also serious accidents and injuries as a result of real-world system outcomes. At the very least, business users will be less inclined to trust the data and the applications built on them. In addition, companies may be subject to government regulatory scrutiny if data quality and accuracy play a role in front-line business decisions.
Data can be a strategic asset only if there are enough processes and support mechanisms in place to govern and manage data quality, said V. “Bala” Balasubramanian, senior vice president of life sciences at digital transformation services provider Orion Innovation.
Data that’s of poor quality can increase costs as a result of frequent remediation, additional resource needs and compliance issues. It can also lead to impaired decision-making and…