Skip to main content

Data Rich, Information Poor

By September 1, 2022July 2nd, 2024Data Analytics, Data Quality3 mins read
Data Rich, Information Poor

Do you know how much is a Zeta byte?

I was quite intrigued by this graph from Statista. It talks about the amount of data that is created, consumed and stored with some projections till the year 2025, with up to 181 Zeta bytes of data. While much of this data could be transient, as the same report indicates only 2% of this data is actually retained, still it is phenomenal growth.

Statista Graph

Data is one of the visible corollaries of growing digitalization activity across enterprises. As more and more business processes and consumer activity get digitized, they do leave a visible data footprint, which in turn turns out to be a pot of gold for enterprises to extract value. Which leads us to the question:

If we are so data-rich, why is it then we are Information poor?

Have you ever wondered why?

Becoming data rich doesn’t necessarily result in a culture of data proficiency. Ask any business leader in any organization about the availability of quality data for data-driven decision-making, when they need it the most. Despite all this exponential growth in digital and data engineering investments, why is it so difficult to get access to data for decision-making?

This is where it might help to understand the difference between Data and Information. While Data relates to the collection of facts at various points in time for future reference, Information refers to facts about something that is organized, synthesized, meaningful and presented with some context.

One of the key goals of any data-driven organization would be its ability to aid in converting Data into Information. Information that is used to learn about something (customers, business activity, etc) and eventually make things better with strong bottom-line results including improved quality of service, enhanced business operations and stronger customer engagement.

In this context, let us take a quick look at the very popular metaphor that is used in the field of data – Data is the new oil. How reasonable is this comparison? While the metaphor is a good way to compare and attribute the value generated, data is quite complex as a resource and there could be plenty of flaws that make the real-world application of data that much more challenging.

What can be done?

  1. How good is the data from a quality perspective? Quality data is fundamental for any meaningful analytics. Be it any of the analytics toolset or the sophisticated machine learning models, the outcome is only as good as the quality of data that flows through them. Focus on implementing a metrics-based approach for factors such as accuracy, timeliness, completeness and consistency of data. Just like oil, the quality of the data forms the very basis of the value generation exercise.
  2. How easy it is to understand the data? Data Glossary – the metadata about the data that helps your Business and Technology users to be in sync. For your users to consume the data, they need to understand information about the data such as meaning, relationships, origin, usage and type. If your users do not understand the data, they cannot locate it or use it effectively.
  3. Simple to use – make the information access simple with a search interface and let your users ask questions in simple English constructs, without the need to learn query languages, and interpret complex dashboards and reports. Becoming data-driven is no longer about getting obsessed with metrics. It should be as simple as asking someone who has got the answers and elevating our knowledge level and awareness.

The mere onset of digital activity and the availability of high-volume data alone doesn’t make the data that much valuable. It is more about our ability to synthesize information and get answers. No wonder many of the enterprises that are Data rich and Information poor fail significantly in their efforts to become data-driven.

The refined goal for any modern digital enterprise cannot just be about collecting more data, it should be more about synthesizing it into Information and making it easily accessible.