Pathway to Optimizing Your Data Assets
There’s such a thing as having too much data.
A book titled Rembrandts in the Attic, which takes its title from the analogy of patents that an organization secured in the past but never pursued sometimes being like masterpieces stored away in one’s attic. Over time, situations change, and these patents could now be worth money to others, or could be worth defending. However, there may be so much clutter and trash sharing space with the masterpieces that the owner literally isn’t aware of them — and as the years go by, time (and mice) can render even the most priceless asset worthless.
Although that book focused on IP in the sense of patentable ideas, I think the concept has an interesting parallel to another kind of IP: the data assets within an organization. Because of the never-ending stream of data coming into an organization, it’s all too often the case that the truly valuable data gets lost in the crowd with other data that is far less useful and valuable.
What makes certain data valuable and others, not so much? The answer could be many things — some data could be vital to a process for meeting a regulatory or reporting requirement, for example. Or it could be used to gain insights into a unique marketing opportunity. For example, imagine that a year ago, a junior analyst at your insurance company uncovered a solid correlation between and customers who buy premium life insurance policies and people who are avid birdwatchers. The data was mentioned in an internal report, and then forgotten.
That would be a shame, because that particular data asset could have not only been used to increase sales but, with a little thoughtful analysis, might have revealed other cohorts of consumers who would also be likely customers. So the fact that the data is sitting in the organization’s attic, figuratively speaking, could mean significant lost opportunities.
Various solutions
So how do you make sure you’re aware of the “Rembrandt-level” data in your corporate attic, and pluck it from obscurity to where its value can shine? There are various ways, and some are decidedly better than others.
One wrong way is to simply decide you need more piles of stuff to look through in your attic — thinking that the more piles you have, the more chances of finding Rembrandts. To use a more common analogy, if you’re looking for a golden needle in a haystack, the solution is not to simply buy more hay. The fallacy is obvious: the ratio of needles to hay stays about the same.
A better option is to take the less-is-more approach. Instead of investment in mountains of hay, why not try finding or creating haystacks that have a higher likelihood of containing golden needles? Another way of saying this is that you want a better collection of data assets to begin with.
Getting to an ideal data set
In general, there are three primary paths to a less-is-more data environment.
1) Machine learning
Using self-tuning algorithms, a machine learning (also known as artificial intelligence, or AI) solution can process and analyze data, and then make decisions as to what data is more and less useful based on the correlations between data values. Ideally, at the same time it’s making these decisions, the solution is also becoming slightly more attuned to the nature of your data set. Based on my experience, however, the results seldom live up to expectations. A typical machine learning solution may indeed identify certain basic correlations, but at this point in time requires investments in human judgment and oversight that outweigh the insights it delivers.
2) Hire a Chief Data Officer (CDO)
Hiring a creative, experienced CDO can give your organization the ability to envision the desired end state of your data environment. The answers you’re seeking may be already within reach, or might require additional thinking about data, and more involved work on data quality and data architecture. That being said, your organization needs to be ready the CDO’s leadership, with fundamental data governance and other data capabilities in place. And even then, the CDO will need time and resources to build a team.
3) Gather metadata to gain better insights
A third option, and the one I would recommend, is more involved than the first two — but the results it delivers are significantly greater. Gathering metadata helps you understand your data environment and resources at a truly granular level. To achieve this insight, I would definitely not recommend any product that promises to simply hand your metadata to you at the push of a few buttons. Rather, effectively gathering metadata requires a highly-focused communication process occurring over a period of time, emphasizing face-to-face interactions with a diverse set of data stakeholders.
This process allows you and the participants to collaborate around your data assets in an incredibly focused way. Moreover, it eventually produces the metadata and “tribal knowledge” about your data that can reveal valuable and often surprising connections between data, business processes and overall performance metrics. In fact, metadata also adds to the impact of the two approaches mentioned above. It improves the chances that an AI solution will be more effective; likewise, creating a complete metadata repository gives the CDO more information from which to make decisions.
Most of all, this approach gives you unparalleled insights into what data you have, where it resides, and who among your people are the best authorities for each type of data you have. This in turns gives you a far greater ability to understand exactly where the Rembrandts are hidden in your attic — and take steps to curate, protect them, and maximize their value.