Updated: May 29
This article is more of an informed opinion rather than just a description of the lay of the land for Data Governance. Prodago has been thinking about this a lot. In part, we are selfishly trying to clarify our messages and boil down the problems affecting organizations to their essential, painful components. But in part, we are also exposing what we all know to be true: Data Governance exists in most organizations, but it fails to live up to the value executives expect.
It helps to identify the problems and their root causes. This is the first step in finding a long-lasting solution.
The article means to stir the conversation for senior leaders to understand better where things break down and examine how we can address them.
How did we get here?
Traditional Data Governance was designed years ago, at a time when IT needed to involve the business in making decisions around data. It started with system access, data quality, and data definitions. Since then, it has evolved to manage quite a few more things (see 9 data aspects managed by successful Information Governance programs).
But quite a few pressures are forcing us to reconsider HOW we are to govern data. What makes it particularly difficult for leaders is that we are used to observing a problem and then solving it. But more problems to solve have been slowly creeping upon us. We just didn't see them coming!
If we step back a little, the combination of these trends together has created the perfect storm:
We are managing and generating more and more data.
The cloud has infinite performance, but it takes time to move everything there. So we find that there are more and more environments where data sits or holding copies of it.
AI, analytics, and automation mean we need to use this data in more use cases.
Perhaps related to the proliferation of use cases, but for sure, no one will argue that more people need the data to do their work.
Privacy regulation is now stylish politically, and for good reasons. The rules are getting more stringent, especially in the B2C space.
Now that we can automate decisions, we need to worry about bias, ethics, and doing so responsibly, otherwise putting the organization at risk.
The impact of this storm is that there is a lot more to do. New processes are required to both create value and protect against data risks in these areas:
Ensure data quality.
Use and share data.
Historically, Data Governance has not done as much in the areas of processing or using/sharing the data. Data Privacy and Automated Decisions are changing this. But what we want is to enhance Data Governance, not create more silos, where for example, Data Privacy is managed via a completely different set of processes.
What does it mean for organizations?
As Melvin Udall (the character played by Jack Nicholson) said in the movie in "As Good As It Gets," "I'm drowning here, and you're describing the water!". We have no control over those external factors that got us in this pickle. Thankfully, we can derive useful and actionable information from the situation.
Data privacy is the job of everyone
Now that breaches and hacks are in the news, companies must pay serious attention to protecting confidential information, not to mention the laws most are subject to. The problem is that protecting data and complying with rules require us to manage PROCESSES and DATA, not just data.
The quality of data in a data warehouse could be IT's responsibility in the past, with some direction from project or domain data stewards. Ensuring the privacy of information, if you think about it, goes way, way beyond that. Just think about how data is used or shared, and it can quickly escalate in terms of scope. How do you control that everyone is following the rules?
We must recognize that just like we had to develop data quality KPI's to manage data quality, coming up with KPI's around data security and data privacy requires harmonizing and unifying how all stakeholders will participate. Regulations are becoming quite specific and prescriptive. The same data may need a different approach to be compliant with a rule depending on its lifecycle stage, where it sits, how it is aggregated or how it is used.
By managing the processes that manage data, we can
Check whether the process is in place;
If the process is adequate to support compliance;
If someone or group is duly accountable to execute the process;
If the process has been completed and has provided the expected result.
We can clearly create KPI's to track this stuff.
It's too big to do manually
Many organizations are still enforcing Data Governance policies and standards through ad hoc, manual or outdated tools. Data teams try and vet reports and data sets, setting up custom rules all over the place and comparing expected numbers. Technology stacks and the explosion of data mean that this old way is inefficient and can't scale.
Even if the problem gets bigger, it still needs to be managed. We will distribute the work to the rightful owner and automate the validation of its completion. Whatever value the process owners bring in the value chain, they will optimize some of the processes they are responsible for by automating with technology, which might use AI to do some heavy lifting.
These are technologies we are used to, including lineage and mapping, data quality monitoring, etc. But because of the sheer size of the problem (see the previous section on how we got here), we can't just leave it to the various stakeholders or data stewards to figure out. We need to add discipline to the "orchestration" of how everyone needs to participate; we need to manage the processes of managing or manipulating data.
Manage the processes that manage data
The body of practices we have developed over the years for Data Governance is still good. But there will be way more to do. What we left to data stewards to figure out will no longer work. We will have to map out the processes in detail because these will be required for audits anyway. The good news is that it will drive data operations.
Managing processes that manage data rather than just the data will also help as the privacy laws will change. We will need agility to understand what are the operational impacts and react quickly.
Adding a layer of process management will give more visibility to everyone in the organization, no less to executives who have been somewhat in the dark or having to trust that someone is doing their job well to prevent, say, a data breach.