What Do Great Data Governance Policies Look Like?
“What do you think good data governance policies look like?”, the interviewer asked me.
I was interviewing for a data governance manager position when they asked me this. It was blot from a blue.
Up to this point I had been doing the documentation, interviewing, and performing governance tasks. I hadn’t considered how a good data governance policy was defined. Or the characteristics of good policies.
Since then, I’ve boiled it down to 5 things:
- Publicly Known
- Realistic and Reasonable
- Enforceable
- Balances User Needs with Company Needs
Let’s get an overview before we delve into each.
Overview
The goal of governance policies are not to solely enforce a system, but to regulate use of data to ensure:
- Consistency
- Chain of Responsibility
- Legal and Ethical Compliance
- Data Quality
- Trust Between Business Teams and Users
They serve as a framework and a guide for teams that work with machine learning models, data models, manage data, and business users who consume it.
Data governance policies that are too conditional (if-then) can be problematic. They may be difficult to interpret and apply consistently in different situations. They can also be easily circumvented or ignored, which increases the risk of poor data insights, improper use of data/applications, or worse- legal liability.
Rigid rules can be a major pain point for data governance policies. It ends up costing time and money to revise.
Policies should not be too vague or too specific. Their goal is to provide guidelines that are open enough to interpretation. Which are refined as data maturity increases, use cases change, or the business scales.
Now, lets look into the 4 rules:
Publicly Known and Clear
Good data governance must be known to all who are being governed by it. The governed must reasonably participate in the formation of the governance laws. The rules must be clear.
To be known means that the policies being used must be easily accessible. Users need to know that policy rules exist.
Documentation of data governance rules must be accessible and available. It also means that users governed by these rules can easily locate the laws. A central repository, like Microsoft SharePoint exists so users can view the policies, logic, and others behind it.
Reasonable participation means those who are ruled by data governance policies, must have a say in creating, revising, and removing laws. Data policies need to address both business need and actual user need.
While clarity means that important terms are defined and clear. The wording of the policy can be explainable to the user and is transparent. It also means that a data steward or other designated individual is available to explain a concept.
Users who are subject to regulations need to know what they are. So that they can adjust their workflows, operational processes, and work appropriately.
It helps create users and teams more willing to adopt policies.
Communication is so critical to the Data Governance process.
Illustration
Realistic
All data rules must use realistic and reasonable uses cases of data and follow legal rules and regulations. This prevents illegal, unusual, unreasonable use cases from being used in the formulation of data governance policies.
It should not be too vague or too specific. Its goal is to provide guidelines that are open to interpretation, and refined as data maturity increases.
Realistic rules also have reasonable use cases.
Reasonable uses cases assume that the creator(s) of the use case:
- See foreseeable risk of harm of a use case vs. the utility of the use case.
- Knows the extent of the use case.
- Likelihood a use case will cause legal, financial, or reputational harm to users or the company
- Has considered alternatives of lesser risk, and the costs of those alternatives.
- Has considered the current tools and resources available
This standard assumes an ideal rational individual, who considers their actions before creating a use case.
Polices need to based off use cases that come from two levels: the business and the users. The business operates at a strategic and operational level. While users operate at a tactical level or contributor level.
To enable reasonable and realistic use cases, its important data governance policies need to have a review process. Either through a data steward or a data governance council.
Must Be Enforceable
Governance policies don’t mean much if they aren’t enforceable. Without a means to ensure compliance, data governance policies are words on a document.
This has two parts. Governance policies must have a means of enforcement and must not discriminate.
Means of Enforcement.
There must be a data steward or a owner of the data. This person defines the schema, how the data is used, and enforces data contracts created by the producers of the data. There must be penalties and review for misuse of data. This helps keep trust in data and make sure that polices are respected.
It would also be advisable for the data policies to have a sponsor in addition to having a data steward and owner. This ensures the policies not only have a policy backing, but has legitimacy to be enforced.
Must Not Discriminate.
Enforced data governance policy rules are applied to all the users of the data that the policies governs. It also assumes that the users of the data applied due diligence in understanding the policy, both for themselves and outside users.
“Must not discriminate” applies to the way that data is collected, used, and managed by employees — for internal use of the data.
Employees must ensure that they are not collecting or using data in a way that unfairly impacts or disadvantages individuals based on these characteristics.
Without enforcement, data governance policies and procedures may not be taken seriously and may not be followed consistently.
This can lead to problems such as data errors, inconsistencies, and breaches, which can be costly for an organization.
Must Balance Business and User Needs
Good governance policies should take into account the broader interests of business as well as the needs and accessibility of individual contributors / data teams.
Business needs revolve around to use, store, and obtain data to answer questions. User needs is the ability for users to access and use data for projects, analysis, etc.
Governance policies that are too strict causes bottlenecks for data teams and their workers. Which leads to increase development time, cost, and inability to scale. Policies that are too loose, open the business up to risk from how the data is used and stored.
Involve stakeholders from different departments and levels of the organization in the data governance process. This can help ensure that the needs of different user groups are considered. Data governance policies and procedures must align with the overall goals and objectives of the organization.
Be transparent and communicative in the data governance process. This can help build trust and understanding with users, and ensure that they are aware of the policies and procedures in place, and why they are important.
Good governance policies should not suppress the business. Nor should not prevent business users from obtaining data, or inhibit inconvenience larger strategic and operational objectives.
Conclusion
I’d like to emphasize that what is considered good data governance policy will differ from organization and even within industries.
Different organizations have different use cases, data products, users, and corporate culture. Its important to make sure that governance rules are a framework, that evolves with time, review, and experience.
Good governance will evolve. And while policies may vague initially, they will grow more specific as governance and data culture at a company evolves.