I was recently reminded that even the best technology is not always enough to implement a successful solution. Purview may be a great data governance tool, but the key to success is to also implement a well-planned data governance program around it.
Within the last year, I had the unique opportunity to experience two Purview implementation projects with very different approaches to data governance. One organization started from the perspective of implementing a data governance program and finding the best tool to support that program. The other started from the perspective of implementing the technology and then trying to figure out how to use it to provide data governance.
Organization One |
Organization Two |
Established a formal Data Governance Office and dedicated staff. |
Assigned responsibility to an existing data analytics group. |
Business stakeholder involvement from the start of planning, through tool assessment, pilot and into MVP launch. |
No business stakeholder input. |
User focused. |
Technology/administration focused. |
As you may have guessed, organization two had many challenges. Without a clear understanding of the value to be provided to users, it was difficult to obtain support from the business and executive management and get dedicated resources assigned to the project. The result was that the initiative was always a low priority and progress was slow.
Below are the keys to successfully implementing data governance with Purview from my experience. You’ll notice that the technical platform and tools are just one out of seven key components for success.
- Enterprise-wide data governance vision, goals, and objectives.
- Executive and Stakeholder buy-in and support.
- Dedicated data governance staff and participation from business and data domain owners.
- Domain specific use cases and requirements based on how users interact with the data assets.
- Purview Data Catalog and Data Governance technical platform and tools.
- Runbooks, documentation, and workflows as necessary for run-the-business policies and procedures.
- Change Management for pre-launch and post-launch recurring communications & training.
In the following approach, I will incorporate these keys to success with the lessons learned from the two projects I mentioned earlier, as well as others, to guide you toward a successful data governance and Purview implementation and help you avoid the common pitfalls.
Phase 1: Data Governance Planning and Purview Proof of Concept
The main goal of this phase is to ensure core data governance objectives can be met and the right stakeholders become engaged with the project. In order to do this, you first need to know what your objectives are.
Start with Key #1 and define your enterprise-wide data governance vision, goals, and objectives. There are many reasons you may be starting this program, from wanting to get to the next level of a data-driven organization to recovering from a data breach, and your goals and objectives should align with these reasons. You will want to take the time to gather requirements for the program. This is also a good opportunity to start working on Keys 2 & 4, stakeholder buy-in and domain specific use cases.
Microsoft provides some example objectives to get you started with thinking about what you want to achieve. The following is my favorite vision statement and example objectives:
“Our vision for the Data Governance Program is to improve the use and usefulness of data across the company to promote the development of actionable insights. We will drive the evolution of data asset management and empower business units to derive value from their data through policies, standards, and collaboration.”
Goals & Objectives:
- Global Data Policy: Inform employees about the guidelines, frameworks, and related requirements for proper management of data and AI tools.
- Data Governance Operating Model: Develop and disseminate an operating model to empower data experts (e.g. data owners and stewards) while engaging senior stakeholders in each organization to ensure data assets and products are aligned with business needs and organizational strategy.
- Centralized Data Issue Management Platform: Provide and manage a central platform and appropriate channels to allow for reporting and escalation of issues with data assets.
- Enterprise Data Catalog: Implement and manage a catalog that documents and enriches data assets and associated metadata, allowing for improved knowledge sharing, data discovery, lineage analysis, and data quality & health monitoring.
- Training: Train and support employees on the different data governance roles & responsibilities, processes, and tools.
As you can see, you do not need to have all the details worked out at this phase, just an understanding of the direction you need to go.
While you are working on your data governance goals and objectives, you can start on Key #5 by standing up Purview and getting some technical resources trained on how to use it. They should learn how to:
- Navigate the Microsoft Purview governance portal
- Connect Data Sources
- Scan a data source
- Configure lineage
- Search and browse data assets
- Understand Classification and Labelling capabilities
The goal for this phase is to configure Purview to demonstrate some critical end-to-end scenarios for your data governance objectives.
By the end of this phase, you will need to have Key #2 started. Your Data Governance Program might be an IT program, but it isn't just about the technology. It's about fostering an enterprise-wide "data as an asset" mindset and creating a foundation in which data assets can be effectively managed. For this, you want to have the support of a core group of executives and key business stakeholders who understand the value data governance will provide the organization and will champion the program and grow support.
Phase 2: Purview Pilot
In this phase, you will launch Purview with a small set of users, maybe a department or other business unit. You should start with Key #4 and develop in-depth requirements for each domain’s specific end-to-end scenarios, such as lineage from Azure Data Lake Storage to Azure Synapse DW to Power BI, as well as key scenarios that will provide value across your organization, such as glossary terms. Then continue with your Purview implementation.
- Integrate data sources for the pilot domains. You may have to perform a comprehensive data discovery and cataloging across your organization, if you don’t already have an inventory of your data estate. While integrating data sources for the pilot, start planning for the infrastructure components that will be needed.
- Establish an enterprise taxonomy for data classification and tagging for organizing data assets to ensure consistent data definitions. This will provide a framework for data, allowing different systems to communicate with a common language.
- Establish Governance Policies to align with regulatory requirements, security standards and data management best practices. Data standards and policies are crucial for ensuring the integrity and usability of data within your organization. These standards and policies help in reducing data silos, ensuring compliance with regulations, and facilitating data access and literacy across the organization.
The goal of this phase is to show how you will be able to successfully roll-out data governance across the organization. A successful pilot phase will solidify Keys #2 & 3 by strengthening executive and stakeholder support and convincing them of the need for dedicated data governance resources and assigned ownership of data governance tasks across the organization. By end of this phase, you should be able to answer these questions:
- What are my data domains
- Who are the SMEs for each domain?
- Who are the business data owners?
- Who are the data stewards/curators?
- Who are the technical data owners?
- Who are the data source admins?
- Who are the consumers and business users of the data catalog, asset discovery, and insights?
Phase 3: MVP
It’s time to plan your production milestones!
Based on what you learned from your Purview Pilot, create a Center of Excellence and onboard the team of data stewards/curators, technical owners, and business data owners to finish the work needed to make Purview useful across your whole organization. They will then work with your data governance organization to define the domains, what the implementation of data governance for each domain will accomplish, and jointly determine prioritization.
A business domain is a grouping of data with similar characteristics that is needed to satisfy business process requirements. Within the domains, there will be data products that harness the value of the business domain’s data. A data product is a curated and self-contained combination of data, metadata, semantics, and templates. This means it is data that can be easily used for analytics use cases and can be tailored to the business domain’s specific needs.
The Data Product Operating Model is a federated data governance operating model. This means the data in each domain is managed, maintained, and operated by a specific business domain team.
- Determine each domain's current level of data-driven activities and data governance readiness.
- Inventory current data related pain points to understand the challenges currently faced and how data governance may help solve them.
- Map your current data flow between the sources and your analytics products.
- Inventory each domain’s current data products.
- Prepare a Data Glossary of the key data assets within each domain’s data product(s) to catalog your data, increasing its usability.
- Set up your Purview MVP implementation project to integrate the highest priority data domains and define data products.
- Determine potential resources to join the project team and help onboard the business domains.
- Create a plan to go to production, which should also include the key areas of infrastructure and security.
To successfully launch your data governance and Purview MVP solution, you should complete Keys 6 & 7. Data governance runbooks and supporting documentation should provide teams with the resources needed to implement data governance into their domains. Define the required processes for run-the-business policies and procedures and to manage issues. Issue Management is a crucial process for addressing, managing, and resolving data issues formally by defining ownership of data issues, tracking their resolution, and ensuring others are informed. Then develop your training and communications program for both pre- and post-launch.
Phase 4: Data Governance and Purview Production Optimization
By the time you reach this phase you should have effective data governance processes in place. In this phase you will start to integrate the data and business domains that were not included in the MVP.
The delivery of data governance and the Purview Data Catalog at scale provides a source of truth rather than just granting access to the multiple systems of record. Data products should be built and highly governed to provide purpose-built data sets for specific use cases. These products should be aligned to business scenarios rather than technical or administrative requirements.
- Map out the relationships to data across the organization including lineage and other visual representations to make data assets easier to find. (“42% of organizations say at least half of their data is “dark”—that is, unknown or unused for business purposes.” - Microsoft)
- Organize data across the organization in a meaningful manner that can also be easily permissioned and reviewed.
- Create corporate and domain level certified glossary terms to enable data curators to continually add meta data to their data assets that evolve with the business.
- Associate specific meta data that is meaningful to the users who interact with the data.
- Standardize data access workflows.
Data governance is an ongoing process. Regular assessments and adjustments are key to maintaining effective governance. And remember, this is just the start for many things data and analytics; data governance will also help your organization prepare for growing trends such as AI.
If you want to discuss more about what you can do to get started on data governance with Purview and take advantage of new AI capabilities, please don’t hesitate to contact us.