It is said that data is the oil of the digital age. Mine it, refine it and you’ll reap extraordinary rewards. But today many of our clients are shelving their Artificial Intelligence (AI) proofs of concept, instead of moving them into production. These companies know how to run pilots within business units. Yet, they struggle to organize large sets of clean, secure data needed for analytics and machine learning (ML).
But what if we think of data as water instead of oil? Then we start seeing data as an essential resource that ought to be high quality and travel securely from the source system to the consumer. Nobody disputes the importance of sustainable water management in our societies. Similarly, data governance is the springboard to a secure AI-driven future.
Data governance enables clean data to flow freely – and safely
A recent McKinsey survey found that AI high performers are three times more likely than others to have a clear data strategy and well-defined governance processes. With a new mindset and these four data governance best practices, every company can achieve similar results, i.e., increase revenue and cut costs from implementing AI in multiple areas:
- Master Data Management to ingest only the cleanest data: Imagine that your city just added a new source of water. The municipal water supplier will make sure the additional water is treated and made compliant with the city’s standards and only then mix it with the rest of the water.
All companies, but particularly large ones or those acquiring new businesses, must think of their master data in the same way. There needs to be a ‘single version of the truth’ for this data, which is non-transactional in nature and used across multiple departments. For example, sales, finance and procurement will all be using the same customer, product or vendor data. Clean, well-labeled master data is critical for aggregation and reporting, such as when summarizing sales by customer or product.
To get a single authoritative view of this critical data, companies can set business rules to manually clean and organize master data. Or better still, deploy tools to automate the process of cleansing, de-duplicating and synchronizing information used enterprise-wide and across business applications.
- Data Security to protect the economy’s lifeblood: We want our water supply to be free from leaks and contaminants. Consumers similarly expect businesses to keep their sensitive information safe and sound.
To provide that security, organizations must focus on three things: availability, confidentiality and integrity of data. Together, these make sure that accurate and consistent information is available at the right time to authorized users.
Data security is enabled by the right policies and standards and correct architecture. Policies and standards represent the organization’s commitment to information security. They set the overall direction, specify a course of action, change infrequently and are mandatory, requiring a formal exception process. The components of security architecture include: multiple tiers of virus protection, consistent encryption standards, user authentication and access control with single-sign on, and role-based access control.
- Data Operations (data ops) to strengthen the plumbing: It’s the city’s job to maintain a strong water grid including pipes, pumps and service lines to keep clean water flowing safely from the source to the consumer.
Businesses have the same responsibility toward data. This is where data ops come in. Data ops constitute the entire infrastructure and pipeline though which data flows. Data originates at a source which sends it to a centralized repository where it’s sliced and diced and finally, the analysis is reported via dashboards. As data flows through this complex system, data lineage tools like Orion Governance can track it, detect issues and fix errors.
Increasingly, the goal of data ops is to shorten the time between data injection, reporting and visualization. Accelerating data ops is becoming more important as companies turn to real-time analytics to support fast decision making. Fortunately, industry-standard tools such as Azure Data Services, Talend, and Informatica can improve and optimize data ops today.
- Data Stewardship to orchestrate it all: Just like sound water management is a broad, collective effort, so is data governance a team effort. Every company must have senior authorized persons who act like data stewards. Data governance also needs C-level sponsors. Typically, the chief data officer (CDO) makes sure the data flowing across the enterprise is clean, consistent and aligned with business strategy, and the responsibility to keep the pipes robust, safe and efficient lies with the chief information security officer (CISO).
Prepare for what’s next: the convergence of data governance with analytics governance
Today most companies understand “garbage in, garbage out,’ or that any application of AI and ML will be only as good as the quality of the underlying data. Practices like organizing and cataloging data or creating a common dictionary to align the organization around the same language have become table stakes. To scale AI, companies must go much further and adopt the four practices I’ve described here.
Looking ahead, data governance will continue to evolve. Businesses are planning for a future in which analytics will be embedded throughout the business value chain. The Gartner 2020 report on Data Strategy observes that by 2023, almost all of the world’s top 500 companies will have converged data governance and analytics governance. There’s a simple reason for that: insight gained from analytics cannot be meaningful without trustworthy underlying data.
What’s analytics governance and how does it integrate with data governance? We’ll discuss that in our next blog.
We can help you with your journey to becoming an AI all-star. Find out more about our Data Analytics & Business Intelligence services or email us for a free consultation and we can help you get started.