<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=4011258&amp;fmt=gif">

Articles

Cluedin articles

mother researching Master Data Management for the Insurance Industry on laptop with her child sitting on her lap

Master Data Management for the Insurance Industry

CluedIn

Master Data Management (MDM) is a crucial process for many industries, including insurance. MDM involves the creation and management of a central repository of master data, which is used to support a wide range of business processes and decision-making activities. In the insurance industry, MDM is particularly important because of the large amount of data that insurers must manage in order to accurately assess risks, underwrite policies, and settle claims.

The Role of Master Data Management in Insurance

MDM plays a critical role in the insurance industry by providing a single source of high-quality data that can be used to support a range of business processes. This includes:

  • Risk Assessment: In order to accurately assess risk, insurers need access to a wide range of data, including demographic information, credit scores, and historical claims data. By consolidating this data, insurers can more easily analyze and leverage this information to identify trends and patterns that can help them make more educated underwriting decisions.
  • Underwriting: Once insurers have assessed risk, they must decide whether or not to underwrite a policy. This involves evaluating a range of factors, including the policyholder's history, the type of policy being offered, and the level of risk associated with the policy. By using MDM to manage this data, insurers can make more informed underwriting decisions, resulting in more accurate pricing and better risk management.
  • Claims Processing: In the event of a claim, insurers must quickly and accurately process the claim in order to satisfy the policyholder and minimize their own costs. MDM can be used to manage all of the data associated with the claim, including the policyholder's information, the type of claim being made, and any relevant documentation. This can help insurers quickly process claims and reduce the likelihood of fraud.
  • Compliance: The insurance industry is heavily regulated, with strict requirements for data management and reporting. MDM can help insurers ensure that they are meeting these requirements by supporting data governance policies and procedures, automatically categorizing and masking sensitive personal information and providing detailed data lineage.

What is the Business Impact of Master Data Management?

Currently, the biggest opportunity in MDM for insurance companies is the ability to organize data in new and innovative ways to enable advanced analytics, Artificial Intelligence (AI), Machine Learning (ML), and cognitive learning systems. Data-driven organizations are already using MDM architectures to “future-proof” their business by anticipating customer expectations and streamlining operations.

For example, CX management is the source of organic revenue growth for many insurers, and a modern MDM system can take the art and science of managing customer relationships to new levels. By consolidating data from individual policies and aggregating them into a customer/household view, or golden record, insurers can:

  • Use advanced analytics including AI to up-sell/cross-sell more efficiently and effectively
  •  Determine customer channel preferences and communicate, service, market and sell accordingly
  • Understand the status of claims reported, paid and outstanding at the customer/household level
  • Develop a customer level and household level profitability score.

Diving a little deeper, once an MDM solution is in place, insurance firms benefit in a number of ways:

  • 360° customer view – MDM enables a holistic 360° customer view that greatly improves business insights around customer sentiment and demand. This view integrates back to the master data source, ensuring the validity and accuracy of the insights gained. The golden record takes innovation in sales, service, and marketing to new levels of creativity and personalization.
  • Streamlined Customer Data Integration (CDI) – Good MDM practices enable streamlined CDI, reducing the day-today data management burden and releasing resources to focus on value-driven projects.
  • New Cross-Selling Opportunities – Advanced analytics tools can reveal hidden insights previously unknown to the organization. Insurance firms can use this insight to identify cross-selling opportunities and to prioritize specific customers or demographics with tailored sales tactics.

Challenges of Master Data Management in Insurance

Data Quality: Insurance data can be complex and difficult to manage, with a wide range of data sources and formats. While traditional MDM systems have struggled to cope with semi-structured and unstructured data, augmented platforms such as CluedIn are capable of ingesting poor quality data in almost any format in order to consolidate, clean and enrich the data ready for use.

Data Integration: Insurance data is often siloed in different systems and databases, which can make it difficult to integrate this data into a single MDM repository. Historically, this would require significant data mapping and integration efforts. However, more advanced systems like CluedIn can easily cope with hundreds of different data sources.

Governance: MDM requires strong governance to ensure that the data is managed effectively and efficiently. This includes establishing clear policies and procedures for data management, as well as providing ongoing training and support to employees. A popular option for many organizations is to use a data governance platform in conjunction with an MDM system in order to ensure that data is handled in accordance with the governance standards set as well as being easily accessible and usable by business users in various teams.

Cost: Implementing a traditional MDM system is a costly endeavour, requiring significant investments in software, hardware, and personnel. The need to model and map data beforehand also added months to the length of time taken to realize any value from these investments. All of this has changed with the advent of augmented MDM systems which remove the need for upfront data modelling and use modern technologies like Graph to allow the natural relationships between the data to emerge. Contemporary MDM systems are also Cloud-native, which means that they offer the advantages of both scale and efficiency inherent to the Cloud. 

Conclusion

Despite the obvious benefits of MDM, the barriers of traditional approaches have, until now, prevented many insurers from investing in this technology. With many of those hurdles now cleared, the path has opened up for insurers who want to use their data to fuel the insights and innovations they need to remain competitive and profitable. Improvements in business processes, streamlining operations, and managing risk are all vital to the success of an insurance provider, and MDM provides the foundation of trusted, business-ready data that enables them.

Read More
blue waved graphic

Data Governance and Master Data Management. What is the difference and why do I need both?

CluedIn

Data Governance and Master Data Management (MDM) are both important components of managing an enterprise's data assets. While they have somewhat different goals and remits, they are complementary and work together to ensure that an organization's data is accurate, consistent, and secure. The close relationship between the two can often lead to confusion over which discipline is responsible for different areas of data management, and sometimes means that the terms are used interchangeably.

Let's start by defining what Data Governance and Master Data Management are:

Data Governance: 

Data Governance refers to the overall management of an organization's data assets. This is the process of managing the availability, usability, integrity, and security of the data. It involves establishing policies, procedures, and standards for data usage and ensuring that they are followed by everyone who interacts with the data. The primary objective of Data Governance is to ensure that data is properly managed and that it is used in a way that aligns with the organization's goals and objectives.

Some of the key components of Data Governance include:

  • Data policies: These are formal statements that outline how an organization's data should be managed, who has access to it, and how it should be used.
  • Data standards: These are established guidelines and rules that govern how data is collected, stored, and used across the organization.
  • Data stewardship: This is the process of assigning ownership and responsibility for managing specific data elements within an organization.
  • Data quality: This refers to the overall accuracy, consistency, completeness, and timeliness of an organization's data.
  • Data security: This involves protecting data from unauthorized access, theft, or loss.

Master Data Management

This is the process of creating and maintaining a single, accurate, and consistent version of data across all systems and applications within an enterprise. It involves identifying the most critical data elements that need to be managed, and then creating a master data record that serves as the authoritative source for those elements. The primary objective of MDM is to ensure that these critical data elements are accurate, complete, and consistent across the enterprise.

Some of the key components of Master Data Management include:

  • Data modeling: This involves defining the structure and relationships between different data elements and creating a data model that represents the organization's master data.
  • Data integration: This involves integrating master data from various sources and systems to create a single, authoritative source of master data.
  • Data quality management: This involves ensuring that the master data is accurate, complete, and consistent across all systems and applications.
  • Data governance: This involves establishing policies, procedures, and standards for managing master data and ensuring that they are followed by everyone who interacts with the data.
  • Data stewardship: This involves assigning ownership and responsibility for managing specific master data elements within an organization.

It is fair to say that there are several areas of data management in which both Data Governance and Master Data Management have a role to play. For example, defining data quality standards and policies would most likely fall under the remit of Data Governance, whereas assuring the integrity, consistency, and relevance of individual records is the responsibility of Master Data Management. Similarly, data stewardship also has a foot in each camp. While it is generally Data Governance policies that specify how data should be managed and maintained, it is Master Data Management platforms that provide the tools for data stewards to ensure that these policies are followed.

The main differences between Data Governance and Master Data Management are:

  • Focus: Data Governance focuses on managing an organization's data assets as a whole, while MDM specifically targets critical data elements.
  • Scope: Data Governance covers all data assets within an organization, while MDM is concerned only with master data.
  • Objectives: Data Governance aims to ensure that data is properly managed and used in a way that is compliant and secure, and that aligns with the organization's goals and objectives. MDM aims to ensure that critical data elements are accurate, consistent and ready for use by all systems and applications.
  • Processes: Data Governance involves developing and implementing policies, procedures, and standards for managing data, while MDM involves creating and maintaining a single, authoritative source of master data.
  • Ownership: Data Governance involves designating ownership and responsibility for managing all data within an organization, while MDM enforces those roles and responsibilities for managing specific data assets.

Do I really need Data Governance and Master Data Management tools?

If you want to be able to use your data for value creation, and do so in a compliant and secure way, then the answer is yes.

Data Governance and Master Data Management are complementary disciplines in the sense that they both work towards ensuring the quality and integrity of an organization's data assets. Here are some of the specific ways in which they complement each other:

  1. Data Governance provides the framework for MDM: A robust Data Governance framework provides the foundation for MDM. It establishes the policies, standards, and procedures for data usage that MDM relies on to create and maintain accurate and consistent master data records.

  2. MDM ensures data consistency across systems: MDM provides a single, authoritative source of master data that is consistent across all systems and applications within an enterprise. This helps to ensure that data is not duplicated or inconsistent across different systems, which can lead to errors and inefficiencies.
  3.  Data Governance ensures data security and privacy: Data Governance policies and procedures help to ensure that sensitive data is properly secured and that data privacy regulations are adhered to. MDM relies on these policies and procedures to ensure that master data records are secure and comply with data privacy regulations.
  4. MDM enables effective decision-making: With accurate and consistent master data records, organizations can make better decisions based on reliable data. Data Governance ensures that the data is trustworthy, while MDM ensures that the data is accurate and consistent across all systems.

Benefits of implementing Data Governance and Master Data Management

Improved data quality:
Data Governance ensures that data is properly managed and secured, while MDM ensures that critical data elements are accurate and consistent across all systems. Together, these concepts help to improve the overall quality of an organization's data.

Regulatory compliance:
Data Governance policies and procedures help to ensure that an organization complies with data privacy regulations and other regulatory requirements. MDM relies on these policies and procedures to ensure that master data records are compliant with these regulations.

Better decision-making:
Accurate and consistent data is essential for effective decision-making. With Data Governance and MDM, organizations can rely on trustworthy data to make better decisions.

Cost savings:
Inaccurate or inconsistent data can lead to costly errors and inefficiencies. Data Governance and MDM help to reduce these costs by ensuring that data is accurate, consistent, and properly managed.


Conclusion

Data Governance and Master Data Management are complementary yet independent disciplines of data management. Both have distinct areas of responsibility and roles to play within a data estate, and in practical terms, there is little overlap between the two. While Data Governance provides the overall framework within which Master Data Management operates, one doesn’t necessarily have to come before the other and either can work autonomously.

However, as with most technology fields, the real value comes from having a set of tightly integrated tools and systems that work together to deliver greater value than the sum of their individual parts. That is certainly the case with Data Governance and Master Data Management. Organizations are demanding more from their data than ever before – they want more insights, more intelligence, and as a result, more opportunities to grow the business. Meeting that need means that you can’t afford to waste valuable time and money wrangling with data that is of poor quality and difficult to access. In combination, Data Governance and Master Data Management can provide a reliable, trusted pipeline of data that is ready to deliver insight across the business, and that is what most organizations today need to succeed.

Read More
microsoft intelligent data platform graphic

A Brief History of the Microsoft Intelligent Data Platform

CluedIn

The Microsoft Intelligent Data Platform is a suite of tools and services that enable businesses to manage and analyze large amounts of data. Although not officially launched until 2022, the origins of this powerful ecosystem can be traced back over 30 years. The platform has evolved over time to keep pace with changing technologies and business needs, and most recently was expanded to include technology, consulting and ISV partners to complement and build upon its capabilities.

Here's a brief history of the Microsoft Intelligent Data Platform:

The Origins of SQL Server (1989-1995)

The origins of the Microsoft Intelligent Data Platform can be traced back to the early days of SQL Server, which was first released in 1989 for the OS/2 operating system. SQL Server was designed to be a relational database management system (RDBMS) that could store and manage large amounts of data.
Over the years, SQL Server evolved and gained new features, such as support for stored procedures and triggers. Microsoft also released versions of SQL Server for Windows NT and Windows 2000, which helped make it a popular choice for enterprise-level applications.

The Rise of Business Intelligence (1995-2005)

In the late 1990s and early 2000s, the concept of business intelligence (BI) began to gain popularity. BI refers to the tools and processes that businesses use to analyze data and gain insights into their operations.
To meet the growing demand for BI tools, Microsoft released a suite of products under the banner of Microsoft Business Intelligence. These products included SQL Server Analysis Services, which allowed businesses to create multidimensional data models, and SQL Server Reporting Services, which enabled users to create reports and visualizations.

The Emergence of Big Data (2005-2010)

In the mid-2000s, the amount of data being generated by businesses began to grow exponentially. This trend was driven by the rise of the internet, social media, and other digital technologies.
To help businesses manage and analyze this growing amount of data, Microsoft introduced a new product called SQL Server Integration Services. This product allowed businesses to extract, transform, and load (ETL) data from a wide range of sources.

Microsoft launched its own Master Data Management offering - Master Data Services (MDS) - as part of Microsoft SQL Server 2008 R2 in 2010, and it has been included as a feature in every subsequent version of SQL Server. 

The Cloud Era (2010-Present)

In 2010, Microsoft launched its cloud computing platform, Azure. Azure enables businesses to build, deploy, and manage a wide range of applications and services in the cloud. It has since grown to become one of the leading cloud computing platforms, competing with other major cloud providers such as Amazon Web Services (AWS) and Google Cloud Platform.

To support the growing demand for cloud-based data management and analysis tools, Microsoft continued to evolve its suite of data tools and services. This included the release of products such as Azure Data Factory, which allows businesses to orchestrate data workflows in the cloud, and Azure Stream Analytics, which enables real-time data analysis.

Microsoft also embraced open-source technologies, such as Apache Hadoop and Apache Spark, which allowed businesses to analyze large amounts of data using distributed computing techniques.
In October 2022 Microsoft announced the creation of the Microsoft Intelligent Data Platform Partner Ecosystem, consisting of a select number of technology companies, consulting firms, and independent software vendors (ISVs) that offer solutions and services that complement the platform. CluedIn is one such partner, forming part of the Governance pillar of the platform alongside Microsoft Purview. CluedIn is a recommended Azure-native Master Data Management provider and has also been endorsed as a modern alternative to MDS.

Today, the Microsoft Intelligent Data Platform continues to evolve to meet the needs of businesses of all sizes. With its wide range of tools and services, the platform allows businesses to manage and analyze data in the cloud, on-premises, or in hybrid environments. The ultimate goal is to allow companies to realize more value from their data by shifting the emphasis away from day-to-day data management and towards value-creation opportunities.

Read More
woman writing on transparent whiteboard brainstorming master data management ideas for the banking industry

Master Data Management for the Banking Industry

CluedIn

In a world where our personal data is held by a multitude of different organizations, banks hold the deepest and most personal datasets. Forget Google and Facebook, their datasets pale into insignificance when compared with the sheer volume of data held by banks. From employment and property history to investments, savings, credit scores, and transactions, banks have it all.

Data challenges in the Banking Industry

With a wealth of customer and other data at their disposal, banks should be in the best position to offer their customers personalized advice, products, and services. In reality, banking customers rarely receive the kind of tailored offers and bespoke advice they should. Banks are also struggling to streamline processes, manage costs, and drive efficiencies – there is still a lot of manual work required to integrate and clean data, which inhibits a bank’s ability to gain insights and apply intelligence-based technologies.

One of the main challenges for banks is the volume of data they have. Integrating, cleaning, and enriching so many different types of data from multiple systems is not an easy undertaking. This is probably why most banks are still grappling with creating a unified view of internal, structured data. In the meantime, the market has already moved on to addressing unstructured data and using external sources to enrich it in readiness for delivering insight.

Another major consideration for banks in relation to how they manage their data is meeting regulatory requirements and ensuring high levels of compliance at all times. Banks are subject to laws and regulations addressing everything from capital requirements, financial instruments, and payment services to consumer protection and promoting effective competition. All of which place restrictions and conditions on how banks manage their data and ensure its integrity.

Drivers of digital transformation and data modernization

The imperative for banks to evolve into more digitally-enabled, data-driven institutions comes from several distinct, but undeniably related areas.

The emergence of Cloud-native, agile new market entrants is forcing banks to follow their lead and take a more holistic view of their customers and their data. Customers don’t just want to be told which product to buy next – they want personalized advice in real-time. It’s not enough for a bank to know what its customer did, they need to know why they did it and what they are likely to need in the future. In general, it is estimated that banks have the potential to reduce churn by between 10% and 20% and increase customer activity by an average of 15%. This would substantially impact revenue and is why managing customer data and preparing it for use is one of the most important use cases for Master Data Management (MDM) in the banking industry.

Building lean, efficient, and highly effective processes is also a top priority for banks that want to enhance efficiency and reduce costs. Automation, Machine Learning, and AI all have an important role to play in this effort and there is a high degree of interest in these technologies amongst banks and other financial institutions. While results to date have been mixed, partially because of a lack of trusted, governed data to fuel such projects, analyst firm McKinsey is predicting a second wave of automation and AI emerging in the next few years, in which machines will do up to 10 to 25 percent of work across bank functions, increasing capacity and freeing employees to focus on higher-value tasks and projects. To maximize the potential of this opportunity, banks first need to design new processes that support automated/AI work, and they will need a reliable supply of high-quality, integrated data to sustain them.

The compliance conundrum

One of the key drivers for effective data management in the banking sector is satisfying regulatory and compliance requirements. These regulations mean having accurate and up-to-date information with full audit trails and adequate data security protection is important. Historically, this has led to friction between the need to sufficiently protect and report on data and the desire to use it to streamline operations and customize the customer experience.

That has changed as advances in data management technologies have developed to include provisions for meeting data protection and privacy standards. Modern Master Data Management and Data Governance platforms combine the delivery of a trusted single view with the assurance of rigorous data governance capabilities that allow banks to achieve full compliance and use their data with confidence. This is accomplished

through a combination of features like Automated PII Detection, Automatic Data Masking, Data Sovereignty, Consent Management, and the setting of Retention Policies.

The time is now

Achieving fully governed, trusted data is no mean feat for a sector that accumulates a tremendous amount of data on a daily basis. It is however no longer a nice-to-have, as customers demand more from their financial providers and competitors are upping the ante in terms of convenience, flexibility, and experience. The longer a bank allows its technical data debt to grow, the harder it will be to remain competitive.

As margins shrink and new contenders enter the market, the pressure is on to find new ways of delighting customers and exceeding their expectations. For the vast majority of banks, the answers lie within their already extensive data reserves, and now is the time to tap into them.

Read More
blue and green-waved augmented data management graphic

What is Augmented Data Management?

CluedIn
There comes a time in every industry when the status quo is challenged. Often this comes in the form of a new market entrant with a vision to improve or fix what was broken about the old way of doing things. Sometimes, they don’t just make things better, they eradicate the problem altogether. Almost always, they advance the industry in the customer’s favour and force traditional providers to evolve in order to keep up.

This is what is happening right now in the world of Master Data Management (MDM). For 25 years organisations have had to accept that attempting to put their master data to use in a meaningful way would be expensive, slow, and almost wholly reliant on IT. It’s no wonder then that so many of them either tried and failed to implement MDM, or avoided it altogether. Meanwhile, advances in technology such as Cloud computing, Machine Learning (ML), Artificial Intelligence (AI) and Natural Language Processing (NLP) have continued apace, leaving traditional MDM solutions in the metaphorical dust.

All of that is changing with the advent of modern MDM systems. The new breed of MDM is built for the Cloud, enhanced by AI, Graph and/or NLP, and democratizes data for use across the business. In fact, modern MDM systems are so different from their legacy counterparts that in many ways they aren’t MDM systems at all.

Enter Augmented Data Management (ADM). Augmented Data Management utilizes advanced technologies such as Graph, AI and NLP to enhance and automate data management tasks related to quality, governance, preparation and enrichment. The automation piece is crucial as it takes the burden of manual, repetitive tasks away from the data engineering and IT teams, allowing them to focus on creating value. It also means that business users are empowered to use data to drive their own analytics and insight-driven initiatives.

In addition to the obvious simplification and automation benefits, true ADM is Cloud-native and delivers maximum value as part of a Cloud-based, integrated data management ecosystem. Data often lives in multiple complex and siloed systems within an organisation, from ERP platforms, Data Lakes and Data Warehouses to spreadsheets, presentations and PDFs. Many organisations have invested heavily in Business Intelligence (BI) and Analytics tools which rely on a consistent flow of high quality data, but the challenge has been getting the data out of its various repositories and into a state which is usable by these tools. ADM bridges this gap, delivering data that is ready for insight more quickly than ever before.

ADM realises the unfilled promise of what MDM should have been, and in the not too distant future will supersede MDM entirely. Organisations that embrace this advanced approach will finally be able to use their data to shorten product development cycles, accelerate go-to-market plans and maximise revenue-generating opportunities. In the quest to become data-driven, ADM is well on its way to becoming a non-negotiable requirement.
 
Read More
Cloud Native graphic showing 3 white clouds on top of a blue and pink gradient background

Why your Data Management Platform should be cloud-native

CluedIn

The phrase cloud-native gets banded around a lot. If we go by the standard Wikipedia definition, cloud-native computing is an approach in software development that utilizes cloud computing to "build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds".

Does this definition really capture what it means to be cloud-native? We don’t think so. We believe there is a fundamental difference between this definition and what it is to be truly cloud-native.

Cloud-native goes so much deeper than just scale. If we take CluedIn, a cloud-native Master Data Management platform, we can identify ten characteristics that identify it as cloud-native. We would argue that if your data management platform cannot meet these requirements, then not only is it not cloud-native, it is also falling short of delivering against modern MDM requirements. Let’s examine each in turn.

1. Getting your data-initiatives started is faster and easier

Your first experience of a cloud-native platform should be how easy it is to purchase, install, setup, and get started. A cloud-native platform installs in your own cloud tenant (e.g. Azure) with one click, a few details on a form, and that is it. You can install custom settings such as your own HTTPS certificate, or custom domains, but the experience is straightforward and simple. Long gone are the days of tedious installation processes where you need to worry about what operating systems to choose or to wait for your hardware to arrive.

Cloud-native doesn't mean you are on docker containers and Kubernetes. You can take the majority of platforms, even the old dinosaurs and have them containerised. You can then quite easily have it composed with Kubernetes. This is not cloud-native.

Installing CluedIn will setup the environment in an Azure Kubernetes Service. This makes sure it uses an Azure Key Vault, a firewall and all the best practices that you would expect - all rolled up into a single price. CluedIn can then be scaled in many different ways, including automatically, manually through the Azure Portal, or by redeploying the application with new Helm Chart updates (replicas).

Specifying your environment is all done using box sizes with a known number of boxes and associated cost. You want things to happen more quickly? You pay more – it’s as simple as that.

Cloud-native essentially means that the entire backbone of operations in your platform can run off a cloud provider’s stack. For example, being able to utilize cloud Platform-as-a-Service (PaaS) services to scale out the processing or size or storage amount. Cloud-“nativeness” should feel like using a Software-as-a-Service (SaaS) platform, just in your own tenant.

At the technical level, a cloud-native platform doesn't require containers, but this clearly is the direction that cloud is taking and investments will be made that expect you to be on containers to reap the full benefits of the cloud. Having Kubernetes as the orchestration framework is only the start. To get the best out of Kubernetes, your application needs to be designed in a way that can take advantage of it.

Cloud-native means that costs should be transparent and predictable. Like most companies, you’re probably more than happy to spend money when the costs are clear and the ROI is obvious. What you probably don’t want are hidden or unpredictable costs.

Cloud-native does not necessarily have to mean SaaS. We believe that your most important data should stay in your environment, but to take full advantage of the cloud cost model, multi-tenancy is needed. The future of data sovereignty is having cloud providers offer smart ways of slicing the same digital assets and guarantee the tenancy and isolation. There is no easy technical way to solve this, it is more about trust - i.e. do you trust a particular vendor – say CluedIn - to host your data on your behalf? This is not the same as asking you to trust Microsoft with your data. Microsoft have thousands of full time Infrastructure and DevOps people, CluedIn does not (yet!).

When asked what it means to be cloud-native, an industry veteran replied that "it needs to be multitenant." This is interesting, and not necessarily wrong. Rather it is a clear indication that cloud is about an architecture where data is stored once and consumed separately. Even from different companies.

At CluedIn we endeavour to deliver a SaaS experience, with the benefits of PaaS “under the hood”/ The critical difference being that your master data always stays in your environment - even if it is used in a public cloud.

2. Access to elastic scale is at your fingertips (if your technology stack allows it)

Cloud quotas aside, companies theoretically have access to endless cloud resources on demand. This is immensely powerful in delivering accelerated time to value. But just because you installed an application in the cloud, it does not mean that the technology will take advantage of it (quite the opposite, in fact). With easy access to elastic scale comes a warning, as we have seen some customers’ budgets spiral out of control because they took advantage of the scaling possibilities. There are tools to help with this, but the temptation can be to ignore these rules because of a desire to just get the job done.

3. Other cloud-native services will play nicely with you

Very similar to the architectural wins we saw pre-cloud, the same principle of realising greater efficiencies from services that are co-located applies in the cloud.

Although companies embark on many different data-driven initiatives, they typically tend to fall into a handful of buckets. These include Reporting/Business Intelligence, Data Science and Process Automation. The majority of these services are all fuelled by data. Having the sovereignty of your data on the same backbone means that the insight-generating systems can get to the data more easily.

4. Cloud-native systems work with the economics of the cloud.

Moving to the cloud can be a double-edged sword, with scalability, flexibility and interoperability gains being counterbalanced by escalating costs. The cloud only makes economic sense if it is being used in the way it was intended to be. For example, elastic compute is only more efficient when you have the ability to scale down as well as up. You should be wary of tools that have made their way to the cloud rather than having being designed for it in the first instance. This is because for an existing on-premises platform to truly become cloud-native, it will inevitably need a major system overhaul.

One of the benefits of choosing a cloud-native platform is there is a better chance that the architecture of the platform is more sound and suited to that environment. That is a huge generalisation, and it does not mean that architectures that were built pre-cloud are less robust. Rather it means that architectures built for the cloud should inherently be more robust as a result of the economic model imposed by cloud platforms. The first fundamental element of designing for the cloud is to properly separate state from stateless. This is critical to allowing parts of your application to scale up and down. Your application needs to have parts of the platform which can work with 100 instances or none at all. This is the stateless part of your application. As for the state part, it is very cloud-like to have your data persisted in cheap storage. The benefit of this is that when your platform is not being used at all, you pay either little or no money. A truly cloud-native solution will cater for:

  • Unexpected network latency (the time taken for a service request to travel to the receiver and back).
  • Transient faults (short-lived network connectivity errors).
  • Blockages by long-running synchronous operations.
  • A host process that has crashed and is being restarted or moved.
  • An overloaded microservice that can’t respond for a short time.
  • An in-flight orchestrator operation such as a rolling upgrade or moving a service from one node to another.
  • Hardware failures.

5. The majority, if not all, of the R&D investment is going into the cloud

The majority of technology vendors today are investing in making sure that their offering is suited to and available via the popular cloud providers. In fact, many providers are dropping support for their on-premises and non-cloud-native solutions. This is because the cloud offers the best possible benefits for all players involved, including the customer, the vendor and the market. A properly cloud-native platform is working towards providing customers with a solution that is easier, cheaper and faster.

6. Cloud-native embraces and welcomes the idea of huge data.

For years companies and vendors alike have grappled with the challenge of managing huge amounts of data. One of the benefits of major technological advances is that they don’t just make something easier or better, they eradicate the need to even think about it in the first place. The properly architected cloud solution acknowledges and takes advantage of the fact that data storage is cheap. If you look at the breakdown of the costs associated with CluedIn, 2% is storage. Why? Because storage size and the amount of data should not hinder your ambitions.

7. Cloud-native lets you focus on the things that are important.

Since going cloud-native, we have not spent a single day worrying about the security of our virtual machines in the cloud. We have not spent a single day hardening the boxes that are used in processing of our data. And we have seen the same thing happening with our customers. We realise at CluedIn that our speciality is data and hence we rely on cloud services to manage our firewalls, networks, and more. We focus 100% on providing the best Master Data Management platform on the market, allowing our customers focus on doing whatever it is that they do best too.

8. Cloud-native delivers the required services, as part of the service.

Because CluedIn is cloud-native, our customers inherit several cloud services at no extra cost. Something as simple as cost management is natively available in all services, if they have been designed to take advantage of the tooling. Industry analysts Gartner have developed a full FinOps category, but in our opinion a cloud-native service provides this inherently. In addition, there are numerous services required for security, policy management, performance efficiency, alerts and more. All of these are essential services, that a cloud-native platform should deliver as standard, not as an afterthought.

9. Security, authentication and authorization form part of the ecosystem.

Whether it has come as a result of advances in the technology itself, or it is cloud-native specific, security, authentication and authorization should be ingrained inside the services. Genuinely cloud-native solutions have embraced the idea that access, permissions, authentication and authorization are often provided by third party services. This "context" provides more free flowing access to services, connectivity and discovery once properly authenticated. For CluedIn, being cloud-native means being cloud-aware. This means that the CluedIn platform is aware that it is being hosted in the cloud and sits inside a specific context of security - providing access to the services that sit underneath the platform.

10. Dependencies can become services, allowing products to move faster.

A cloud-native solution supports the idea that users should be able to take full advantage of the native services provided by cloud providers. When CluedIn is deployed into Microsoft Azure it uses the underlying managed disks, the underlying logging, the underlying security and has the option of using the respective database services. This allows our product team to walk in on a daily basis feeling like a team of 400+ developers, because we know that many of the wins we are achieving in our platform come as a knock-on effect of us using the services which huge teams at Microsoft are working on, on a daily basis.

Summary

There’s a lot more to being cloud-native than simply being able to scale and run scalable applications. Scalability is a big part of it, but true cloud-nativeness is a mindset and a attribute that runs through the very core of a platform. Most companies are already well on their way to realising the advantages the cloud offers, and there is no turning back the tide. Those who will see the greatest scalability, efficiency and flexibility gains will embrace truly cloud-native platforms and reap the benefits of a cloud-first approach.

Read More
green Data Mesh graphic

Data Mesh: The Next Frontier in Master Data Management?

CluedIn

Data is difficult to manage, especially master data. Master Data Management is essentially the process of controlling the way you define, store, organize and track information related to customers, products and partners across your organization. Traditional data modeling requires that all business entities are defined in advance and cannot change throughout the life of the enterprise system; this limits flexibility and prevents you from easily integrating new systems with existing ones. What if you could handle master data management in a more flexible and agile way? Could it make managing master data much easier?

Data Mesh is a relatively new approach to mastering data, and if its advocates are to believed, could represent the future of master data management.

What is Data Mesh?

Data Mesh is a platform architecture that takes a decentralized approach to managing data. Fundamentally, it’s about treating as something that is separate from any single platform or application. Under the Data Mesh philosophy, data is treated as a democratized product, owned by the entire organisation and managed by domain-specific teams. Each team is responsible for the quality, governance and sovereignty of its own domain, and this data is then provided for use by the rest of the business.

What problems is Data Mesh trying to solve?

The theory behind Data Mesh is a noble one. In essence, it is seeking to solve the fundamental and pervasive issues that have dogged traditional Master Data Management systems for years. In order to realise any value from these initiatives, organisations have been forced to wrestle their data into rigid structures and compositions before they can even get started. Cleaning, preparing and integrating data is hard, time-consuming and expensive, which is why so many Master Data Management projects fail before they've even begun. It's not uncommon in a traditional MDM project for it to take six months to get just one domain operational. Six months! Considering the speed at which customers, markets and competitors change, this timeline simply isn't acceptable. By rejecting the heavily centralized and monolithic models of the past, Data Mesh is trying to do a good thing. The question is whether Data Mesh is a practical alternative.

The challenges of Data Mesh

Despite the buzz around Data Mesh, there are some fundamental questions that need to be answered before you can decide if it's right for you. Firstly, you'll need to consider whether your organisation is operating at a scale at which Data Mesh makes sense. Complete decentralization of data brings its own challenges and risks, as it depends on each of those domain-based teams having the necessary skills and experience to manage the data they are responsible for. Without some level of centralised management, data silos, duplication and governance issues will inevitably arise.

Data Mesh is also a more expensive option. If we accept that every domain-based team will need people with some level of data management experience, and that there needs to be another team to oversee them, suddenly this looks very expensive. For all but the largest of organisations, Data Mesh is most likely cost-prohibitive.

It's not just about cost, it's also about value and building a compelling business case. In theory, federated ownership should lead to quicker learning cycles and accelerated ROI. In practice, if each domain-based team needs to invest in some level of data analytics and engineering expertise before they've even procured any technology, it's going to be very hard for some departments - let’s say the HR team which is responsible for Employee Data - to justify and build a compelling business case.

Which brings us to another point. You cannot buy a Data Mesh. There is no off-the-shelf product that will enable a federated approach to data ownership. It is about so much more than technology and tools. It is a topology, a guiding principle, and in order to realise its value it requires a mindset change and cultural shift that many organisations simply are not ready for.

A means to an end

We've established that Data Mesh is not right for everyone. But that doesn't mean what it seeks to achieve is wrong. Quite the opposite, in fact. There is only way to allow organisations to use their data in ways that will actually help them to adapt to market shifts and serve their customers and stakeholders better. And that is to rip up the current master data management "rulebook" and start again.

It is possible to share data ownership between the IT team and business users without requiring the latter to be data engineers or scientists. It is possible to automate manual tasks like data cleaning, enrichment and integration and save hours of time and significant sums of money. Most importantly, it is possible for sets of data to be treated as products which are universally useful across the business. Truly democratized data can only come from a platform that benefits the many, not only the few - as is the case with Data Mesh. The future therefore lies in a modern approach to Master Data Management that has the same ambition as Data Mesh, but which makes the means of achieving it accessible to all.

Read More
graphic of hands grabbing a circle of light

The Future of Master Data Management: Traditional vs. Modern approaches

CluedIn

Master Data Management (MDM) has been around since the mid-1980s, but has really come to the fore in the last decade, with many of today’s data governance efforts built on top of existing MDM strategies. This has been driven by the advent of Big Data, an increased focus on Business Analytics and Intelligence, and growing adoption of Machine Learning and Artificial Intelligence.

For the past 25 years or so there have been no major leaps in how providers have built or provisioned their MDM offerings. Traditional MDM solutions still require you to implement strict controls over every aspect of your master data management process—from data acquisition to data storage, and from maintenance and modification to security and access control. These systems were built for the on-premises, siloed institutions of the past where data ownership lay almost exclusively with the IT department.

Modern approaches are more aligned to how most enterprises operate today - in a hybrid, highly distributed and fluid fashion. Data is a valuable business asset, which means that technology and business users are equally responsible for its maintenance and use. This does not mean that everyone in the business needs to be a data engineer or architect. What it does mean is that everyone is, to some extent or another, a data steward and a data citizen. It is the job of technology to enable these roles and ensure that everyone with a stake in an organisation's data benefits from its potential. Which is where Modern MDM comes in.

What is Master Data Management?

At a fundamental level, Master Data Management (MDM) is the process of creating and maintaining a single, consistent view of your organization's critical data. MDM is closely related to data governance, which can be thought of as rules for how data is collected, processed, stored and accessed. It also includes policies on how data should be handled, such as how long it should be retained and what access permissions are granted to different groups of people or individual data owners.

Master data is the set of identifiers that provides context about business data. The most common categories of master data are customers, employees, products, financial structures and locational concepts. Master data is different to reference data, which is data that refers to other objects, but does not contain identifiers that represent different types of master data entities. Whether there is still a need for reference data in the context of what can be achieved with modern MDM is debatable, but that's a discussion for another time.

What's the problem with traditional Master Data Management solutions?

It has been estimated by Gartner that up to 85% of MDM projects fail. That's a big number. Little wonder then that so many organisations have been burnt in the past and aren't exactly falling over themselves to start another MDM initiative.

There's a number of reasons why this number is so high:

  1. The upfront planning process - data profiling, analysing and modelling is time consuming and expensive. Many traditional MDM projects take over a year to deliver any ROI at all.
  2.  A domain-by-domain approach, such as that used by traditional MDM systems, causes complexity and creates new silos, restricting how the data can be used.
  3. Traditional MDM demands high manual and technical intervention, which is both costly and time-consuming.
  4. Because traditional MDM systems are built on relational databases with only direct relationships, connections are manual and add to the maintenance overhead.
  5. Due to the upfront profiling and modeling requirements, you're always playing catch-up with your data as it changes. This adds to the complexity and need for manual intervention, further delaying projects.

In spite of all of the above, the fact remains that businesses need to be able to use their data to fuel the projects that will move them forward. Whether these are customer, product, supplier or employee focused initiatives, they all rely on data to provide insights to inform them. At the moment, many organisations are using their data in this way, but the data is neither consistent nor reliable. Which means that the results and recommendations aren't trusted either.

The modern approach to Master Data Management

Modern MDM seeks to solve the above issues in a number of ways.

  • By managing all of your data - master, meta, reference, structured and unstructured. Suddenly, the potential use cases for your data have multiplied exponentially.
  • By eradicating the need to model your data upfront. Modern MDM embraces data in its "raw" form from hundreds, if not thousands, of data sources. The potential cost and time savings are huge.
  • By removing repetitive and manual tasks from the outset. Automating manual tasks like data cleaning reduces the burden on the client and frees time and resources to work on value-orientated tasks instead.
  • By being truly Cloud-native. Most traditional MDM platforms were not born in the Cloud, they were built for an on-premises, highly structured environment and then tweaked for the Cloud. Modern MDM platforms were built for the Cloud - which means that getting up and running is quicker and easier, you can scale up or down at pace, and you benefit from the Cloud economic model.
  • By providing proactive data governance. Establishing trust in data means having full visibility of its lineage and controlling what happens to your sensitive data in a transparent way. Meeting compliance requirements and demonstrating how data is protected won’t slow you down anymore.

You may be wondering what is so different about modern MDM systems that makes all of the above possible. One major difference is that modern MDM systems like CluedIn are built on a NoSQL, schema-less database called Graph. In the world of Graph, the relationships between the data are as important as the data itself.

A really simple way to think of it is similar to the difference between organising your data into neat rows and columns in Excel versus jotting it down on a whiteboard. With the whiteboard you can visualise the relationships between the data and add the connections as they emerge. This is exactly what Graph does - as the data is ingested, it allows the patterns and relationships to surface, and is then able to organise it into a natural data model. LinkedIn, Facebook and Google are all built on Graph, and the same principles of schema-less, scalable modeling now apply to MDM.

What does the future of Master Data Management look like?

In many ways, the future of Master Data Management doesn't look like Master Data Management at all. Where traditional MDM systems were siloed and slow, modern platforms are integrated and quick. Where the old way of approaching MDM dictated set rules and structures, the new way embraces freedom and flexibility. And if we accept that these concepts shouldn't only apply to Master data, but all data, then the concept of Master Data Management becomes almost entirely redundant.

At this point in time, CluedIn is the only MDM platform that uses Graph. This will change as established vendors and new market entrants recognise how powerful Graph can be when applied to the management of business data. And that's a good thing. Right now, forward-thinking businesses that want to use their data to react to market forces, competitive advancements and customer preferences have a very limited choice: traditional MDM or CluedIn. As the market continues in this direction, a new category will emerge and we will no longer talk about traditional or modern approaches to MDM. In fact, there's a very good chance that by that stage, we won't be talking about MDM at all.

Read More
6 question marks scattered across green background

The six questions you need to ask to become a data-driven business

CluedIn

The term “data-driven business” refers to an organisation that uses data to inform or enhance decision-making, streamline operational processes and ultimately to fuel revenue and growth. Whether or not it is possible for any business to be solely data-driven is another debate, but there is no doubt that those who get close to it are adept at turning data into insight, and at using that insight to propel the business forward. While most companies today would probably cite becoming data-driven as a crucial enabler of their wider goals, there aren’t many that have achieved it. Google, Facebook, McDonalds and UBER definitely fall into this group, but these are industry heavyweights and represent the exception rather than the rule.

What does that mean for everyone else vying to achieve data-driven status? Like many things in life, it starts with the basics and builds from there. Even the big boys had to start somewhere!
All truly data-driven businesses have something in common, aside from the obvious operational and competitive advantages. They can all answer six vital questions.

  • What data do we have?
  • Where is the data?
  • What is the quality of the data?
  • Who owns the data?
  • Who is responsible for each step of the data journey from start to finish?
  • What happened to the data as it transitioned from raw to insightful?

Why is it even important for you to be able to answer these questions in the first place? There are the obvious compliance and regulatory reasons why you should, but for now let’s focus on what your business could achieve if you had the answers to these questions.

What data do we have?

Once you have experienced one win as a result of seeing data really work for you, you’re hooked. This could be using data to optimise processes, lower operational costs, find more customers, attract great talent, monitor trends in the market and much more besides. Knowing what data you as a company have in your arsenal is the first trigger to inspiring these types of insights. Insights can come from manual discovery, or can come from using technology to find patterns in the data and bring them to your attention. We believe in being able to walk before you can run and it is not necessarily a bad thing to start gaining insights through manual discovery.

For example, if you have a list of customers and a list of support tickets, you might want to know which geography causes the most support tickets. With a pattern-driven approach, it is not so much about asking the questions of the data, but rather about allowing the data to reveal interesting trends. The likelihood is that there will be patterns hidden in the data that you would not proactively ask for – e.g. churned customers took over 54 hours to have their support tickets resolved. This insight may then lead you to hire more customer support representatives to bring down the average answer rate or have an internal SLA that no ticket takes more than 24 hours to answer.

Where is the data?

Knowing where the data is and where it has come from is an important regulatory requirement, but in the context of achieving some type of insight, knowing the answer to these questions is vital to establishing trust in the data from across the organization. If someone on the street handed you a credit card and said "Feel free to use this!” the first thing you’d probably ask is where it came from. Without this lineage, there is no trust. And most notably, in this analogy, you would want to know if the source of this credit card is reputable. 

Also, although duplicate data is not necessarily a huge storage cost issue anymore, it is a big operational issue. Of course, this also depends on exactly how much duplicated data you have – petabytes of it can be quite costly! Which also means that knowing where your data is can help you to reduce operational costs too.

What is the quality of the data?

In the era of fake news and AI bots that are indistinguishable to humans, it is more important than ever to establish integrity in the data you are using to make decisions. There are a plethora of shades of data quality, and every shade will correlate with a different level of confidence in the "usability" of the data. It should also be pointed out that there is no such thing as right or wrong when it comes to data, and no matter how high quality the data is deemed to be it will bring with it an inherent level of risk. 
In the spirit of keeping things technology-agnostic and high-level, think about the times you have made a decision with confidence. What gave you that confidence? Was it that your research came from a reputable source? Was it because the voice of the crowd all agreed with one approach? Was it your gut feeling?

Just like everyone else, you probably make decisions on a daily basis using a combination of these techniques to make your final judgement. It’s much the same with data - determining quality is about building up your confidence in making a decision. The challenge with data is that it doesn't have to adhere to any laws of physics, hence any judgement made on data quality is a heuristic attempt to provide metrics on which a decision can be made with an acceptable level of confidence and risk. You can read more about how CluedIn interprets and measures the shades of data quality here.

Why does data need ownership?

In many ways, it doesn’t. In fact, it needs much more than ownership. This is why we have frameworks in Data Governance like the RACI model, in which the four dimensions of "ownership" are defined as the minimum requirements for an ownership matrix relating to data and journey that data takes. Like any process you have within a business, if no-one is responsible for it, it often grinds to a halt. As you have probably experienced in other parts of your business, sometimes a task can be blocked by the most minuscule reason, but the bottom line is - it was blocked. This is often down to a lack of ownership for that part of the process. 

Who is responsible for each step of the data journey from start to finish?

The data journey from source to insight has some very distinguishable steps, and each of these steps requires you to attack the data from a different angle. Irrespective of the  technology you use to get from source to insight, the generic journey includes pulling data from a number of sources, integration, normalisation, standardisation, deduplication, linking, enrichment, profiling, mapping and transformation. (Honestly speaking, we could easily add another 10 or 15 stages, but let's stick with this list for now!). In many cases, each of these steps is a comprehensive task and responsibility in its own right. For example, the normalisation and standardisation of data is easily a full time job for many data stewards. Hence, if a full supply chain of ownership of the steps in the process is not established then it should not be a surprise that the flow of usable data can break down – often for the most mundane of reasons.

What happened to the data as it transitioned from raw to insightful?

Let’s consider for a moment why it is that data needs lineage, and different parties to take responsibility for the entire data journey, yet other processes we run within the  company don't demand the same level of stringent needs? Could it be that this lineage would actually be very useful in all parts of the business, but because of the digital nature of data it is inherently easier to build a digital footprint? The same cannot be easily said for passing around Excel sheets from department to department, for example. Any explanation of how this Excel sheet "came to be" simply isn’t something that can be achieved simply through the use of Excel. The audit trail of the transformation of data from source to insight is often just as useful for “explainability”  as it is for highlighting parts of the process that can be improved or are error-prone.

Summary 

Now that we have established the questions you need to answer in order to start your journey to being truly data-driven, we should look at how technology can help you to both answer the questions and use those answers to best effect. The best way to do this is to approach it from both the asset and the record level – which in effect means getting both the birds-eye and granular view, and bringing them together in a way that makes sense. One powerful and increasingly popular combination is to use Microsoft Purview and CluedIn. To some degree, both Purview and CluedIn answer all of the questions above, but at different levels. The bottom line is, you need both and in some ways, you can't have one without the other, particularly if your data technology stacks are all housed within Microsoft Azure.

Read More
data science graphic with multiple shaped icons on a pink background

Driving data science with a data quality pipeline

CluedIn

High quality, trusted data is the foundation of Machine Learning (ML) and Artificial Intelligence (AI). It is essential to the accuracy and success of ML models. In this article, we’ll discover how CluedIn contributes to driving your Data Science efforts by delivering the high quality data you need.

CluedIn not only provides your teams with tooling that improves the quality of the data that is fed to your ML models, it also simplifies the iterations by which you can evaluate their effectiveness.

The five Vs of Data Quality

The term “data quality” is overused and can mean many things. As Machine Learning and Big Data are still both evolutionary fields with developments in each complementing the other, we’ll approach it from an angle you may already be familiar with – the five Vs of Big Data (Volume, Variety, Velocity, Value and Veracity).

Read More