<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=4011258&amp;fmt=gif">

MDM Toolkit: 
Master Data Management Glossary for Stakeholder Understanding



A specific piece of data representing a particular aspect of an entity, such as a product's colour or a customer's address.

Augmented Data Modeling:
An approach that uses AI and machine learning techniques to assist in the creation and management of data models.



Big Data:
Large and complex data sets that traditional data processing software can't manage. Often includes data from social media, IoT devices, and more.



Canonical Model:
A design pattern used to communicate between different data formats where a common data format is defined to which all other data formats are mapped.

Change Data Capture (CDC):
A technique used to track and capture changes in data, such as inserts, updates, and deletes, enabling synchronization between databases or data warehouses.

CluedIn Cleanse:
A feature in CluedIn that cleans data by fixing inaccuracies, filling in missing values, and ensuring data consistency.

CluedIn Enrich:
A feature in CluedIn that enhances data by adding more context, details, or attributes from external sources.

CluedIn Discover:
CluedIn's capability to help users discover new insights, relationships, and patterns within their data.

Centralized MDM:
Centralized MDM involves establishing a centralized repository for master data while allowing individual systems to maintain local copies for operational purposes.

Co-existence MDM:
Co-existence MDM recognizes that different domains or business units within an organization may have unique requirements for managing master data.

Consolidated MDM:
Consolidated MDM, also known as Analytical MDM, focuses on integrating and harmonizing master data from multiple sources into a single, authoritative view.



Data Catalog:
A centralized repository that allows for efficient data management and data discovery, often containing metadata, data profiles, and data lineage.

Data Cleansing (or Data Cleaning):
The process of identifying and rectifying (or removing) errors and inconsistencies in data to improve its quality.

Data Consolidation:
The process of integrating and unifying data from various sources into a centralized data repository to achieve a single, accurate, and consistent view of master data. This process involves cleaning, transforming, and rationalizing data to ensure quality and consistency, thereby supporting better decision-making and compliance with governance standards.

Data Dictionary:
A centralized repository of metadata definitions used to describe the structure, type, and usage of data within a database, data warehouse, or data lake.

Data Domain:
A category of data, such as customer, product, or supplier data.

Data Governance:
The overall management of data availability, usability, integrity, and security. It involves a set of processes, roles, policies, and metrics to ensure the effective and efficient use of information.

Data Integration:
The process of combining data from different sources, providing a unified view or dataset.

Data Lake:
A storage repository that holds a vast amount of raw data in its native format until it's needed, providing more flexible data processing options compared to traditional databases.

Data Lifecycle:
The stages through which data passes, typically characterized as creation, storage, use, sharing, archiving, and deletion.

Data Lineage:
The process of tracking data from its source to its final form, including every touchpoint and transformation in between. It helps in understanding how data flows and transforms across systems.

Data Migration:
The process of transferring data from one system to another, which may involve transforming the data to a new format.

Data Model:
An abstract representation of organizational data, often represented visually.

Data Privacy:
The management of data to ensure the protection of personal information, complying with regulations such as GDPR or CCPA.

Data Profiling:
The process of examining data to gather statistics and information about its quality and the nature of the information contained.

Data Quality:
A measure of the condition of data based on factors such as accuracy, completeness, reliability, and relevance.

Data Silo:
A repository of fixed data that remains under the control of one department or team and is isolated from the rest of the organization, leading to disjointed data management.

Data Steward:
An individual responsible for managing and maintaining specific data in an organization.

Data Stewardship:
The management and oversight of an organization's data assets to provide business users with high-quality data that is easily accessible in a consistent manner.

Data Virtualization:
The ability to retrieve and manipulate data without requiring technical details about the data, such as how it is formatted or where it is physically located.

Data Warehouse:
A large, centralized database that integrates data from different sources, making it accessible for business intelligence activities.

Dimensional Modeling:
A design technique used in data warehousing to separate measurement data (facts) from descriptive data (dimensions), facilitating the understanding of data and improving query performance.

Duplication Detection:
The process of identifying duplicate records within a dataset.



ELT (Extract, Load, Transform):
ELT is a data integration approach where data is extracted from source systems and loaded into a target database or data warehouse; only after loading is the data transformed to meet operational or analytical requirements.

ETL (Extract, Transform, Load):
ETL is a data integration process where data is extracted from source systems, then transformed to meet specific operational needs or to comply with certain standards, and finally loaded into a target database or data warehouse.

A distinct object (like a customer, product, or location) about which data is stored.



Federated MDM:
An approach where master data is managed across multiple systems and processes but can be accessed and used in a unified manner.



Golden Record:
The most trusted version of a record in a system. In MDM, it refers to the most accurate and complete version of a data entity.

Graph Database:
A database that uses graph structures to represent and store data flexibly, without restricting it to an existing model, such that the connections between the data are as important as the data itself. 
Hierarchy Management: The process of managing and organizing data in a hierarchical structure, such as product categories or organizational charts.



Hierarchy Management:
The process of managing and organizing data in a hierarchical structure, such as product categories or organizational charts.

Hub and Spoke Model:
A type of MDM architecture where the MDM solution (hub) is centrally connected to different data sources (spokes).



Identity Resolution:
The process of ensuring that an entity (like a customer) is consistently and uniquely identified across different datasets.



Master Data:
The consistent and uniform set of identifiers and extended attributes that describe the core entities of an enterprise.

Match & Merge:
The process of identifying duplicate records and merging them into a single, most accurate version.

Data that describes other data. It provides information about a certain item's content.

Metadata Management:
The management of data that describes other data, ensuring consistency, accessibility, and proper usage of metadata across the organization.

Modern MDM (Master Data Management):
An evolved approach to MDM that leverages new technologies and methodologies to manage master data. Modern MDM often incorporates:

  • Real-Time Processing: Data is processed as it's generated or received, allowing for real-time insights and actions.
  • Schema-On-Read: Data schema is defined when it's read, offering flexibility in handling diverse data sources.
  • Decentralized Architecture: Data can be managed and accessed across multiple systems, often leveraging cloud technologies.
  • Collaborative Implementation: A more inclusive approach where business and IT teams collaborate closely, ensuring the MDM solution meets real-world needs.
  • Flexible Data Models: Adaptable data models that can evolve with business changes.
  • AI and Machine Learning: Leveraging AI technologies to improve data quality, match/merge processes, and derive insights.



A database design technique that reduces redundancy and dependency by organizing data into separate tables based on logical relationships.



A data model that defines the structure of data, including the types of data, the relationships between them, and the rules governing data.

Operational Data:
Data that is generated as a result of day-to-day business operations.



Reference Data:
Data used to classify or categorize other data, such as country codes or product codes.

Registry MDM:
An MDM approach where the master data is stored as a reference or index, rather than consolidating data into a physical master record.



Schema Evolution:
The ability to adapt the schema of a database as the model or requirements change over time.

Schema Mapping:
The process of creating associations or mappings between two or more data models.

Semantic Consistency:
Ensuring that data is consistently categorized and labelled across the organization.

Single View (or 360-Degree View):
A holistic, single view of an entity (like a customer) derived from data consolidated across various sources.

Source System:
A software system that is used as the source of data for the MDM system.

Structured Data:
Data that is organized in a specific manner or model, often in relational databases, making it easily searchable.

Survivorship Rules:
Rules that determine which data attributes will be chosen in the final master record during the match and merge process.

The process of distributing and sharing master data across different IT systems and departments.



Traditional MDM (Master Data Management):
A data management approach that centralizes, cleanses, and synchronizes a company's master data across systems and applications. Traditional MDM often relies on:

  • Batch Processing: Data is processed in large batches rather than in real-time or near-real-time.
  • Schema-On-Write: Data schema is defined before data ingestion.
  • Centralized Architecture: A single, central repository where master data is stored and managed.
  • Top-Down Implementation: MDM solutions are often driven by IT departments, with business stakeholders providing requirements.
  • Rigid Data Models: Predefined data models that might not easily adapt to changing business needs.

Transactional Data:
Data related to business transactions, such as purchases. Unlike master data, which remains fairly consistent, transactional data changes frequently.



Unified Data Model:
A framework for representing data objects and their relationships in a single, consistent structure.

Unstructured Data:
Data that doesn't have a specific form or structure, such as emails, social media posts, and documents.



Version Control:
The process of tracking and managing changes to data over time.



Zero Modeling:
An approach in modern MDM solutions that allows for data to be ingested in its raw form without predefined models, and then makes sense of it over time.

Frequently Asked Questions about MDM

MDM Business case template