The modern medallion-style architecture has become the standard framework for managing data flow through various stages of transformation and enrichment, typically organized as Bronze, Silver, and Gold zones. Each stage serves its own purpose: Bronze for raw data, Silver for cleansed data, and Gold for highly refined, analytics-ready data.
As data pipelines evolve to better serve both engineering and business needs, CluedIn introduces a new zone into this model—what we call the “Platinum” zone. Here’s a look at why the Platinum zone matters, where CluedIn fits into the medallion architecture, and how this integration elevates the data supply chain for more robust, business-approved data.
Introducing the “Platinum” Zone: CluedIn’s Role in Business-Certified Data
After working with many clients operating within the medallion-style architecture, we observed an opportunity to add a crucial layer between the Silver and Gold stages: the Platinum zone. Positioned here, CluedIn facilitates a unique data transformation stage, where data receives business validation and quality assurance, gaining a "seal of approval" from business users. By introducing the Platinum zone, CluedIn enables a bridge between raw data processing and refined analytics, allowing for data that is not only technically accurate but also carries business-contextual validation. This approach makes the Platinum zone an ideal place to bring business stakeholders into the data pipeline, empowering them to define, approve, and be accountable for the quality of data.
A Flexible Architecture for Both Engineering and Business Needs
In this architecture, data teams have the best of both worlds. Engineers retain the flexibility to move data directly from the Silver to the Gold zone, enabling rapid data flows and analytics processing. However, when data enters the Platinum zone, it signals that the information has been vetted by the business. Here, business stakeholders review and approve the data through CluedIn’s workflows, which adds an extra layer of accountability and quality. With CluedIn, companies can ensure that data certified in the Platinum zone has the buy-in of the lines of business, offering peace of mind that the data flowing through the business carries both technical and contextual integrity.
Supporting Open Formats for Operational and Analytical Workloads
CluedIn is built with flexibility in mind. Supporting open file formats like Parquet on both ingress and egress, CluedIn seamlessly integrates with diverse workloads and architectures. For analytical tasks, the platform can export data in Parquet format, making it accessible to downstream analytics tools. For operational workloads, where other formats may be preferred, CluedIn provides options to support the necessary formats without additional data transformation steps. This flexibility enables teams to integrate CluedIn within their existing workflows without compromising compatibility or performance, ensuring that data can flow smoothly across various platforms and applications.
Leveraging Data Catalogs for Streamlined Data Integration
In an ideal configuration, CluedIn doesn’t connect directly to source systems. Instead, it ingests data through a data catalog, streamlining the integration process and enhancing data governance. CluedIn’s support for asset retrieval from data catalogs allows it to act as a seamless part of the data pipeline, ingesting data from a cataloged source and preparing it for business validation in the Platinum zone. By treating the data catalog as the central entry point, CluedIn maintains a controlled, governed ingestion process that aligns with best practices in data management and security.
CluedIn Ingress as Another Data Lake: Schemaless on Entry, Structured on Egress
To maximize success, CluedIn recommends treating its ingestion like a data lake—avoiding pre-processing or preparation steps. Pushing raw, unstructured data into CluedIn allows the platform’s schemaless ingestion to capture all data as-is, ensuring no loss of information or context. Once ingested, CluedIn’s powerful data processing capabilities can transform and organize the data as needed, structuring it appropriately for egress into the Platinum zone. This setup allows for strict schema enforcement when data exits the platform, ensuring high-quality data reaches downstream applications and analytics tools while preserving the original data’s integrity upon entry.
CluedIn’s Role in a Modern, Business-Centric Data Pipeline
With CluedIn and the Platinum zone, companies can implement a medallion-style architecture that meets both engineering and business needs. By positioning CluedIn between Silver and Gold, this architecture enables a more comprehensive, business-validated data pipeline, ensuring that data reaching the Gold zone is ready not only for analysis but also carries the business’s stamp of quality and accountability. Open format support, integration with data catalogs, and a flexible, schemaless ingestion model make CluedIn the ideal choice for modern data pipelines, helping organizations bridge the gap between technical accuracy and business relevance.
This architecture exemplifies our commitment at CluedIn to bring data quality, business involvement, and adaptability to the forefront of data management, empowering organizations to achieve their data-driven goals with confidence.