CluedIn University

Learn the CluedIn platform one step at a time. Follow along as our data experts take you through all the steps to success with CluedIn and your customers.

Getting Started

Installing CluedIn on your machine.

How to get all the basics in place.

In this session, Stephen takes you through all the pre-requisites, downloading and installing CluedIn on your local machine.


  • Installing CluedIn Pre-Requisites
  • Installing Docker
  • Installing Docker Compose
  • Installing CluedIn

Changing your CluedIn environment.

Let's explore what comes with CluedIn.

In this session, Stephen takes you through some of the common ways that you will want setup CluedIn.


  • Creating an Account
  • SMTP

Scenario Introduction

Let's walk through CluedIn with a defined set of data and use case.

In this session, Stephen introduces the sample data he will be working with for all the university tutorials..


  • Simple CSV, Excel and JSON files.
  • Importing data.
  • Simple Mapping

Bigger Data

Let's step it up and start connecting directly to a database.

In this session, Stephen adds a new datasource to the story, this time in a database.


  • Importing data through a database.
  • Complex Mapping (made simple)
  • Ingesting large datasets.

Big Data

Let's connect using CluedIn connectors.

In this session, Stephen installs some prebuilt connectors provided by CluedIn and the community.


  • Installing a connector.
  • What is a connector?
  • How can I build my own connector?

Deployment into other Environments.

Let's deploy what we have into a cloud environment.

In this session, Stephen takes what we have so far and shows how we can deploy this into production.


  • Cloud pre-requisites.
  • Helm charts, Secrets, DNS and more.
  • Choosing a cloud provider.

Build a repeatable deployment model.

Let's setup a pipeline that will make it easy to upgrade, rollback and add to our production environment.

In this session, Stephen takes us through how we can setup continuos integration so that we can deploy our changes really simply and have multiple teams work on the CluedIn implementation.


  • Creating a build and deployment process.
  • Create tests to enable auto-deployment.
  • Composing your additions with Docker.

Changing our CluedIn environment (Part 2)

Let's dive into the different parts of CluedIn and understand what we want to enable.

In this session, Stephen dives into the CluedIn configuration and decides what features we want to turn on and off.


  • Exploring Configuration.
  • Rolling out these updates into different environments.
  • Rebuilding Docker images.

Data Integration Pattern

Eventual Connectivity helps us integrate one system at a time.

In this session, Stephen explains the pattern of "Eventual Connectivity" and how it can simplify the entire integration process.


  • Mapping different data sources together.
  • Handling edges cases.
  • Blending data across data sources.

Pre-Cleaning of Data

Let's protect bad data as much as possible.

In this session, Stephen will pre-clean some of the data as to protect from potential data integration issues.


  • Introduction to CluedIn Clean.
  • Focus on Edges and Entity Codes.
  • Establising known rules.

Data Quality Mertics

Tracking your Data Quality.

In this session, Stephen will talk through the inbuilt data quality metrics.


  • Introduction to the Metrics engine.
  • Setting KPI's.
  • Validating that cleaning data and increase data quality metrics.

Creating Data Streams

Sharing data to other systems.

In this session, Stephen will talk about pushing data from CluedIn to third party systems.


  • Creating Streams.
  • Streaming to Databases.
  • Streaming to Systems such as Power BI, Tableau.

Cleaning Data that is already in use.

How to improve data quality over time?

In this session, Stephen will talk about cleaning data that is already in operational use from other systems.


  • CluedIn Clean.
  • Commiting Projects.
  • Impact analysis and handling data cleanup.

Fixing Duplicate Records

How to address potential duplicate records.

In this session, Stephen will walk through identifying and fixing duplicates.


  • Merge Records.
  • Split records that you accidentally merged.
  • Building new Duplicate Detection.

Enriching Records

How to automatically and manually enrich records?

In this session, Stephen will walk the external enrichment services and investigating how you can manually enrich records in CluedIn Clean.


  • Enabling and Disabling External Enrichment Services.
  • Using CluedIn Clean to enrich records that are not perfect matches.
  • Building a new External Enrichment Service.

Creating Rules

How monitor data with rules to automate tasks?

In this session, Stephen will create rules to automate cleaning, masking and notifications.


  • Conditions and Actions.
  • Reprocessing Data with new rules.
  • Rolling back changes made by rules.

Creating your own Data Dictionary

How to talk to your data without being an expert?

In this session, Stephen will create Vocabularies for each source system and will create layers of Vocabularies so that different parts of the business can consume the data with ease.


  • Creating and mapping Vocabularies.
  • Nest a hierarchy of Vocabularies.
  • Changing Vocabularies and Reprocessing.

Creating your Data Glossary

How to make it even easier to talk and share your data?

In this session, Stephen will create Glossary entries which can map and translate simple business terminology into underlying filters.


  • Glossary Terms.
  • Lexicons.
  • Mapping Glossary Terms to Vocabularies.

Setting up data lifetimes.

How to setup retention periods for your data and your data sources?

In this session, Stephen will create retention periods so that you can manage the lifetime of your data.


  • Entity based retention.
  • Query based retention.
  • Mesh Commands.

Setting up consent for the data you have.

How to make sure you only have the data that you have consent for?

In this session, Stephen will create consent mappings so that you can prove that you have the right to have the data you are working with.


  • Consent Mappings.
  • Capturing Consent.
  • Consent with Streams.

Viewing Jobs and Engine Statistics.

How to monitor the performance of CluedIn?

In this session, Stephen will explain the jobs engine as well as show you how to monitor the progress and state of what is happening in CluedIn.


  • Jobs.
  • Engine Room.
  • Running Administration Commands.

Adding new Processing Pipelines

How to add custom processing logic to CluedIn?

In this session, Stephen will add custom logic to the processing pipeline.


  • Subscribing to the CluedIn processing pipeline.
  • Creating Clues.
  • Deployment of new extensions.

And more to come...

Ready to Learn?

Talk to an expert today and how we can help with your specific challenges.