Why is my Data Lake not working?
The Enterprise is yearning to get value out of data. In attempt to make data available to the business, a recent trend has been to establish a Data Lake, with the intention of having all data available in one place and accessible via a common querying interface as to eliminate the necessity to create point to point integrations and have to understand the querying languages of the many different systems that are used within their business.
The Data Lake is not a new concept in fact, it just now has a name that the market can relate to the technology. Traditionally, the Data Warehouse was seen as the conceptual way for a companies data to be universally accessible through a single querying interface - to some degree it achieved this, and the Data Warehouse is a critical piece of the technology stack at most enterprise companies.
So if we have the Data Warehouse, why do we need the Data Lake? If we were to place on a scale, the need for a Data Warehouse versus that of the Data Lake, the Data Warehouse is much more of a need, where the Data Lake is seen more as an optimisation in both cost and scalability. However it is important to establish that the Data Lake does not make a company more Data Driven, quite the opposite it fact. The Data Lake alone can be summarised as a low cost storage for data that the business is not ready to operationalise. This does, to some degree, fill a hole in the data pipeline today, but is it the right technology and are we filling a shallow hole, while forgetting about an even deeper trench?