Senthil Nayagan
Senthil Nayagan I am a Data Engineer by profession, a Rustacean by interest, and an avid Content Creator at present.
Data Engineering

Introduction to Data Lake

Introduction to Data Lake
A data lake is a centralized repository containing a significant amount of data from several sources in a more flexible natural or raw format for analytical usage.

Data lake trade-offs

There are trade-offs involved in the shift from traditional data storage and processing platforms, such as databases and data warehouses, to data lakes. After the migration to the data lake, we have sacrificed the following capabilities in favor of others:

  • We have given up durability and consistency features like ACID transactions in return for the ability to process them on a highly scalable platform.
  • We have traded performance characteristics such as indexing and caching in exchange for the capacity to handle data in multiple formats.
  • We have given up features like versioning and auditing in exchange for the ability to decouple storage and computing.
comments powered by Disqus