|  All Stories

Data Product vs. Data as a Product

A data product is not the same as data as a product. A data product aids the accomplishment of the product's goal by using the data, whereas in data as...

Let’s Know About the Parquet File

An open source file format for Hadoop that provides columnar storage and is built from the ground up with complex nested data structures in mind.

Partitions and Bucketing in Spark

Partitioning and bucketing are used to improve the reading of data by reducing the cost of shuffles, the need for serialization, and the amount of network traffic.

Data Catalog

A data catalog is an inventory of an organization's data assets that enables rapid and efficient access to the most relevant data.

Case Class in Scala

The case class represents immutable data. It is a type of class that is often used for data storage.

How To Set SLA in Apache Airflow

Apache Airflow enables us to schedule tasks as code. In Airflow, a SLA determines the maximum completion time for a task or DAG. Note that SLAs are established based on...