Tag: data-management

Data Product vs. Data as a Product

A data product is not the same as data as a product. A data product aids the accomplishment of the product's goal by using the data, whereas in data as...

Data Deluge

When the granularity of data increases, its complexity also increases. At some point, we will reach a point where we cannot handle the volume of fresh data being generated.

Tag: data-engineering

Data Product vs. Data as a Product

A data product is not the same as data as a product. A data product aids the accomplishment of the product's goal by using the data, whereas in data as...

Data Catalog

A data catalog is an inventory of an organization's data assets that enables rapid and efficient access to the most relevant data.

Introduction to Data Engineering

It's the process of designing and building systems for gathering vast quantities of raw operational data from a variety of sources and formats, analyzing, converting, and storing it at scale....

Data Deluge

When the granularity of data increases, its complexity also increases. At some point, we will reach a point where we cannot handle the volume of fresh data being generated.

Tag: programming

Case Class in Scala

The case class represents immutable data. It is a type of class that is often used for data storage.

Defining Variables Using the `def` Keyword in Scala

Difference between `lazy val` and `def`.

Rust’s Ownership and Borrowing Enforce Memory Safety

Rust's ownership and borrowing features prevent us from experiencing memory-related problems. Rust is a great choice when performance matters and it solves pain points that bother many other languages.

Tag: rust

Rust’s Ownership and Borrowing Enforce Memory Safety

Rust's ownership and borrowing features prevent us from experiencing memory-related problems. Rust is a great choice when performance matters and it solves pain points that bother many other languages.

Tag: memory-management

Rust’s Ownership and Borrowing Enforce Memory Safety

Rust's ownership and borrowing features prevent us from experiencing memory-related problems. Rust is a great choice when performance matters and it solves pain points that bother many other languages.

Tag: anti-pattern

Anti-Pattern

Anti-patterns at first seem to be quick and reasonable, they typically have adverse effects in the future. They are design and code smells. It affects our software badly and adds...

Tag: design-patterns

Singleton Pattern

A singleton pattern limits the number of instances of a class to one.

Anti-Pattern

Anti-patterns at first seem to be quick and reasonable, they typically have adverse effects in the future. They are design and code smells. It affects our software badly and adds...

Tag: data-streaming

Tag: singleton-pattern

Singleton Pattern

A singleton pattern limits the number of instances of a class to one.

Tag: coding-principles

Singleton Pattern

A singleton pattern limits the number of instances of a class to one.

Tag: scala

Case Class in Scala

The case class represents immutable data. It is a type of class that is often used for data storage.

Defining Variables Using the `def` Keyword in Scala

Difference between `lazy val` and `def`.

Tag: functional-programming

Defining Variables Using the `def` Keyword in Scala

Difference between `lazy val` and `def`.

Tag: envelope-encryption

Envelope Encryption

Envelope encryption is a way of encrypting plaintext data using a key and then encrypting that key using an another key. This strategy is intended not just to make things...

Tag: data-protection

Envelope Encryption

Envelope encryption is a way of encrypting plaintext data using a key and then encrypting that key using an another key. This strategy is intended not just to make things...

Tag: root-key

Envelope Encryption

Envelope encryption is a way of encrypting plaintext data using a key and then encrypting that key using an another key. This strategy is intended not just to make things...

Tag: data-key

Envelope Encryption

Envelope encryption is a way of encrypting plaintext data using a key and then encrypting that key using an another key. This strategy is intended not just to make things...

Tag: big-data

Data Catalog

A data catalog is an inventory of an organization's data assets that enables rapid and efficient access to the most relevant data.

Introduction to Data Engineering

It's the process of designing and building systems for gathering vast quantities of raw operational data from a variety of sources and formats, analyzing, converting, and storing it at scale....

Tag: data-pipeline

Introduction to Data Engineering

It's the process of designing and building systems for gathering vast quantities of raw operational data from a variety of sources and formats, analyzing, converting, and storing it at scale....

Tag: apache-spark

Let’s Know About the Parquet File

An open source file format for Hadoop that provides columnar storage and is built from the ground up with complex nested data structures in mind.

Partitions and Bucketing in Spark

Partitioning and bucketing are used to improve the reading of data by reducing the cost of shuffles, the need for serialization, and the amount of network traffic.

Need for Caching in Apache Spark

Caching is one of Spark's optimization strategies for reusing computations. It stores interim and partial results so they'll be utilised in subsequent computation stages.

Tag: cache

Need for Caching in Apache Spark

Caching is one of Spark's optimization strategies for reusing computations. It stores interim and partial results so they'll be utilised in subsequent computation stages.

Tag: data-caching

Need for Caching in Apache Spark

Caching is one of Spark's optimization strategies for reusing computations. It stores interim and partial results so they'll be utilised in subsequent computation stages.

Tag: airflow

How To Set SLA in Apache Airflow

Apache Airflow enables us to schedule tasks as code. In Airflow, a SLA determines the maximum completion time for a task or DAG. Note that SLAs are established based on...

Tag: sla

How To Set SLA in Apache Airflow

Apache Airflow enables us to schedule tasks as code. In Airflow, a SLA determines the maximum completion time for a task or DAG. Note that SLAs are established based on...

Tag: service-level agreement

How To Set SLA in Apache Airflow

Apache Airflow enables us to schedule tasks as code. In Airflow, a SLA determines the maximum completion time for a task or DAG. Note that SLAs are established based on...

Tag: workflow-engine

How To Set SLA in Apache Airflow

Apache Airflow enables us to schedule tasks as code. In Airflow, a SLA determines the maximum completion time for a task or DAG. Note that SLAs are established based on...

Tag: object-oriented-programming

Case Class in Scala

The case class represents immutable data. It is a type of class that is often used for data storage.

Tag: oops

Case Class in Scala

The case class represents immutable data. It is a type of class that is often used for data storage.

Tag: data-lake

Data Catalog

A data catalog is an inventory of an organization's data assets that enables rapid and efficient access to the most relevant data.

Tag: data-catalog

Data Catalog

A data catalog is an inventory of an organization's data assets that enables rapid and efficient access to the most relevant data.

Tag: data-inventory

Data Catalog

A data catalog is an inventory of an organization's data assets that enables rapid and efficient access to the most relevant data.

Tag: partition

Partitions and Bucketing in Spark

Partitioning and bucketing are used to improve the reading of data by reducing the cost of shuffles, the need for serialization, and the amount of network traffic.

Tag: bucketing

Partitions and Bucketing in Spark

Partitioning and bucketing are used to improve the reading of data by reducing the cost of shuffles, the need for serialization, and the amount of network traffic.

Tag: hadoop

Let’s Know About the Parquet File

An open source file format for Hadoop that provides columnar storage and is built from the ground up with complex nested data structures in mind.

Tag: parquet

Let’s Know About the Parquet File

An open source file format for Hadoop that provides columnar storage and is built from the ground up with complex nested data structures in mind.

Tag: columnar-format

Let’s Know About the Parquet File

An open source file format for Hadoop that provides columnar storage and is built from the ground up with complex nested data structures in mind.

Tag: columnar-storage

Let’s Know About the Parquet File

An open source file format for Hadoop that provides columnar storage and is built from the ground up with complex nested data structures in mind.

Tag: lakehouse

Tag: data-goverance

Tag: data-security

Tag: solid

Tag: reverse-etl

Tag: etl

Tag: data-product

Data Product vs. Data as a Product

A data product is not the same as data as a product. A data product aids the accomplishment of the product's goal by using the data, whereas in data as...

Tag: data-as-a-product

Data Product vs. Data as a Product

A data product is not the same as data as a product. A data product aids the accomplishment of the product's goal by using the data, whereas in data as...

Tag: shuffling

Tag: data-mesh

Tag: data-science