Simplifying upserts and deletes on Delta Lake tables

Store

Data Engineers face many challenges with Data Lakes. GDPR requests, data quality issues, handling large metadata, merges and deletes are a few of the tough challenges usually every Data Engineer encounters with a Data Lake with formats like Parquet, ORC, Avro, etc. This session showcases how you can effortlessly apply updates, upserts and deletes on a Delta Lake table with a very few lines of code and use time travel to go back in time for reproducing experiments & reports very easily, how we can avoid challenges due to small files as well. Delta Lake was developed by Databricks and has been donated to Linux Foundation, the code for which could be found at http://delta.io. Delta Lake is being used by a huge number of companies across the world due to its advantages for Data Lakes. We will discuss, demo and showcase how Delta Lake can be helpful for your Data Lakes because of which many enterprises have Delta Lake as the default data format in their architecture. We will will use SQL or its equivalent Python or Scala API to perform showcase various Delta Lake features.

The slides are available here: tinyurl.com/bbuzz-delta-lake or attached below.

Video

Slide

bbuzz_Delta_Lakes_0.pdf

Maschinenhaus

15.06.2021 17:50 – 18:20

Talk

Beginner

Speakers

Prashanth Babu
EMEA RSA Practice Lead
Databricks

Video

Slide

Speakers

Newsletter

Thank you!