Delta Lake on Databricks: ACID Transactions, Time Travel, and Schema Evolution Explained

Celestinfo Software Solutions Pvt. Ltd. Aug 21, 2025

Last updated: September 2025

Quick answer: Delta Lake is an open-source storage layer that adds ACID transactions to Parquet files on S3, ADLS, or GCS. It enables UPDATE, DELETE, and MERGE operations that plain Parquet can't do, plus time travel (query historical versions), schema evolution (add columns without rewriting data), and OPTIMIZE/ZORDER for query performance. On Databricks, it's the default table format. Key gotcha: VACUUM's default retention is 7 days - set it lower and time travel breaks for older versions. Also, ZORDER on high-cardinality columns is wasteful.

Introduction

Plain Parquet files on a data lake give you cheap storage and fast columnar reads. They don't give you transactions, updates, or deletes. If your pipeline writes a batch of files and crashes halfway through, readers see a partially-written, inconsistent dataset. Delta Lake solves this by adding a transaction log on top of Parquet, turning your data lake into something that behaves more like a database. This guide covers every Delta Lake feature you'll actually use, with SQL examples and the performance gotchas you'll hit. For a broader comparison of where Delta Lake fits versus a traditional warehouse, see our Databricks vs. Snowflake comparison.

How the Transaction Log Works

Every Delta table has a _delta_log directory containing JSON files that record every operation: which Parquet files were added, which were removed, metadata changes, and schema changes. Each JSON file represents one commit (one atomic transaction). When you read a Delta table, the engine replays the log to determine which files constitute the current version.

Every 10 commits, Databricks writes a checkpoint file (in Parquet format) that consolidates the log state. This means reads only need to find the latest checkpoint and replay the few commits after it, rather than scanning the entire log history. For tables with thousands of commits, checkpoints are what keep read performance acceptable.

CRUD Operations: What Parquet Can't Do

Plain Parquet is append-only. Delta Lake makes Parquet files mutable through the transaction log:

-- INSERT works the same as plain Parquet
INSERT INTO sales.orders VALUES (1001, '2026-02-27', 'SHIPPED', 149.99);

-- UPDATE: changes are written as new Parquet files, old files marked as removed in the log
UPDATE sales.orders SET status = 'DELIVERED' WHERE order_id = 1001;

-- DELETE: same mechanism, old files removed, remaining rows written to new files
DELETE FROM sales.orders WHERE order_date < '2024-01-01';

-- MERGE (upsert): the most useful operation for ETL
MERGE INTO sales.orders AS target
USING staging.new_orders AS source
ON target.order_id = source.order_id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *;

Under the hood, UPDATE and DELETE rewrite the affected Parquet files entirely. If you update 1 row in a 100MB file, Delta Lake writes a new 100MB file. This is why small, frequent updates on large files are expensive - and why OPTIMIZE matters (more on that below).

Time Travel

Every commit creates a new version. You can query any historical version by version number or timestamp:

-- Query by version number
SELECT * FROM sales.orders VERSION AS OF 42;

-- Query by timestamp
SELECT * FROM sales.orders TIMESTAMP AS OF '2026-02-20 10:00:00';

-- Restore a table to a previous version (undo accidental changes)
RESTORE TABLE sales.orders TO VERSION AS OF 42;

-- View the history of all operations on a table
DESCRIBE HISTORY sales.orders;

Critical gotcha: Time travel only works if the underlying Parquet files still exist. VACUUM (which we'll cover below) deletes old files. The default VACUUM retention is 7 days. If you run VACUUM table RETAIN 24 HOURS, any time travel query for versions older than 24 hours will fail with a "file not found" error. Plan your retention based on how far back you actually need to query.

Schema Evolution

Delta Lake supports adding new columns, renaming columns, and changing column types without rewriting existing data:

-- Add a column
ALTER TABLE sales.orders ADD COLUMN shipping_carrier STRING;

-- Enable automatic schema evolution during writes
SET spark.databricks.delta.schema.autoMerge.enabled = true;

-- Or use mergeSchema option in a specific write
df.write.format("delta") \
  .option("mergeSchema", "true") \
  .mode("append") \
  .saveAsTable("sales.orders")

With mergeSchema, if your source data has a new column that the target table doesn't, Delta Lake adds the column automatically. Existing rows get NULL for the new column. This is incredibly useful for sources with evolving schemas (APIs, event streams), but it can also mask problems - a typo in a column name creates a new column instead of writing to the intended one. Review schema changes in DESCRIBE HISTORY regularly.

OPTIMIZE and ZORDER

Streaming writes and frequent appends create lots of small files. Small files = slow reads because the engine spends more time opening files than reading data. OPTIMIZE compacts small files into larger ones (target size: 1GB by default).

-- Compact all small files
OPTIMIZE sales.orders;

-- Compact and co-locate data by specific columns for faster filtering
OPTIMIZE sales.orders ZORDER BY (order_date, region);

ZORDER rearranges data within files so that rows with similar values for the specified columns are stored together. This enables file skipping - if you query WHERE region = 'US-EAST', Databricks can skip files that don't contain any US-EAST data.

Gotcha: ZORDER on high-cardinality columns (like user_id with millions of unique values) is wasteful. There aren't enough rows per value to co-locate meaningfully, so you're spending compute on OPTIMIZE without getting file-skipping benefit. Stick to medium-cardinality columns: dates, regions, categories, status codes. Two ZORDER columns is usually the sweet spot - more than 3 gives diminishing returns.

VACUUM: Storage Management

Every UPDATE, DELETE, and OPTIMIZE creates new Parquet files while the old ones remain for time travel. VACUUM removes files that are no longer referenced by any version within the retention period:

-- Remove files older than 7 days (default)
VACUUM sales.orders;

-- Remove files older than 168 hours explicitly
VACUUM sales.orders RETAIN 168 HOURS;

-- DRY RUN: see what would be deleted without actually deleting
VACUUM sales.orders DRY RUN;

Always run VACUUM DRY RUN first on important tables. And remember: reducing retention below 7 days requires setting delta.deletedFileRetentionDuration on the table and disabling the safety check with spark.databricks.delta.retentionDurationCheck.enabled = false. Think twice before doing this - if someone needs to restore data from 3 days ago and you set retention to 24 hours, that data is gone.

Delta Live Tables (DLT)

DLT is Databricks' declarative pipeline framework built on Delta Lake. Instead of writing imperative ETL code (read from source, transform, write to target), you declare what the output table should look like and DLT figures out the execution:

-- Declare a streaming table that reads from a Kafka source
CREATE STREAMING TABLE raw_events
AS SELECT * FROM cloud_files('/data/events/', 'json');

-- Declare a materialized view with quality expectations
CREATE OR REFRESH MATERIALIZED VIEW clean_orders (
  CONSTRAINT valid_amount EXPECT (amount > 0) ON VIOLATION DROP ROW
)
AS SELECT order_id, customer_id, amount, order_date
FROM LIVE.raw_events
WHERE event_type = 'ORDER';

The EXPECT constraint is DLT's built-in data quality feature. Rows that fail the expectation are either dropped, flagged, or cause the pipeline to fail, depending on the ON VIOLATION setting. This replaces a lot of manual data quality code. For managing the compute behind these pipelines, see our guide on managing compute workloads for ETL vs. analytics.

Delta Lake vs. Apache Iceberg vs. Hudi

All three are open table formats that add ACID transactions to data lakes. The practical differences:

Databricks now supports Iceberg through UniForm, which writes Delta tables that are also readable as Iceberg tables. So it's not entirely an either/or choice anymore.

Key Takeaways

CelestInfo Engineering Team

We build lakehouse architectures on Databricks and Snowflake. Delta Lake, Iceberg, or both - we'll help you pick what fits. About us

Related Articles

Frequently Asked Questions

What happens if I set VACUUM retention below 7 days in Delta Lake?

VACUUM's default retention is 7 days. If you set it lower (e.g., VACUUM table RETAIN 24 HOURS), any time travel queries for versions older than 24 hours will fail because the underlying Parquet files have been deleted. You must also set delta.deletedFileRetentionDuration on the table and spark.databricks.delta.retentionDurationCheck.enabled to false to bypass the safety check.

Should I use ZORDER on high-cardinality columns?

No. ZORDER works by co-locating related values in the same files, which lets Databricks skip irrelevant files during queries. High-cardinality columns (like user_id with millions of unique values) don't cluster well because there aren't enough rows per value to group meaningfully. Use ZORDER on medium-cardinality columns like date, region, or category.

How does Delta Lake compare to Apache Iceberg?

Both provide ACID transactions and time travel on data lakes. Delta Lake has tighter Databricks integration (Unity Catalog, Delta Live Tables, Photon engine). Iceberg has broader multi-engine support (Spark, Trino, Flink, Dremio) and is gaining traction for vendor-neutral architectures. If you're all-in on Databricks, Delta Lake is the natural choice. If you need to query the same tables from multiple engines, Iceberg is worth evaluating.

What is the Delta Lake transaction log and why does it matter?

The transaction log (_delta_log directory) is a series of JSON files that records every change to a Delta table: which Parquet files were added, removed, or modified, plus metadata and schema changes. It's what makes ACID transactions possible on object storage. Every 10 commits, Databricks creates a checkpoint file (Parquet format) so reads don't have to replay the entire log history.

Ready? Let's Talk!

Get expert insights and answers tailored to your business requirements and transformation.