Ad Technology 200 Employees 10 Weeks

How an Ad-Tech Platform Cut Query Costs by 60% Moving from Redshift to a Hybrid Architecture

A 32-node Redshift cluster burning $47K/month, analysts blocking the bidding pipeline with attribution queries, and a data team that spent more time tuning DISTKEYs than building products. Here's how we fixed it.

The Challenge

This company runs a programmatic bidding platform that processes ad impressions across mobile, display, and connected TV channels. On a typical day, that's 8 billion impression events flowing through their system. All of it was landing in a 32-node Redshift cluster that cost $47,000/month in reserved instance pricing.

The real problem wasn't the storage - it was the contention. Their analytics team ran complex multi-touch attribution queries that needed to scan weeks of impression data, join it with click streams, and calculate conversion windows. These queries regularly took 20+ minutes and consumed most of the cluster's compute. While those queries ran, the real-time bidding data ingestion pipeline backed up. In programmatic advertising, delayed bid data means missed auctions and lost revenue.

They'd already tried Redshift Concurrency Scaling to isolate the workloads, but that just doubled their bill. The on-demand concurrency scaling charges added another $40K+ per month during peak hours. Meanwhile, the data team was spending 60% of their time manually tuning DISTKEY and SORTKEY configurations, running VACUUM operations, and debugging WLM queue configurations instead of building the attribution models the business actually needed.

The VP of Engineering described it bluntly: "We're paying a senior engineer's salary just to babysit Redshift."

What We Built

We didn't do a simple lift-and-shift. The key insight was that most analytical queries only touched the last 7 days of data, but they were paying to keep 18 months of impression history on hot, expensive Redshift storage. So we designed a hybrid architecture.

Hot tier (last 7 days): Moved to Snowflake. This is where 95% of analyst queries land. Snowflake's per-query compute isolation was the game-changer here - analysts can run their 20-minute attribution queries on a dedicated LARGE warehouse without touching the ingestion warehouse at all. No more contention, no more WLM tuning, no more angry Slack messages from the bidding team at 2am.

Cold tier (7+ days): Historical impression data gets written to S3 in Parquet format, partitioned by date and ad_network_id. We set this up as Snowflake external tables so analysts can still query it when they need longer lookback windows for quarterly reports - it's just slower and cheaper. For the 18 months of historical data on Redshift, we ran a one-time bulk export using Redshift UNLOAD to S3 with Parquet formatting.

Ingestion pipeline: Replaced the existing Redshift COPY-based ingestion with Kinesis Data Firehose → S3 → Snowflake Snowpipe. Impression events hit a Kinesis stream, Firehose buffers them (60-second window, 128MB buffer), writes Parquet files to S3, and Snowpipe auto-ingests them into Snowflake. The whole pipeline runs with under 3 minutes of end-to-end latency. Compared to their old Redshift COPY jobs that ran every 15 minutes and sometimes backed up for hours, this was a massive improvement.

The tuning problem solved itself. On Redshift, they'd spent months getting the DISTKEY on impression_id and SORTKEY on event_timestamp just right. On Snowflake, automatic micro-partition pruning handles this transparently. The data team didn't need to think about physical data layout at all.

Results

60% Cost Cut $47K/mo → $19K/mo compute costs

Zero Contention Analytics & ingestion fully isolated

8B Events/Day Ingested with <3 min latency

20min → 45sec Average analyst query time

Tech Stack

Snowflake Snowpipe AWS Kinesis Firehose Amazon S3 (Parquet) Redshift (decommissioned)

What We Learned

Concurrency Scaling isn't free - it's a trap for high-volume workloads. Redshift's Concurrency Scaling sounds great in a demo, but when you've got analysts running heavy queries 8 hours a day, those on-demand charges add up fast. In this case, it was cheaper to run a dedicated Snowflake warehouse than to pay Redshift's burst pricing. The math doesn't work once you cross about 4 hours of daily concurrency scaling usage.
Hot/cold tiering is the obvious optimization nobody does. We see this pattern constantly: teams keep 12-24 months of data on expensive hot storage when 90%+ of queries only touch the last week. Moving cold data to S3 Parquet with external tables gave analysts the same SQL interface for historical queries while cutting storage costs by roughly 80%.
Snowpipe's auto-ingest changed how the team thinks about data freshness. On Redshift, they ran batch COPY jobs every 15 minutes and accepted that as "near real-time." With Snowpipe auto-ingesting from S3 event notifications, data shows up in Snowflake within minutes of hitting Kinesis. The bidding optimization team started building models that react to hourly trends instead of daily ones - something that wasn't possible before, not because of technology limits, but because the data was always too stale.
Decommissioning Redshift took longer than the migration itself. We had the Snowflake pipeline running in production within 6 weeks. The last 4 weeks were mostly spent on validating data parity, migrating downstream BI reports, and getting sign-off from compliance that the historical data export was complete. Don't underestimate the tail end of a migration project.

Redshift Costs Getting Out of Hand?

Query contention, runaway Concurrency Scaling bills, endless DISTKEY tuning - we've untangled it before. Tell us what you're dealing with.

Start a Conversation