Reverse ETL: Pushing Data from Your Warehouse Back to Salesforce, HubSpot, and Beyond

Q: How do I model data for Reverse ETL in dbt?

Build dedicated 'sync-ready' mart tables in dbt with clear primary keys that match the destination's identifier (e.g., Salesforce record ID or email). Include only the columns you're syncing. Keep the model flat - no nested objects. Add dbt tests to ensure primary key uniqueness and required fields are never null.

Chandra Sekhar

Published Sep 25, 2025 · Last updated Feb 2026

Celestinfo Software Solutions Pvt. Ltd. • Sep 25, 2025

Quick answer: Reverse ETL syncs modeled data from your data warehouse (Snowflake, BigQuery, Redshift) back to operational tools like Salesforce, HubSpot, and Zendesk. You write a SQL query in the warehouse to define the data, map columns to destination fields, and schedule syncs. Tools like Census, Hightouch, and Polytomic handle the API integration. Don't build it yourself - API rate limits, error handling, and idempotency are harder than they look.

Last updated: October 2025

Introduction

Your data warehouse has the best version of your customer data. Product usage metrics, support ticket history, billing status, lead scores from your ML model - it's all there, cleaned and joined. But your sales team lives in Salesforce. Marketing lives in HubSpot. Support lives in Zendesk. They can't write SQL, and they shouldn't have to. Reverse ETL bridges this gap by pushing warehouse-computed data to the tools where teams actually do their work.

This isn't a new idea - teams have been exporting CSVs from warehouses and uploading them to Salesforce for years. What's new is that dedicated Reverse ETL tools automate the entire flow: query the warehouse, detect changes, handle API pagination and rate limits, retry failures, and log everything. For a broader view of how Reverse ETL fits into the integration landscape, see our 5 Data Integration Patterns guide.

Why Reverse ETL Exists

The modern data stack created an ironic problem. Fivetran and Hevo pull data into your warehouse from 200+ sources. dbt transforms it into clean, well-tested models. Your warehouse is the single source of truth. But all that work is trapped behind a SQL interface that only the data team uses.

Meanwhile, the people who make revenue-impacting decisions - sales reps, campaign managers, support agents - work in SaaS tools that have no direct connection to the warehouse. They're making decisions based on stale or incomplete data, or worse, manually maintaining their own spreadsheets. Reverse ETL solves this by pushing the warehouse's enriched data to where it's actually needed.

How Reverse ETL Works

The workflow has 4 steps:

Define a model (SQL query). Write a SELECT statement in your warehouse that produces the rows and columns you want to sync. This query runs in Snowflake, BigQuery, or Redshift - wherever your modeled data lives.
Map columns to destination fields. Tell the Reverse ETL tool which warehouse column maps to which field in the destination. For example: warehouse.lead_score maps to Salesforce.Lead.Lead_Score__c.
Configure sync behavior. Choose how records are matched (by email? Salesforce ID? HubSpot contact ID?) and what happens on conflicts (upsert, create-only, update-only).
Schedule syncs. Run the sync hourly, daily, or on a cron schedule. The tool diffs the current query results against the last sync, and only sends changed records to the destination.

SQL - Example Reverse ETL Source Query

-- This query powers a Census sync to Salesforce
-- It pushes lead scores and product usage data to Lead records

SELECT
    l.salesforce_lead_id    AS lead_id,
    s.lead_score            AS lead_score__c,
    s.last_product_login    AS last_product_login__c,
    s.features_used_count   AS features_used__c,
    s.plan_name             AS current_plan__c
FROM analytics.mart_lead_scoring s
JOIN raw_salesforce.leads l
    ON s.email = l.email
WHERE s.lead_score IS NOT NULL;

Reverse ETL Tools

Tool	Warehouse Support	Destinations	Differentiator
Census	Snowflake, BigQuery, Redshift, Databricks	150+ (Salesforce, HubSpot, Google Ads, Braze, Zendesk)	Strong dbt integration, audience builder for marketing teams
Hightouch	Snowflake, BigQuery, Redshift, Databricks, PostgreSQL	140+ (Salesforce, Marketo, Intercom, Amplitude)	Visual audience builder, customer studio for non-technical users
Polytomic	Snowflake, BigQuery, Redshift, PostgreSQL	50+ (Salesforce, HubSpot, Notion, Airtable)	Bi-directional syncs, built-in data mapping UI
Fivetran (built-in)	Snowflake, BigQuery, Redshift	Growing list	Single platform for both ETL and Reverse ETL
Hevo (built-in)	Snowflake, BigQuery, Redshift	Growing list	No-code setup, part of existing Hevo pipeline

Use Cases That Actually Work

1. Lead Scoring Pushed to Salesforce

Your data science team builds a lead scoring model in the warehouse using product signups, website behavior, company firmographics, and email engagement. The model runs nightly, scoring every lead. Reverse ETL pushes the score to Salesforce's Lead_Score__c field. Sales reps see the score right on the lead record when they open it - no dashboard required.

2. Customer Health Scores to Zendesk

Combine product usage data (DAU, feature adoption, error rates), billing status (MRR, payment failures), and support history (open tickets, CSAT scores) into a single health score in the warehouse. Push it to Zendesk as a custom field. When a support agent opens a ticket, they immediately see whether this is a healthy customer or one at risk of churning - and can adjust their response accordingly.

3. Product Usage Data to HubSpot

Marketing wants to segment users by product behavior: power users, trial users who haven't activated, users who stopped logging in. These segments live in the warehouse as dbt models. Reverse ETL pushes segment membership to HubSpot contact properties, enabling targeted email campaigns without marketing ever touching SQL.

4. Ad Audience Syncs to Google/Meta

Build a warehouse model that identifies high-value customers or lookalike seed audiences. Push those email lists to Google Ads Customer Match or Meta Custom Audiences. The sync runs daily, so your ad targeting always reflects the latest warehouse data instead of a stale CSV uploaded 3 weeks ago.

Building It Yourself vs Buying

The temptation to build your own Reverse ETL is strong. "It's just API calls, right?" Here's what you'll actually need to build:

Change detection. You can't send all 500,000 contacts to Salesforce every hour. You need to diff the current query results against the last sync and only send changed records. This means storing sync state somewhere reliable.
API rate limit handling. Salesforce standard edition caps at 15,000 API calls per 24-hour period. HubSpot's batch API handles 100 records per call with 100 calls per 10 seconds. You need backoff, retry, and queuing logic.
Idempotency. If a sync fails halfway through, you need to resume without creating duplicate records. This means upsert logic based on external IDs, not blind inserts.
Error handling. When Salesforce rejects 47 out of 10,000 records because of a validation rule, you need to log those failures, surface them to users, and make them retryable.
Field mapping and type coercion. Your warehouse has TIMESTAMP_NTZ. Salesforce expects DateTime in ISO 8601. HubSpot wants epoch milliseconds. Every destination has its own type system.
Schema drift. When someone adds a new required field in Salesforce, your sync breaks. You need monitoring and alerting for schema changes in both source and destination.

For 1–2 simple destinations with small volumes, building might work. For anything more, Census or Hightouch pays for itself in engineering time saved within the first month. They've already solved these problems across 150+ integrations.

Data Modeling for Reverse ETL

Reverse ETL is only as good as the data you feed it. Build dedicated "sync-ready" mart tables in dbt with these principles:

Clear primary keys that match the destination. If you're syncing to Salesforce, the primary key should be the Salesforce record ID (not your internal user ID). If matching by email, make sure emails are deduplicated and lowercased.
Only the columns you're syncing. Don't expose your entire customer model. Include only the 5–10 fields that the destination actually needs. This reduces sync volume and API calls.
Flat structure. No nested JSON, no arrays. SaaS APIs expect flat key-value pairs. If your warehouse model has nested structures, flatten them in the mart layer.
dbt tests for sync readiness. Add unique and not_null tests on primary keys. Add accepted_values tests on enum fields that map to picklists in the destination. A failing dbt test should block the sync.

dbt - Sync-Ready Mart Model

-- models/marts/sync_salesforce_leads.sql
-- This model powers the Census sync to Salesforce Lead records

{{ config(materialized='table') }}

SELECT
    sf.id                           AS salesforce_lead_id,
    COALESCE(s.lead_score, 0)       AS lead_score,
    s.last_login_at                 AS last_product_login,
    s.total_features_used           AS features_used_count,
    s.current_plan                  AS plan_name,
    CURRENT_TIMESTAMP()             AS _synced_at
FROM {{ ref('mart_lead_scoring') }} s
INNER JOIN {{ ref('stg_salesforce__leads') }} sf
    ON LOWER(s.email) = LOWER(sf.email)
WHERE sf.is_converted = FALSE
  AND s.lead_score IS NOT NULL

Common Mistakes

Syncing too many fields. Every extra field increases API call size and the chance of validation errors. Start with 3–5 high-value fields (lead score, health score, last login) and expand based on actual business team requests.
Syncing too frequently. Hourly syncs sound great until you hit Salesforce's 15,000 API call daily limit by 3 PM. Match sync frequency to how often the downstream team actually checks the data. For most use cases, daily or every 6 hours is plenty.
No monitoring. Reverse ETL failures are silent by default. If the sync breaks on a Tuesday and nobody notices until the following Monday, sales reps have been working with week-old data. Set up Slack alerts for sync failures and record count anomalies.
Syncing raw data instead of modeled data. Reverse ETL doesn't fix messy data - it amplifies it. If your warehouse model has duplicate customer records, you'll create duplicate records in Salesforce. Clean the data in dbt first.

API Rate Limits You'll Hit

Destination	Rate Limit	Practical Impact
Salesforce (Standard)	15,000 API calls / 24 hours	~15,000 record updates per day using single-record API; bulk API handles 10,000 per batch
HubSpot	100 records/call, 100 calls/10s	~36,000 records/hour with batch API
Zendesk	400 requests/minute (Team), 700 (Professional)	Use bulk API for updates over 100 records
Google Ads	15,000 operations/mutate	Customer Match lists can take 24–48 hours to populate after upload
Intercom	1,000 requests/minute	Contact updates are rate-limited per workspace

The gotcha: Salesforce's 15,000 record/hour limit on standard edition is the one that bites most teams. If you're syncing 50,000 leads daily, you'll need Salesforce Enterprise edition (which raises the limit) or use the Salesforce Bulk API (which Census and Hightouch support natively). HubSpot's batch API maxes at 100 records per call - not per second, per call. For large contact lists, budget for sync windows of 30–60 minutes.

Key Takeaways

Reverse ETL pushes warehouse-computed data (lead scores, health scores, segments) to SaaS tools where business teams work.
The workflow is: SQL query in warehouse, column mapping, schedule sync. Tools like Census and Hightouch handle the API complexity.
Don't build your own unless you have very simple requirements. API rate limits, idempotency, and error handling are harder than expected.
Model your data for sync readiness in dbt: flat structure, clear primary keys, only the columns you need, and tests to ensure data quality.
Start with 3–5 high-value fields and daily syncs. Expand based on what business teams actually use, not what seems comprehensive.
Monitor sync failures actively. Silent failures mean stale data, which is worse than no data.

Chandra Sekhar, Senior ETL Engineer

Chandra Sekhar is a Senior ETL Engineer at CelestInfo specializing in Talend, Azure Data Factory, and building high-performance data integration pipelines.

Q: What is Reverse ETL?

Reverse ETL is the process of syncing modeled, enriched data from your data warehouse (Snowflake, BigQuery, Redshift) back to operational SaaS tools like Salesforce, HubSpot, Zendesk, and Google Ads. It closes the loop between analytics and operations by putting warehouse-computed insights where business teams actually work.

Q: Should I build my own Reverse ETL or buy a tool?

Buy, unless you have very few destinations and simple sync requirements. Building your own means handling API rate limits, pagination, error retries, idempotency, field mapping, and schema drift for every destination. Tools like Census and Hightouch have already solved these problems across 150+ integrations.

Q: What are the API rate limits for common Reverse ETL destinations?

Salesforce standard edition allows 15,000 API calls per 24-hour period. HubSpot's batch API handles 100 records per call with 100 calls per 10 seconds. Google Ads allows 15,000 operations per mutate request. These limits constrain how much data you can sync and how frequently.

Q: How do I model data for Reverse ETL in dbt?

Build dedicated "sync-ready" mart tables in dbt with clear primary keys that match the destination's identifier (e.g., Salesforce record ID or email). Include only the columns you're syncing. Keep the model flat - no nested objects. Add dbt tests to ensure primary key uniqueness and required fields are never null.

Burning Questions
About CelestInfo

Simple answers to make things clear.

How accurate are the AI insights?+

Our AI insights are continuously trained on large datasets and validated by experts to ensure high accuracy.

Can I integrate with my existing tools?+

Absolutely. CelestInfo supports integration with a wide range of industry-standard software and tools.

What security measures do you have?+

We implement enterprise-grade encryption, access controls, and regular audits to ensure your data is safe.

How often are insights updated?+

Insights are updated in real-time as new data becomes available.

What kind of support do you offer?+

We offer 24/7 support via chat, email, and dedicated account managers.

Still have questions?

Reverse ETL: Pushing Data from Your Warehouse Back to Salesforce, HubSpot, and Beyond

Introduction

Why Reverse ETL Exists

How Reverse ETL Works

Reverse ETL Tools

Use Cases That Actually Work

1. Lead Scoring Pushed to Salesforce

2. Customer Health Scores to Zendesk

3. Product Usage Data to HubSpot

4. Ad Audience Syncs to Google/Meta

Building It Yourself vs Buying

Data Modeling for Reverse ETL

Common Mistakes

API Rate Limits You'll Hit

Key Takeaways

Related Articles

Q: What is Reverse ETL?

Q: Should I build my own Reverse ETL or buy a tool?

Q: What are the API rate limits for common Reverse ETL destinations?

Q: How do I model data for Reverse ETL in dbt?

Burning Questions
About CelestInfo

Ready? Let's Talk!

Reverse ETL: Pushing Data from Your Warehouse Back to Salesforce, HubSpot, and Beyond

Introduction

Why Reverse ETL Exists

How Reverse ETL Works

Reverse ETL Tools

Use Cases That Actually Work

1. Lead Scoring Pushed to Salesforce

2. Customer Health Scores to Zendesk

3. Product Usage Data to HubSpot

4. Ad Audience Syncs to Google/Meta

Building It Yourself vs Buying

Data Modeling for Reverse ETL

Common Mistakes

API Rate Limits You'll Hit

Key Takeaways

Related Articles

Q: What is Reverse ETL?

Q: Should I build my own Reverse ETL or buy a tool?

Q: What are the API rate limits for common Reverse ETL destinations?

Q: How do I model data for Reverse ETL in dbt?

Burning QuestionsAbout CelestInfo

Burning Questions
About CelestInfo