SaaS / Series A 45 Employees 6 Weeks

How a B2B SaaS Company Built a Self-Serve Analytics Platform in 6 Weeks

Five data sources, zero trust in the numbers, and a Monday metrics email that took 3 hours to assemble by hand. Here's how we fixed it with Hevo, Snowflake, and dbt.

The Challenge

Product data lived in PostgreSQL. Billing was in Stripe. Support tickets were in Zendesk. Marketing attribution was in HubSpot. User behavior tracking was in Mixpanel. Five sources, zero integration between them. And the CEO wanted a weekly metrics email every Monday morning.

The head of engineering was the one assembling that email. Every Monday, he'd pull data from each source manually, paste numbers into a Google Sheet, and send it out by 10am. It took about 3 hours, and the numbers were wrong roughly 30% of the time. The Stripe API kept timing out on large date-range pulls, so billing numbers were often incomplete. HubSpot's attribution model didn't match what the marketing team expected. Nobody trusted the data, and every all-hands meeting started with the same question: "Wait, which version of churn are we looking at?"

The company had just raised a Series A, and investors wanted clean metrics - MRR, net revenue retention, CAC payback period. The engineering team was 12 people, none of them data engineers. They didn't have the bandwidth to build a data stack from scratch, and they didn't have the time to debug flaky API integrations every week.

What We Built

We used Hevo Data to pipe all 5 sources into Snowflake. Hevo's pre-built connectors for Stripe, Zendesk, HubSpot, and Mixpanel were each connected in under a day. The Stripe connector was the fastest - about 2 hours from setup to first data landing in Snowflake, including the historical backfill. PostgreSQL replication was set up with Hevo's log-based CDC, which required setting wal_level=logical on their RDS instance (a parameter group change and a reboot, not a big deal but it does require a brief downtime window).

Once all 5 sources were flowing into Snowflake's raw layer, we built a dbt project with a semantic layer that codified agreed-upon metric definitions. MRR calculation, for example, had been a source of endless arguments. We defined it as: sum of all active subscription line items at month-end, excluding one-time charges, prorated for mid-month changes. Churn was defined as: MRR lost from customers who fully cancelled, divided by beginning-of-month MRR. NRR included expansion, contraction, and churn. Every metric had a single dbt model with a YAML description that the whole team reviewed in a PR before it went live.

For the BI layer, we deployed Metabase. We evaluated Looker and Preset as well, but Metabase won for two reasons: the team was small enough that Looker's governance features were overkill, and Metabase's SQL-native interface felt more natural to the engineers who'd be the primary users. The Monday metrics email is now automated - a Metabase "Pulse" sends it at 6am every Monday, pulling from the dbt models in Snowflake. Zero manual work.

Results

5 Sources Unified in Snowflake
6am Monday Automated metrics email (was 10am, manual)
$0 to Self-Serve Analytics in 6 weeks
100% Metric definition agreement across teams

Tech Stack

Hevo Data Snowflake dbt Metabase PostgreSQL Stripe API

What We Learned

  • Metric definitions are a people problem, not a tech problem. The hardest part of this project wasn't connecting data sources or writing dbt models. It was getting the CEO, the VP of Sales, and the head of marketing to agree on what "churn" means. We spent a full afternoon in a room with a whiteboard before the dbt model was written. That meeting saved weeks of back-and-forth later.
  • Hevo's historical load can spike your Snowflake bill if you're not careful. When we first connected Stripe, Hevo backfilled 3 years of invoice data in one shot. That ran up the Snowflake credits more than expected because the warehouse auto-scaled to LARGE during the initial load. We now set a MAX_CLUSTER_COUNT of 1 on the Hevo loader warehouse and accept slower initial loads in exchange for predictable costs.
  • Metabase over Looker was the right call for a 45-person company. Looker's LookML modeling layer is powerful, but it's another abstraction on top of dbt that this team didn't need. Metabase reads directly from the dbt-modeled tables in Snowflake, and the engineers can write SQL questions when they need something custom. At this team size, the simpler tool wins.

Data Scattered Across Too Many Tools?

If your weekly metrics are assembled by hand and nobody trusts the numbers, we've been there. Let's talk about what a realistic data stack looks like for your team size.

Start a Conversation