The Single Source of Truth

Mike Wang
May 29, 2025
5 min read

This is the second post in the Future of Revenue series. Check out the other posts below. Updated 5/29/25.

Today I'm going to show you how to create a single source of truth for all of your GTM data.

If you're part a GTM team in 2025, you're probably in one of two camps: either you have yet to collect and organize your team’s data and you feel like you’re flying blind, or you experiencing data overload - you’re swimming in information but drowning in noise. Every tool you add to your stack claims it will unlock the golden insight, but instead leaves you juggling fragmented views that each only paint one piece of the picture1.

We’ve heard the promises for years - a holistic 360-degree view of your accounts and leads - but the results have been lackluster and it’s started to feel more like a mirage than a realistic goal. Even if you’ve successfully unified your data (which for some teams can take years), and you have a big beautiful data warehouse, how do you actually leverage it to grow revenue in your day-to-day? There's always been a key piece missing - until now.

The rise of large language models (LLMs) has allowed for an incredible breakthrough - now AI can organize and interpret millions of datapoints to provide nuanced, actionable insights and personalized content at scale. You may have already tried some of these AI tools and been underwhelmed - AI is not a silver bullet!

AI without data is shallow and flat. Data without AI is idle and squandered.

The magic is in how the technology is integrated ✨.

In this series, we show you what a state of the art signals-based GTM automation system looks like, starting with the data architecture in this post.

Data Sources

Building a revenue automation system starts with robust data foundations. After all, a single source of truth that's empty is about as useful as a Tesla with no charge.

There’s a world of data out there, and below is an overview of essential types to consider. But don't worry - you don't need to catch 'em all!

The specific data sources you should prioritize will depend on several key factors, including:

  • The product you are selling
  • Your ideal customer profile (ICP)
  • The sales and marketing motions you employ

Experimentation is key -find the signals and lead fit indicators most valuable for your unique business. For example, if you are a heavily sales driven organization with long sales cycles, maintaining high quality call transcripts, email history, and notes may be especially beneficial.

Data Sources Table
Data Type Possible Sources Examples
First party data
Website Engagement Website Tag, Pardot, Marketo A lead screenshotted your pricing page, or spent 20 min reading about a key topic.
Content Engagement Website Tag A lead downloads your eBook, and read a related blog post and case study.
Ad Engagement Website Tag, LinkedIn Ads, Google Ads, Bing Ads A VP of Sales clicked your ad about pipeline velocity - signaling they care about sales acceleration.
Referral Traffic Website Tag A leads comes from the Y Combinator site, or ChatGPT, indicating a certain type of intent or buyer profile.
Form Fills Hubspot, Website Tag A lead fills out a demo or contact request form.
Deal History Salesforce, Hubspot, Stripe A company that closed-lost last year is reengaging.
Notes on the Account, Contact, or Opportunity Salesforce, Hubspot Ashley described a specific painpoint they want to solve, and she loves her golden doodle.
Sales Email Threads Outreach, Salesloft The prospect replies “let’s revisit next quarter” or asks for technical documentation.
Marketing Email Engagement Hubspot, Marketo, Pardot A lead opens and clicks 4 of your last 5 newsletters, including one about a new product launch.
Meetings Google Calendar, Calendly, Salesforce The lead booked a 30-min intro call with an AE next week and added 3 colleagues to the invite.
Call Transcripts Gong, Fireflies*, Fathom* They described their budget, why they are interested, and identified their gatekeeper on the call.
Webinar Attendance Salesforce, Hubspot A CMO attended your webinar.
Event Attendance Salesforce, Hubspot A decision-maker stopped by your Dreamforce booth.
Product Usage Data Mixpanel, Your Database* A user signed up for a trial, invited 5 teammates, and hit a key usage milestone.
Customer Support Requests* Intercom*, Zendesk* A customer asks “Do you integrate with Salesforce?” - a strong upsell signal.
Third party data
Hiring Third Party APIs A healthtech company is hiring 10 care coordinators - signal they’re expanding patient ops and may need automation tools.
Job Listings Third Party APIs A university is hiring an “Online Learning Systems Manager” - signal they’re investing in edtech infrastructure.
Champion Movement (Job Changes) Third Party APIs Your champion just became COO at another company - prime time to reopen the deal.
Social and Review Activity Linkedin, G2, X*, Reddit*, GitHub* A manufacturing VP complains on LinkedIn about supply chain inefficiencies, or a key prospect announces they are going to an upcoming conference.
News & Press Scraping, AI Web Search, News APIs* A retail chain just announced a major store expansion in the Northeast - triggering demand for hiring, payroll, POS systems.
Custom AI signals ✨ Scraping, AI Web Search, Clay* Almost anything you can think of! A real estate developer acquired new land, a prospect launched a new feature that may need your LLM ops product.
Funding Data Third Party APIs A fintech startup just raised a $30M Series B
Firmographic Data Scraping, Third Party APIs Revenue, employee count, industry, location - may all be important for narrowing leads to your ICP.
Technographic Data Scraping, Third Party APIs* A direct-to-consumer brand runs Shopify + Klaviyo - indicating they are a fit for your marketing services.
Person Data Third Party APIs IP & fingerprint resolution → person level data, job title, and contact info for anonymous web visitors.
Intent Co-ops Zoominfo, 6sense*, Bombora* A regional bank is showing surging interest in “fraud detection tools”

* Avina comes with integrations to all of the above data types and sources, except for those marked with a star.

Web Behavior Data

Not all data is created equal. Knowing that a visitor spent 15 minutes on your pricing page isn’t groundbreaking intel, but knowing precisely what caught their attention during their session can make or break a sale.

At Avina we take the web interaction data a step further by also capturing the exact text that visitors read, actions like scrolling and clicking, and screenshots. We then resolve anonymous visitors at the contact and account level with a waterfall of data providers. With the help of AI, each session transforms into a detailed narrative of the visitor’s interests and persona:

Data Unification

Now that we’ve identified the data sources we want to integrate, the next step is to build the data pipelines and transformations to unify everything. Clean, unified data is the fuel that powers effective playbooks, workflow automations, and account-based marketing. On top of that, AI requires tidy data, and will reward you with the most effective recommendations and personalization.

The data unification process involves three main steps: Integration and Ingestion, the Unification Pipeline, and Enrichment:

Step 1: Integration and Ingestion

In order to start pulling in the data we need, we’ll need to integrate with those data sources through their application programming interfaces (APIs). For example, Hubspot, Google Ads, Outreach, and many other platforms offer public APIs.

If the thought of building countless integrations makes you want to book a one-way ticket to Tulum, I've got good news. ELT Tools like Airbyte, Fivetran, and Ampersand can streamline the heavy lifting - we use them extensively at Avina.

For custom web data without available APIs, you can use web scraping to pull it directly into your systems.

Once you have those integrations built out, you’ll typically store the data in a data warehouse or database like AWS Redshift or MongoDB.

Step 2: Unification

The next step is to take all the data that we’ve pulled in and unify it together - the challenge is that each source structures data differently and contains duplicate records not just within each system but also across them.

The key is to first define a unified data model that is flexible enough to accommodate the idiosyncrasies of each source, then build a data pipeline that transforms data from each system into the unified schema. We’ll want a unified model for each of these entities:

  • Accounts
  • Contacts
  • Deals
  • Touches
  • Tactics/Channels

The data pipeline should do the following:

Data Pipeline Steps
  • Standardize the data Transform data into common schemas while preserving entity relationships.
    Example: domain, website, and site should all map to ‘domain
  • Clean the data: Refine and correct raw data for accuracy.
    Example: https://www.google.comgoogle.com
    company name "[not provided]" → set name as the domain
  • Normalize data: Convert data into a standard, consistent format.
    Example: "$1M", "1 million"1000000
    "Jan 4th, 2023 5pm ET""2023-01-04T22:00:00.000-05:00"
  • Structure the data: Organize raw data, often using LLMs for extraction.
    Example: From a raw call transcript, we use an LLM to extract information such as painpoints, budget, and decision makers.
  • Classify the data: Categorize data using rules and AI.
    Example: Label raw data with the type of touch: utm_campaign=aug_ai_newsletter_123 → Channel = Marketing Emails
  • Deduplicate records: Identify and merge duplicate entries.
    Example: A contact has the email john@meta.com and another has john@gmail.com, but both have LinkedIn url as linkedin.com/in/john - they are the same person.

At Avina, we make heavy use of the Python library pandas, which provides an extensive set of data manipulation utilities that make our lives much easier. We then orchestrate tasks using celery and AWS ECS to run transformation for each customer. Other excellent tools for transformations and task orchestration include dbt, Airflow, and Hatchet.

Step 3: Enrichment

The third and final step in building our data architecture is to fill in data gaps using a waterfall of providers.

In order to give our GTM system the most complete context and allow us to close the loop by getting in touch with our leads, we’ll want to acquire the following information wherever it’s missing:

  • Company name, domain, LinkedIn url
  • Company firmographic info (employee count, revenue, industry, location)
  • Company product summary
  • Company technographic info (what tools do they use)
  • Contact name, email, phone, LinkedIn url, job title, location

Since it’s impossible get 100% accuracy with any one data provider, we recommend using a pool of data providers (at Avina we use People Data Labs, Wiza, etc). It’s also more economical to enrich the information only when you need it - you don’t need to enrich the emails of all 500k contacts in your CRM if you are not going to email them right away, so having a system that does real-time enrichment is helpful.

Congratulations! With that we’ve completed the data foundation for our cutting-edge GTM engine by establishing a powerful single source of truth for all of our leads and target accounts 🎉. If that sounded like a lot of work, well Christmas came early and Santa just showed up with…

The Avina Customer Data Platform

Avina is unique among GTM automation platforms in that it automatically creates a full-fledged customer data platform (CDP) for you, taking all of this work completely off your plate. The CDP ensuring that every layer of intelligence applied on top of it is working with the most complete, accurate, and fresh information. The system syncs and unifies all of your data sources every 2 hours.

For some, hearing CDP might give them PTSD from spending months wrangling dbt materializations and schema mappings (*cough* Bizible *cough* Salesforce Data Cloud). With Avina, implementation takes a full… brace yourselves… 30 min! Don’t believe me? Try it out yourself.

On top of feeding the most complete context to the AI, you have direct access to every touchpoint in the dashboard and direct access to the database itself.

On the other hand, if a tool does not integrate with these data sources or unify them effectively, it’s simply not possible for it to have all the context on your accounts. Any insights, scores, recommendations, and personalization will inevitably miss the mark if they are not aware of key information like deal history, past communications, and marketing engagement.

Customization

While Avina is plug and plug and comes with the default best practices out of the box, we know one size doesn’t fit all. On the CDP side alone, these capabilities range from custom deduplication logic, reading custom columns and objects, and data filters, to custom new hire and social keyword monitoring, and custom call transcript extraction.

For a detailed look into all of the customization available, stay tuned for a dedicated post where we explore the full range of possibilities.

Marketing and Sales Alignment

An added benefit of having a single source of truth is that it lays the groundwork to align the sales and marketing teams.

The best companies think of GTM as one continuous motion - companies that align their sales and marketing teams see 36% higher customer retention and 38% higher win rates. True alignment can only be achieved when the entire revenue org speaks the same language, measures the same metrics, and coordinates its actions around the same data.

What’s next

Now that we’ve laid a rock-solid data foundation - a unified, enriched, and fully customizable source of truth - it’s time to put that data to work ⚡️. Clean data sitting in a warehouse doesn’t close deals. The real value comes when that data brings the perfect leads to your doorstep, fuels the right plays, drives personalized messaging, and shows up at the exact moment your reps need it most.

In the next post, we’ll show you how to make that leap - from data to decisions. We’ll explore how AI can transform raw data into clear signals and guidance: who to target, what to say, and when to say it. You’ll see how we extract deep insights from customer behavior, spot intent early, and surface the highest-impact actions across your pipeline.

The Future of Revenue: From Chaos to Clarity

Part 1: The Single Source of Truth (you are here)

Part 2: Finding Signal in the Noise (coming soon)

Part 3: Revenue on Autopilot (coming soon)

Part 4: The New GTM Frontier (coming soon)

Footnotes:
  1. If, on the hand, you feel like you’ve crushed your GTM stack and it’s meeting your goals, we would love to chat and understand how you did it!

* * *

Avina is a signals-based GTM automation platform for B2B teams that uses cutting-edge artificial intelligence to convert data into revenue.

Ready to move from chaos to clarity? Book a demo with us here.

Have questions, feedback, or ideas? Reach out to hello@avina.io - we would love to hear from you!

Here’s what our customers are saying:

“Avina has completely changed the way I work. Before, I felt like I was shooting in the dark to get responses, but now I have the highest propensity buyers being delivered to my doorstep, and I can focus on closing those deals.”
- James, Director of Strategic Partnerships, Ingenious (
Case study)
“In sales, it’s all about getting more quality at-bats. Products that help do better cold outbound are faster horses - Avina’s warm inbound alerts is a car.”
- Ethan Pope, VP Revenue Operations, Nitrogen Wealth
Share this post
Mike Wang