Product-led growth (PLG) thrives on data. But as anyone who’s worked on a company’s customer relationship management (CRM) tool like Salesforce or Hubspot knows, not all data is created equal. For every clean CRM database, there are a dozen others riddled with unnecessary users, mismatched tables, and fossils of outdated accounts. To get some insights into how to best organize PLG data collection and reporting using CRM resources, we sat down with Rachel Bradley-Haas, former data engineer and director at Heroku, and co-founder of Big Time Data. She shared with us data horror stories from her experience, as well as how she advises businesses to get their CRM mapped to exactly what their product needs. Don’t expect PLG and CRMs to be BFFs from the get go First, the bad news. CRMs like Salesforce and Hubspot have required entities like Accounts, Deals, and Leads, each of which have strict definitions and relationships to each other, and have little to do with the PLG strategy of freemium users. Trying to directly shove a PLG model into a CRM is a recipe for disaster. “It terrifies me when people think going straight from their production database to Salesforce without any stop gaps is a brilliant idea,” said Rachel. Salesforce ends up being a data junkyard rather than a sales tool, loaded up with mounds of information that’s useless to your sales team, like individual usage time and freemium metrics. On the other side of the coin, it’s bad for your engineering team too, because they have all these sales terms that are meaningless to them, like Contacts and Opportunities, which they have to code workarounds for. This leads to messy situations where Deals can represent downloads and Contacts represent free users. You end up with a huge disconnect between what native CRM objects are meant to represent and what you’re using them to represent. Much like a disorganized cable setup, Rachel and her team rip out these wonky workarounds and replace them with a functional system—one that is more aligned with PLG goals and tactics. How to strain out the right data So what does a better system look like? Rachel advocates for the “warehouse-first approach,” where all your raw product data is stored in a data warehouse separate from your CRM. Ultimately, it leaves you with the CRM as a sales tool only. The initial “throwing all of the data-spaghetti at the CRM-wall and seeing what sticks,” is no more. Instead, you can map the data from your production database and pull out what you think sales will want to see in the first place. Here’s two examples of how this could work: 1. A new organization starts using your product. All of the data concerning their usage is stored in the warehouse, and only sales-relevant information is sent to your CRM, such as: The name of the organization which triggers an Account creation; The name of the main user which triggers an associated Contact creation; And more. Now, instead of having to wade through the weeds of usage data, your sales team can use your CRM as it was intended. “The concept of an Account or Contact doesn’t exist in a production database, therefore Salesforce owns those entities,” Rachel said. “Any additional entities (organizations, relationships, and product usage) are managed by the data warehouse in terms of creating, updating, and whatnot. This enables a mirroring system where we can determine what’s going on in the product and how to surface it in Salesforce for people to act on.” 2. A new freemium user starts using your product. Instead of having thousands of inactive free users in your CRM, you can set up a trigger or alert in your data warehouse. This can automatically create an account in your CRM when an organization reaches 10 free users, for example. The exact numbers that would trigger account creation can vary—it’s up to your sales and marketing teams, depending on when they feel it’s best to engage with the client. This type of data modeling works perfectly with PLG motions. Relegating logic to your data warehouse via data mapping can achieve the following: Reduces clog; Keeps data relevant; and Accommodates clients anywhere on the spectrum from individual freemium to enterprise. If you build a warehouse, the data will come So, let’s say your business could benefit from a data warehouse. The questions then become: which one should I use, and how do I set it up? Here are some options that Rachel recommended: Snowflake. The cream of the crop when it comes to data warehouses, though it is pricier than other options. It might not be needed depending on the size of your data and what you’re trying to do with it. AWS Postgres. A PostgreSQL database on Amazon Web Services. A great small or mid-size option, especially if you’re already making use of other AWS tools. An ETL (Extract, Transform, and Load) tool. When your sales team updates information in your CRM, you’ll want to send it back to the warehouse to keep everything in sync. Fivetran, Stitch, and Airbyte are all good options. An orchestrator. Middleware that orchestrates the flow of data between your warehouse and CRM. Airflow and dbt are both options for this, and Hightouch acts as a “Reverse ETL,” sending data from your warehouse to the CRM. As Rachel put it, using tools like these allows “engineering to focus on engineering things, building out the product, evolving, and ensuring that our understanding of the data coming out of the product is simplified and modeled in the way that people downstream can consume it and use it.” Hiring the right people for data–not just data engineers Having your data warehouse and CRM talking to each other via middleware requires specialized knowledge to set up. It’s important to hire the right people to make it work. Rachel shared some horror stories where companies didn’t hire data analysts or data engineers. In fact, they ended up with systems that created more problems than they solved. “It’s a complete disaster where they’re making Stripe API calls to Salesforce, generating all these useless records, and automation chaos,” she said. “No one has a sense of ownership because it’s too technical for the RevOps [Revenue Operations] analyst to understand. Engineering says, ‘I did exactly what you wanted, I’m done, this is no longer my scope.’ And now you’re stuck with this crappy thing and no one knows how it works.” On the opposite end of the spectrum, Rachel warned against jumping the gun and hiring data engineers right off the bat. Their specialty is in setting up databases—not gleaning conclusions and business actions from them. This yields its own set of issues. “What data engineers end up doing is building very nice glorified data piping tools, but don’t think through, ‘how is someone going to consume this data, how do we need to model it to make sure people can act on it?’ So then you have a very expensive data warehouse with great clean data in it, but not in a way anyone can understand or knows what’s going on. You end up having a lot of money spent but no action taken from it,” Rachel said. To avoid both of these situations, she recommended first hiring a data analyst. This ideal analyst understands the data and knows the tools available, and can set up the initial components of a modern data stack. If it gets to a point where the analyst starts to become overwhelmed and can no longer both analyze the data and engineer changes to the warehouse, then it may be a good idea to hire a data engineer as well. The creation of another new product: data Hiring data analysts? Data engineers? Sounds like your product’s data is no longer just a tool to make sales. The data itself is turning into a product that needs to be managed. Now you’ve got an entirely new way of looking at all this data—which is what you want. Rachel made the point that your data warehouse and your CRM should both be thought of as products, not just tools. “You need to have someone managing those products,” she said. “There are so many unique connections and automations coming out of them to not treat your data warehouse and CRM as business critical products. They’re so core to everything being automated that, if you don’t, you’re kind of screwing yourself over.” Data should be thought of as a product for a few reasons, but one is because of the level of customization required in order to squeeze the most value out of it. For example, in your data warehouse, proper product usage monitoring requires a lot of setup. You need unique identifiers for events that the user can trigger while using your product, all of which need to be mapped to user IDs, organization IDs, component IDs, and more. As mentioned earlier, for your CRM, generic one-size-fits-all setups are likely not going to cut it when it comes to more complex PLG entity models. It’s scary but not impossible to get a great PLG tech stack The reality is that getting a good PLG stack requires work. No tool will be perfect right out of the box, and even the best hires will need some time to get familiar with your product. But Rachel left us with some reassuring words, recommending that you “[Get] into your warehouse and attempt to represent what’s really going on in your product at the right levels, one piece at a time. It’s very doable once you take time to think it through and map it out.” Be sure to check out these data management articles for more great steps to take, one piece at a time. ● 5 Common DataOps Mistakes (and How To Avoid Them) ● Developing an Effective Long-Term Data Strategy for Your Company ● Migrating Your CRM? 5 Common Pitfalls to Avoid ● Purge Your CRM to Avoid Information Overload ● 3 Simple Salesforce Hacks with Major Sales Impact