hey everyone, has anyone developed a predictive lead score based on their CRM data? For context, a few years ago I worked at a SaaS with hundreds of inbound trial sign ups, and we developed a lead score - both co-founders were developers, which helped - that would tell us the likelihood of someone converting to a paid user based on historical data and a few key variables - country, company size, whether they use their work address, company type, and so on. I'm wondering if anyone has ever created a lead score based on the entire CRM database and all of the various datapoints captured across all objects to determine the likelihood of any given contact or company converting to SQL or opening a new deal within the next 30 days, or something similar. The lead score would be updated when a key property is changed, a new object is created, a form is submitted, and so on. This should be technically feasible, and given the well-known frustrations of building manual lead scores with manual inputs, I'd be surprised if no one has ever tried to develop something similar before. I'm asking this now because I have access to a large database (hundreds of thousands of records) and I'm currently using Clay for data enrichment, so there's a lot of potential for developing such a predictive lead score.
Nuno P. this sounds like a great great idea! With the recent advancement in AI tech, it should be more straightforward to calculate a lead score. In your database what all datapoints are you currently storing based of which one can potentially calculate the lead score? Also what did you use to generate the leads ?
Feels like that’s the entire purpose of lead scoring in HubSpot, for example. Give every contact an engagement and/or ICP score…and every company a fit score.
Once you create the scoring model, and publish it, yes.
Sorry, don't tools like Apollo already have lead scores..?? Or have I misunderstood the context here?
Lead score are you basing this all on intent?
Pat H. what do you mean by creating the scoring model? So, the idea here would be to avoid that entirely. The predictive lead score does all the calculations for you, etc. Chandana P. I think it's a different context. 🙂 Jean T. based on everything 🙂 not just intent. The idea is that anything that happens in the CRM feeds that specific lead score which is automatically calculated for you.
Nuno P. if your goal is to predict the likelihood of a prospect converting, which I think it is, I wouldn't rely on CRM data only . As a couple other people have said you have data about the contact, and all their interactions both on and off your website and web properties... Then you have data about the account or company, provided that you sell a high ticket product where the buying committee is multiple people. Then you have information about the companies.. firmographics, tech used, leadership structure. These are three separate but related data sets. The idea is to figure out, based on all the data that you have, and all the data that you don't yet have about the past but can look up retroactively, and do Factor analysis, meaning: what are certain combinations of factors (such as size activity on the website, companies structure, number of decision makers involved....) that are shared among closed won deals. Once you have this understanding you can build a model that will weigh each Factor ( datapoint) and certain combinations accordingly. It's not going to be scores only it's also going to be thresholds, meaning if an ICP fit account collect a huge score because they are actively consuming your stuff and are all over your website, but for example, they never book a demo, then they prob should be considered hot lead.... So its gonna be a company nation of scoring as well as puring people and companies into buckets.... Hth 🙂
Nice thinking, Nuno P. — I love this thread and how you're framing scoring. You're talking about building a global scoring model derived from everything in the CRM, then layering on enrichment from other tools. Salesforce Data Cloud, anyone? There are really two solid ways to approach this, and I totally see where Pat H. and Dan Rényi are coming from. Salesforce and HubSpot both offer out-of-the-box solutions for lead and account scoring. Personally, I like Marketing Cloud Account Engagement (formerly Pardot) for this — though it's often underutilized or not set up correctly. You're basically describing the two ends of a spectrum I see play out a lot: On one end, you've got the full-stack enrichment and scoring beast — Clay + Clearbit + Apollo + behavioral tracking + firmographic enrichment — all piped into a custom ML model. It's amazing if you've got the data, budget, and a dev team to keep it all running. It sounds like you had that previously. But it's not just about building the system — it's about maintaining trust in it and refining it over time. That can easily turn into a nightmare, and most teams fall short with the upkeep. Doesn't Chris Walker's Passetto aim to do something like this? On the other end, you've got a Salesforce Flow that updates a score or flag based on a few key fields (email type, job title, company size, country, intent signal). It's simple but effective — and more importantly, it gives teams a way to test hypotheses and start having better conversations about lead quality. That said, what you're describing is bigger than just lead scoring. You're talking about a predictive model that listens across the entire CRM — Leads, Contacts, Accounts, Opportunities, and even Custom Objects. That's a much more complex architecture. A Flow can definitely prototype this — it's quick, has no code, and is great for iterating — but once you're triggering logic across many objects and records, you'll likely hit DML limits or run into orchestration complexity. That's where Salesforce Data Cloud starts to make sense. It's designed to unify data across Salesforce (and beyond), consolidate it into a single profile, and apply segmentation, calculated insights, or even AI-driven predictions in real-time. If you're already deep in the Salesforce ecosystem, it's a powerful way to score based on everything without building brittle automation chains. So, wherever you start — I'd recommend this general path:
Diagnose what's working (and what's broken) inside Salesforce.
Score based on fields your team already tracks and trusts.
Layer in external enrichment only when it improves actionability
You don't need to track every signal like it's a Chris Walker masterclass — just start with the 20% of inputs that drive 80% of conversions. Your sales team probably already knows what those are. Start there, and refine.
I apologise for taking so long to reply Dan Rényi, Tyson P. I still need to work on this idea, but the way I used predictive lead scores before, it's all done for me. We would assign a likelihood rate of conversion to any free trial we received and prioritise accordingly. It was based on a few properties, and the model determined the weights. We weren't the ones saying that 'country' property is worth 20%, employee size is 10%, etc. The model determined that by itself. So I didn't have to worry about how the model was built. All I had to do was choose which properties to include in the model, and it would do the rest. So, all I'm thinking about is how I can apply that simple model (how likely is it that this free trial user will convert to a paid user within the next 30 days based on these five properties?) to a more complex model in which we say (how likely is it that this company will become a customer within the next 90 days based on all of the properties we discover in the CRM?). The complex model would "run" the analysis and update the likelihood conversion rate whenever a key event occurred, which would have to be specified. Maybe I need to be less ambitious and do like Tyson P. suggests. P.S. All this was done on HubSpot, so I don't know how this would work on Salesforce. I'm also using HubSpot the moment. To be honest, I was even thinking about developing this with a developer friend of mine and releasing it as a HubSpot marketplace plugin.
Yeah, that totally makes sense, Nuno P., and you're definitely not alone in that thinking. I dig the ambition. I think like this, too, but the reality of most clients' CRMs usually forces me to take a simpler approach! haha That trial conversion model is honestly how more scoring setups should start. It's focused, property-driven, and tied to a specific action, like converting in 30 days. What you're exploring now — extending that logic to a full company becoming a customer — is definitely doable. It just takes more orchestration. My only concern would be around the AI model acting like a bit of a black box, but it sounds like you and your dev team handled that well. You're right that the key is defining the right trigger events and making sure your model is powered by clean, consistent data. Both HubSpot and Marketing Cloud offer out-of-the-box scoring and prediction models, but they tend to be more manual and less flexible than a proper predictive approach. That's why I always recommend starting with the simplest possible prototype, proving the signal is there, and then scaling the complexity. Also, here are some tried-and-true CRM triggers that work well for Account-level scoring, especially if you're using enrichment services and have them fully integrated:
Closed Won deals on the Account (total, recent, or within a 90-day window)
Average days to Closed Won (great for measuring velocity)
Employee size (firmographic, often from enrichment)
Number of related Contacts (can reflect buying committee complexity)
Rollup fields worth tracking:
Total lifetime value (LTV)
Total Opportunities
Last activity date (if Einstein Activity Capture or phone logging is in place)
Website form submissions across Contacts
Marketing email opens and clicks
Product usage (if available via integration)
Since you're using HubSpot as the hub for the predictive model, but most of the raw data lives in Salesforce, you'll likely need to calculate rollups and custom metrics in Salesforce first (rollup fields, rollup helper, or similar), then sync them over to HubSpot. The native integration handles a lot, but scoring across multiple objects usually takes some setup to get clean, reliable data into the Company object where it can be used effectively. Building a plugin to automate this logic could be a great move. You've clearly got the context and experience to bring it to market. Good luck, and happy to keep the convo going anytime.
Yeah. It always is man! It's such an expensive mess to fix, we have to get creative with how we work around that enormous elephant in the room.
Nuno P. "... converting to SQL or opening a new deal within the next 30 days, or something similar." SQL and new deals are usually very different. Nevertheless, you are poking at what is called a time to won model. It is a curve showing the percent that are won (can be converted as well but far less useful) as a function of time since a starting event (opp creation, MQL conversion date, whatever...). We have many blogs on how to do this. Here's one (with examples and a recipe). https://www.funnelcast.com/post/win-rates-are-curves-not-numbers Or you can use our free product. Have fun with this. It's a very robust and powerful way to model demand generation and a sales pipeline. DM me if you want help.
