A Customer Fit model helps predict who would be a good fit for your product based on WHO is this person or company. It means it uses only firmographic, demographic, and technographic data and does not include any behavioral or intent data points. The Likelihood to Buy model would typically include behavioral data to predict if this prospect is ready to buy. Learn more about the different models.
Now that you have this in mind, your next question would likely be: which features exactly are used in the model?
What variables are used in the model?
The list of Computations marked LIVE in the Data Studio tells you the different variables used in the model. The computation can be used in the Decision Trees, in the Overrides, or in the Signals.
- Go to the Data Studio (studio.madkudu.com)
- Click on Computations
- If you have a model already live, the page should already be filtered to show you the Live computations
And your next question is probably: for each variable, what are the values that matter? How does the score change depending on the value? We call Trait the value of a Computation (or variable). For example, a computation is "Company country", a Trait is "United States".
What traits are important in the scoring?
In a point-based model, it's simple: the higher the number of points associated with a trait (for example "If industry = Internet Software & Services, then 100") the more important this variable and trait are in the scoring.
In a decision tree-based model, it's not as straightforward. We recommend reading first the following articles:
- how to read a Decision Tree model
- how the customer fit score calculated
To understand the most important traits, you would look at the conditions used in
- the Overrides, which include the rules implemented on top of the statistical model to force the score of leads regardless of historical data.
- the Trees, and more specifically the Split conditions of the nodes. The higher the split condition is in a tree that uses the computation and trait, the most important / impact this trait has on the scoring.
Usually, the first split allows separating business emails from personal, spam, student emails .... In this case, the scoring relies heavily on the fact that the lead is using a personal or business email.
Then within the business emails, let's say you want to make a split about company size. The split condition might look like employee > 100 and it separates the population into the SMBs versus the MidMarket/Ent companies. If each group from the split has very different conversion rates, it means the company size is also a strong influencer of the score.
You can browse through the trees to understand the computations and traits used. If this seems cumbersome, and you can go straight to the next question for a better method.
What makes a very good lead versus a low-fit lead?
Isn't it what your Sales team and execs are actually asking you? Well, this may have been covered during your first onboarding with MadKudu but if you were not around or it was some time ago, you may not have this answer at hand. But you can get it through the Data Studio.
There are 2 ways to understand this question.
1. What does the data say about which leads are good versus bad?
This is the data-driven question. To answer that you would want to focus only on the analysis of the decision tree(s) that models and ranks your leads by conversion rate. It means analyzing your leads in the past to understand which ones converted better than others.
Here is how to get this information:
- Go to the Data Studio (studio.madkudu.com)
- Click on your Customer Fit model
- Click on the Model tab, then Thresholds
- Understand what is the weight of each Tree and keep this page in mind you'll need it again
- If Tree 1 predictions have a weight of 80% for the score, Tree 1 and 2 are marginal because they only make up 20% of the score. This means most of the score is coming from Tree 1.
- Now, go to the Tree tab.
- You can navigate in the tree, like described above in this article, or go straight to the table Conversion rates by nodes at the bottom left.
- You see the list of all the nodes of the trees, ranked by descending conversion rate. The first nodes represent your best-converting leads (the very good leads), while the bottom ones represent the low fit leads. Learn more about how to read this table.
So where is the limit between very good / good / medium and low then? Great question! And the answer is by reading the Thresholds in the Thresholds tab you were on a second before.
The thresholds tell you what is the minimum conversion rate a node must have to be labeled "very good", "good", "medium" or "low" by the model.
Then you can find in the trees the definition of each node to understand what is the population in this node.
In this example, node #6 contains the very large traffic US companies in the Internet Software industry, while node 38 at the bottom contains personal emails, 39 spam emails ... etc. This reveals that leads from very large traffic US companies in the Internet Software industry are very good leads.
2. What populations are scored very good versus low by the model?
If the question is about understanding what is the output of the model, here you would want to look at all of the components of the model which include both what the historical data says (in the Trees) and the rules added on top of the predictions (the overrides) based on Sales feedback, Marketing strategy, etc.