What the Customer Fit Model is not
The Customer Fit score, when built from decision trees, is not the sum of points associated with each demographic or firmographic trait. MadKudu customer fit model does not work with a point-based system but with a decision tree model. This provides a score based on the historical analysis of your conversions. Think of it more as a "look-a-like score".
With a point-based model, how do you know if you should attribute 100 points to the industry "Software" while 10 for "Professional Services"? How do you know if the industry should weigh in more or less than the company size in the scoring? How do you account for the fact that SMB companies aren't a good fit for you except in very specific countries like France and UK?
This is where you'll end up trying to build a model with conditional rules, such as:
- Rule 1: IF company size > 100 AND country NOT IN UK, France then X points
- Rule 2: IF company size LESS THAN 100 but MORE THAN 10 AND country IN UK, France then X points
Moreover, here we are only using 2 variables (company size and country). So imagine what a nightmare to maintain or make sense of if you were to add the industry, the web traffic, etc.
The need for a decision tree
This is why a decision tree model is easier to build! The rules described above are essentially in the node definition. Each node of the tree represents a conditional rule and has a conversion rate associated to it (its point value if you wish). But, the beauty here is that there was no guess work to attribute to this node a value. It is purely based on historical analysis: in the past, have SMB companies in UK and France converted well or not?
What we are saying here is that the different trees are essentially a visual representation of all of the rules, and points are automatically assigned to these rules.
Understanding a score through an example
Let's take a practical example.
Demonstration of how firstname.lastname@example.org ended up with a medium customer fit segment, and a customer fit score of 68.
Pre-requisite: you might want to read this article to understand how decision trees work
- The lead falls into 1 specific node of each Tree, the node for which its enrichment matches the node definition. This node is defined by a conversion rate. For example, email@example.com falls in
- node 10 of Tree 1 which contains the MidMarket UK companies in the Professional Services industry. This node has a conversion rate of 23%.
- node 16 of Tree 2 which contains Product Manager in mid-market companies. This node has a conversion rate of 19%
- node 6 of Tree 3 which contains companies with a low number of techs. This node has a conversion rate of 3%
- Now the score (before being normalized) is the weighted average of these conversion rates. Each Tree has its own weight in the scoring. The weight of each tree is configured in a way to give the best performing tree the best weight, and overall to optimize the performance of the model. Here in our example, the Tree 1 has the heaviest weight, and Tree 3 is not used at all.
The score is:
[ weight Tree 1 x CvR node 10 + weight Tree 2 x CvR node 16 + weight Tree 3 x CvR node 6]
= [ 70% x 23 + 30% x 19 + 0% x 3 ]
= 21.8 = unscaled score (or "raw score")
- The Segment thresholds allow to define which range of conversion rate will be labelled as very good, good, medium or low are configured in the Thresholds tab of your model in the Data Studio. These thresholds are unique to your model, adjusted based on the performance of your model on the training dataset. They are configured to adjust the distribution of leads per segment: we want to have 10-15% of very good, 20-25% of good, 30-35% of medium, and the rest in low, while having the majority of conversions identified as very good and good. Learn more about model performance.
In this example, the thresholds of the model were set at 10, 22 and 45. This means that any lead with a unscaled score (the score calculated in step 2) above 45 will be scored very good, then between 22 and 45 will be score good ... etc. firstname.lastname@example.org with a score of 21.8 is between 10 and 22 = medium.
You may be interested in this article too to understand these conversion rate thresholds.
- The Normalization of the score between 0 and 100 makes sure the output of the model is a score between 0 and 100 and the score range between the segment is always the same:
- very good: 85 to 100
- good: 70 to 84
- medium: 50 to 69
- low: 0 to 49
Here in our example, the thresholds were chosen as described in the step 3, and we have the following mapping between the internal thresholds and the model output thresholds (that you see in your CRM)
Why? Imagine if you were to explain to your Sales team that a very good score is above 45, then good between 22 and 45 ... etc, and this would change every time you want to update the model. Well, good luck with that! So this is why we apply a normalization, automatically done by the Data Studio, no need to configure it, it will translate the internal segment thresholds described in step 2 into the external thresholds your team sees in the CRM (0,50,70, 85).
How? A formula spreads the score before normalization into the normalized bucket. So that email@example.com with a medium segment and an unscaled score of 21.8, is at the top of the medium bucket (the top being 22), and is matched as a 68 when normalized.