- You have access to the Data Studio
MadKudu's Data Studio is the platform that allows you to easily build predictive models and segmentation. This article focuses on the data studio capabilities for the Customer Fit models.
At a high level, the platform allows you to:
- understand which firmographic, demographic or technographic traits of leads are correlated with their conversion
- build a decision-tree based or point-based model from those traits
- adjust the thresholds between the customer fit segment (very good, good, medium, low)
- create the signals of the model
- validate the performance of the model on a validation dataset from your CRM or uploaded via CSV file
- preview a sample of scored leads
Now let's get into the details.
If you open a Customer Fit model you will see have the following tabs.
In this section, you can
- see some quick info on the creation and update date of the model
- save notes about the model
- access quickly to some parts of the model
- get some sanity check data about the training and validation datasets that you uploaded. Learn more about the Customer Fit training and validation dataset.
In the Insights tab, see which firmographic, demographic and technographic traits are important conversion factors for your ideal customer profile. You'll find a great article on how to read these graphs here.
This section regroups all the different parameters which can be changed to configure the model.
If the model is a Decision Tree
We've got a full article about it just for you here.
The TL;DR: the decision trees allow to classify leads by populations with very different conversion rates to identify the high performing populations versus the low performing populations.
If the model is point-based
For a point-based model, the logic behind the score of a lead is more straightforward. The score of a lead is the sum of all the points associated with each rule the lead complies with.
For example, if we have the following conditions:
- WHEN is_personal = 1 THEN -30
- WHEN pers_title = 'CTO' THEN 50
then a lead with a personal email but a CTO job title will have a score of -30 + 50 = 20.
A full article is available here to give you more details on creating or editing the rules of your point-based model.
If the model is a Decision Tree
The Threshold tab allows you to group the different decision trees together and configure the conversion rate thresholds to define the customer fit segments.
In this example:
- Tree #1 is allocated a weight of 50% in the total score of the lead
- Tree #2 is allocated a weight of 30% in the total score of the lead
- Tree #3 is allocated a weight of 20% in the total score of the lead
Learn more here How is the Customer Fit score calculated
- The threshold between Low and Medium is 3% conversion rate. This means that all the leads in the nodes of the trees with a conversion rate lower that 3% will be scored low.
Learn more here how to optimize the performance of a model.
If the Model base is point-based
To make things easier for the end-users of the scoring, you may want to use a label (or "segment") instead of a score. For example, does your team know if a score of 20 is a good score because it's 20 out of 30 or it's a bad score because it's 20 out of 100?
The Threshold tab allows you to configure the 4 available segments: very good, good, medium, low. You can set the thresholds defining what is the minimum score for each threshold.
Overrides allow you to force the segment of populations of leads regardless of what your historical data says about them. This allows you to make sure you can include your Sales feedback in the scoring (for example, always scoring low your partner or reseller to make sure they are never sent to Sales). Learn more about overrides and how to add one to a live model.
Signals are meant to provide information in your CRM to explain the score of a lead. It is also useful for surfacing relevant information about the prospect, even if that information is not used in the scoring.
This section allows you to check the output of the model with a sample and the performance of the model on a different dataset that was used to build the model (the training dataset)
Check out a random sample of scored leads from the validation dataset in the sample page.
You will find for each email its score, segment, and signals as you would see it in your CRM.
If you click on the Advanced mode you will see some enrichment data points from MadKudu providers
[For decision tree-based model] The links to the nodes of the trees allow you to understand in which nodes the leads are falling into for each tree, which explains the scores of the leads.
A model needs to be validated on a validation dataset that does not have overlaps with the training dataset. For that, we usually take more recent leads than the training dataset and check the performance of the model on this dataset in the Validation tab. The same metrics of Recall and Precision can be extracted from the graph and a model is "valid" if we are still close to 65%+ Recall and 10x Precision.
Note : "Recall" means the true positives, converters scored good or very good
What does -333 mean as a value?
-333 or -1 is used for "unknown" values when using numerical computations