You'd like to use product usage or CRM data from a source MadKudu does not currently have an integration with? No worries, we can easily set up a transfer using Amazon S3 from Redshift or flat files (JSON or CSV). MadKudu's preferred way is to pull data from your S3 bucket where the data is formatted as described below, and from which MadKudu has access through an IAM role.
- Please refer to this documentation to give MadKudu access to your bucket.
- For transfer from Redshift, please refer to this documentation.
Pre-requisites
- You have access to an AWS account to create / manage an S3 bucket
How to format your data
MadKudu works with 4 types of objects:
- Event: what are users doing?
- Contact: who is the user?
- Account: what accounts my users belong to?
- Opportunity: what deals do I open? (when you are not using Salesforce Opps or HubSpot Deals)
Depending on what type of segmentation you will want to build (based on behavioral activity, based on lead/account attributes or based on both), you may send us data of 1 or more objects.
- If you track behavioral activity in a system that does not integrate with MadKudu and would like to build a behavioral model, you would need to send at least Events. If you plan on having account scoring, please send us your Accounts as well.
- If you have a homemade CRM or CRM which does not integrate with MadKudu, you would need to send at least Contacts and Opportunities for MadKudu to understand who are your contacts and who converts. If you plan on having account scoring, please send us your Accounts as well.
You want to send...
Event
To send behavioral data (product usage, web activity, marketing activity...), create a file named event with the following attributes (with headers included):
Attribute | Format | Example | Description | |
event_key |
required | String | "abc123" | A unique key identifying the event. If you do not have one, we suggest creating a combination of event_text + contact_key + event_timestamp |
event_text |
required | String | “signup”, “login”, “invited a friend” | The action taken by the user. |
event_timestamp |
required | Unix time | “1436172703” | The time at which the event happened |
contact_key |
required | String | "abc123" or "paul@madkudu.com" | The unique identifier of the user who performed the action. This needs to be the same as the contact_key field in the identify file. |
event_* |
optional | String or Numeric | properties describing the event (e.g. event_url for the url of visited page, event_form_title for the title of form submitted...) |
Example in JSON format
{"event_key": "abcd1234", "event_text":"signed up", "event_timestamp":1234567890, "contact_key":"abc1234"}
{"event_key": "abcd2345", "event_text":"visit web page", "event_timestamp":1234567890, "contact_key":"paul@madkudu.com", "event_url":"http://www.domain.com/pricing"}
Contact
To send enrichment or CRM data on the contact level (CRM, demographic, firmographic traits ...), create a file named contact with the following attributes (with headers included):
Attribute | Format | Example | Description | |
contact_key |
required | String | “abc123”, “paul@madkudu.com” | Unique identifier for the user in your database. It can be the email. It must be the same as used in the Event file if Event file sent as well. |
email |
required | String | "paul@madkudu.com" | Email of the user. Pass it even if the contact_key already contains the email |
created_date |
required | Unix time | 1436172703 | Creation date of the contact (required if MadKudu does not have an integration with your CRM) |
contact_* |
optional | String or Numeric | Enrichment traits you know about the user (examples: contact_title, contact_country, contact_subscription_plan...) |
Example in JSON format
{"contact_key":"abc1234", "email":"paul@madkudu.com"}
{"contact_key":"432535", "email":"paul@madkudu.com", "contact_title":"cto"}
Account
To send enrichment or CRM data on the account level (CRM, demographic, firmographic traits ...), create a file named account with the following attributes (with headers included):
Attribute | Format | Example | Description | |
account_key |
required | String | "def456", “madkudu.com” | a unique identifier for the account the user belonged to. It can be the domain of the account |
contact_key |
required if sending Contact data | String | "abc123", "paul@madkudu.com" | the unique identifier of the user who performed the action. This needs to be the same as the contact_key field in the Contact and Event files. |
domain |
required | String | "madkudu.com" | Web domain of the account |
created_date |
required | Unix time | 1436172703 | Creation date of the account (required if MadKudu does not have an integration with your CRM) |
conversion_date |
optional | Unix time | 1436172703 | Conversion of the account into paying customer. |
ARR |
optional (highly recommended) | Numeric | $20,000 | Annual Recurring Revenue of the account |
account_* |
optional | String or Numeric | Enrichment attributes you know about the account (examples: account_industry, account_ARR, account_subscription_plan...) |
Example in JSON format
{"contact_key":"abc1234", "account_key":"madkudu.com", "domain": "madkudu.com", "name": "madkudu"}
{"contact_key":"abc4983", "account_key":"ibm.com", "domain": "ibm.com", "account_ARR":"3000"}
Customer Fit training data
If you are not able to send your Opportunities as described above, MadKudu still needs to understand who converts from your historical data to configure a customer fit model. You can send us a unique flat file extracted from your CRM that tells us among your leads who has converted with the following fields (with headers included):
Attribute | Format | Example | Description | |
email |
required | String | “paul@madkudu.com” | the unique identifier of the user who performed the action |
target |
required | Boolean | 1 | indicate if the lead converted with your conversion definition (Opp created, Opp stage 2…) |
amount |
required | Numeric | 2,300 | if target =1, amount of the opportunity converted (as defined by the conversion definition). 0 otherwise. |
target_closed_won |
required | Boolean | 1 | indicate if the lead converted into a Closed Won opp (paying customer) |
amount_closed_won |
required | Numeric | 2300 | amount generated from the first closed won opp. |
created_date |
optional | Unix time | created date of the email | |
properties |
optional | String or Numeric | Numeric self-input information at time of lead creation or any other field that you’ve augmented your leads with and want MadKudu to evaluate (example: team_size, industry...) |
Example in JSON format
{"email":"elon@tesla.com", "target":"1", "amount": "2499", "target_closed_won":"0", "amount_closed_won":"0", "created_date": "1234567890" }
{"email":"paul@madkudu.com", "target":"0", "amount": "0", "target_closed_won":"0", "amount_closed_won":"0", "created_date": "1234567810", "team_size":"5"}
Points of attention
The bracket { } and single quote ' characters are not supported. Make sure to delete any of these before creating your files.
How to format the files
MadKudu currently supports two file formats:
- Newline-delimited JSON (preferred)
- CSV
Newline-delimited JSON
Our preferred format for upload is newline-delimited JSON, which is more standardized and less error-prone than CSV.
In this format, the different records are separated by the newline \n
character. Each line is a valid JSON object:
{"event_text":"signed up", "event_timestamp":1234567890, "contact_key":"abc1234"}
{"event_text":"added a friend", "event_timestamp":1234567890, "contact_key":"paul@madkudu.com", "some_other_event_field":"some_value"}
{"contact_key":"abc1234", "email":"paul@madkudu.com"}
{"contact_key":"432535", "email":"paul@madkudu.com", "some_other_contact_field":"some value"}
"
in your data with a \
(e.g. replace "
with \"
)
Incorrect
{"event_text":"signed up", "event_timestamp":1234567890, "contact_key":"abc1234", "key": "val"ue"}
Correct
{"event_text":"signed up", "event_timestamp":1234567890, "contact_key":"abc1234", "key": "val\"ue"}
CSV
We also support the .csv format, with the recommended format:
- column names (header) in the first line
- separator:
~
→ separate the value with~
(ex:abc~def~
) Please do not use,
or-
as it easily creates parsing issues - delimiter:
"
→ this adds quotes around the values (abc -> "abc"
) - line separator: line-break
\n
- Delimit your values with " "
- Remove all line break characters (for example
\n
) from your fields. - Make sure the number of fields is the same for each line.
- Escape your
"
characters by adding a second"
character in front of it (see here for details)
Incorrect
Values are not delimited by "
abc,cde,ef
Correct
"abc","cde","efg"
Incorrect
The "e is wrongfully formatted. A second " should be added before.
"abc","cd"e","efg"
Correct
"abc","cd""e","efg"
Data validation
JSON line and CSV are relatively easy to corrupt (for example with "
or ,
characters in the data).
We will validate the data on our side and warn you of any corruption issue, but it helps a lot if you follow the format requested above.
Compression
To speed up the data upload part, we highly recommend that you compress your file with GZIP before uploading them to S3.
You can call your file whatever you want it (we recommend event, contact and account). However, please make sure to add the correct extension depending on your file format:
- .json.gz for compressed JSON (recommended)
- .json for uncompressed JSON
- .csv.gz for compressed CSV
- .csv for uncompressed CSV
How to store your file
We recommend that the files you want to share with MadKudu are in a dedicated folder and that you create an IAM policy and role for MadKudu to access these files.
You will also need to set up a recurring push of your data to this folder for MadKudu to score fresh data. This is done by creating distinct files, as described below.
File naming
In the S3 bucket, please upload data into separate folders by date and by objects
{object}/{year}/{month}/{day} where the objects are
- event
- contact
- account
- opportunity
If you use the S3 API, simply “prefix” your destination file name. For example, uploading to "contact/2020/11/20/11:00:00/name_of_file.csv"
will add a file name name_of_file.csv to the contact folder.
Please use this recommended file naming and storing system in the bucket for MadKudu to be able to automatically pull any new file.
s3://bucket_name/object/year/month/day/name_of_file.csv
Compression
To speed up file transfer, you can compress files locally before transferring them to Amazon S3. If you want to compress your files, please use the GZIP compression method and use .gz or .gzip as your file extension (we currently don’t support other methods or other extensions) .
Frequency: setting up a recurring push of data to MadKudu
Depending on how your data will be used in the MadKudu platform, we recommend providing fresh data through that bucket
- every 4 hours or at least twice a day if used in a behavioral model
- in one shot if used for a punctual model training or analysis
For setting up a recurring push of data, please upload a new file for each batch of new records, naming the files as described in File naming.
Please open a ticket here if you have a doubt, we'll be happy to assist you.
Frequently Asked Questions
I'm having an issue with S3 / I don't know how to use S3
Please open a ticket here and we will be happy to assist you.
Your file format doesn’t work for me. What do I do?
If you’re having any issues with the file format, please open a ticket here and we’ll be happy to help.
How often is the data refreshed?
As soon as you drop data to the S3 bucket, expect results to be updated in the Madkudu platform within 6 hours.
What would happen if I send the same event more than once - will it appear twice in MadKudu?
Our system will deduce the events based on contact_key / event_text / timestamp
. If you send the same event twice, only one will be kept:
- If sent in two separate batches, only the most recent will be kept.
- If sent in the same data batch, first one in the file.
Can we add other attributes to the Contact records?
Yes, please send any attributes you have stored in your user table (except sensitive ones (password, cc number)).
In particular, it is always helpful to get the following:
created_date
lead source
- current plan / value of the plan