# Before we start...

An email address is composed of a local part and a domain: [local part]@[domain].

# For which conditions an email or domain would be flagged as spam?

## Contains test

If the local part OR domain is exactly “test”

If the local part OR domain begins with “test”

If the local part contains “+test” anywhere

If the local part has “test” followed by any number of digits, and that’s the end (ex: logantest333@madkudu.com)

If the domain contains “.test” anywhere

## Some characters are repeated many times

If any character is repeated at least 4 times consecutively

OR

If any pair of letters is repeated at least 4 times (tetetete@gmail.com)

## Numbers exceed letters

If the local part is made up of more than half by digits (1234aa@gmail.com)

If there is at least one more digit than there are letters in the local part

If there are at least 6 numbers in the local part

## Local part has no letters

If the local part does not contain any letters

## Spammy patterns detected!

If the email is longer than 20 and the top 3 most common characters make up more than 70% of characters

## Domain end is domain

If the strings before and after the period in the domain are the same (logan@hello.hello)

## Domain or local part contains backlisted word or phrase

If the domain contains any phrase from an in-house list of blacklisted words/phrases

## Contains asd or sdf twice

If the email contains the sequence of letters “asd” or “sdf” twice

## Looks like spam

contains the phrase “noemail” anywhere in it

## Local part or domain length is 1

the local part contains contains one character (a@gmail.com) or domain contains exactly one character (logan@a.com)

## Local part has no vowels

If the local part of the string is at least 4 characters long, does not contain any numbers, and does not contain any vowels.

## Local part low vowel ratio

If the local part is 5 characters long and there are no vowels in it

If the local part is greater than 5 characters and the fraction of vowels is less than 0.08 (specifically, vowels / letters, not vowels / total number of characters. As in, numbers are not being counted when doing the division)

## Domain contains short gibberish

If the domain contains any of the following:

asdef

asdf

If the domain looks exactly like any of the following:

asd.com

sdf.com

fsd.com

dsa.com

## Contains absurdity

If the local part contains any of the following:

princessleia

If the local part is exactly:

sda

ads

dsa

nothing

abc

sdf

## Not F1000 or personal and has numbers in local

There are two consecutive numbers in the local part.