(Posted on Aug-21-2020) We just completed the FIRST installment of our NEW Service Provider DNSBL anti-spam data - this first one lists abusive Sendgrid IDs and sub-hosts - and we are providing it for FREE to the world! For the rest of August, the free version of our new Sendgrid lists will be fully functional and updated as fast as possible. Starting in September
As you probably have noticed, over the years, the spam coming from both large ISPs and from ESPs (Email Service Providers) - has slowly grown - and has now reached an almost-crisis situation. Years ago, the large ESPs and ISPs feared getting listed on anti-spam lists, and tried HARD to keep the spammers off of their system. Little by little, they grew into a “we're too big to block” attitude, and things slowly got worse and worse. This has greatly harmed email, and many end users are frustrated. Then, starting about a year ago - then GREATLY ACCELLERATING in recent months - an emergency situation developed due to a large amount of phishing spams (and other criminal spams, virus download spams, etc) that started spewing from Sendgrid in an alarmingly increasing volume. Even after many pleas from many in the industry, with many helpful suggestions, the criminal spam kept coming. So that created an URGENT need to get this new type of anti-spam list released ASAP.
First, the following 3 plug-ins are extremely new and subject to change without notice - and at least 2 of them require some amount of custom implementation and are NOT necessarily drag and drop. So read the instrutions carefully, and keep in mind that these are almost as new to us as they are to you. Some of these require about 5 to 30 minutes of custom implementation (or so).
Because this is a NEW type of anti-spam list, plug-ins are still in the early stages of development (some of those efforts are listed above) - so we are currently in the beginning stages of paying developers (who specialize in this area - we do not!) to develop a special rule for this for SpamAssassin (again, the first version of that is in the list above) - and donations to invaluement will help us to pay for that effort (we have already made a financial commitment for that effort). In the meantime, we are confident that MANY reading this page will have the expertise to implement this into their systems - and this page will describe HOW to do that! Also, be sure to join up with the SpamAssassin discussion list for updates - as there is a likelihood that others there will be able to rush out a script/SA-RULE for this there, too.
The FREE data is currently available via http download. It is shocking how LITTLE data there is - and yet how MUCH damage that spams referenced in that data are doing. Similarly, this will be a tiny fraction of a percentage of your overall spam blocked, but will represent a HUGE percentage of the viruses and criminal spam that had been making it past many spam filters!
Please use CURL or WGET to download the data files - and please use the feature where it ONLY downloads the files when the one on the server is newer than the last copy you previously downloaded. There are two types of data:
(1) Sendgrid IDs that are found OFTEN in the SMTP-ENVELOPE FROM address of Sendgrid-sent messages.
EXAMPLE: <bounces+14927644-0137-rob=pvsys.com@sendgrid.net>
So in THIS example, 14927644 is the ID. Nothing more. Nothing less.
(2) Sendgrid customer domains that are sometimes found in the SMTP-ENVELOPE FROM address of Sendgrid-sent messages (when they are NOT sendgrid.net)
EXAMPLE: bounces+2128379-w71g-name=example.com@em4937.workbeez.com
So this THIS case, workbeez.com is the Sendgrid customer domain. Also, please try to NOT check this hostname if/when this is sendgrid.net, since that wastes resources. But when the Sendgrid customer uses their own domain in there - it will OFTEN be a long multi-dotted hostname, but where only the domain at the end will be on our list, so DO NOT do an exact-match on what comes after the @ symbol. Otherwise, in this case, you might be checking our list for em4937[.]workbeez[.]com, when we have listed workbeez[.]com in our datafile. So that would create a false negative. Also, avoid false positives where, for example, goodsender[.]com is at the end of the ENVELOPE-FROM, but we had sender[.]com blacklisted, then your code mistakenly counted it as a 'hit' since there was a match on sender[.]com - do NOT do that! See the difference? (a regex boundry can be your friend here: /b -OR- try this regular expression on the 2 examples above that contained the word 'sender': \bsender.com\>?$) - that regular expression will correcty hit sender[.]com, but miss goodsender[.]com. Rememer, these domains with the word 'sender' in them are just fictitious examples - any resemblance with actual web sites is purely coincidental. The rbldnsd implementation described below... ALSO handles these issues very nicely.
NOTE: If you programming this on a system that has already downloaded the content of the message, then a nice feature to implement - is to only apply it to messages that have a X-SG-EID header.
For all of these, right click , then 'save as' - then later you can set them up for frequent downloads (every minute!) using CURL or WGET - only using the setting that only downloads when the server versions are newer.
Sendgrid ID file is here:
http://www.invaluement.com/spdata/sendgrid-id-dnsbl.txt
Sendgrid SMTP-ENVELOPE-FROM file is here:
http://www.invaluement.com/spdata/sendgrid-envelopefromdomain-dnsbl.txt
And this SMTP-ENVELOPE-FROM file is specially formatted for rbldnsd:
http://www.invaluement.com/spdata/sendgrid-envelopefromdomain-dnsbl-rbldnsd.txt
(so, in the rbldnsd-formatted file, the DOT at the beginning of each entry does the following: if the listing was "foo[.]com", and the hostname after the @ symbol, in the SMTP envelope FROM was "foo[.]bar[.]foo[.]bar[.]foo[.]com" - then ONE single lookup on the ENTIRE host name will return a listed answer in rbldnsd, without having to chop off labels and without having to do multiple dns queries.)
ENJOY - AND - PLEASE CONSIDER MAKING A DONATION - THIS TOOK A MASSIVE EFFORT - AND MUCH REVENUE WAS LOST IN RECENT WEEKS DIVERTING RESOURCES TO THIS. THANKS!
Does this go beyond blocking phishes or virus or other criminal spams?
Absolutely! But, beyond the criminal spams, we are focusing on the WORST spammers - the ones that are hitting many spamtrap addresses with impunity - yet don't have a history of legit usages. In a FEW cases, the companies listed are sometimes slightly legit - but they typically spam like crazy. Therefore, please do NOT use the domain list for general filtering - only use that for THIS application of the data. If you can't handle the fact that we are going after the WORST customers of ESPs, even when not criminal phishing or virus exploits - then don't use it - and then maybe you're a part of the problem, too? But ENOUGH IS ENOUGH - ESPs need to start feeling some economic pain for their poor abuse prevention - and if we only blocked the viruses and phishes, that would enable the cold-email spammers to continue to send massive amounts of unsolicited bulk email to users who DO NOT want it! As I said, enough is enough. It is past time that the recipients get relief from this spam!
Aren't these already being blocked with content scanners, such as ClamAv? Razor? Pyzor?
Yes, many are, but many are NOT. Also, when new such spams start spewing, we are confident that our list will pick them up AND distribute that corresponding data MUCH FASTER than such content scanners. Anti-spam "checksum-fingerprint" systems like Razor and Pyzor are excellent anti-spam tools, but will often be stuck in a game of whack-a-mole as they have to start all over with each newly reformatted spam sent from the spammer, whereas that spammer's follow-up reformatted spams are already blocked by our listing of their already-listed Sendgrid ID. Also, some spammers have found ways to evade such anti-spam techniques by applying variations to each individual spam. NOT saying that such scanners are not helpful, they ARE helpful. But they just don't even come close to eliminating the need for THIS new type of anti-spam list.
Are there other advantages over content scanners that might already be blocking some of these?
Yes! Particularly for massive systems, such as hosters and ISPs hosting millions of mailboxes - another advantage of THIS particular anti-spam data - is that it can be applied to the SMTP-ENVELOPE data, for a much more efficient process, and without having to wait for the message datafile to download. That is a HUGE performance advantage. Note that OTHER future additions to our Service Provider DNSBLs will instead involve filtering based on the content inside the body of the message. This will vary. But this FIRST offering works with SMTP-envelope data, giving it an efficiency edge over content scanners.
Are there False Positives?
First, no DNSBL is perfect, and anyone using this is expected to do so at their own risk. If that is a huge concern - try it out in low-scoring mode or marking mode first. But we applied decades of expertise into the development of this - where invaluement is already known for offering high-quality low-false-positive DNSBLs that have EARNED a reputation of being industry-leading DNSBLs with extremely rare FPs. We've applied the SAME careful approach to developing this new type of list (thus, the hundreds of hours of development time - that would take others without our expertise 10s of thousands of hours!). What we choose to NOT list - is even MORE important than what we DO list. Anyone can create a low-quality DNSBL. Knowing what to list, what to NOT list - and especially making WISE decisions for mixed-scenarios where there is MUCH abuse, but also SOME legitimate usages - is the HARD part. That is why the number of high-quality DNSBLs can be counted on one hand. We are very pleased with the quality of this list at launch-time. Very pleased!
Is the paid version better?
At this point, they are the same. Starting in September, there will be a slight delay on the updating of the free version data. Also, soon, other types of ESP dnsbls will become available to our paying subscribers. But THIS one that is focused on Sendgrid data will remain free and public - due to the enormous amount of abuse from Sendgrid in recent months. Also, the data is not yet in the invaluement paid customer datafeeds (rsync or direct queries) - but that is coming soon (ETA: Aug-25-2020)! And then, in the coming weeks/months, more of this type of data, targetting other ESPs and ISPs, will start appearing in only the paid versions, for invaluement subscribers!
How can I help?
Follow the SpamAssassin list and collaborate with them about making new rules for this. And a donation is welcome, too - which can be applied to your future subscription should you later decide to become an invaluement customer. Such donations will also enable us to stay focused on this effort and continue to develop other variations on our Service Provider DNSBL - to deal with other ESPs, too.
Thanks for your help with this - and for any donations - please also consider subscribing and/or testing out our OTHER antispam data, described on our homepage, here!
Rob McEwen, CEO of invaluement.com
rob AT invaluement.com
+1 478-475-9032
PS - YES - this really was the WORST LAUNCH EVER - no FB pixel - no press release or ads - late in the afternoon on a Friday - Friday night in Europe - etc. Basically, this launch is horrible. So why this timing? We have been working day-and-night over the past several weeks to get this operational - even at the expense of activities that bring in our normal revenue - the amount of criminal spam that has been making it to the inbox is alarming - and so we have been trying to get this out ASAP - and *NOW* happens to be the point where it is ready. So THAT dictated the timing. It seems unethical to wait until this Tuesday and do all those other good launch strategies that we wish we had time to prepare - when spammers are exploiting innocent victims with these horrific and often criminal spams.