Where does spam come from? We've all asked ourselves this question. A cadre of systems administrators, who are actively trying to preserve this incredibly important communications channel, have invested a great deal of time into understanding the problem. Learning how to read email "headers" in order to trace spam to its senders, blocking abusive mail servers and other sources, and so on - has given us some interesting insights over the years. Lately, however, the spammers have begun to escalate their attempts to ensure delivery of their unsolicited ads for illegal (and dubious) pharmaceuticals, porn, and virtually everything else. The United Nations estimates over 70% of all mail sent on the Internet is spam, and it's getting worse.
Back in the late nineties, spam largely came from two places: ISPs who inadvertently allowed spammers to sign up for throwaway dialup accounts or even fat pipes and hosting services, and from "open relays" - mail servers that, in the tradition of openness and trust that defined the early days of the Internet, allowed anyone to use their servers to send (relay) mail to anyone else. Servers were created to list, and allow instantaneous blocking of mail from, known fixed spammer networks and open relays, using the Internet name service known as DNS. These services, called DNS blacklists, or DNSBLs, were often, and are still, run as free services by volunteers.
Around the same time, a massive educational effort was launched to try and close all remaining open relays. It was eventually so successful that the spammers had to resort to other tactics by the first year of the new millenium.
One tactic, coincident with the rise of cheap, high-bandwidth broadband service (cable, DSL, etc.), involves writing or adapting viruses and worms to attack and exploit holes in the Microsoft Windows operating system. Once a vulnerable computer is infected -- often without its owner's knowledge, by viruses such as SoBig or Netsky or Bagle -- the spammers install "proxy" software. This proxy software is used to funnel their spam elsewhere, allowing the senders to hide behind the victim. Computers that have been compromised in this way are called "zombies."
Often, zombies are linked together by "command and control" software that allow spammers to create vast, multi-layered, networks of compromised computers. These zombies are used in "round robin" fashion, attempting to send spam from one host after another until they find one that hasn't yet been listed in a DNS-based blacklist or locally blocked, and voila - you've got spam.
Some estimates number these zombie machines in the tens of millions, and even conservative estimates suggest that 100,000 new PCs are infected every week. These computers can be used for more than proxying or relaying spam, however, but we'll get to that in a bit.
We know about this "round robin" strategy, because for nearly two years anti-spammers and mail administrators and others have been collecting lists of ISP naming conventions for their dynamically assigned or otherwise consumer-oriented services. Whereas many are resigned to accepting the spam and filtering it into seldom-read folders with antispam software of varying quality, there are others, like us, who prefer to reject it as soon as it tries to enter our networks. And if ISPs won't prevent their customers from running insecure operating systems on the public Internet, it is left to us to protect ourselves.
In the process, we've collected nearly eight thousand of these naming conventions, and converted them into patterns that can be applied to the name of any remote computer trying to send us mail. If the remote system matches a known naming convention, we reject it as likely spam. This overall approach, combined with other tactics, has been remarkably effective, reducing hesketh.com's spam load (which has reached as high as 80% of all inbound mail) by over 99% with a very low "false positive" rate. We've gone from receiving 1500 spams/day in May 2003 to rejecting 3000-5000/day as spam or other forms of email abuse.
Where a legitimate mail server lacks certain credentials and can't get the problem fixed due to incompetent staff at their ISP, we whitelist those exceptions so that legitimate mail continues to flow.
When we first noticed this "round robin" phenomenon in mid-2003, we'd see in our logs that a given sender address would try, from as many as twenty different sources on as many different ISPs, often in several different countries, to deliver spam to a given local address. We've seen recent delivery attempts span as many as 400 different sources, shifting the sender address or other evasive tactics, in a given day. One day last week we saw fully one quarter of all spam delivery attempts come from addresses in a single Polish ISP's domain, suggesting that there was a single spammer behind the entire attack. Fortunately, our list of naming conventions of ISPs offering dynamic / broadband service managed to reject all of the attempts, relegating it to a notable statistical anomaly, without bothering or harming our user base.
Our approach and similar methods, however, are coming under counterattack by the spammers, shifting the balance of power again. In the past few years, DNSBLs have been remarkably effective at catching most compromised or misconfigured computers soon after a spam run has begun, limiting the effectiveness of such runs. We estimate that our use of just two DNSBL services, Spamhaus.org's SBL/XBL and the Distributed Sender Blackhole List at DSBL.org, catches between 50% and 60% of our inbound spam, leaving the rest to our other custom filters. The vast majority of the spam that makes it through, today, comes from compromised webmail systems and Web hosting companies - primarily "phishing" expeditions and "advance fee fraud", otherwise known as "Nigerian 419" scams. And fortunately, most of that is easy to catch with simple message body filters.
In the past couple years, many popular DNSBLs have been shut down by massive "distributed denial of service" (DDoS) attacks, where tens of thousands of computers all consume a tiny bit of the resources of the target. In effect, this denies service to legitimate users (as well as costing the hosts enormous fees for stolen bandwidth). It has long been known by the antispam community that the same "zombies" can be, and are being, used in these DDoS attacks. Many of the zombie viruses and worms were derived from software that allowed similar attacks on other services, such as Internet Relay Chat (IRC), a predecessor to the widely used Instant Messaging services.
Why is the loss of DNSBLs important? It implies that local, effective filtering mechanisms, beyond the ability of spammers to disrupt using their abusive tactics, are absolutely necessary. Our blocking rules, before we shifted our DNSBL checks in front of those rules for efficiency's sake, were responsible for blocking roughly four fifths of the inbound spam here during 2004. If the DNSBLs continue to disappear under withering DDoS attacks, we'll be okay as we've developed effective blocking rules for spam sent via zombies. Unfortunately, the landscape is changing in potentially disastrous ways, partly due to the success of these antispam methods.
One of these is already being seen by AOL and Outblaze and other large ISPs or mail service providers -- instead of sending their spam directly through zombies to the victim's mail server, some spammers are sending through the mail servers (known as "smarthosts") of the compromised computer's own ISP, making it far more difficult to block without also blocking legitimate email from the same server. What this means for everyone is that the spam load is likely to get a lot worse, legitimate mail is likely to become less reliable as frustrated administrators block the mail servers of recalcitrant ISPs, and spam is likely to shift to other mediums. There have already been reports of Nigerian advance fee fraud scammers using AOL Instant Messenger, cell phone text messaging services, and other mechanisms. I have a bogus Spanish lotto letter, sent to me on paper, from last year -- showing that the scammers already adapted far older tricks to email, so they're unlikely to give up if email becomes hostile to them.
ISPs will no longer be able to ignore the spam flowing freely from their networks just because it hasn't shown up on their mail server's logs. They will need to adopt a strategy of rate-limiting the amount of email you can send from your cable modem or DSL line. Availability of already overburdened mail servers will become even worse in the future. The spam that used to be sent direct from compromised broadband customers' computers will clog the mail servers of the ISPs themselves. Corporate customers communicating with others using consumer ISP email accounts will see even more delays and lost mail, and vice versa, breaking the connection between customer and company.
And yet, in the midst of all this, the executives at MCI refuse to terminate hosting for a company that sells software used to find and exploit holes that can later be used to send spam. And at the same time, one of these same executives is being awarded the Association for Computing Machinery's Turing Award, the equivalent of the Nobel Prize for computer science. It's clear where the priorities of many major ISPs lay -- until it starts to cost them money instead of earning it, spam will continue unabated.
The shift from open relays to zombies was a sea change in the war against spammers. We've successfully fought the zombies, but dark clouds are once again gathering on the horizon, and spammers look likely to frustrate our efforts once again, unless ISPs take drastic actions and consumers take responsibility for securing their computers, cutting the cycle at the roots. If they don't, expect to see a lot more spam in 2005 and the years following, and say goodbye to email as an effective medium.