Bayesian Spam Filters

Published 12/26/03

Follow-Up:

Several folks wrote to warn me about the dangers of recommending “bouncing” spam back to the senders. To quote Roger Matus, whose company makes InBoxer anti-spam software:

The idea of bouncing is seductive. As you say, it “might help get you off some lists”, although it does not help for most. The problem is that one of the latest spamming methods is to spoof a legitimate address or domain. It is bad enough to be impersonated. But, when thousands of people bounce messages to the innocent victim of spoofing, the bounce feature can bring down entire email accounts and corporate servers. The bounce also adds to the network traffic, without much benefit.

Let me get his second point off the table, just because I disagree with it. Sorry! I just don’t believe that we’re talking about so much bandwidth as to make a difference. (There used to be an organization called the “Bandwidth Preservation Society” that offered suggestions like “do all your downloading late at night when demand is low” and other, similiar ideas. Silly then, silly now.)

But his first point has merit. As the owner of several domains, every now and then I see bounces coming to them — these are bounces caused by some spammer spoofing one of my domains. In other words, the junk looks like it’s coming from, say, whizkid.com, but isn’t. So I get the bounces. That seems to be a good argument against bouncing spam; half the time or more you’ll be annoying some innocent person.

But there is some merit to bouncing, just as there is some merit to that CAN-SPAM Act.

If you look through your spam (yes, that’s like the vet telling you to “examine the animal’s feces) you might notice there are two kinds. Type 1 spam comes from legitimate businesses that have deluded themselves into thinking they have a right to pitch you — maybe they’re the “business partners” of some store you shopped at. These are easy to spot: The subject line is plain as day, e.g., “Get your next car at the price you want!” or “Cheap cigarettes!” There might even be a working “unsubscribe” button and/or a valid postal address in the message.

Type 2 spam is the true, vile junk: Porn, generic drugs, and that ilk. These are the spammers I mentioned in my column that make their money just by your opening the message. These are the folks that hide their real mailing addresses, and that modify the subject line with random characters and other such things to try to fool spam filters.

There is no point in trying to bounce Type 2 spam (at least until someone comes up with a way of tracing the actual sender). It’ll just go to some unsuspecting schlub who’s domain is being spoofed.

But there is hope in bouncing that Type 1 spam. There’s a decent chance, as many of these are legit businesses, that your bounce will go to the right place: a machine that reads it and takes your address off the list, thinking that it’s no good.

So while the people who wrote to say that spam filters that bounce might not be a good idea are partially right, I think there’s a good argument for carefully bouncing your spam.

Add to del.icio.us Digg it! Add to Technorati Add to Furl Add to reddit Stumble it!

The Fray


Rich Olson says:

I have a friend who had their email address hijacked by a spammer. They got hundreds of “bounces” back – making her email address unusable.

Most of these were probably “real” bounces – but any additional ones generated by anti-spam software compounds the problem. It’s not just the bandwidth (which is generally trivial) – it’s the time people have to spend sorting through it.

June 21st, 2005 at 11:23 PM

Max says:

If the bouncebacks at least included a generous snippet of the original spam content, then the bouncebacks would get tagged and shoved into the junk mail folder (or whatever) of the victim email address. That at least makes it possible to continue to use a mailbox which has had its address hijacked in this way. Sending the more common form of bounceback, which just has an error message of some sort and the original title of the offending email, obviously saves a little bandwidth but ultimately contributes to a serious problem.

August 2nd, 2005 at 12:15 PM

Weigh in

Yer name:

Yer e-mail (to be notified of responses or I can respond privately -- never ever shared):

Yer Web site (if you like):

What you have to say (Be civil, or it might be removed; comments with links
might be held for moderation, just so you know):




Site created with

and


Blog run by