Jim's Depositorythis code is not yet written |
help |
I write from the end of June, 2008 having just completed a quarterly spam analysis and adjustment. Following is a brief description of the mail community, the incoming mail stream, how I process it, and the results.
The Mail Community
The Incoming Mail Stream
The Process
The Results
Maintenance
The bogofilter works best if it is trained regularly to follow spam trends. I have in the past manually sorted thousands of messages into good and bad piles for training, but that is mind numbing. For ongoing training I do the following:
Results
The end result is I spend dozens of man hours per year to stop 250,000 spam. I'd just hire google to front end filter our mail for $3/address/year, but the security policy won't allow that.
comment by jim, 2 months ago
Going forward: I will have to drop dcc. Their licensing is no longer free enough to be distributed by Debian. That will slow more messages, but in practice anything dcc catches is also caught by spamassassin.
I'd like to add an adaptive whitelist out front to prevent false positives and give me a stream of known good messages for training the bogofilter. I haven't found one I like yet, but I keep looking. Maybe I'll have to write it.
comment by jim, 2 months ago
An extra note on bogofilter: Bogofilter is built with a single user in mind. I'm sure it works better when it has a single user's mail to think about and can rely on the human to tag the false positives and negatives.
In a 150 user common filter you can rely on exactly 0 of them to report their miscategorized spam. If you try to force them to comply you will find that 10% of them do it backwards and pollute your statistics so badly you have to erase everything and start again.
That said, it works quite well and is speedy and doesn't rely on external network servers so it makes a good first line of defense.
The femtoblogger software is being written by Jim Studt. The content of this page is provided by anonymous individuals. If you believe something on this page is innapropriate contact Jim Studt. |
Contributeloginlogout post create account (12 seconds) recent comments FilterSearchBrowsers
Archives
|