[TriLUG] Spamassassin question - Bayesian filtering

Jon Carnes jonc at nc.rr.com
Thu Mar 13 14:00:56 EST 2003


The training looks like a pain in the a**, but I think you could make it
easier on the folks by setting up some scripts to accept forwarded
messages from your local users.

Local users would forward mistakenly tagged messages to one of two
addresses:
  sa_nospam - indicating that this shouldn't have been marked as spam
  sa_spam - indicating that this spam message slipped through

It would be up to you Jeremy to have your script reshape the message to
its original form and then submit it back via the sa-learn command.

Just an idea, but it may work (and be a good contrib back into the
community).

Jon 

On Thu, 2003-03-13 at 13:11, Jeremy Portzer wrote:
> Good afternoon folks,
> 
> I've been playing around with the new Spamassassin, version 2.50, which
> includes Bayesian filtering (see http://www.paulgraham.com/spam.html for
> the paper about this, mentioned at ESR's talk, and see the man page for
> the "sa-learn" command).
> 
> As per the sa-learn man page, the default in SA 2.50 is to operate in
> Unsupervised auto-learning.  This means that mail is populated in the
> "ham/spam" databases based on whether SpamAssassin marks it as spam or
> not, from the other rules.   The man page mentions that this "should be
> supplemented with some supervised training in addition, if possible."
> 
> How do I go about "supplementing" the auto-learning mode?  One problem I
> can see with auto-learning is that missed spams become marked as "ham"
> (non-spam) and could mess up the database.  So I'm collecting these
> mistakes, but how do I properly adjust the database?  Do I need to make
> it "forget" the mistaken emails first, and then run them through
> sa-learn with --ham?   Or is running them through with --ham enough?
> 
> Anyone know of resources/HOWTOs/examples with actual commands, instead
> of generalized statements like "supplement with supervised training" ?
> 
> ====
> 
> If anyone else is interested in testing SpamAssassin, it is installed on
> the TriLUG mail server now.  Just put something like this in your
> .procmailrc :
> 
> :0fw
> | /usr/bin/spamc
> 
> Then your spam will be marked with the X-Spam-Status header, which you
> can filter on if you like.
> 
> Regards,
> Jeremy
> 
> -- 
> /=====================================================================\
> | Jeremy Portzer       jeremyp at pobox.com       trilug.org/~jeremy     |
> | GPG Fingerprint: 712D 77C7 AB2D 2130 989F  E135 6F9F F7BC CC1A 7B92 |
> \=====================================================================/





More information about the TriLUG mailing list