[TriLUG] stopping Cyrillic spam.

Cristóbal Palmer cristobalpalmer at gmail.com
Sat Jan 27 21:10:21 EST 2007


The other spam thread we've got going made me think it might be good
to ask this question here. I've done some googling, but what I've
found hasn't lead me to anything that actually works. :(

I'm trying to filter subject lines like these:

Fwd[50]: Бюджетирование в предприятии
[87]: СЛУЖБА БРОНИРОВАНИЯ ГОСТИНИЦ Вега,
re[15]: Сделай работу грамотной
[99]: Служба бронирования "Измайлово-Тур"
fwd[6]: Управление производственным предприятием
[5]: ГОСТИНЕЧНЫЕ НОМЕРА В МОСКВЕ.
[0]: Бюджетирование производственного предприяти я
[68]: Типографские услуги
[8]: резюме соискателя на должность помощник дир ектора по персоналу
[7]: Бронирование номеров в гостиницах.
[4]: Ознакомьтесь с проблемами бюджетирования, и научитесь их решать
[1]: Ваш бизнес будет удачным!
E-MAIL РЕКЛАМА - УСПЕХ РАЗВИТИЯ БИЗНЕСА
[5]: Рассылка рекламы по интернету
ГЕНЕРАЛЬНОМУ ДИРЕКТОРУ

I'm getting a LOT of spam like this.

I want to drop another rule in my .spamassassin/user_prefs that will
filter this crap out. I've got lots of rules that look something like:

# block pharmacy crap
header ib_PHARMACY Subject =~ /.*PH.*A*R*M*A+C*Y*.*/i
describe ib_PHARMACY subject contains sequence roughly spelling 'pharmacy'
score ib_PHARMACY 3

But I can't for the life of me get a regex that properly catches all
the subject lines I gave above. Before anybody suggests ok_locales, I
looked at that and decided I don't want to have to explicitly permit
mail. I need to be overly permissive and restrict characters or
locales as they become problematic.

Any help would be most welcome.

Thanks,

-- 
Cristóbal M. Palmer
UNC-CH SILS Student -- ils.unc.edu/~cmpalmer
TriLUG Vice Chair
"There are many roads to enlightenment, and thus many roads back to
the One True Debian" --crimsun


More information about the TriLUG mailing list