Sign In

More Information

Share Your Filters here!

Feedback, Suggestions, Bug reports about G-Lock SpamCombat software.

Moderators: Alex Markov, marisp

Share Your Filters here!

Postby Dausenkunz » Sun Jan 11, 2004 12:54 am

First of all I'd Like to say a big thaks to the makers of spam cop. This software is really great.

I found that some of the filters are a bit too picky for my taste. Some legitimate mails have been marked as 'Online Pharmacy spam' for example. I would suggest to fix this by reducing the number of wildcards between the letters of the words from five to a more modest setting (2 or 3).
in Edit > Blacklist:Header or Body > Online Pharmacy spam
I replaced the Value:

(?i)(pharmacy)|(medication)|(prescription)|((p(.{1,5})?h(.{1,5})?
a(.{1,5})?r(.{1,5})?m(.{1,5})?a(.{1,5})?c(.{1,5})?y))|((m(.{1,5})
?e(.{1,5})?d(.{1,5})?i(.{1,5})?c(.{1,5})?a(.{1,5})?t(.{1,5})?i(.{1,5})
?o(.{1,5})?n))|((d(.{1,3})?r(.{1,3})?u(.{1,3})?g(.{1,3})?s))

with:

(?i)(pharmacy)|(medication)|(prescription)|((p(.{1,3})?h(.{1,3})?
a(.{1,3})?r(.{1,3)?m(.{1,3})?a(.{1,3})?c(.{1,3})?y))|((m(.{1,3})
?e(.{1,3})?d(.{1,3})?i(.{1,3})?c(.{1,3})?a(.{1,3})?t(.{1,3})?i(.{1,3})
?o(.{1,3})?n))|((d(.{1,3})?r(.{1,3})?u(.{1,3})?g(.{1,3})?s))

This seems to make the filter more tolerant. If anybody else has some useful ideas for blacklist or whitelist filters they would like to share then please post them here.

Thanks
Dausenkunz
Newbie
Newbie
 
Posts: 1
Joined: Sun Jan 11, 2004 12:33 am

Share Your Filters here!

Postby DC » Thu Jan 22, 2004 5:07 pm

Since SpamCombat has no filter by country, for those who cannot read messages written with non-Western European characters (Chinese, Korean, Japanese, Russian, Arabic, etc.) I suggested the creation of 6 filters (see "non-Western European character sets" in this Forum).

Each of these 6 filters had only one character in the field “Subject”.
These characters are rarely used in Western European languages, but I found them often in the “Subject” of 45 spam of this type, read by SpamCombat.

The more frequent characters found in these 45 spam, and their ANSI code were:

Æ: Alt 0198, : Alt 0182, ×: Alt 0215, (it is not an x or X)
ø: Alt 0248, Ø: Alt 0216, ±: Alt 0177

Since then, I collected more 400 spam of this type (~20% of my total spam) and I include a new important filter with the following character:

Ð: Alt 0208

With these seven filters I had no more false positives of this type (bad messages marked as good).

The frequency of messages detected by each of these characters, among those 445 spam, was:

¶: (30%), Ð: (29%), Æ: (28%),
×: (11%), ø: (1%), Ø: (0,5%), ±: (0,5%)

DC
User avatar
DC
registered user
registered user
 
Posts: 73
Joined: Tue Dec 30, 2003 8:45 pm
Location: Sao Paulo, Brazil

Share Your Filters here!

Postby Alex Markov » Fri Jan 23, 2004 2:45 pm

To blacklist the emails with non-English characters in the message header or/and body, you can use a regular expression with the following syntax:

\xhh hexadecimal character (up to 2 hex digits)

An example of the regular expression shown below allows to catch ANY non-English character:

Code: Select all
[\x80-\xFF]
Alex Markov
Developer
Developer
 
Posts: 34
Joined: Wed Dec 04, 2002 6:05 pm

Share Your Filters here!

Postby DC » Fri Jan 23, 2004 4:04 pm

Thanks Alex,

However I did not mean any non-English character, but non-Western European characters (Asian, Arabic, Cyrillic...) because I cannot read them. The seven filters I suggested are very good in this job.

I tested your suggestion and I would be careful about it: all messages written in Spanish would be considered Spam. The same if they were in French, Italian, Portuguese or German (characters Ä, Ü, usw).

DC
User avatar
DC
registered user
registered user
 
Posts: 73
Joined: Tue Dec 30, 2003 8:45 pm
Location: Sao Paulo, Brazil

Share Your Filters here!

Postby Alex Markov » Fri Jan 23, 2004 4:33 pm

When non-Western European characters are presented in the hexadecimal format, they look like below:

0198 = C6
0182 = B6
0215 = D7
0248 = F8
0216 = D8
0177 = B1
0208 = D0

So, I would recommend to use the following reg expression instead of your 7 filters:

Code: Select all
[\xC6\xB6\xD7\xF8\xD8\xB1\xD0]
Alex Markov
Developer
Developer
 
Posts: 34
Joined: Wed Dec 04, 2002 6:05 pm

Viagra in message body blaclist

Postby Riggs » Sun Jan 25, 2004 10:46 pm

I deleted this filter by mistake. Could someone please post the code? Thanks.
Riggs
Newbie
Newbie
 
Posts: 3
Joined: Mon Dec 22, 2003 3:33 am

Blacklist Emails With "Viagra" in the Message Body

Postby Alex Markov » Mon Jan 26, 2004 10:19 am

Regular expression:

Code: Select all
(?i)(v(\s{0,5})?[i1](\s{0,5})?a(\s{0,5})?g(\s{0,5})?r(\s{0,5})?a)|(v[i1|l][@a]gr[@a])|(VIAGRA)
Alex Markov
Developer
Developer
 
Posts: 34
Joined: Wed Dec 04, 2002 6:05 pm

Blacklist Virus Emails

Postby marisp » Tue Jan 27, 2004 5:09 pm

Since in the version 1.30 of SpamCombat added the ability to show the attached file name if an email has an attachment, you can block emails with virus and other suspicious attachments.

Here are the regular expressions that catch emails with W32.Novarg virus in the email body and attachment:

W32.Novarg in Message Body - select MessageBody and add:

Code: Select all
(?i)(mail transaction failed\. partial message is available)|(The message contains Unicode characters and has been sent as a binary attachment)|(The message cannot be represented in 7\-bit ASCII encoding and has been sent as a binary)


W32.Novarg in Attachment - select X-GSC-Attachment and add:
Code: Select all
(?i)(document|readme|doc|text|file|data|test|message|body)\.(exe|.pif|scr|cmd|bat|zip)


Here is the regular expression that allows to block emails with any suspicious attachment:

Suspicious Attachment - select X-GSC-Attachment and add:
Code: Select all
(?i)\.(exe|pif|scr|bat|cmd)


You can add these regular expressions to the SpamCombat blacklist. Since you add the regular expressions, enable the RegExp checkbox.
marisp
Site Admin
Site Admin
 
Posts: 3128
Joined: Mon Feb 25, 2002 4:11 pm

Postby tomtom » Wed Jan 28, 2004 12:30 am

Where is the attached filename shown? I've tried looking at several e-mails with attachments but cannot see where the attachment filenames are displayed. I've looked in the main program display and also in the list of fields but no attachments shown. Am I on the wrong track here? :? [/quote]
tomtom
registered user
registered user
 
Posts: 10
Joined: Mon Dec 29, 2003 11:33 pm

Postby tomtom » Wed Jan 28, 2004 10:23 am

Ooops - messages that were supposed to have attachments didn't have. Just recieved a message with a 'real' attachment and this showed up OK. Forget my last post - thx.
tomtom
registered user
registered user
 
Posts: 10
Joined: Mon Dec 29, 2003 11:33 pm

Postby John Fitzsimons » Mon Feb 02, 2004 7:39 am

As regards tomtom's query. Shouldn't files with attachments have something like Content-Type: multipart/alternative; in the headers (body ?) ?

Also, FWIW a partial filter that might be handy (not checked with SpamCombat yet) is one I came across in another newsgroup :

Code: Select all
(?ms)Subject:[\s]*(test|hello|hi|error).+\b(zip|exe|bat|cmd|pif|scr)\b


Not sure how that would need to be changed to work with SpamCombat though.

Regards, John.
John Fitzsimons
registered user
registered user
 
Posts: 12
Joined: Fri Jan 30, 2004 12:11 am

Postby marisp » Mon Feb 02, 2004 12:17 pm

Code: Select all
(?ms)Subject:[\s]*(test|hello|hi|error).+\b(zip|exe|bat|cmd|pif|scr)\b


I realize that the reg expression above is intented to catch emails with W32.Novarg virus. In the current version of SpamCombat this reg expression will not work yet. Please, see the post above (Blacklist Virus Emails) where the reg expressions to catch virus emails are described.
marisp
Site Admin
Site Admin
 
Posts: 3128
Joined: Mon Feb 25, 2002 4:11 pm

filtering messages with non-Western characters

Postby DC » Fri Mar 05, 2004 5:41 am

Sometime ago (see above), I suggested seven filters for those who cannot read messages written with non-Western characters (Chinese, Korean, Japanese, etc.).

Since then, I identify two more characters which identify the spam not filtered by those previous (in parenthesis their ANSI codes):

½ (Alt 0189), » (Alt 0187),

The hexadecimal codes (to use in regular expressions) for these two characters are:

xBD and xBB.

I added these two codes to the seven I already had and the regular expression now sees like that:

[\xC6\xB6\xD7\xF8\xD8\xB1\xD0\xBB\xBD]

I use this expression in the Black List ("Subject" field) for about two weeks.
Since then, SpamCombat classified as spam ALL messages I received (about 200) written with non-Western characters. :)

DC

By the way, the page below has a nice converter of characters from normal to ANSI and to hexadecimal codes:
http://code.cside.com/3rdpage/us/unicode/converter.html
User avatar
DC
registered user
registered user
 
Posts: 73
Joined: Tue Dec 30, 2003 8:45 pm
Location: Sao Paulo, Brazil

Postby alex0305 » Thu Mar 11, 2004 7:35 pm

How can you blacklist messages where the "To:" does not contain my email address? I've looked in help and can't see it. Thanks.
alex0305
Newbie
Newbie
 
Posts: 4
Joined: Sat Mar 06, 2004 5:39 pm

Postby marisp » Fri Mar 12, 2004 2:23 pm

First you can add a condition To: your email address to the whitelist. As the whitelist is prioritized, all the emails with your email address in TO fiel will be automatically marked as good.

Then you can add this regular expression to the blacklist:
Code: Select all
(?i)[\S\d\.]+@[\S\d\.]+

This regular expression will catch any email in the TO field and automatically mark the message as spam.
marisp
Site Admin
Site Admin
 
Posts: 3128
Joined: Mon Feb 25, 2002 4:11 pm

Next


Return to G-Lock SpamCombat

Who is online

Users browsing this forum: No registered users and 0 guests

 

Who is online

In total there are 0 users online :: 0 registered, 0 hidden and 0 guests (based on users active over the past 5 minutes)
Most users ever online was 437 on Tue Jan 25, 2005 6:23 am

Users browsing this forum: No registered users and 0 guests

Current time

It is currently Wed Sep 08, 2010 3:14 pm