Advanced Mail Filtering Rules

If the basic spam config options are not enough for you, consider using JTAN's Advanced Mail Rule Editor available from the Mailbox Configuration page. It's very cool, very powerful. It's also not for everybody.

First of all, it requires some study and consideration in order to generate consistent and useful rules. By way of trying to help, we will prevent you from entering totally inconsistent rules. Unfortunately, a picky interface is something that drives impatient beginners nuts, sending them off on rants about "annoying" and "hard to use". Moreover, you can most definitely cause yourself extreme confusion and lose mail, or worse, if you use these advanced rules carelessly. Please don't venture any further unless you agree to hold JTAN harmless for mistakes you might make. On the other hand, with these rules you can achieve far finer control than you can with the simplified controls. We do encourage you to use these rules if you can think, and if you can take responsibility for your choices.

How it Works

Your basic configuration still operates, flagging messages based on their score. Advanced rules are then applied. Each rule specifies a conditional decision on some email header or other item. Rules are evaluated in the order given. If the condition is true for the selected item, the action is taken. Normally, if the action "delivers" or "deletes" the mail, no further rules are checked.

Simple Example

Suppose you never want to see any email with the word "viagra" in the subject line. Suppose you are certain that nobody would ever send you legitimate email with that word in the subject. It's easy to make a rule that would delete this mail. Here's what the rule looks like this:

Let's study the components of the rule from left to right. First we have the digit "1". That tells us this is the first rule in the sequence. If lower down in the list you have a rule that says "viagra" subjects aren't to be deleted, that rule is too late. The mail has already been deleted by the earlier rule. Sequence matters.

After the sequence number, we have the "Item" we are examining. In this case it's the Subject of the email. Other options include the From or To addresses, any header, or numeric quantities like the size of the mail or the "spamness" score.

The next two columns are the "not" checkbox followed by the "Condition". These two are used in combination with each other. The "Condition" decides something about the selected Item. In the above example, it is deciding if the Subject contains a certain word. Conditions typically have a parameter. In this case the parameter is the word we are looking for. Another example of a parameter would be with the condition "Less". You need to say what number the Item must be less than.

The "not" checkbox will reverse the sense of the condition. For example, if we wanted to select all mail that did NOT have the word "viagra" in the subject, we would have checked the "not" box.

Finally we have the action and the action parameter. This is what we should do if the rule was true. Since we have selected "Delete" for the action, there is no parameter. If we had chosen an action like "Forward", then we would have to specify the forwarding address.

On the extreme right is a "Del" checkbox for deleting the rule. Don't confuse this button with the "Delete" action.

More Examples

Here's another example that demonstrates the "Flags" item, and the "Folder" action. Suppose you want all your spam sent to a special IMAP folder named "spam". Use a rule like this:

With the "Flags" item, we can test to see if the message was called spam by the basic flagging configuration (or your private blacklist). Use the 'IsSpam' flag to make your rules work with the basic configuration.

Here are some other important points to keep in mind when you save to an IMAP folder. First of all, the folder needs to be created before this will work. Add the folder using your email program. MS Outlook will do this if you have set up your mailbox to use IMAP. You can also add IMAP folders with webmail. Remeber this is a folder on our server, not your PC, and is therefore, implicitly, a subfolder of your INBOX at JTAN. You don't need to use the INBOX prefix, just the name of the folder. Inbox sub.sub folders are not currently supported by advanced rules.

Another possible use for a folder is to save larger messages. This is handy when you normally check your mail with a handheld or a dialup modem that doesn't have a lot of speed. You can have large messages sent to a special "bigmail" folder using this rule.
We can go on giving simple examples like this, but hopefully you get the hang of it. If you still have questions, you might want to read the Gory Details section below for more detailed information about the various terms in a rule, as well as how rules fit in with basic processing.

Regular Expression Example

RegExps are very powerful tools. Books have been written about them. Before you die, you owe it to yourself to learn about them. Here is a link to a decent tutorial. A web search will turn up thousands of references.

In the JTAN rule editor, we use procmail compatible "minimal match" regular expressions. There's only one twist. A ^ character at the begining, and a $ at the end are implied. You shouldn't use them. The regep is pinned to the ends by default. Instead, if you DON'T want a forced match of the whole string, use .* wildcards at the ends.

The following is an example that uses regular expressions. The idea is to delete mail with the word "viagra" in the subject, even if spammers might stick odd characters between the letters, like "v|i|a|g|r|a" or "v_iagra". The question mark wildcard allows 0 or 1 characters to match.
Of course, the above rule will also match the subject like


  Subject: Come via grandma's favorite road.

So you need to be careful, or expect some lossage. That's why we don't usually recommend the "delete" option unless you are certain you don't mind losing mail.

Match Replacement

The regular expression condition supports the procmail-style \/ match separator. If you use the special code \/ in your regexp, and if the regexp matches the item (ignoring the \/ code), then everything to the right of the \/ will be stored in the __MATCH__ variable and available as a string replacement in the action parameter.

For example, if you use a regexp pattern like this

  X-SaveFolder: \/[a-z]+

and you choose the "Folder" action, with parameter __MATCH__, then if you receive mail with a header X-SaveFolder: stuff, then that mail will be stored in an IMAP folder "stuff", assuming it exists.

Other possible uses for this __MATCH__ replacement include dynamic remailing and forwarding, dynamic subject tagging, and other tricks. The use of __MATCH__ replacement in an action is very powerful, but it can also be hazardous. Make sure that your regexp is sufficiently restrictive so that the action is held in check. Also, it should be noted that to the right of the \/, maximal regexp matching is performed, which is usually what you want.

Applying a Set of Rules to All Mailboxes

If you want to apply a set of rules to any email coming in to any of your mailboxes, the way you do that is you pick one of your mailboxes to have your "master" rules. Then, in all your other mailboxes, you put just one rule with the "Include" action to include the rule set from your master.

The "Include" action Includes the rules from the mailbox given in the parameter. Rules are always included, so there can be no Item or Condition. Only one level of include recursion is permitted. Processing continues with the next rule in the current mailbox after the include, assuming none of the included rules delivered the mail.

Therefore, you can both include master rules and have special rules for individual mailboxes. You can also override a master rule by listing other rules before the Include.

The Gory Details

Most of the properties and features of the rule system are straightforward, but like any system that tries to be general purpose, there are always specific issues that need to be defined and explained.

First of all, let's explain how the advanced rules fit in with basic spam tagging, and the black/white lists on the "basic" config page.

Spamassasin always scores your mail. The X-JTAN-Antispam header will always appear, giving the spam score. Similarly, the X-JTAN-SpamScore header will have a series of "sssssss"'s that indicate the spam score.

In addition to this score, the basic spam configuration will set an internal flag in mail processing that indicates the mail is 'spam' should the score exceed the threshold you have set in your basic config. After the basic scoring and flagging based on the score, if the sender appears in the blacklist in your basic config, the spam flag will be set regardless of the score. Similarly, should the sender appear in your whitelist, the internal spam flag is always cleared. The whitelist has the final word if the sender is in both the black and white list.

Advanced rules are processed next after the internal spam flag is set from the basic configuration. That way, you have a chance to "undo" what the basic scoring does using advanced rules. They can flip the spam flag, or they can do their own actions based on that flag or other conditions.

Once the advanced rules complete, if the mail still hasn't been deleted or delivered, the mail will be disposed of according to the basic spam configuration. That is, if it is still flagged as spam, it will be either deleted, subject-tagged, or accepted as per your preference. Otherwise it will be delivered as normal.

Obviously, sequence can be important. Keep in mind that the basic spam scoring and spam flagging happens first, then the black and whitelists get a second crack at flagging (or unflagging) the mail, then the advanced rules get run. The advanced rules can deliver or delete. If any mail remains after the advanced rules, the basic spam action will take place if the spam flag is set.

If you don't want to use advanced rules, keep them blank. If you don't want the basic score flagging, set your basic threshold to 100.

Because of the importance of ordering, you may want to move rules around after you have written them. To rearrange rules, change their sequence numbers to the order that you want them to be tested. A decimal fraction sequence number can be used to move a rule between two others. Rule 0 would be before rule 1. And so on.

It can be handy to include rules from another mailbox. Use the 'Include' action for this. You cannot use an Item or Condition with the Include action.

The syntax of the rules themselves can be tricky. When creating rules, it's important to keep consistent between items and conditions, and to provide sensible parameters when required. For example, the "Less" condition requires both a numeric Item, and a numeric parameter. The interface will do it's best to police this, so don't be suprised if you get an error when you say to forward mail to "5".

Keep in mind that some actions terminate processing, while others continue processing. This can be important. Remember, your basic action will only have a chance to happen if no rule 'delivers' your mail.

Items

An "Item" is something tested by a Condition. The following Items are currently supported for rules.

Subject: The subject of the email that appears in the "Subject:" header.
To: The intended recipient of the mail as it appears in the "To:" header. This may be different than the envelope recipient. (You are always the envelope recipient!) Keep in mind that To headers are not safe to use for mail routing.
From: A good guess of the sender's reply address based on the From, Reply-To, or other headers. keep in mind that spammers universally forge their reply address.
Header: Complete headers with the tag name. This allows you to use a regexp to match special combinations. In the case of Subject, From, and To items, the header tag name (e.g. "Subject:") is stripped off before the test is done. When using the "Header" item, the tag name is not stripped. So if you want to test for a Message-ID header with the word "spammer" in it, you need to use a regexp like "Message-ID:.*spammer.*" for it to work.
Size: The size of the mail message in bytes. This must be used with a numeric condition. It's handy to use this to keep large mails from being forwarded to your palm device or cell phone.
Spamness: The Spamassasin score of the mail. This must be used with a numeric condition. It's handy to use this for special handling based on spamness score.

Conditions

The following Conditions are currently supported for rules.

Contains: True if the parameter appears anywhere in the item. The test is not case sensitive.
Address: True if the item matches the email address given as the parameter.
Mailer: True if the item is the email address of some automated mailer. This will match mail coming from most mailer-daemons.
Daemon: True if the item is the email address of some Mailer. This is a more comprehensive test that will match mail coming from most daemons. If you want to know the exact definition of these, see "man procmailrc".
Mailer: True if the item is the email address of some automated mailer. This will match mail coming from most mailer-daemons.
RegExp: True if the regular expression given in the parameter matches the item. This allows you to use procmail compatible regular expression tests. The ^ and $ are implied at the begining and end, you don't need to use them. Rather, you need .* at the ends if you don't want your regexp "pinned" to the ends. The \/ separator is supported, setting the __MATCH__ string replacement for action parameters.
Less: True if the item is numerically less than the parameter. Both the parameter and the item must be numeric.
Greater: True if the item is numerically greater than the parameter. Both the parameter and the item must be numeric.
IsWhite: True if the sender is in your whitelist. Must be used with the "Flags" item. There is no parameter.
IsBlack: True if the sender is in your blacklist. Must be used with the "Flags" item. There is no parameter.
IsSpam: True if the mail was already flagged as spam by your basic configuration. Must be used with the "Flags" item. There is no parameter.

Actions

The following Actions are currently supported for rules.

Delete: Deletes the mail. Processing terminates.
Forward: Forwards the the mail to an email address given in the parameter. Processing terminates.
Folder: Delivers the mail the mail to the IMAP INBOX sub folder given in the parameter. The sub folder must exist, otherwise delivery is to INBOX itself. Subfolders within subfolders are not supported. Processing terminates.
Tag: Tags the subject line with text given in the parameter. Curly brackets surround the tag. Use this to create custom subject Tags. Processing continues.
FlagSpam: Sets the internal flag indicating that the mail is spam. Based on this flag, the mail will be processed subsequenly based on what the basic configuration is for spam handling. Similar to blacklisting. Processing continues.
FlagOK: Clear any internal flag indicating that the mail is spam. Prevents the mail from being processed subsequenly based on what the basic configuration is for spam handling. Similar to whitelisting. Processing continues.
Cc: Copies the mail to an email address given in the parameter. Processing continues.
Include: Includes the rules from the mailbox given in the parameter. Rules are always included, so there can be no Item or Condition. Only one level of include recursion is permitted. Processing continues with the next rule in the current mailbox after the include, assuming none of the included rules delivered the mail.