“The Merriam-Webster Dictionary”
oracle : one held to give divinely inspired answers or revelations
j-chkmail's oracle is a set of tests about weak spam indicators. Some examples are :
Heuristics may include, but not only, looking for some regular expressions inside some parts of messages.
The main goal of using this kind of heuristics isn't to use them to detect spam, as long as these are weak spam indicators. Heuristic filter isn't a main filtering method. But it can help to confirm the two main filtering methods : bayes filter and URL filtering.
The number of tests are not too big : less than 40 nowadays. Only really relevant checks are integrated into the oracle.
You find 5 check categories in ORACLE.
Just enable it !
j-chkmail.cf
# SPAM_ORACLE # Do heuristic filtering # Syntax : ----- # VALUES : NO YES SPAM_ORACLE YES
If you want to use RBLs with the Oracle, take a look at “Expert users” section.
j-chkmail's oracle uses two configuration files :
/etc/mail/jchkmail/j-tables - this file is used to enable/disable each Oracle test and assign odds to them./etc/mail/jchkmail/j-oradata - this file is used to define unwanted things and to assign odds to them. Unwanted things may be one of :To change the names of these files, you can edit j-chkmail.cf file :
j-chkmail.cf
# ORACLE_DATA_FILE # Some oracle definitions # Syntax : ----- ORACLE_DATA_FILE j-oradata # ORACLE_SCORES_FILE # Oracle scores # Syntax : ----- ORACLE_SCORES_FILE j-tables
Declare the RBLs you want to use - no more than 16. The more RBLs you declare the slower will be the filter !
j-chkmail.cf
# RBL # Real-Time Blacklists (used at Oracle) # Syntax : RBL[/CODE] - rbl.domain.com/127.0.0.1 RBL rbl.domain.com/127.0.0.1
Enable these RBLs at j-oradata configuration file and assign its odds.
j-oradata
R00 DISABLE odds=5.000 Realtime Blacklist R01 DISABLE odds=4.000 Realtime Blacklist R02 DISABLE odds=10.000 Realtime Blacklist ...
If you want to enable/disable or change the values of tests, you shall edit j-oradata configuration file :
j-table
C05 DISABLE odds=1.000 SMTP client sending mail to spamtrap C06 DISABLE odds=1.000 Bad EHLO parameter C07 DISABLE odds=1.000 Myself EHLO parameter - forged M01 ENABLE odds=1.000 No HTML nor TEXT parts
If you you want to modify the list of Unwanted things used by some Oracle checks ( CHARSET | BAD-EXPR | BOUNDARY | MAILER | HTML-TAG ), you may edit j-oradata file :
j-oradata
HTML-TAGS odds=1.66 <script[^<>]*>
HTML-TAGS odds=1.40 <script[^<>]+src=[^<>]+>
HTML-TAGS odds=1.45 <span[^<>]*>
BAD-EXPR odds=20.88 http[s]?://[^ /#]*#[0-9a-f]
BAD-EXPR odds=1.00 http[s]?://[^ /&]*&#[0-9]{1,3}
BAD-EXPR odds=1.03 http[s]?://[^ /@>\\n]*@
BAD-EXPR odds=6.92 http[s]?://[^ /]*[0-9]{1,3}[.][0-9]{1,3}[.][0-9]{1,3}[.][0-9]{1,3}
BAD-EXPR odds=3.91 http[s]?://[^>\n\r *]+\\*http[s]?://
CHARSET odds=13.00 ^big5$
CHARSET odds=9.00 ^euc-kr$
CHARSET odds=4519.00 ^gb2312$
In probability theory and statistics the odds in favour of an event or a proposition are the quantity p / (1 − p) , where p is the probability of the event or proposition. In other words, an event with m to n odds would have probability n/(m + n). For example, if you chose a random day of the week, then the odds that you would choose a Sunday would be 1/6, not 1/7. These 'odds' are actually relative probabilities.
OBS :
/var/log/j-chkmail shows the tests that have been done when checking a mail, that's a usefull if something get rejected. You will find the reason here
/var/log/j-chkmail
Mar 4 17:08:46 mx0 j-chkmail[7771]: [ID 000000 local5.info] 47CD740E.001 ORACLE - M02 text/html without text/plain ( 0.2) Mar 4 17:08:46 mx0 j-chkmail[7771]: [ID 000000 local5.info] 47CD740E.001 ORACLE - M13 RFC2822 headers compliance ( 1.0) Mar 4 17:08:46 mx0 j-chkmail[7771]: [ID 000000 local5.info] 47CD740E.001 ORACLE - H06 HTML tag/text ratio ( 0.5)
Terminal
$ j-chkmail -t oradata $ j-chkmail -t oracle-checks