WARNNING: This is not a HOW-TO, this is a HOW-I-DID, so you should do in your own way, using or not the information in this page.


1) Spamd configuration

Spamd is started with the RedHat style script that comes with the SpamAssassin distribution with the following options:

   OPTIONS="-d -a -x -m 22 --socketpath=/var/run/spamd -u spamd"

Qmail max smtp incoming connections is set to 20, so I configure the spamd children to 22.

I run spamd in unix-socket mode since from my test over ten thousand mails, spamd is 7,8% faster running with unix-socket.

The user spamd was created in this way:

   mkdir /var/spool/spamd
   groupadd spamd
   useradd -g spamd -d /var/spool/spamd spamd
   chown spamd:spamd /var/spool/spamd

/etc/mail/spamassassin/local.cf

   # This is the right place to customize your installation of SpamAssassin.
   # See 'perldoc Mail::SpamAssassin::Conf' for details of what can be
   # tweaked.
   #
   ###########################################################################
   #
   # Some settings are same as default, but I like to see them...

   required_hits 6.5
   rewrite_subject 0
   dns_available yes

   # Site-wide files
   use_bayes 1
   bayes_path /var/spool/spamd/bayes
   bayes_file_mode 0666
   bayes_min_ham_num 150
   bayes_min_spam_num 150

   bayes_auto_learn 1
   bayes_auto_learn_threshold_nonspam -0.5
   bayes_auto_learn_threshold_spam 11.2

   auto_whitelist_path /var/spool/spamd/whitelist
   auto_whitelist_file_mode 0666

   # DCC
   use_dcc 1
   dcc_path /usr/bin/dccproc

   # My Test (already included in spamassassin 2.64)
   # Listed in sbl-xbl.spamhaus.org
   header RCVD_IN_XBL              eval:check_rbl_txt('xbl', 'sbl-xbl.spamhaus.org.')
   describe RCVD_IN_XBL            Received via a relay in sbl-xbl.spamhaus.org
   tflags RCVD_IN_XBL              net

   # We check sbl-xbl.spamhaus.org so don't check sbl.spamhaus.org
   score RCVD_IN_SBL 0
   score RCVD_IN_XBL 0 3.6 0 3.6

   # My score
   score NO_DNS_FOR_FROM 2.550
   
   # The default score for this is -4.9   too much
   score BAYES_00 0 0 -2.901 -2.900
   
   # I trust my bayes database so I modified the score
   # Don't do this until you have a good database
   score BAYES_50 0 0 0.5 0.5
   score BAYES_56 0 0 0.9 0.9
   score BAYES_60 0 0 2.8 2.8
   score BAYES_70 0 0 3.5 3.5
   score BAYES_80 0 0 3.9 3.9
   score BAYES_90 0 0 5.4 5.4
   score BAYES_99 0 0 6.4 6.4


    

2) Sample of qmail-queue.log

  Sun, 08 Aug 2004 07:53:18 CEST:15644: +++ starting debugging for process 15644 by uid=81
  Sun, 08 Aug 2004 07:53:19 CEST:15644: w_c: elapsed time from start 0.528421 secs
  Sun, 08 Aug 2004 07:53:19 CEST:15644: return-path='lxb57jl@hotmail.com', recips='user@domain.com'
  Sun, 08 Aug 2004 07:53:19 CEST:15644: from='"Ross Hays" ',
                                        subj='1% ۱篹 ̅5000! ez zikhv  ww',
                                        via SMTP from 211.216.136.165
  Sun, 08 Aug 2004 07:53:19 CEST:15644: s_p_d: domain_rcpt 'domain.com', scanners 'sophie_scanner,spamassassin,perlscan_scanner'
  Sun, 08 Aug 2004 07:53:19 CEST:15644: sophie: finished scan in 0.025799 secs
  Sun, 08 Aug 2004 07:53:20 CEST:15644: SA: REPORT hits = 37.6/6.5
   1.5 RCVD_NUMERIC_HELO      Received: contains a numeric HELO
   4.2 DATE_SPAMWARE_Y2K      Date header uses unusual Y2K formatting
   1.2 HTML_IMAGE_ONLY_02     BODY: HTML: images with 0-200 bytes of words
   6.4 BAYES_99               BODY: Bayesian spam probability is 99 to 100%
   0.1 HTML_MESSAGE           BODY: HTML included in message
   0.3 MIME_HTML_ONLY         BODY: Message only has text/html MIME parts
   3.6 SPAMCOP_URI_RBL        URI's domain appears in spamcop database at sc.surbl.org
   2.9 DCC_CHECK              Listed in DCC (http://rhyolite.com/anti-spam/dcc/)
   2.5 FORGED_HOTMAIL_RCVD2   hotmail.com 'From' address, but no 'Received:'
   0.7 DATE_IN_PAST_06_12     Date: is 6 to 12 hours before Received: date
   3.9 SUBJ_ILLEGAL_CHARS     Subject contains too many raw illegal characters
   0.1 MISSING_OUTLOOK_NAME   Message looks like Outlook, but isn't
   1.6 MISSING_MIMEOLE        Message has X-MSMail-Priority, but no X-MimeOLE
   1.9 RCVD_DOUBLE_IP_SPAM    Bulk email fingerprint (double IP) found
   1.5 FORGED_MUA_IMS         Forged mail pretending to be from IMS
   4.1 FORGED_IMS_HTML        IMS can't send HTML message only
   1.1 MIME_HTML_ONLY_MULTI   Multipart message only has text/html MIME parts
  Sun, 08 Aug 2004 07:53:20 CEST:15644: SA: yup, this smells like SPAM - hits=37.6 - rejecting message...
  Sun, 08 Aug 2004 07:53:20 CEST:15644: SA: finished scan in 1.730351 secs - hits=37.6
  Sun, 08 Aug 2004 07:53:20 CEST:15644: r_e: X-Qmail-Scanner-1.22st: We have reasons to believe this mail is SPAM
  Sun, 08 Aug 2004 07:53:20 CEST:15644: ------ Process 15644 finished. Total of 2.302074 secs
    

3) Spamassassin: tcp-server vs. unix-socket

Test done in a dedicated mailhub:
HW: Pentium IV 2,4 Ghz, ram 1 Gb, HardDisk SCSI (Adaptec 29160).
SW: RedHat 7.3, kernel 2.4.26, perl 5.6.1, spamassassin 2.63.

   Spamassassin TCP-SERVER mode

   Average: 2.0614
   Median:  1.1995
   Std_dev: 3.2359

   Spamassassin UNIX-SOCKET mode (faster 7,8%)

   Average: 1.9124
   Median:  1.0033
   Std_dev: 2.7514
    


Back
Salvatore Toribio

20040808