Everything

So, I want to use my home ubuntu server to get my mail.  We've got three or four PC's in the house, and it's getting annoying only having my mail on one machine.  Plus, there's a lack of motion around the SAWin 32 project, and I'd like to keep up to date against the scum sucking spammers.  This one uses various components;

FetchMail: This downloads my mail from my ISP via POP and passes it to....

Postfix: A Mail Server, it pases my mail to ...

Spam Assassin: This scans the mail, and if it's spammy, it tags it as spam, then passes it back to ...

Postfix, which then stores it in a MailDir where....

Dovecot: A IMAP mail server, this then serves up my Maildir via IMAP to ....

Thunderbird on my Laptop.

 

So, Steps in Order

 

Install and Configure Postfix

https://help.ubuntu.com/community/Postfix

https://help.ubuntu.com/community/PostfixBasicSetupHowto

Install Dovecot

https://help.ubuntu.com/community/Dovecot

http://wiki.dovecot.org/

This was complicated because the guides suggested using port 10143 for IMAP, and Thunderbird defaulted to 143 and complained there was no response.  Took a while to catch that one !

 

Setup SpamAssassin

http://www.debuntu.org/postfix-and-pamassassin-how-to-filter- spam-p2

http://www.wantlinux.net/?p=17

http://blog.redbranch.net/2008/04/spamassassin-and-sa-update.html

Setup sa-update and sa-learn

http://blog.redbranch.net/2008/04/spamassassin-and-sa-update.html

 

Setup Fetchmail

 

http://ccgi.maxpower.plus.com/2007/02/10/fetch-email-with-fetchmail-and-ubuntu/

Other Usefull Links

http://steveyoung.wordpress.com/?s=fetchmail

Steps:

 

1) Download Packages.

Basically, get everything from http://sourceforge.net/projects/sawin32/ 

 

2) Unpack and install

I installed all the add-on's all into c:\program files\SAProxy\.  During the installationyou get asked about mailservers, I leave this page blank and do it later....

 

3) Configure your mail client 

Setup your mail client to use SAProxy.  Set the pop3 server to be 127.0.0.1 (presuming you're running SAProxy locally.  I've toyed with the idea of running it on another box, but never tested it...).  Set your username to be username:server name, so for instance if your user name is This email address is being protected from spambots. You need JavaScript enabled to view it. and your server name is pop3.domain.com then set the username to be This email address is being protected from spambots. You need JavaScript enabled to view it.:pop3.domain.com

 

Now test it, hopefully SAProxy shoudl now be filtering mail.  If you look at all headers you'll see a new header (you may have to select View > Show all headers to see this) called X-Spam-Status that will either start No (Spam Assassin doesn't think this is spam) or Yes (Spamassassin thinks this is spam).  Spam Assassin should also put **** SPAM **** into the subject of any Spam messages.

Now set up your mail client to filter all messages with **** SPAM **** in the header into another folder. 

 

4) Check sa-update

open a comand window and navigate to your saproxy folder.  then run something like the following:

 sa-update --nogpg -D

This runs sa-update with no crypto checking, and full debug messages.  You should get loads of messages, and then something like this:

[1992] dbg: channel: metadata version = 477972[1992] dbg: dns: 5.1.3.updates.spamassassin.org => 477972, parsed as 477972[1992] dbg: channel: current version is 477972, new version is 477972, skipping
channel
[1992] dbg: diag: updates complete, exiting with code 1

  This shows it works, now you can set it up to work routinely.  Create atext file called (mines called sare-sa-update-channels. txt).  In this file you can enter channels for sa-update to update one or more rule files.  Mine currently contains;

updates.spamassassin.org
72_sare_redirect_post3.0.0.cf.sare.sa-update.dostech.net
70_sare_evilnum0.cf.sare.sa-update.dostech.net
70_sare_bayes_poison_nxm.cf.sare.sa-update.dostech.net
70_sare_html1.cf.sare.sa-update.dostech.net
70_sare_html0.cf.sare.sa-update.dostech.net
70_sare_header1.cf.sare.sa-update.dostech.net
70_sare_header0.cf.sare.sa-update.dostech.net
70_sare_specific.cf.sare.sa-update.dostech.net
70_sare_adult.cf.sare.sa-update.dostech.net
72_sare_bml_post25x.cf.sare.sa-update.dostech.net
99_sare_fraud_post25x.cf.sare.sa-update.dostech.net
70_sare_spoof.cf.sare.sa-update.dostech.net
70_sare_genlsubj0.cf.sare.sa-update.dostech.net
70_sare_uri1.cf.sare.sa-update.dostech.net
70_sare_uri0.cf.sare.sa-update.dostech.net
70_sare_whitelist.cf.sare.sa-update.dostech.net
70_sare_obfu.cf.sare.sa-update.dostech.net
70_sare_stocks.cf.sare.sa-update.dostech.net

 

If I remember you need to have updates.spamassassin.org for this to work (this updates the default rules).  The rest are all from the excellent SARE Ninja's ! 

Now run;

sa-update -D --channelfile sare-sa-update-channels.txt --nogpg >> updbat.txt

 This runs sa-update using your channel file and routes the debug output to updbat.txt

5) Prepare some Ham and Spam

In your mail app, create two folders, I call mine Spam and NoSapm, but it's as you like

Put some typical spam in the spam one, and some typical non spam in the nospam (you should try and put as much as possible in here at first to help SpamAssassin learn your spam and ham

 

6) Run sa-learn

Run sa-learn for ham and spam

Open a command window and navigate to your spamassassin folder.  Type something like  

sa-learn.exe --spam --mbox --showdots path_to_spam_mailbox_file

This should produce something like this;

Learned tokens from 376 message(s) (388 message(s) examined) 

Then do the same for your ham, i.e. 

  sa-learn.exe --ham --mbox --showdots path_to_nospam_mailbox_file

This will teach the bayesian learning your spam and ham. 

7) Write a batch file for sa-update and sa-learn

Here;s mine, it's not perfect:

date /t >> updbat.txt
time /t >> updbat.txt
xcopy "C:\Documents and Settings\Rebecca\Application Data\Mozilla\Profiles\Paul\g4dhc3vd.slt\Mail\Local Folders\Spam" "C:\Program Files\SAproxy\" /Y
xcopy "C:\Documents and Settings\Rebecca\Application Data\Mozilla\Profiles\Paul\g4dhc3vd.slt\Mail\Local Folders\NoSpam" "C:\Program Files\SAproxy\" /Y
sa-learn.exe --spam --mbox --showdots Spam >> updbat.txt
sa-learn.exe --ham --mbox --showdots NoSpam >> updbat.txt
xcopy "C:\Program Files\SAproxy\nospamempty" "C:\Documents and Settings\Rebecca\Application Data\Mozilla\Profiles\Paul\g4dhc3vd.slt\Mail\Local Folders\NoSpam" /Y
xcopy "C:\Program Files\SAproxy\spamempty" "C:\Documents and Settings\Rebecca\Application Data\Mozilla\Profiles\Paul\g4dhc3vd.slt\Mail\Local Folders\Spam" /Y
sa-update --channelfile sare-sa-update-channels.txt --nogpg >> updbat.txt
time /t >> updbat.txt
exit

For some reason, I prefer to copy the spam and nospam files into the spamassassin folder first. It means if it dies, I can see what I was trying to do, plus it should work whilst my mail is open. 

 

8) Add this batch file to the scheduled programs part of the control panel

 

9) Sit back, and enjoy a Spam free life....  (occaisonaly check the debug text files, and put any spam or ham that gets wrongly tagged in the relevant folder....)

 

 

 

I've been meaning to post something about R for a while, but never got started, and now have a pile of things I'd like to post, so it's time to get started.

I first started using R during my Master Dissertation where I had to do some stats.  I've since had several occasions needed to do some ad-hoc data analysis of one sort or another, and every time I've ended up using R to get it done.  I now use R regularly, and while can't describe myself as an expert, I'd say a enthusiastic amateur.

R is an integrated suite of software facilities for data manipulation, calculation and graphical
Display.  It is a full, proper programming language, even being turing complete.  It has a suite of operators for calculations on data in many forms, in particular arrays and matrices.  It also has an enormous collection of add-on packages for pretty much any form of analysis, calculation or statistic that can be performed.

You can run R in a million different ways, the most basic is just using the basic R interpreter and the command line.  Personally I use Eclipse / StatEt / LaTex, which I'll describe another time.

A colleague recently asked about the basics, so I've cribbed my email back to him here, where I suggested either Rcommander or JGR / Deducer, which both seem the ideal mid point of some extra menu/click functionality without trying to rebuild Excel in R.  Rcommander seems to be slightly better in terms of a statistics tool, and JGR/Deducer in terms of data exploration.
 
RCommander (http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/) is an add-on package to R.

To Use RCommander, install R and run, then in the console, type
install.packages("Rcmdr", dep=TRUE)
this should install R commander.  To run it in the console type

 install.packages("Rcmdr", dep=TRUE)

this should install R commander. To run it in the console type

library(RCmdr)

JGR (Java Gui for R - http://rforge.net/JGR/ ) and an add-on to it, Deducer (http://www.deducer.org).  They're both 'packages' in R, which means some extra functionality that can be easily installed from within R and then used.

To Install
Ensure you have a fairly up to date Java installed
Download and Install the latest version of R from CRAN (Comprehensive R Network -
http://cran.r-project.org/ )
Download and Install the JGR client from http://rforge.net/JGR/
Start the JGR application.  It will get any extra things it needs.
The JGR console should now be open. To load Deducer, go to 'Packages & Data' > 'Package Manager' and select Deducer and DeducerExtras.
 
Now you're using R !
 
Now you need to go find some help, I'd recommend some places to start with R;