CPTTM Network Admin newsletter issue #6


In order to keep closer contact with IT network administrators in Macau, we've created a network admin newsletter and I've taken the liberty to add you to our netadmin-news mailing list. If you'd like to unsubscribe or recommend friends to subscribe at any time, just email me.

--- Simon Tam, Editor in Chief

Topics in this issue:

Centralized SPAM Filtering

Tried about Spam? You may receive tens of spam daily in your office computer, get tired? Some users may use client-side filtering tools like POPfile and the built-in junk mail control in the email clients. Although this client side tools have good spam filtering result, but, this may be effort-demanding to set up such mechanism.

A good way is to set up a central spam filter mechanism at the Mail Server in your office. CPTTM recently has tested an open source spam filtering module called : SpamAssassin and got a good result. Below are the details :

As said, SpamAssassin is a spam filtering module, it needs to be integrated with existing Mail Server. We are using MDaemon, which has SpamAssassin built-in.

When the Mail server receive a email, it will feed it to SpamAssassin, SpamAssassin will score this email according to its knowledge. The higher the score, the higher the probability the email will be a spam. Then the email is fed back to the Mail server. The Mail server can then act according to this score. e.g. drop the email, flag the Subject of the email etc.

How does SpamAssassin get its knowledge on what are spam and what are ham? (ham means normal emails)

You need to set up two folders to put spam and ham separately for SpamAssassin to learn. Initially, SpamAssassin  needs at least 200 spam and 200 ham to learn before it will work. Later on, when users find wrongly classified emails, they should report to SpamAssassin. How can they do it? In MDaemon, there are two special email addresses : spam@yourdomain.com and ham@yourdomain.com, where "yourdomain.com" is the domain name of your organization. The users need to forward the wrongly classified emails AS ATTACHMENT, which is RFC822 complaint, to these two special email addresses. Then, the forwarded attachment will be delivered to the two folders correspondingly. For a general approach, since the two folders can be exposed as "Public folders", in MDaemon, it is  implemented in this way, the users can subscribe to these two public folders with IMAP. Then, this can be done by drag-and-drop.

By this means, SpamAssassin will get smarter and smarter.

After implementing SpamAssassin in our mail server, for about 2 weeks, CPTTM has achieved a better Spam filter result.

Take an example, we have three colleagues recorded the following accuracy of classification :

Colleague 1    97%
Colleague 2    70%
Colleague 3    70%

We found that SpamAssassin will seldom wrongly classified ham as spam. But, the weak point is that it is in sensitive to some plain text spam. This is one of the reason for the 70% accuracy. The other is that we still need to fine tune the spam score threshold to suit our case. This threshold will affect which emails will  be classified as spam.

SpamAssassin can be integrated with many mail servers like :  sendmail, Postfix, qmail, exchange etc. MDaemon is one Windows based mail server that incorporate SpamAssassin.

Want to learn more, go to :

http://spamassassin.apache.org/

http://www.spamblogging.com/archives/000069.html

Data Life cycle Management Case Study In Cyberlab

Cyberlab was set up in year 2001. Until now, all the data we stored in these 5 years are over 40GB. We use a 24GB tape for backup data. To backup full data everyday is impossible. Considering the limited resources and efficiency, it's not good to spread the data on several tapes. The data will grow faster and faster in the coming future. Moreover, the data after years may become obsolete. Resources is wasted for processing this kind of data. Thus, we came out a solution to solve the problem.

On our file server, the data is stored by year.

Data - 2004 - Courses
- Projects
- 2005 - ...
- 2006 - ...
- Home - users - email

We decide to backup the data in current year only. Yes, only year 2006 will be backup into the tape daily. For the past data (2005 and before), we store them in permanent storage, such as DVD. Then make all the past data as READONLY on the file server. Users can still access the old data but can't make any modification to it.

After that, we found another fact that users' emails are one of the largest folder which contain over 12GB data. But emails can't be backup separately, we change our folder structure as below. Change a whole email folder to be year based. Then, create new account on email client for each user, let say we create an account called "2005". Point the account folder to /Data/2005/users/email. Move all emails in year 2005 to those account. Point the current active account folder to /Data/2006/users/email. Thereby, when we backup the folder 2006 into tape daily, we can backup the emails in year 2006.

Data - 2004 - Courses
- Projects
- Home - users - email
- 2005 - users - email
- ...
- 2006 - users - email
- ...
- Home - users

That's what we do in Cyberlab. Hope that our experience can bring you ideas how you can manage your data better. If you have your experiences on how to manage data life cycle, you are WELCOME to share with us. Just send email to me.

Free retake of Microsoft Exams

Promotion to Microsoft exams (070, 072, 074, MB) comes again. Candidate who register the 2nd Shot Promotion can retake your unsuccessful exam for free. Promotion date is from February 15, 2006 through June 30, 2006.

For detail information and registration, please refer to http://www.microsoft.com/learning/secondshot.


Books review - Red Hat Linux Networking and System Administration

Red Hat Linux experts Terry Collings and Kurt Wall start with the basics - network planning and Red Hat installation and configuration. They then show you in detail how to set up network and Internet services, from establishing a network file system to configuring mail services. Eight chapters give you the lowdown on customizing the kernel, automating tasks with scripting, performing backups, and more - the nuts - and -bolts maintenance information you need to keep your system running smoothly. And last but not least, the authors provide nearly 100 pages of proven strategies and tips for maintaining system security.

You can borrow this book from our "CPTTM IT Book Shelf" in Cyberlab. Please visit :
http://www2.cpttm.org.mo/cyberlab/mslib/



CPTTM Network Admin Newsletter can be reviewed from :
http://www2.cpttm.org.mo/cyberlab/netadmin-news/