I often use Windows as a terminal to my various UNIX systems. Sometimes its helpful to run proprietary software - and I don't have time/inclination to mess around with half-baked emulators/ports/binary blobs/whatevers under Linux. I either run a completely open system like OpenBSD or I run Windows. Anyway, I never use Windows to do any real work. I always shell into a remote system to actually get things done. Either PuTTY or - if you prefer real OpenSSH like me - OpenSSH via Cygwin/X work fine for getting a terminal. WinSCP or Cygwin's OpenSSH for scp(1) are good for file-transfer under most circumstances. However, one of the nice things about Windows is the Explorer shell. It - and its KDE knock-off - are useful for certain file management operations. Why not leverage it? So I started looking for a way to mount remote filesystems via SSH, so that they appear as native Windows volumes. And I found a way to do it, for free. Dokan user mode filesystem for Windows Dokan is basically FUSE for Windows. That's all dandy and there are plenty of useful FUSE filesystems out there, like this one which uses my BitTorrent implementation. Whats cool about Dokan is they also do an SSH FS implementation. Is it hard to set up I figured this thing was surely going to be a PITA and probably not work to boot. In fact, you just install three things - some Microsoft runtime library, the main Dokan library, and Dokan SSHFS - and there you go. There is simple GUI app to set up remote mounts that supports all the things you'd expect, saving sessions, both password and public key authentication. Does it actually work Yes, although it doesn't seem to support symlinks. A symlink to a directory on a remote system appears as a file under Dokan. So no $HOME/public_html for you - oh well. Final thoughts Its fun to look at your horribly un-organised UNIX home directory in Windows, and see just how messy it is. Almost makes me want to start cleaning things up. But then I remember I know how to use locate(1) and find(1).

Niall O'Higgins is an author and software developer. He wrote the O'Reilly book MongoDB and Python. He also develops Strider Open Source Continuous Deployment and offers full-stack consulting services at FrozenRidge.co.

Read and Post Comments

My C BitTorrent implementation, Unworkable, used to be hosted on an anonymous CVS repository I had running on my server at home. This was fine, until I reinstalled the machine from scratch and didn't feel like setting up the whole anonymous CVS access again. Its a pretty painful process, unfortunately, although there is this guide to setting up anonymous CVS. No public VCS, bad for an Open Source project So, for a while, there was no public source control for Unworkable, which sucked. Its difficult and cumbersome for other developers to write diffs and track changes without one. While I generally like to maintain full control of the source code hosting, I've had good experiences with Google Code before. They don't show ads, and their interface is clean and to the point, unlike say SourceForge, which is covered in ads and various nonsense. Anyway, I had a CVS repository and I first wanted to convert it to Subversion, including all the history, then I wanted to import that into Google Code. Converting from CVS to Subversion The first thing to do was to convert my existing CVS repository to Subversion. There is a nice tool specifically for this, cvs2svn. It is in fact very easy to use, at least in my basic case - I only work with HEAD or in SVN terminology, trunk. I simply ran:

cvs2svn --trunk-only --svnrepos ~/unworkable-svn ~/unworkable-cvs
Et voila, I have a shiny new Subversion repository in ~/unworkable-svn, with full history. Importing Subversion repository to Google Code Google Code lets you import an existing Subversion repository pretty easily, as long as you have an empty project. When your Google Code project is created, it will be set to revision 1. In Subversion-land, revision 0 is sort of magic, and so you will need to overwrite it to properly import your existing repository. Google give you a place to do this, but its slightly confusing because they don't put it under 'Administer'. In order to reset your repository you must:
  • Log into your Google Code project with administrator privileges.
  • Browse to the 'Source' page (either 'Checkout' or 'Browse' but not 'Changes').
  • Scroll to the bottom of the page, and click the 'reset this repository' link, which is sort of hidden.
  • Choose the option "Did you just start this project and do you want to 'svnsync' content from an existing repository into this project?"
  • Click the big "Reset Repository" button which has a big red warning label beside it.
Now you are ready to import your repository. You will use the 'svnsync' tool included in Subversion 1.4 and up to do this. There are two commands, one which takes a path to both the Google Code repository and your repository, and another which takes just a path to the Google Code repository. The first one will run quite quickly, the second one can take a while as it imports each individual revision,
# This command takes both the path to your Google Code repository
# and the path to the repository you want to import
svnsync init --username YOURUSERNAME 
    https://YOURPROJECT.googlecode.com/svn file:///path/to/localrepos
# This command takes just the path to the Google Code repository.
# It will take a while to complete.
svnsync sync --username YOURUSERNAME https://YOURPROJECT.googlecode.com/svn
Once you've done that, your code is imported. Enjoy!

Niall O'Higgins is an author and software developer. He wrote the O'Reilly book MongoDB and Python. He also develops Strider Open Source Continuous Deployment and offers full-stack consulting services at FrozenRidge.co.

Read and Post Comments

OpenBSD 4.5 was released the other day. I upgraded one of my servers and workstations to the new release, from 4.4-current and 4.4-release respectively. Mostly, things have gone pretty smoothly, as is usually the case with OpenBSD. The new release has plenty of incremental improvements, with the developers gradually polishing and refining things such that the operating system as a whole gets better and better. Some of the highlights for me include the new multi-plexing, re-sampling, low-latency audio server aucat(1), the new even-stricter malloc(3) which can catch buffer overflows of even a single byte. Additionally, the Atheros wireless driver, ath(4), which I have in a couple of my laptops, now supports WPA. All in all, lots of nice improvements which make OpenBSD even more of a pleasure to use. I did however come across one nasty bug after upgrading one of my servers. We use the symon system monitor on all our servers to log and graph all the sorts of system and network metrics you'd expect - cpu usage, disk, memory, network io, etc. This is very useful for capacity planning and also for spotting bugs or mis-configurations. Especially on a shared, multi-user system you want to keep an eye on resource utilisation over time. Unfortunately, the 4.5-release symon package for amd64 would exit after startup. When I ran it in debug mode, it gave me a 'Bus error' - then when I twiddled the symon.conf a bit more to remove some sensors, it would exit with a segfault, and finally I could get it to run by disabling some stuff but it would spew warning messages to standard IO. Specifically, it had a problem with the if(lo0) and if(bnx) sensors. If I disabled those, it would run, but spit out warnings. However, these sensors were pretty useful to me, so not having them was very annoying. I noticed some recent commits to the -current symon port, so I decided to give that a shot. Its always pretty hairy building -current ports on -release, but in this case I didn't have much choice. Fortunately for me, the -current symon port built and ran fine, completely eliminating the issues I'd had with the 4.5 -release packages. So, this was a mild annoyance, although it does highlight the sad demise of the -stable ports tree. Overall, 4.5 is a solid release with plenty of new features, just be warned that the symon package might not work out of the box for you, if you rely on it.

Niall O'Higgins is an author and software developer. He wrote the O'Reilly book MongoDB and Python. He also develops Strider Open Source Continuous Deployment and offers full-stack consulting services at FrozenRidge.co.

Read and Post Comments

I was converting some mod_rewrite rules from the Lighttpd webserver to Apache today. While Lighttpd and Apache both have request rewriting modules with pretty equivalent functionality, there are some significant differences nonetheless. Specifically, I was trying to rewrite a URL of the form:

/script?key=123abcxyz
to a file on the local disk:
/abc/123/123abcxyz
In Lighttpd, I had a single rewrite rule handling both the URL and its query string:
"^/script?key=(([0-9a-f]{3})([0-9a-f]{3}).*)" => "/$2/$3/$1"
The regular expression might be a little confusing at first, but its reasonably straight forward. My first thought was that I could do pretty much exactly the same thing with an Apache RewriteRule, like so:
RewriteRule ^/script?key=(([0-9a-f]{3})([0-9a-f]{3}).*) /$2/$3/$1
Unfortunately this won't work - the RewriteRule will only be passed the URL - that is, /script without the query string (?key=foo). So how do you make the RewriteRule aware of the value of the query string to rewrite to the local on-disk file correctly? You must have a RewriteCond directive preceeding the RewriteRule. RewriteCond can run a grouping regular expression over the query string, and RewriteRule can pull those groups out of the previous RewriteCond with a special syntax:
RewriteCond %{QUERY_STRING} key=(([0-9a-f]{3})([0-9a-f]{3}).*)
RewriteRule ^/script /%2/%3/%1
So there you are - how to have an Apache RewriteRule operate on parts of the query string as well as on the URL. The solution ends up being a little more convoluted under Apache than under Lighttpd, but is still manageable.

Niall O'Higgins is an author and software developer. He wrote the O'Reilly book MongoDB and Python. He also develops Strider Open Source Continuous Deployment and offers full-stack consulting services at FrozenRidge.co.

Read and Post Comments

Good spam filtering with OSBF-Lua and Mutt

January 17, 2009 at 12:11 AM | categories: Technical, UNIX | View Comments |

I've used Mutt as my mail reader (aka MUA) for years. My personal mail goes through OpenBSD's greylister, spamd(8) which cuts out a very large portion of spam. However, my work email account, and also any personal account subscribed to mailing lists, still get a fair bit of spam. So some additional filtering is needed. Enter OSBF-Lua. OSBF-Lua is a port of the orthogonal sparse bigrams with confidence factor classifier from the CRM114 project. It was recommended to me around a year ago by Pedro Martelletto, a talented systems hacker who is also a very nice guy. Thanks Pedro! Anyways, OSBF-Lua is very easy to set up - assuming you already have procmail or something similar installed and hooked up to your MTA. I'm going to assume you are installing this on OpenBSD, but the instructions should not differ much for any modern UNIX system. Installing OSBF-Lua First, install the package on your machine. Any moderately recent version of OpenBSD should have a binary package for it, and its dependency, the Lua programming language:

$ sudo pkg_add -i osbf-lua
lua-5.1.4: complete
osbf-lua-2.0.4p1: complete
On OpenBSD, the important files will be installed to /usr/local/share/osbf-lua. Now follow the official OSBF-Lua install instructions, which I'm reproducing below with the OpenBSD-specific paths. Configuring OSBF-Lua to filter your mail
# Do the following steps under your account, not as root

#1) Create your local osbf-lua dir:
    mkdir $HOME/osbf-lua
# 2) Create your log and cache dirs:
    mkdir $HOME/osbf-lua/log
    mkdir $HOME/osbf-lua/cache
# Note: Old messages in the cache dir should be deleted
# regularly, typically from a cron job, to preserve disk space. 

# 3) Copy the spamfilter config file to your dir:
    cp /usr/local/share/osbf-lua/spamfilter_config.lua \
        $HOME/osbf-lua

# 4) Edit spamfilter_config.lua to set your password
    $EDITOR $HOME/osbf-lua/spamfilter_config.lua

# 5) Change to the current dir to your osbf-lua dir and 
# and create the spamfilter databases (spam.cfc, nonspam.cfc)
     cd $HOME/osbf-lua
     lua /usr/local/share/osbf-lua/create_databases.lua
# 6) Add the following lines to your .procmailrc:

# set OSBF_LUA_DIR to where spamfilter.lua, 
# spamfilter_command.lua, etc were installed

OSBF_LUA_DIR=/usr/local/share/osbf-lua
OSBF_LUA_USER_DIR=$HOME/osbf-lua

:0fw: .msgid.lock
* < 350000 # don't check messages greater than 350000 bytes
| $OSBF_LUA_DIR/spamfilter.lua --udir $OSBF_LUA_USER_DIR
Training OSBF-Lua from inside Mutt You should now have OSBF-Lua hooked up to your mail pipeline. However, it will only work with the email gateway control method. That is, OSBF-Lua will notice certain commands in the Subject header and respond to those. While nifty and occasionally useful, it is tedious to train the filter by sending emails to yourself. A much easier method is to set up a Mutt macro which invokes the filter. I have mine bound to shift-s (spam) and shift-h (ham). The additions to your $HOME/.muttrc file are trivial:
macro index,pager H "~/learn-nonspam.sh"
macro index,pager S "~/learn-spam.sh"
Now for the simple shell scripts: $HOME/learn-spam.sh
#!/bin/sh
UDIR=$HOME/osbf-lua
LEARN=/usr/local/share/osbf-lua/spamfilter.lua
$LEARN --learn=spam --udir=$UDIR
$HOME/learn-nonspam.sh
#!/bin/sh
UDIR=$HOME/osbf-lua
LEARN=/usr/local/share/osbf-lua/spamfilter.lua
$LEARN --learn=nonspam --udir=$UDIR
Now you can train the filter from inside Mutt, with a single keypress! That should save a lot of repetitive and time-consuming fiddling with the e-mail gateway method.

Niall O'Higgins is an author and software developer. He wrote the O'Reilly book MongoDB and Python. He also develops Strider Open Source Continuous Deployment and offers full-stack consulting services at FrozenRidge.co.

Read and Post Comments

« Previous Page -- Next Page »