TODO
====
a gui that shows all stopped spams/passed not-spams and lets the user add them to the corpus?
read mbox format
add spam header to spam emails
use mail::audit
better docs
better installation

DONE
====
* (many) database support
* (many) incremental add/remove/rebuild of ratings
* (Roger Burton West/many) base64/mime decode (MIME::Parser)
* (Michel Rodriguez) use strict on build
* use strict on check
* (Lee Henderson) optimize: when word is only in "not-spam" corpus = 0.01
* (Chris Shenton) recurse directories
* (Chris Shenton) ignore too-large files
* (Paul Graham) remove html comments
* (Paul Graham) remove all-numeric tokens
* (Paul Graham) only count duplicate interesting tokens once
* (Michael J. Pomraning) fix GPL preamble..."foobar"?
* check usage on each program
* fix split regex

experiment with:
================
ignore tokens that are too short/too long
how resistant to miscategorization?