 |
Index for Section 1 |
|
 |
Alphabetical listing for B |
|
 |
Bottom of page |
|
BOGOUTIL(1)
NAME
bogoutil - Dumps, loads, and maintains bogofilter database files
SYNOPSIS
bogoutil [options] {-d file | -H | -l file | -m | -w file_or_dir |
-p file_or_dir} file.db
bogoutil {-r | -R} directory
bogoutil {-h | -V}
where options is
[-v] [-n] [-D] [-a age] [-c count] [-s min,max] [-y date] [-I file]
[-x flags]
DESCRIPTION
Bogoutil is part of the bogofilter Bayesian spam filter package.
It is used to dump and load bogofilter's Berkeley DB databases to and from
text files, perform database maintenance functions, and to display the
values for specific words.
OPTIONS
The -d file option tells bogoutil to print the contents of the database
file to stdout.
The -H file_or_dir option tells bogoutil to print a histogram of the speci-
fied database file to stdout. The output is similar to bogofilter -vv. Fi-
nally, hapaxes (tokens which were only seen once) and pure tokens (tokens
which were encountered only in ham or only in spam) are counted.
The -l file option tells bogoutil to load to load the data from stdin into
the database file.
The -m option tells bogoutil to perform maintenance functions on the speci-
fied database, i.e. discard tokens that are older than desired, have counts
that are too small, or sizes (lengths) that are too long or too short.
The -w file_or_dir option tells bogoutil to display token information from
the database. The option takes an argument, which is either the name of the
wordlist (usually wordlist.db) or the name of the directory containing it.
Tokens can be listed on the command line or piped to bogoutil. When there
are extra arguments on the command line, bogoutil will use them as the to-
kens to lookup. If there are no extra arguments, bogoutil will read tokens
from stdin.
The -p file_or_dir option tells bogoutil to display the database informa-
tion for one or more tokens. The display includes a probability column with
is the token's spam score (computed using bogofilter's default values. Op-
tion -p takes the same arguments as option -w .
The -r option tells bogoutil to recalculate the ROBX value and print it as
a six-digit fraction.
The -R option does the same as -r, but prints more information and saves
the result in the training database.
The -I file option tells bogoutil to read its input from file rather than
stdin.
The -v option produces verbose output on stderr. This option is primarily
useful for debugging.
The -D redirects debug output to stdout (it usually goes to stderr).
The -x flags option sets debugging flags.
Option -n stands for "replace non-ascii characters". It will replace char-
acters with the high bit (0x80) by question marks. This can be useful if a
word list has lots of unreadable tokens, for example from asian spam. The
"bad" characters will be converted to question marks and matching tokens
will be combined when used with '-m' or '-l', but not with '-d'.
Option -a age indicates an acceptable token age, with older ones being dis-
carded. The age can be a date (in form YYYYMMMDD) or a day count, i.e. dis-
card tokens older than age days.
Option -c value indicates that tokens with counts less than or equal to
value are to be discarded.
Option -s min,max is used to discard tokens based on their size, i.e.
length. All tokens shorter than min or longer than max will be discarded.
Option -y date is specifies the date to give to tokens that don't have
dates.
The -h option prints the help message and exits.
The -V option prints the version number and exits.
DATA FORMAT
Bogoutil reads and writes text files where each nonblank line consists of
a word, any amount of horizontal whitespace, a numeric word count, more
whitespace, and (optionally) a date in form YYYYMMDD. Blank lines are
skipped.
RETURN VALUES
0 for successful operation. 1 for most errors. 3 for I/O or other errors.
Error 3 usually means that something is seriously wrong with the database
files.
AUTHOR
Gyepi Sam <gyepi@praxis-sw.com>.
Matthias Andree <matthias.andree@gmx.de>.
David Relson <relson@osagesoftware.com>.
For updates, see the bogofilter project page:
http://bogofilter.sourceforge.net/.
 |
Index for Section 1 |
|
 |
Alphabetical listing for B |
|
 |
Top of page |
|