{ ^_^ } sinustrom Solving life, one problem at a time!

Address book based e-mail approval in procmail

2020-08-09
Author: Zoltan Puskas
Categories: linux
Tags:

Spam filters can be a bit too aggressive and sometimes mark e-mail coming from my friends and family as spam. Since I check my spam folder at most once a week it’s easy for some messages to go unnoticed for a long time. In order to fix this I’ve decided to add a new a rule to procmail, that will approve all e-mails coming from addresses in my contact list, preferably without manually syncing the address book to procmail.

I organize my contact information with an ncurses based tool called abook, since it works well with the mutt mail client and it’s also usable over SSH. It stores contact information in a text file using the INI format. All I needed to do was to somehow convert this information into a procmail allowlist.

Dynamic matching

As a first try I added the following rule to .procmailrc, which did the trick:

ABOOK=/usr/bin/abook
FORMAIL=/usr/bin/formail

:0
{
  :0 hw
  FROM=|$FORMAIL -zcx"From:" | sed -E 's/[<>]//g' | sed -E 's/.*\s(\S+@\S+)\s?.*/\1/' 

  :0
  * ? $ABOOK --mutt-query "$FROM"
  $MAILDIR
}

As the rule has a two parts to it, I’ve decided to place it in a rule block for easier readability.

The first rule in the block sends the message’s header through formail to extract the “From:” field, and then uses sed to get only the e-mail address. The sed part is needed to get rid of the descriptive name, so I only end up with the e-mail address itself, e.g. if the “From:” field contains something like “John Doe <john@mailservice.com>”, I only need “john@mailservice.com” part. The result is stored in the FROM variable.

The second rule in the block does the actual checking. It runs a query against abook with the given e-mail address. If the address is in my list abook will print the match to the output, which is discarded, and exits with success (exit code: 0), otherwise it exits with an error (exit code: 1). The ? rule directive will make a decision based on the exit code and if a match is found it will place the message into my inbox.

Pre-generated filter matching

While the above rule works well, it involves running four extra processes on every message filtered: formail + sed + sed + abook, which makes it more resource intensive compared to running through a pre-defined rule list using procmail’s internal egrep.

Since abook does not support advanced listing or querying, I wrote a small script, called abook2procmail, to gather all email addresses and format them into a procmail rule that can be included.

With the helper script the process becomes quite easy. First an include rule file, i.e. an allowlist, is generated by running:

$ abook2procmail --procmailrc ~/.procmail/abook.rc

which is then included in ~/.procmailrc, close to the top of the file, by adding the following line:

INCLUDERC=$HOME/.procmail/abook.rc

This makes the filtering more efficient, though it requires regenerating the filter manually if the address book changes.

Automatically updating the filter

Fortunately address book contents usually do not change that frequently, but it still would make sense to somehow automatically regenerate the allowlist include file to avoid the need for remembering to invoke the generator script after updates.

via crontab

Adding a personal crontab entry could invoke periodic regeneration (use crontab -e to edit your personal cron configuration). Below example will update the filter from the address book once a day at 23:00 hours.

0 23 * * * /usr/bin/abook2procmail --procmailrc ~/.procmail/abook.rc.new && mv ~/.procmail/abook.rc.new ~/.procmail/abook.rc

The move operation is required to make the update atomic and avoid potential race conditions if at the time of generating the new rule the system also happens to process new incoming messages.

via Git hook

I happen to store and distribute my address book via Git. It would make sense to utilize a Git hook to update my filters whenever there are changes pulled from central repository. So in my local checkout I added the following ~/.abook/.git/hooks/post-merge hook:

#!/bin/sh

function fail_clean() {
    rm $1
    echo "failed"
    exit 1
}

function regenerate() {
    echo -n "Regenerating procmail allowlist... "

    TMPOUT=$(mktemp)
    /usr/local/bin/abook2procmail --procmailrc ${TMPOUT} || fail_clean ${TMPOUT}
    mv ${TMPOUT} ~/.procmail/abook.rc || fail_clean ${TMPOUT}

    echo "done"
}

# Main
regenerate

We can also update the cron job to issue a git pull instead of running the generator script directly like so:

0 23 * * * /usr/bin/git -C ~/.abook pull

This solution is much better, for several reasons. First, the procmail include is updated immediately when a new revision of the address book is pulled from the repository. Second, if we use it in combination with the updated cron method, we can avoid both the address book and the filter becoming stale. Finally, it minimizes writes to the SSD due to both using /tmp, which is tmpfs, and by initiating writes only when there is an update, as git pull is a read only operation if the local checkout is up to date.


Content