Tell me more ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I wrote a small script to analyse spam messages that are spam false negative; meaning that they are spam messages in nature but that happened to be in your INBOX folder because your spam filter failed to detect it correctly (I personally use SpamAssassin and unfortunately it rarely happens).

The goal is to have a chance to analyse spam messages that are put in spam folder manually by running a script in cron jobs.

Obviously, all spam messages will be analysed regardless of which email client you're using (Thunderbird, Kaiten Mail or Roundcube), because all you need is to have a message moved in spam folder.

Example of my well working PHP script:

<?php
// Check if there is messages in SPAM folder that don't have [SPAM] lable (e.g.: manually moved from INBOX)
exec("grep -L '\[SPAM\]' /home/domainexample.ru/Maildir/.Junk/cur/* 2> /dev/null", $spam_messages);

if (!empty($spam_messages)) {

    $sa_learn = '/usr/bin/sa-learn --spam';
    foreach ($spam_messages as $spam_message) {
        //Learn a message that we believe is spam
        $sa_learn .= ' ' . $spam_message;
        $marked_as_spam = file_get_contents($spam_message);
        // Adding [SPAM] flag to message's subject of analyzed message
        $marked_as_spam = str_replace("Subject:", "Subject: [SPAM] ", $marked_as_spam);
        file_put_contents($spam_message, utf8_encode($marked_as_spam));
    }
    // Logging the results
    $sa_learn .= ' > ' . '/home/domainexample.ru/.spamassassin/logs/' . date('d.m.Y-G:i') . '_analyzer.log';
    //Executing analyzer in background 
    shell_exec("nohup $sa_learn 2> /dev/null & echo $!");
    // Cleaning cached messages in Dovecot
    shell_exec("rm -f /var/lib/dovecot/index/domainexample.ru/.Junk/*");
}
?>

I would like to have (in case there are) few ideas for improving this already working PHP script but most importantly, I would like to learn of how to write the same script using PERL or/and BASH scripting.

Could you please suggest some ideas for improving current PHP script, along with providing pure and fully working examples of the same scenarios in PERL or/and BASH?

share|improve this question
 
Interesting script, but it's a bit of a tall order to do all of these things. If you focus on either a single language (and show us the code if it's not PHP), you probably have a better chance at getting an answer. Judging by the use of exec and shell_exec it should be easy to convert this to a Bash script. –  l0b0 Jun 7 at 6:52
 
It wouldn't be that easy for me to do it, as I'm not familiar with Perl, for instance. If I do it, it will take much time for me and only few minutes for an expert in either Bash or Perl. I have just proposed to provide any example Bash or/and Perl. –  Ilia Rostovtsev Jun 8 at 6:57

1 Answer

Here's an untested Bash version:

#!/usr/bin/env bash
set -o errexit -o noclobber -o nounset

while IFS= read -r -u 9 path
do
    /usr/bin/sa-learn --spam "$path" \
        > "/home/domainexample.ru/.spamassassin/logs/$(date +%d.%m.%Y-%G:%S)_analyzer.log" \
        2>&1 &
    sed -i -e 's/^Subject:/Subject: [SPAM] /' "$path"
    rm "$path"
done 9< <(grep -FL '[SPAM]' /home/domainexample.ru/Maildir/.Junk/cur/* 2> /dev/null)

Some changes from the original:

  • Uses grep's -F option to speed up search.
  • Runs sa-learn repeatedly instead of once to avoid having to accumulate the data. Shouldn't slow down the execution since the processes are backgrounded.
  • Deletes files as soon as they are processed, to avoid reprocessing if the previous run failed.
share|improve this answer
 
Based on the logic above, it's so different from PHP and I don't see how the $path is created / looped. Your example unfortunately is not working. I checked it on the test server and nothing is happening - logs are not created, messages are not processed, clean up is not deleting anything. What is wrong? –  Ilia Rostovtsev Jun 11 at 7:44
 
As logs are empty, I can't tell what is going on there. The script execution I just hanging.. Until you click CTRL+C –  Ilia Rostovtsev Jun 12 at 8:16
 
Yes. It print's out everything in console. –  Ilia Rostovtsev Jun 12 at 8:30
 
Yes, correct, logs are created every-time and it's just empty.. –  Ilia Rostovtsev Jun 12 at 8:36
 
let us continue this discussion in chat –  l0b0 Jun 12 at 8:37
show 4 more comments

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Not the answer you're looking for? Browse other questions tagged or ask your own question.