Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

Consider the following CSV file:

A; B; ;
B; ; A;
C; ; E F;
D; ; E;
E; C; ;

The fields:

  • $1: the jname. A unique id of the entry.
  • $2: a " "(space)-separated list of incond.
  • $3: a " "(space)-separated list of outcond.

For the "link" A-B to be valid, jname A must define B as outcond, and job B must define A as incond.

In the above example, D-E is not a valid "link" because E doesn't define D as incond. C-F is not a valid "link" because F doesn't exist.

A cond is not valid if the link it forms is not valid. The script must detect all non valid conds and which jobs are infected.

#!/usr/bin/awk -f

BEGIN {
    FS=" *; *";
    delim = "-";
    conds[""]=0;
}

{

    icnd_size = split($2, incond_list, " ");

    for (i=1; i<=icnd_size; ++i) {
    conds[incond_list[i] delim $1]++;
    }

    ocnd_size = split($3, outcond_list, " ");

    for (i=1; i<=ocnd_size; ++i) {
    conds[$1 delim outcond_list[i]]--;
    }

}

END {

    for (i in conds) {
    sz = split(i, answer, delim);

    if (conds[i] == 1) {
        j = answer[2];
        c = answer[1];
        inorout = "INCOND";
    }

    if (conds[i] == -1) {
        j = answer[1];
        c = answer[2];
        inorout = "OUTCOND";
    }

    if (conds[i] != 0)
        print "Invalid", inorout, c, "on job", j;
    }
}

The script works, although I do not have large data to test against. I see 2 problems with it:

  1. the script will break if some cond has the character delim in the name
  2. the script might break (and/or return false positives) if a line is inserted twice or if two lines have the same jname.

I could use any tip on addressing the two problems, as well as any critique of the code, it's literally my first awk code.

share|improve this question
    
Not really a CSV file is it! C => Comma => ','. You have a SSC. Semicolon separated file. –  Loki Astari Mar 5 '12 at 17:09

Your Answer

 
discard

By posting your answer, you agree to the privacy policy and terms of service.

Browse other questions tagged or ask your own question.