7

I am trying to use awk inside a bash script and do a rather common task: iterate well structured file lines and split them by a delimiter into an array. Here is a sample from the file:

Joe:Johnson:25
Sue:Miller:27

There are numerous examples how this can be done on a single line in the interactive mode, however, I am doing it in a script in which I would like to use that array manipulated by awk outside the awk subshell in bash itself:

cat ${smryfile} | while read smryline; do

    echo ${smryline}

    #now i want to split the line into array 'linearray' in awk but have it usable when i get back to bash
    echo ${smryline} | awk '{split($0,$linearray,":")}'

    varX=$linearray[2]
    echo $varX
    #do something with $varX

done

I get an error:

awk: syntax error at source line 1
 context is
     >>> {split($0,$linearray <<< ,":")}
awk: illegal statement at source line 1

Is it possible to do what I am trying to do (use arrays that are defined in awk outside of its scope) and how should I do it?

2
  • Please note that ${var} is not the same as "$var". Verify with this: var=" a b c "; echo "$var"; echo ${var} -- you'll see the whitespace removed in the 2nd echo. Commented Feb 5, 2014 at 21:35
  • You try to split a line into an awk array, and then try to access that array under the same name as shell variable! That won't work.
    – U. Windl
    Commented Apr 8, 2019 at 22:05

3 Answers 3

7

I think that you can do what you want without awk:

cat "${smryfile}" | while IFS=: read first last varx
do
    echo "first='$first' last='$last' varx='$varx'"
    # do something
done

This produces:

first='Joe' last='Johnson' varx='25'
first='Sue' last='Miller' varx='27'

Note that this approach will work even if some names in the file include spaces.

Note also that the use of cat above is not necessary:

while IFS=: read first last varx
do
    echo "first='$first' last='$last' varx='$varx'"
    # do something
done <"${smryfile}"

A side benefit of removing cat, as per the above, is that any variables that you create in the loop will survive after the loop finishes.

3
  • 1
    i found the same solution simultaneously and it works. thanks
    – amphibient
    Commented Feb 5, 2014 at 21:36
  • +1, neat solution for this use-case, although no arrays were assigned as was the original question.
    – grebneke
    Commented Feb 5, 2014 at 21:45
  • This is good information, but it doesn't technically answer the question, since it doesn't assign to an array, and thus is entirely un-scalable.
    – krb686
    Commented Oct 24, 2016 at 2:04
6

This should work:

linearray=($(awk -F: '{$1=$1} 1' <<<"${smryline}"))
echo ${linearray[2]}
# output: 27

Explanation: awk -F: splits input on :. awk by default separates modified output with a space, so you can construct an bash array directly with the output from awk. Note modified output, hence the no-op call to $1=$1, else the data would just come out in the original form.

But given your example, why not extract the third column with awk -F: and loop the output:

awk -F: '{print $3}' "$smryfile" | while read varX; do
    echo $varX
done
5
  • 1
    doesn't really work. the array doesn't get the expected data
    – amphibient
    Commented Feb 5, 2014 at 21:26
  • @amphibient - right, I missed a no-op "hack" when testing and copy-pasting. Updated.
    – grebneke
    Commented Feb 5, 2014 at 21:34
  • I understand that {$1=$1} is a no-op action, but I am not understanding the role of the lone '1' that follows without an intervening ;.
    – Codex24
    Commented Feb 5, 2014 at 23:39
  • @Codex24 it's just awk shorthand for print. The same thing could be written like this: {{$1=$1;} print;}.
    – terdon
    Commented Feb 6, 2014 at 1:30
  • @Codex24 - The general pattern in awk is condition {action} condition2 {action2} .... The default action is print $0, and 1 evaluates to true, so awk '1' file is the equivalent of cat file. The point of {$1=$1} is to make awk believe a field was changed, else it would print the input unchanged including the : which we want to get rid of to create the bash array.
    – grebneke
    Commented Feb 6, 2014 at 7:24
0

Granted you did ask for a solution with awk, but I think the sed solution is a bit more clear.

str="Joe:Johnson:25"
array=($(echo "$str" | sed 's/:/ /g'))
for el in "${array[@]}"; do
    echo "el = $el"
done

This gives:

el = Joe
el = Johnson
el = 25

In this case, you have to be sure not to put double quotes around the command expansion, so you don't end up with a single string element.

And this doesn't work for elements that contain spaces, of course. If you had:

str="Joe:Johnson:25:Red Blue"

You'd end up with

el = Joe
el = Johnson
el = Red
el = Blue

But if you don't mind using eval, you can insert quotes before the array assignment to get it to work.

eval "array=($(echo "\"$str\"" | sed 's/:/" "/g'))"

# Before the eval, it turns into:
eval "array=("Joe" "Johnson" "25" "Red Blue")"

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .