This one-liner removes duplicate lines from text input without pre-sorting.
For example:
$ cat >f
q
w
e
w
r
$ awk '!a[$0]++' <f
q
w
e
r
$
The original code I have found on the internets read:
awk '!_[$0]++'
This was even more perplexing to me as I took _
to have a special meaning in awk, like in Perl, but it turned out to be just a name of an array.
Now, I understand the logic behind the one-liner: each input line is used as a key in a hash array, thus, upon completion, the hash contains unique lines in the order of arrival.
What I would like to learn is how exactly this notation is interpreted by awk. E.g. what the bang sign (!
) means and the other elements of this code snippet.
How does it work?