With gawk
, from its man page about arrays, you can read a details explanation.
In most other languages, arrays must be declared before use, including
a specification of how many elements or components they contain. In
such languages, the declaration causes a contiguous block of memory to
be allocated for that many elements. Usually, an index in the array
must be a positive integer. For example, the index zero specifies the
first element in the array, which is actually stored at the beginning
of the block of memory. Index one specifies the second element, which
is stored in memory right after the first element, and so on. It is
impossible to add more elements to the array, because it has room only
for as many elements as given in the declaration. (Some languages
allow arbitrary starting and ending indices—e.g., ‘15 .. 27’—but the
size of the array is still fixed when the array is declared.)
....
Arrays in awk are different—they are associative. This means that each array is a collection of pairs: an index and its corresponding array element value
So you can define an array without specify its size:
$ awk 'BEGIN{a[0]=1;a[10]=2;print length(a)}'
2
It's not like perl
, which use contiguous block of memory for array:
$ perl -le '$a[0]=1;$a[10]=1;print ~~@a'
11
And perl
hash is very similar to gawk
array:
$ perl -le '$a{0}=1;$a{10}=1;print ~~keys %a'
2
Because gawk
arrays implement as a hash table, so you can access any element of array in constant time, independent from size of array.