Memory allocation for sparse array in awk

Question

I have searched but didnt reach to any conclusion that when i defined an sparse array does it reserve all the contiguous memory upto maximum index or it allocate the memory at that particular index only.

array[100000]="ID1"
array[1001]="ID2"

Similarly when i do for loop for an array does it scan all the indexes where array[i] exist or It point only defined array's index for ex. 100000 & 1001.

for(i in array){...}

I have to store some value at specific index but i am afraid of memory allocation,so it become so much important for me to know,how does it actually allocate memory in case of sparse array,thanks.

Michael Homer · Accepted Answer · 2014-08-12 09:02:58Z

up vote 4 down vote accepted

Per the gawk manual, which is a good general awk language reference:

An important aspect to remember about arrays is that array subscripts are always strings.

That is, awk arrays are always associative, and numeric keys are stringified. Only the keys that are in use are stored in the array (and maybe some extra space for the future). Numeric indices are not contiguous, so sparse arrays don't take up any more space than another array with the same number of elements.

As for loops, when using the for (k in array) {body} syntax the:

loop executes body once for each index in array that the program has previously used

Again, only the indices that have been used will be included in the array iteration. Note that the order of iteration is undefined, however; it's not necessarily either numeric or the order of addition to the array.

answered Aug 12 '14 at 9:02

Michael Homer
17.8k43163

+1 for nice explanation,so basically It will have only 4 byte integer memory allocation for 2 different values ? – Aashu Aug 12 '14 at 9:26

It probably needs slightly more space than the number of actual entries, but more or less, yes. It's likely to be implemented with a hash table, which means it will average something like k * n * sizeof(void*) for some k ~2 and n the number of populated entries. It's undefined, though, so it could be exactly linear as you said. – Michael Homer Aug 12 '14 at 9:31

add a comment |

cuonglm · Answer 2 · 2014-08-12 12:23:14Z

With gawk, from its man page about arrays, you can read a details explanation.

In most other languages, arrays must be declared before use, including a specification of how many elements or components they contain. In such languages, the declaration causes a contiguous block of memory to be allocated for that many elements. Usually, an index in the array must be a positive integer. For example, the index zero specifies the first element in the array, which is actually stored at the beginning of the block of memory. Index one specifies the second element, which is stored in memory right after the first element, and so on. It is impossible to add more elements to the array, because it has room only for as many elements as given in the declaration. (Some languages allow arbitrary starting and ending indices—e.g., ‘15 .. 27’—but the size of the array is still fixed when the array is declared.)

....

Arrays in awk are different—they are associative. This means that each array is a collection of pairs: an index and its corresponding array element value

So you can define an array without specify its size:

$ awk 'BEGIN{a[0]=1;a[10]=2;print length(a)}'
2

It's not like perl, which use contiguous block of memory for array:

$ perl -le '$a[0]=1;$a[10]=1;print ~~@a'
11

And perl hash is very similar to gawk array:

$ perl -le '$a{0}=1;$a{10}=1;print ~~keys %a'
2

Because gawk arrays implement as a hash table, so you can access any element of array in constant time, independent from size of array.

asked	1 year ago
viewed	152 times
active	1 year ago

current community

your communities

more stack exchange communities

Memory allocation for sparse array in awk

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged awk memory array or ask your own question.

Hot Network Questions

current community

your communities

more stack exchange communities

Memory allocation for sparse array in awk

2 Answers 2

Did you find this question interesting? Try our newsletter

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged awk memory array or ask your own question.

Related

Hot Network Questions