Array
From DocForge
An array is a linear data structure that consists of a group of elements having a single name that are accessed by indexing. In most programming languages each element has the same data type and the array occupies a contiguous area of storage.
Array elements are accessed by their numeric position in the array (index) or by named positions (keys). An associative array is one in which each element is referenced by a key, typically a string or other basic type. Most languages support multi-dimensional arrays, or arrays of arrays, in which more than one index is needed to access an element. A two-dimensional array might be thought of as the elements in a plane that are indexed by x and y coordinates, while a three-dimensional array might be thought of as a volume that is indexed by x, y, and z coordinates.
[edit] Benefits and Limitations
Arrays permit efficient (constant-time, O(1)) random access, but not efficient insertion and deletion of elements (which are O(n), where n is the size of the array). Linked lists have the opposite trade-off. Consequently, arrays are most appropriate for storing a fixed amount of data which will be accessed in an unpredictable fashion, and linked lists are best for a list of data which will be accessed sequentially and updated often with insertions or deletions.
Another advantage of arrays that has become very important on modern architectures is that iterating through an array has good locality of reference, and so is much faster than iterating through a linked list of the same size, which tends to jump around in memory. However, an array can also be accessed in a random way, as is done with large hash tables, and in this case this is not a benefit.
Arrays also are among the most compact data structures; storing 100 integers in an array takes only 100 times the space required to store an integer, plus perhaps a few bytes of overhead for the pointer to the array (4 on a 32-bit system). Any pointer-based data structure, on the other hand, must keep its pointers somewhere, and these occupy additional space. This extra space becomes more significant as the data elements become smaller. For example, an array of ASCII characters takes up one byte per character, while on a 32-bit platform, which has 4-byte pointers, a linked list requires at least five bytes per character. Conversely, for very large elements, the space difference becomes a negligible fraction of the total space.
Because arrays have a fixed size, there are some indexes which refer to invalid elements - for example, the index 17 in an array of size 5. What happens when a program attempts to refer to these varies from language to language and platform to platform.
[edit] Uses
Although useful on their own, arrays also form the basis for several more complex data structures, such as heaps, hash tables, and VLists, and can be used to represent strings, stacks and queues. They also play a more minor role in many other data structures. All of these applications benefit from the compactness and locality of arrays.
One of the disadvantages of an array is that it has a single fixed size, and although its size can be altered in many environments, this is an expensive operation. Dynamic arrays automatically perform this resizing as late as possible, when the programmer attempts to add an element to the end of the array and there is no more space. To average the high cost of resizing over a long period of time (we say it is an amortized cost), they expand by a large amount, and when the programmer attempts to expand the array again, it just uses more of this reserved space.
In the C programming language, one-dimensional character arrays are used to store null-terminated strings, so called because the end of the string is indicated with a special reserved character called a null character ('\0').
[edit] Array Support
Programming language | Base index | Bound Check | Dimensions | Dynamic |
---|---|---|---|---|
Ada | n | checked | n | init1 |
APL7 | 0 or 1 | checked | n | init1 |
assembly language | 0 | unchecked | 1 | no |
BASIC | 1 | unchecked | 1 | init1 |
C | 0 | unchecked | n2 | heap3,4 |
C++5 | 0 | unchecked | n2 | heap3 |
C# | 0 | checked | n2 | heap3,9 |
Common Lisp | 0 | checked | n | yes |
Fortran | n | varies12 | n | heap3 |
IDL | 0 | checked | n | yes |
Java5 | 0 | checked | 12 | heap3 |
Lua | 1 | checked | 12 | yes |
MATLAB | 1 | checked | n8 | yes |
Oberon-1 | 0 | checked | n | no |
Oberon-2 | 0 | checked | n | yes |
Pascal | n | checked | n | varies10 |
PERL | n | checked | 12 | yes |
PHP | 0 | checked | n | |
PL/I | n | checked | ||
Python | 0 | checked | 12 | yes |
Ruby | 0 | checked | 12 | yes |
Scheme | 0 | checked | 12 | no |
Smalltalk5 | 1 | checked | 12 | yes6 |
Visual BASIC | n | checked | n | yes |
- Size can be chosen on initialization/declaration after which it is fixed.
- Allows arrays of arrays which can be used to emulate multi-dimensional arrays.
- Size can only be chosen when memory is allocated on the heap.
- C99 allows for variable size arrays – however there is almost no compiler available to support this new feature.
- This list is strictly comparing language features. In every language (even assembler) it is possible to provide improved array handling via add on libraries. This language has improved array handling as part of its standard library.
- The class Array is fixed-size, but OrderedCollection is dynamic.
- The indexing base can be 0 or 1, but is set for a whole "workspace".
- At least 2 dimensions (scalar numbers are 1×1 arrays, vectors are 1×n or n×1 arrays).
- Allows creation of fixed-size arrays in "unsafe" code, allowing for enhanced interoperability with other languages
- Varies by implementation. Newer implementations (FreePascal and Delphi) permit heap-based dynamic arrays.
- Behaviour can be tuned using compiler switches. As in DMD 1.0 bounds are checked in debug mode and unchecked in release mode for efficiency reasons.
- Almost all Fortran implementations offer bounds checking options via compiler switches. However by default, bounds checking is usually turned off for efficiency reasons.