This is an old revision of the document!


Arrays

For completeness and details on several parameter expansion variants, see the article about parameter expansion and check the notes about arrays.

An array is a way for the coder to collect multiple values (data, text, numbers) under a common name, the name of the array. The specific values are accessible using an index into the array. Thus, the array (in any programming language) is a useful and common data structure.

This is a symbolic (not a real!) view of an array named NAMES, the indexes go from 1 to 4 (technically, 0 to 3) here.

NAMES
 0: Peter
 1: Anna
 2: Greg
 3: Jan

The purpose is clear: If you want the second name, you ask for the index 1 of the array NAME. All your names are in the array NAME, you don't need 4 variables for 4 names, just one variable (the array) which contains many elements.

Attention:

  • As in C, the array indexes start at 0 (zero)!
  • All Bash arrays are sparse!
  • Multidimensional arrays are not implemented

Bash supports two different indexing methods for arrays:

  • integer numbers (starts at 0)
  • :V4: strings

The indexing by numbers is what was shown above: Every element is indexed by an integer number, and all syntax used for both assigning and dereferencing indexed arrays is an arithmetic evaluation context.

:V4: The indexing by strings is called associative.

Declaration

The following explicitly give variables array attributes, making them arrays:

Syntax Description
ARRAY=() Declares an indexed array ARRAY and initializes it to be empty. This can also be used to empty an existing array.
ARRAY[0]= Generally sets the first element of an indexed array. If no array ARRAY existed before, it is created.
declare -a ARRAY Declares an indexed array ARRAY. An existing array is not initialized.
declare -A ARRAY :V4: Declares an associative array ARRAY. This is the one and only way to create associative arrays.

Storing values

Storing values in arrays is quite as simple as storing values in normal variables.

Syntax Description
ARRAY[N]=VALUE Sets the element N of the indexed array ARRAY to VALUE. N can be any valid arithmetic expression. Negative indexes can't be used for assignment even if ARRAY already exists (unlike ksh and zsh, but see "getting values" below for negative indexes in Bash).
ARRAY[STRING]=VALUE :V4: Sets the element indexed by STRING of the associative array ARRAY.
ARRAY=VALUE As above. If no index is given, as a default the zeroth element is set to VALUE. Careful, this is even true of associative arrays - there is no error if no key is specified, and the value is assigned to string index "0".
ARRAY=(E1 E2 …) Compound array assignment - sets the whole array ARRAY to the given list of elements indexed sequentially starting at zero. The array is unset before assignment unless the += operator is used. When the list is empty (ARRAY=()), the array will be set to an empty array. :V4: This method obviously does not use explicit indexes. An associative array can not be set like that! Clearing an associative array using ARRAY=() works.
ARRAY=([X]=E1 [Y]=E2 …) Compound assignment for indexed arrays with index-value pairs declared individually (here for example X and Y). X and Y are arithmetic expressions. This syntax can be combined with the above - elements declared without an explicitly specified index are assigned sequentially starting at either the last element with an explicit index, or zero.
ARRAY=([S1]=E1 [S2]=E2 …) :V4: Individual mass-setting for associative arrays. The named indexes (here: S1 and S2) are strings.

As of now, arrays can't be exported.

Getting values

Syntax Description
${ARRAY[N]} Expands to the value of the index N in the indexed array ARRAY. If N is a negative number, it's treated as the offset from the maximum assigned index (can't be used for assignment) - 1 /:V4: 4.2-alpha)
${ARRAY[S]} :V4: Expands to the value of the index S in the associative array ARRAY.
"${ARRAY[@]}"
${ARRAY[@]}
"${ARRAY[*]}"
${ARRAY[*]}
Similar to mass-expanding positional parameters, this expands to all elements. If unquoted, both subscripts * and @ expand to the same result, if quoted, @ expands to all elements individually quoted, * expands to all elements quoted as a whole.
"${ARRAY[@]:N:M}"
${ARRAY[@]:N:M}
"${ARRAY[*]:N:M}"
${ARRAY[*]:N:M}
Similar to what this syntax does for the characters of a single string when doing substring expansion, this expands to M elements starting with element N. This way you can mass-expand individual indexes. The rules for quoting and the subscripts * and @ are the same as above for the other mass-expansions.

For clarification: When you use the subscripts @ or * for mass-expanding, then the behaviour is exactly what it is for $@ and $* when mass-expanding the positional parameters. You should read this article to understand what's going on.

Metadata

Syntax Description
${#ARRAY[N]} Expands to the length of an individual array member at index N (stringlength)
${#ARRAY[STRING]} :V4: Expands to the length of an individual associative array member at index STRING (stringlength)
${#ARRAY[@]}
${#ARRAY[*]}
Expands to the number of elements in ARRAY
${!ARRAY[@]}
${!ARRAY[*]}
Expands to the indexes in ARRAY since BASH 3.0

Destruction

The unset builtin command is used to destroy (unset) arrays or individual elements of arrays.

Syntax Description
unset ARRAY
unset ARRAY[@]
unset ARRAY[*]
Destroys a complete array
unset ARRAY[N] Destroys the array element at index N
unset ARRAY[STRING] :V4: Destroys the array element of the associative array at index STRING
Destroying individual array elements using the syntax above may cause pathname expansion to occur.

Example: You are in a directory with a file named x1, and you want to destroy an array element x[1], with

unset x[1]
then pathname expansion will expand to the filename x1 and break your processing!

Even worse, if nullglob is set, your array/index will disappear.

To avoid this, either disable pathname expansion or quote the array name and index:

unset "x[1]"

Numerical Index

Numerical indexed arrays are easy to understand and easy to use. The Purpose and Indexing chapters above more or less explain all the needed background theory.

Now, some examples and comments for you.

Let's say we have an array SENTENCE which is initialized as follows:

SENTENCE=(Be liberal in what you accept, and conservative in what you send)

Since no special code is there to prevent word splitting (no quotes), every word there will be assigned to an individual array element. When you count the words you see, you should get 12. Now let's see if Bash has the same opinion:

$ echo ${#SENTENCE[@]}
12

Yes, 12. Fine. You can take this number to walk through the array. Just subtract 1 from the number of elements, and start your walk at 0 (zero):

N_ELEMENTS=${#SENTENCE[@]}
MAX_INDEX=$((N_ELEMENTS - 1))

for ((i = 0; i <= MAX_INDEX; i++)); do
  echo "Element $i: '${SENTENCE[i]}'"
done

You always have to remember that, it seems newbies have problems sometimes. Please understand that numerical array indexing begins at 0 (zero)!

The method above, walking through an array by just knowing its number of elements, only works for arrays where all elements are set, of course. If one element in the middle is removed, then the calculation is nonsense, because the number of elements doesn't correspond to the highest used index anymore (we call them "sparse arrays").

Associative (Bash 4)

Associative arrays (or hash tables) are not much more complicated than numerical indexed arrays. The numerical index value (in Bash a number starting at zero) just is replaced with an arbitrary string:

# declare -A, introduced with Bash 4 to declare an associative array
declare -A SENTENCE

SENTENCE[Begin]="Be liberal in what "
SENTENCE[Middle]="you accept, and conservative "
SENTENCE[End]="in what you send"
SENTENCE["Very end"]="..."

Beware: don't rely on the fact that the elements are ordered in memory like they were declared, it could look like this:

# output from 'set' command
SENTENCE=([End]="in what you send" [Middle]="you accept, and conservative " [Begin]="Be liberal in what " ["Very end"]="..." )
This effectively means, you can get the data back with "${SENTENCE[@]}", of course (just like with numerical indexing), but you can't rely on a specific order. If you want to store ordered data, or re-order data, go with numerical indexes. For associative arrays, you usually query known index values:
for element in Begin Middle End "Very end"; do
  printf "%s" "${SENTENCE["$element"]}"
done
printf "\n"

A nice code example: Checking for duplicate files using an associative array indexed with the SHA sum of the files:

# Thanks to Tramp in #bash for the idea and the code

unset flist; declare -A flist;
while read -r sum fname; do 
    if [[ ${flist[$sum]} ]]; then
        printf 'rm -- "%s" # Same as >%s<\n' "$fname" "${flist[$sum]}" 
    else
        flist[$sum]="$fname"
    fi
done <  <(find . -type f -exec sha256sum {} +)  >rmdups

Integer arrays

Any type attributes applied to an array apply to all elements of the array. If the integer attribute is set for either indexed or associative arrays, then values are considered as arithmetic for both compound and ordinary assignment, and the += operator is modified depending on whether the value refers to an array or element of an array.

 ~ $ ( declare -ia 'a=(2+4 [2]=2+2 [a[2]]=a[2])' 'a+=(42 [a[4]]+=3)'; declare -p a )
declare -ai a='([0]="6" [2]="4" [4]="7" [5]="42")'

The zeroth element is assigned to the result of 2+4. The next element is assigned the result of 2+2. The next element index is the result of a[2], which has already been given the value 4, and the value is also given a[2]. This shows that arrays can be self-referential, and with integer arrays this is even true of the values so long as they are arithmetic variables, not expansions (beginning with a $ sign). The next assignment argument to declare uses +=, and since this is a compound assignment, it appends to the array rather than deleting it and creating a new array. The next-highest unassigned index is 5, which is set to 42. Lastly, the element whose index is the value of a[4] (which is 4), gets 3 added to it's existing value making a[4] == 7. Note that having the integer attribute set this time causes += to add, rather than append, as it would for a non-integer array.

Lastly notice the single quotes. Due to the undocumented manner in which compound assignments as arguments to declaration commands are parsed and evaluated, it's possible to get unexpected results (as we would in this case). I recommend always single-quoting all compound assignment arguments to declaration commands unless you specifically intend to exploit this property. Since the exact behavior is virtually unknown (as of this writing), it isn't recommended.

captainmish, 2011/05/26 12:16

Handy for "slicing" - with

${ARRAY[*]:N:M}

the M is implied if omitted, so eg

${ARRAY[*]:5}

will expand to elements 6 to {end of array}

zik, 2011/06/13 19:24

It isn't necessary to subtract 1 from the number of elements. Just change the termination condition to '<'.

N_ELEMENTS=${#SENTENCE[@]}
  
for ((i = 0; i < N_ELEMENTS; i++)); do    # the increment is i plus plus but preview isn't showing that way.
  echo "Element $i: '${SENTENCE[i]}'"
done

Altair IV, 2012/01/12 14:25

You can safely loop through sparse and associative arrays using the ${!array[@]} expansion pattern.

for i in "${!array[@]}"; do
   echo "${array[$i]}"
done

Bash 4.2+ also has negative array indexing. You can expand the second-to-last element with ${array[-2]}, for example, and even apply a parameter expansion to it at the same time!

Yclept Nemo, 2012/11/29 04:43

The isSubset function in the Indirection section could really use some explanation, particularly:

[1]

  local -a 'xkeys=("${!'"$1"'[@]}")' 'ykeys=("${!'"$2"'[@]}")'

I couldn't find any examples of similar usage of local/declare anywhere else. The BashFAQ[1] states, "the right hand side of the assignment is not parsed by the shell". Yet clearly - and also in the following example - some level of parsing is performed.

  local -a 'a=({0..5})'

While I tested myself and for the right-hand side of declare/eval could not produce any unwanted side effects or arbitrary code execution, I can't guarantee that in this form either function is safe. Furthermore the behavior that allows these functions to dereference indirect variable references seems undocumented, and only to work with the array option. The following won't cause any expansion:

  local 'xkeys="${!'"$1"'[@]}"'

[2]

  set -- "${@/%/[key]}"

Sets the positional parameters to the value of the parameter expansion.

[3]

  [[ ${!2+_} && ${!1} == ${!2} ]] || return 1

In reference to ${name+_}, quoting the bash manual: "Omitting the colon results in a test only for a parameter that is unset. Put another way, if the colon is included, the operator tests for both parameter’s existence and that its value is not null; if the colon is omitted, the operator tests only for existence. "

[1] http://mywiki.wooledge.org/BashFAQ/006#Indirection

You could leave a comment if you were logged in.