This is an old revision of the document!


Arrays

For completeness and details on several parameter expansion variants, see the article about parameter expansion and check the notes about arrays.

An array is a way for the coder to collect multiple values (data, text, numbers) under a common name, the name of the array. The specific values are accessible using an index into the array. Thus, the array (in any programming language) is a useful and common data structure.

This is a symbolic (not a real!) view of an array named NAMES, the indexes go from 1 to 4 (technically, 0 to 3) here.

NAMES
 0: Peter
 1: Anna
 2: Greg
 3: Jan

The purpose is clear: If you want the second name, you ask for the index 1 of the array NAME. All your names are in the array NAME, you don't need 4 variables for 4 names, just one variable (the array) which contains many elements.

Attention:

  • As in C, the numerical array indexes start at 0 (zero)!
  • All Bash arrays are sparse!
  • Multidimensional arrays are not implemented

Bash supports two different indexing methods for arrays:

  • integer numbers (starts at 0)
  • :V4: strings

The indexing by numbers is what was shown above: Every element is indexed by an integer number, and all syntax used for both assigning and dereferencing indexed arrays is an arithmetic evaluation context.

:V4: The indexing by strings is called associative.

Declaration

The following explicitly give variables array attributes, making them arrays:

Syntax Description
ARRAY=() Declares an indexed array ARRAY and initializes it to be empty. This can also be used to empty an existing array.
ARRAY[0]= Generally sets the first element of an indexed array. If no array ARRAY existed before, it is created.
declare -a ARRAY Declares an indexed array ARRAY. An existing array is not initialized.
declare -A ARRAY :V4: Declares an associative array ARRAY. This is the one and only way to create associative arrays.

Storing values

Storing values in arrays is quite as simple as storing values in normal variables.

Syntax Description
ARRAY[N]=VALUE Sets the element N of the indexed array ARRAY to VALUE. N can be any valid arithmetic expression. Negative indexes can't be used for assignment even if ARRAY already exists (unlike ksh and zsh, but see "getting values" below for negative indexes in Bash).
ARRAY[STRING]=VALUE :V4: Sets the element indexed by STRING of the associative array ARRAY.
ARRAY=VALUE As above. If no index is given, as a default the zeroth element is set to VALUE. Careful, this is even true of associative arrays - there is no error if no key is specified, and the value is assigned to string index "0".
ARRAY=(E1 E2 …) Compound array assignment - sets the whole array ARRAY to the given list of elements indexed sequentially starting at zero. The array is unset before assignment unless the += operator is used. When the list is empty (ARRAY=()), the array will be set to an empty array. :V4: This method obviously does not use explicit indexes. An associative array can not be set like that! Clearing an associative array using ARRAY=() works.
ARRAY=([X]=E1 [Y]=E2 …) Compound assignment for indexed arrays with index-value pairs declared individually (here for example X and Y). X and Y are arithmetic expressions. This syntax can be combined with the above - elements declared without an explicitly specified index are assigned sequentially starting at either the last element with an explicit index, or zero.
ARRAY=([S1]=E1 [S2]=E2 …) :V4: Individual mass-setting for associative arrays. The named indexes (here: S1 and S2) are strings.
ARRAY+=(E1 E2 …) Append to ARRAY.

As of now, arrays can't be exported.

Getting values

Syntax Description
${ARRAY[N]} Expands to the value of the index N in the indexed array ARRAY. If N is a negative number, it's treated as the offset from the maximum assigned index (can't be used for assignment) - 1 /:V4: 4.2-alpha)
${ARRAY[S]} :V4: Expands to the value of the index S in the associative array ARRAY.
"${ARRAY[@]}"
${ARRAY[@]}
"${ARRAY[*]}"
${ARRAY[*]}
Similar to mass-expanding positional parameters, this expands to all elements. If unquoted, both subscripts * and @ expand to the same result, if quoted, @ expands to all elements individually quoted, * expands to all elements quoted as a whole.
"${ARRAY[@]:N:M}"
${ARRAY[@]:N:M}
"${ARRAY[*]:N:M}"
${ARRAY[*]:N:M}
Similar to what this syntax does for the characters of a single string when doing substring expansion, this expands to M elements starting with element N. This way you can mass-expand individual indexes. The rules for quoting and the subscripts * and @ are the same as above for the other mass-expansions.

For clarification: When you use the subscripts @ or * for mass-expanding, then the behaviour is exactly what it is for $@ and $* when mass-expanding the positional parameters. You should read this article to understand what's going on.

Metadata

Syntax Description
${#ARRAY[N]} Expands to the length of an individual array member at index N (stringlength)
${#ARRAY[STRING]} :V4: Expands to the length of an individual associative array member at index STRING (stringlength)
${#ARRAY[@]}
${#ARRAY[*]}
Expands to the number of elements in ARRAY
${!ARRAY[@]}
${!ARRAY[*]}
Expands to the indexes in ARRAY since BASH 3.0

Destruction

The unset builtin command is used to destroy (unset) arrays or individual elements of arrays.

Syntax Description
unset ARRAY
unset ARRAY[@]
unset ARRAY[*]
Destroys a complete array
unset ARRAY[N] Destroys the array element at index N
unset ARRAY[STRING] :V4: Destroys the array element of the associative array at index STRING
Specifying unquoted array elements as arguments to any command, such as with the syntax above may cause pathname expansion to occur due to the presence of glob characters.

Example: You are in a directory with a file named x1, and you want to destroy an array element x[1], with

unset x[1]
then pathname expansion will expand to the filename x1 and break your processing!

Even worse, if nullglob is set, your array/index will disappear.

To avoid this, either disable pathname expansion or quote the array name and index:

unset 'x[1]'

Numerical Index

Numerical indexed arrays are easy to understand and easy to use. The Purpose and Indexing chapters above more or less explain all the needed background theory.

Now, some examples and comments for you.

Let's say we have an array sentence which is initialized as follows:

sentence=(Be liberal in what you accept, and conservative in what you send)

Since no special code is there to prevent word splitting (no quotes), every word there will be assigned to an individual array element. When you count the words you see, you should get 12. Now let's see if Bash has the same opinion:

$ echo ${#sentence[@]}
12

Yes, 12. Fine. You can take this number to walk through the array. Just subtract 1 from the number of elements, and start your walk at 0 (zero):

((n_elements=${#sentence[@]}, max_index=n_elements - 1))

for ((i = 0; i <= max_index; i++)); do
  echo "Element $i: '${sentence[i]}'"
done

You always have to remember that, it seems newbies have problems sometimes. Please understand that numerical array indexing begins at 0 (zero)!

The method above, walking through an array by just knowing its number of elements, only works for arrays where all elements are set, of course. If one element in the middle is removed, then the calculation is nonsense, because the number of elements doesn't correspond to the highest used index anymore (we call them "sparse arrays").

Associative (Bash 4)

Associative arrays (or hash tables) are not much more complicated than numerical indexed arrays. The numerical index value (in Bash a number starting at zero) just is replaced with an arbitrary string:

# declare -A, introduced with Bash 4 to declare an associative array
declare -A sentence

sentence[Begin]="Be liberal in what "
sentence[Middle]="you accept, and conservative "
sentence[End]="in what you send"
sentence["Very end"]="..."

Beware: don't rely on the fact that the elements are ordered in memory like they were declared, it could look like this:

# output from 'set' command
sentence=([End]="in what you send" [Middle]="you accept, and conservative " [Begin]="Be liberal in what " ["Very end"]="...")
This effectively means, you can get the data back with "${sentence[@]}", of course (just like with numerical indexing), but you can't rely on a specific order. If you want to store ordered data, or re-order data, go with numerical indexes. For associative arrays, you usually query known index values:
for element in Begin Middle End "Very end"; do
  printf "%s" "${sentence["$element"]}"
done
printf "\n"

A nice code example: Checking for duplicate files using an associative array indexed with the SHA sum of the files:

# Thanks to Tramp in #bash for the idea and the code

unset flist; declare -A flist;
while read -r sum fname; do 
    if [[ ${flist[$sum]} ]]; then
        printf 'rm -- "%s" # Same as >%s<\n' "$fname" "${flist[$sum]}" 
    else
        flist[$sum]="$fname"
    fi
done <  <(find . -type f -exec sha256sum {} +)  >rmdups

Integer arrays

Any type attributes applied to an array apply to all elements of the array. If the integer attribute is set for either indexed or associative arrays, then values are considered as arithmetic for both compound and ordinary assignment, and the += operator is modified in the same way as for ordinary integer variables.

 ~ $ ( declare -ia 'a=(2+4 [2]=2+2 [a[2]]=a[2])' 'a+=(42 [a[4]]+=3)'; declare -p a )
declare -ai a='([0]="6" [2]="4" [4]="7" [5]="42")'

The zeroth element is assigned to the result of 2+4. The next element is assigned the result of 2+2. The last index in the first assignment is the result of a[2], which has already been assigned as 4, and its value is also given a[2]. This shows that even though any existing arrays named a have already been unset by using = instead of +=, arithmetic variables within keys can self-reference any elements already assigned within the same compound-assignment, and with integer arrays this is also true of the values so long as they are arithmetic variables, not expansions (beginning with a $ sign). The next assignment argument to declare uses +=, and since this is a compound assignment, it appends to the array rather than deleting it and creating a new array. The next-highest unassigned index is 5, which is set to 42. Lastly, the element whose index is the value of a[4] (which is 4), gets 3 added to it's existing value making a[4] == 7. Note that having the integer attribute set this time causes += to add, rather than append, as it would for a non-integer array.

The single quotes force the assignments to be evaluated by declare. It's possible to get unexpected results without them (as we would in this case) due to the way in which these compound assignment arguments are evaluated. A special-case of this is shown in the next section.

Indirection

Arrays can be expanded indirectly using the indirect parameter expansion syntax. Parameters whose values are of the form: name[index], name[@], or name[*] when expanded indirectly produce the expected results. This is mainly useful for passing arrays (especially multiple arrays) by name to a function.

This example is an "isSubset"-like predicate which returns true if all key-value pairs of the array given as the first argument to isSubset correspond to a key-value of the array given as the second argument. It demonstrates both indirect array expansion and indirect key-passing without eval using the aforementioned special compound assignment expansion.

isSubset() {
    local -a 'xkeys=("${!'"$1"'[@]}")' 'ykeys=("${!'"$2"'[@]}")'
    set -- "${@/%/[key]}"

    (( ${#xkeys[@]} <= ${#ykeys[@]} )) || return 1

    local key
    for key in "${xkeys[@]}"; do
        [[ ${!2+_} && ${!1} == ${!2} ]] || return 1
    done
}

main() {
    # "a" is a subset of "b"
    local -a 'a=({0..5})' 'b=({0..10})'
    isSubset a b
    echo $? # true

    # "a" contains a key not in "b"
    local -a 'a=([5]=5 {6..11})' 'b=({0..10})'
    isSubset a b
    echo $? # false

    # "a" contains an element whose value != the corresponding member of "b"
    local -a 'a=([5]=5 6 8 9 10)' 'b=({0..10})'
    isSubset a b
    echo $? # false
}

main

This script is one way of implementing a crude multidimensional associative array by storing array definitions in an array and referencing them through indirection. The script takes two keys and dynamically calls a function whose name is resolved from the array.

callFuncs() {
    # Set up indirect references as positional parameters to minimize local name collisions.
    set -- "${@:1:3}" ${2+'a["$1"]' "$1"'["$2"]'}

    # The only way to test for set but null parameters is unfortunately to test each individually.
    local x
    for x; do
        [[ $x ]] || return 0
    done

    local -A a=(
        [foo]='(
            [r]=f
            [s]=g
            [t]=h
            )'
        [bar]='(
            [u]=i
            [v]=j
            [w]=k
            )'
        [baz]='(
            [x]=l
            [y]=m
            [z]=n
            )'
        ) \
        ${4+${a["$1"]+"${1}=${!3}"}} # For example, if "$1" is "bar" then define a new array: bar=([u]=i [v]=j [w]=k)

    ${4+${a["$1"]+"${!4-:}"}} # Now just lookup the new array. for inputs: "bar" "v", the function named "j" will be called, which prints "j" to stdout.
}

main() {
    # Define functions named {f..n} which just print their own names.
    local fun='() { echo "$FUNCNAME"; }' x

    for x in {f..n}; do
        eval "${x}${fun}"
    done

    callFuncs "$@"
}

main "$@"

captainmish, 2011/05/26 12:16

Handy for "slicing" - with

${ARRAY[*]:N:M}

the M is implied if omitted, so eg

${ARRAY[*]:5}

will expand to elements 6 to {end of array}

zik, 2011/06/13 19:24

It isn't necessary to subtract 1 from the number of elements. Just change the termination condition to '<'.

N_ELEMENTS=${#SENTENCE[@]}
  
for ((i = 0; i < N_ELEMENTS; i++)); do    # the increment is i plus plus but preview isn't showing that way.
  echo "Element $i: '${SENTENCE[i]}'"
done

Altair IV, 2012/01/12 14:25

You can safely loop through sparse and associative arrays using the ${!array[@]} expansion pattern.

for i in "${!array[@]}"; do
   echo "${array[$i]}"
done

Bash 4.2+ also has negative array indexing. You can expand the second-to-last element with ${array[-2]}, for example, and even apply a parameter expansion to it at the same time!

Yclept Nemo, 2012/11/29 04:43

The isSubset function in the Indirection section could really use some explanation, particularly:

[1]

  local -a 'xkeys=("${!'"$1"'[@]}")' 'ykeys=("${!'"$2"'[@]}")'

I couldn't find any examples of similar usage of local/declare anywhere else. The BashFAQ[1] states, "the right hand side of the assignment is not parsed by the shell". Yet clearly - and also in the following example - some level of parsing is performed.

  local -a 'a=({0..5})'

While I tested myself and for the right-hand side of declare/eval could not produce any unwanted side effects or arbitrary code execution, I can't guarantee that in this form either function is safe. Furthermore the behavior that allows these functions to dereference indirect variable references seems undocumented, and only to work with the array option. The following won't cause any expansion:

  local 'xkeys="${!'"$1"'[@]}"'

[2]

  set -- "${@/%/[key]}"

Sets the positional parameters to the value of the parameter expansion.

[3]

  [[ ${!2+_} && ${!1} == ${!2} ]] || return 1

In reference to ${name+_}, quoting the bash manual: "Omitting the colon results in a test only for a parameter that is unset. Put another way, if the colon is included, the operator tests for both parameter’s existence and that its value is not null; if the colon is omitted, the operator tests only for existence. "

[1] http://mywiki.wooledge.org/BashFAQ/006#Indirection

You could leave a comment if you were logged in.